The embodiments discussed in the present disclosure are related to provision of semantic feedback on deep neural network (DNN) prediction for decision making.
Recent advancements in the field of neural networks have led to development of various techniques for classification of data which may be associated with various real-time applications. For example, a trained Deep Neural Network (DNN) may be utilized in different applications for various classification tasks, such as classification or detection of different datapoints (i.e. an image).
The subject matter claimed in the present disclosure is not limited to embodiments that solve any disadvantages or that operate only in environments such as those described above. Rather, this background is only provided to illustrate one example technology area where some embodiments described in the present disclosure may be practiced.
According to an aspect of an embodiment, operations by a device may include receiving a first dataset associated with a real-time application. The operations may further include predicting, by a Deep Neural Network (DNN) pre-trained for a classification task of the real-time application, a first class associated with a first datapoint of the received first dataset. The operations may further include determining a first set of feature scores for the first datapoint based on the predicted first class associated with the first datapoint. The operations may further include identifying a set of confusing class pairs associated with the DNN based on the predicted first class and a predetermined class associated with the first datapoint. The operations may further include clustering the received first dataset into one of a set of semantic classes based on the determined first set of features scores, the predicted first class, and the identified set of confusing class pairs for each datapoint in the received first dataset. Each of the set of semantic classes may be indicative of a prediction accuracy associated with a set of datapoints clustered in corresponding semantic class of the set of semantic classes. The operations may further include training a classifier based on the clustered first dataset, the determined first set of feature scores, and the set of semantic classes.
The objects and advantages of the embodiments will be realized and achieved at least by the elements, features, and combinations particularly pointed out in the claims.
Both the foregoing general description and the following detailed description are given as examples and are explanatory and are not restrictive of the invention, as claimed.
Example embodiments will be described and explained with additional specificity and detail through the use of the accompanying drawings in which:
all according to at least one embodiment described in the present disclosure.
Deep Neural Networks (DNNs) have achieved a good classification accuracy in various classification tasks, however, in certain applications, such as autonomous driving or medical diagnosis, an incorrect detection or mis-classifications (in even a fraction of cases) may lead to financial or physical losses. This may raise an important concern regarding the reliability of the prediction results of the DNN and regarding the decisions or actions which may be taken based on the prediction results of the DNN. Conventional solutions in the field of explainable artificial intelligence (XAI) may provide an interpretation of the prediction output of the DNN in terms of a set of features that may be used to generate the predicted output of the DNN. Other conventional techniques may involve a use of a native confidence associated with the DNN to quantify a reliability of the prediction output of the DNN. However, the native confidence score may not be readily interpretable to an end user (such as non-technical people) and may not enable the user to take an appropriate decision or an action associated with the prediction result (i.e. confidence score) of the DNN.
Some embodiments described in the present disclosure relate to methods and systems to effectively provide a semantic feedback on a prediction result of a Deep Neural Network (DNN) for decision making. In the present disclosure, a dataset may be clustered into a set of semantic classes based on a first set of feature scores, a class predicted by the DNN, and a set of confusing class pairs for each datapoint in the dataset. Each semantic class may be indicative of a prediction accuracy associated with a set of datapoints clustered in a corresponding semantic class of the set of semantic classes. A classifier may be trained based on the clustered dataset, the first set of features, and the set of semantic classes. In an operational phase, the trained classifier may be used to classify a new datapoint into one of the set of semantic classes based on a second set of feature scores determined for the new datapoint and a class predicted by the DNN for the new datapoint. An action associated with the semantic class (i.e. determined for the new datapoint) may be further determined and the determined action may be rendered for a user to aid the user in appropriate decision making based on the classified semantic class for the new datapoint.
According to one or more embodiments of the present disclosure, the technological field of classification by the DNN may be improved by configuring a computing system in a manner, that the computing system is able to effectively provide a semantic feedback on a prediction result of the DNN which may be understandable by an end user (such as a non-technical person) and may be used for appropriate decision making by the end user. The computing system may include a classifier, which may be trained to cluster a datapoint fed to the DNN into a semantic class indicative of a prediction accuracy of datapoints clustered into the semantic class, as compared to other conventional systems which may not provide user-interpretable feedback on DNN prediction results.
The system may be configured to receive a first dataset associated with a real-time application. Examples of datapoints in the first dataset may include, but are not limited to image data, speech data, audio data, text data, or other forms of digital signals associated with a real-time application. The system may be further configured to control the DNN to predict a first class associated with a first datapoint of the received first dataset, where the a DNN may be pre-trained for a classification task associated with the real-time application. For example, the classification task may be an image classification task and the first datapoint may include an input image of an animal (such as, a cat). The system may be configured to control the DNN to predict the first class (for example, a label of a cat) based on the received first datapoint (for example, the input image of the cat).
The system may be further configured to determine a first set of feature scores for the first datapoint based on the predicted first class associated with the first datapoint. Examples of the first set of feature scores may include at least one of, but not limited to, a Likelihood-based Surprise Adequacy (LSA) score, a Distance-based Surprise Adequacy (DSA) score, a confidence score, a logit score, or a robustness diversity score, for the first datapoint. The LSA score may be indicative of whether the first datapoint is in-distribution with the first dataset. The DSA score may be indicative of whether the first datapoint is closer to the predicted first class than another class neighboring to the predicted first class in a hyper-space associated with the DNN. The confidence score corresponds to a probability score of the DNN for the prediction of the first class associated with the first datapoint. The logit score may correspond to a score from a pre-softmax layer of the DNN for the prediction of the first class associated with the first datapoint. The robustness diversity score may be indicative of a degree of stability of a prediction by the DNN for one or more variations corresponding to the first datapoint.
The system may be further configured to identify a set of confusing class pairs associated with the DNN based on the predicted first class and a predetermined class associated with first datapoint. The system may be further configured to cluster the received first dataset into one of a set of semantic classes based on the determined first set of features scores, the predicted first class, and the identified set of confusing class pairs for each datapoint in the received first dataset. Each of the set of semantic classes may be indicative of a prediction accuracy associated with a set of datapoints clustered in corresponding semantic class of the set of semantic classes. Examples of the set of semantic classes may include at least one of, but not limited to, a likely-correct (or reliably correct) semantic class, a one-of-two semantic class, a likely incorrect semantic class, or a do-not-know semantic class. The system may be further configured to train a classifier based on the clustered first dataset, the determined first set of feature scores, and the set of semantic classes. An example of the classifier may include, but is not limited to, a K-Nearest Neighbor (K-NN) classifier.
For example, in an operational phase, the system may be configured to receive a second datapoint (such as new datapoint) associated with the real-time application. The system may be further configured to control the DNN to predict a second class associated with the second datapoint, where the DNN may be pre-trained for the classification task of the real-time application. The system may be further configured to determine a second set of feature scores for the second datapoint based on the predicted second class associated with the received second datapoint. The second set of feature scores may be similar to the first set of feature scores.
The system may be further configured to apply a pre-trained classifier on the received second datapoint, based on the determined second set of feature scores and the predicted second class. The classifier may be pre-trained to classify a datapoint into one of a set of semantic classes. The system may be further configured to classify the second datapoint into one of the set of semantic classes based on the application of the pre-trained classifier on the second datapoint. The system may be further configured to determine at least one action associated with the classified one of the set of semantic classes for the second datapoint. The system may be further configured to render the determined at least one action for the end user.
Typically, conventional systems may provide an interpretation of a prediction result of the DNN that may not be readily comprehensible by the end user. In other words, interpretation by the conventional systems may not be intuitive for the user to base a decision or action on the interpretation. The disclosed system, on the other hand, may partition an input space of the DNN into various semantic classes that may be human comprehensible. A semantic class may indicate a predictive accuracy of datapoints clustered in the semantic class. The clustering of a datapoint in a certain semantic class may be used to provide the user a qualitative feedback, regarding a reliability of the DNN in the classification of the datapoint and may further enable the user to take an informed decision based on the qualitative feedback (i.e. in form of the semantic class), like how to use or not use the prediction output of the DNN.
Embodiments of the present disclosure are explained with reference to the accompanying drawings.
The electronic device 102 may be configured to control a pre-trained DNN 104 to predict a first class (for example, a label) associated with a first datapoint of the received first dataset. For example, in case the DNN 104 is pre-trained for an image classification task and the first datapoint is an image, the DNN 104 may be controlled to predict an object in the image, such as, an animal (e.g., a cat) as the first class. In some embodiments, the electronic device 102 may control the DNN 104 to generate a first confidence score (i.e. native confidence score as a probability value) which may indicate a prediction for the first class associated with the received first datapoint. Similarly, the electronic device 102 may be configured to control the pre-trained DNN 104 to predict the first class associated with each datapoint in the first dataset.
The electronic device 102 may be further configured to determine a first set of features scores for the first datapoint based on the predicted first class associated with the first datapoint. Examples of the first set of feature scores may include at least one of, but not limited to, a Likelihood-based Surprise Adequacy (LSA) score, a Distance-based Surprise Adequacy (DSA) score, a confidence score, a logit score, or a robustness diversity score, for the first datapoint. Details of the first set of feature scores are provided, for example, in
The electronic device 102 may be further configured to identify a set of confusing class pairs associated with the DNN 104 based on the predicted first class and a predetermined class associated with the first datapoint. Similarly, the electronic device 102 may be configured to identify the set of confusing class pairs for each datapoint in the first dataset. The identification of the set of confusing classes for a datapoint is described further, for example, in
The electronic device 102 may be configured to cluster the received first dataset into one of a set of semantic classes based on the determined first set of features scores, the predicted first class, and the identified set of confusing class pairs for each datapoint in the received first dataset. Each of the set of semantic classes may be indicative of a prediction accuracy associated with a set of datapoints clustered in corresponding semantic class of the set of semantic classes. Examples of the set of semantic classes may include at least one of, but not limited to, a likely-correct semantic class, a one-of-two semantic class, a likely incorrect semantic class, or a do-not-know semantic class. Details of the set of semantic classes are provided, for example, in
The electronic device 102 may be configured to receive a second datapoint associated with the real-time application, via the communication network 112. The second datapoint may be stored on the database 108 or the user-end device 110. The system may be further configured to control the pre-trained DNN 104 predict a second class (i.e. a label) associated with the second datapoint. The electronic device 102 may be further configured to determine a second set of feature scores for the second datapoint based on the predicted second class associated with the received second datapoint.
The electronic device 102 may be further configured to apply a pre-trained classifier (e.g., the classifier 106) on the received second datapoint, based on the determined second set of feature scores and the predicted second class. The classifier 106 may be pre-trained to classify a datapoint into one of the set of semantic classes. The electronic device 102 may be further configured to classify the second datapoint into one of the set of semantic classes based on the application of the pre-trained classifier 106 on the second datapoint. The electronic device 102 may be further configured to determine at least one action associated with the classified one of the set of semantic classes for the second datapoint. The electronic device 102 may be further configured to render the determined at least one action for the user 114. The classification of a datapoint into one of the set of semantic classes based on a pre-trained classifier is described further, for example, in
Examples of the electronic device 102 may include, but are not limited to, an object detection engine, a recognition engine, a mobile device, a desktop computer, a laptop, a computer work-station, a training device, a computing device, a mainframe machine, a server, such as a cloud server, and a group of servers. In one or more embodiments, the electronic device 102 may include a user-end terminal device and a server communicatively coupled to the user-end terminal device. Examples of the user-end terminal device may include, but are not limited to, a mobile device, a desktop computer, a laptop, and a computer work-station. The electronic device 102 may be implemented using hardware including a processor, a microprocessor (e.g., to perform or control performance of one or more operations), a field-programmable gate array (FPGA), or an application-specific integrated circuit (ASIC). In some other instances, the electronic device 102 may be implemented using a combination of hardware and software.
Although in
The DNN 104 may comprise suitable logic, circuitry, interfaces, and/or code that may configured to classify or recognize an input datapoint to generate an output result for the particular real-time application. For example, the pre-trained DNN 104 may recognize different objects in input images and may provide a unique label for each object in the input images. The unique label may correspond to different living (like human, animals, plants) or non-living entities (like, but not limited to, vehicle, building, computer, book, etc.). In another example, the pre-trained DNN 104 related to an application of speech recognition, may recognize different input audio samples to identify a source (e.g., a human-speaker) of the audio sample. In an embodiment, the output unique label may correspond to a prediction result of the DNN 104 for the input datapoint. The DNN 104 may be configured to output a first confidence score (as a native confidence score) which may indicate a probability (between 0 to 1) of the output prediction result of the DNN 104. For example, for the input datapoint as an animal (like cat), the trained DNN 104 may generate a higher first confidence score (for example 0.95 which is close to 1.0) to predict the input datapoint with the unique label as the animal (for example cat). The DNN 104 may be implemented using hardware including a processor, a microprocessor (e.g., to perform or control performance of one or more operations), a field-programmable gate array (FPGA), or an application-specific integrated circuit (ASIC). In some other instances, the DNN 104 may be a code, a program, or set of software instruction. The DNN 104 may be implemented using a combination of hardware and software.
In some embodiments, the DNN 104 may correspond to multiple recognition layers (not shown) for recognition of the input datapoints, where each successive layer may use an output of a previous layer as input. For example, the multiple recognition layer may include an input layer, one or more hidden layers, and an output layer. Each recognition layer may be associated with a plurality of neurons, each of which may be further associated with plurality of weights. During training of the DNN 104, the multiple recognition layers and the plurality of neurons in each layer may be determined from hyper-parameters of the DNN 104. Such hyper-parameters may be set before or while training the DNN 104 on a training dataset (i.e. for example different images of a particular class). The DNN 104 may be trained to adjust the plurality of weights at different layers based on the input datapoints and the output result (i.e. a ground truth) of the DNN 104.
Each neuron or node of the DNN 104 may correspond to a mathematical function (e.g., a sigmoid function or a rectified linear unit) with a set of parameters, tunable to train the DNN 104 for the relationship between the first datapoint (for example an image data), as the input of the DNN 104, and the prediction result or class as the output of the DNN 104. The set of parameters may include, for example, a weight parameter, a regularization parameter, and the like. Each neuron may use the mathematical function to compute an output based on one or more inputs from neurons in other layer(s) (e.g., previous layer(s)) of the DNN 104. All or some of the neurons of the DNN 104 may correspond to same or a different mathematical function.
In training of the DNN 104, one or more parameters (like weights) of each neuron of the DNN 104 may be updated based on whether an output of the final layer for a given input (from the training dataset) matches a correct result based on a loss function for the DNN. This update process may be repeated for same or a different input till a minima of loss function may be achieved and a training error may be minimized. Several methods for training the DNN are known in art, for example, gradient descent, stochastic gradient descent, batch gradient descent, gradient boost, meta-heuristics, and the like. The DNN 104 may include code and routines configured to enable a computing device, such as the electronic device to perform one or more operations for classification of one or more data inputs (i.e. for example image data) into one or more outputs (i.e. class labels).
Examples of the DNN 104 may include, but are not limited to, a recurrent neural network (RNN), an artificial neural network (ANN), a convolutional neural network (CNN), a CNN-recurrent neural network (CNN-RNN), R-CNN, Fast R-CNN, Faster R-CNN, a Long Short Term Memory (LSTM) network based RNN, CNN+ANN, LSTM+ANN, a gated recurrent unit (GRU)-based RNN, a fully connected neural network, a Connectionist Temporal Classification (CTC) based RNN, a deep Bayesian neural network, a Generative Adversarial Network (GAN), and/or a combination of such networks.
The classifier 106 may comprise suitable logic, circuitry, interfaces, and/or code that may configured to classify an input datapoint into one of the set of semantic classes. Each of the set of semantic classes may be indicative of a prediction accuracy associated with a set of datapoints clustered in corresponding semantic class of the set of semantic classes. Examples of the set of semantic classes may include at least one of, but not limited to, a likely-correct semantic class, a one-of-two semantic class, a likely-incorrect semantic class, or a do-not-know semantic class. The classifier 106 may be pre-trained, by the electronic device 102, based on the clustered first dataset, the determined first set of feature scores, and the set of semantic classes for each datapoint in the first dataset related to the real-time application. The training of the classifier 106 is described, for example, in
The database 108 may comprise suitable circuitry, logic, interfaces, and/or code that may be configured to store a dataset (e.g., the first dataset) including a plurality of datapoints related to the real-time application. The electronic device 102 may receive the first dataset from the database 108. Further, the plurality of datapoints in the first dataset may be a set of training datapoints (or a training dataset) that may be used to train the DNN 104 and/or the classifier 106. The plurality of datapoints may further include a set of test datapoints (or a test dataset) which may be used to test the DNN 104 or test the classifier 106. The database 108 may be a relational or a non-relational database that include the training dataset or the test dataset. Also, in some cases, the database 108 may be stored on a server, such as a cloud server or may be cached and stored on the electronic device 102. The server of the database 108 may be configured to receive a request to provide a dataset (e.g., the first dataset) or a new datapoint (e.g., the second datapoint) from the electronic device 102, via the communication network 112. In response, the server of the database 108 may be configured to retrieve and provide the requested dataset or a particular datapoint to the electronic device 102 based on the received request, via the communication network 112. In some embodiments, the database 108 may be configured to store the classifier 106. In some embodiments, the database 108 may be configured to store the pre-trained DNN 104 for the particular real-time applications. Additionally, or alternatively, the database 108 may be implemented using hardware including a processor, a microprocessor (e.g., to perform or control performance of one or more operations), a field-programmable gate array (FPGA), or an application-specific integrated circuit (ASIC). In some other instances, the database 108 may be implemented using a combination of hardware and software.
The user-end device 110 may comprise suitable logic, circuitry, interfaces, and/or code that may be configured to store the real-time application where the specific classification task (i.e. for which the DNN 104 and the classifier 106 are trained) may be performed. In some embodiments, the user-end device 110 may deploy the pre-trained DNN 104 and the classifier 106 to provide a semantic feedback on the prediction results of the deployed pre-trained DNN 104. The user-end device 110 may utilize the deployed DNN 104 to perform the classification or detection task of the real-time application, train the classifier 106, and utilize the deployed classifier 106 for the determination of the semantic feedback (i.e. set of semantic classes) on the prediction or classification result generated by the deployed DNN 104. For example, the user-end device 110 may be an electronic device which may receive an input image from an in-built camera or a server and may perform the image classification or recognition on the input image based on the trained DNN 104 deployed on the user-end device 110. The user-end device 110 may train the classifier 106 and further use the deployed trained classifiers 106 to determine the semantic feedback on a reliability of classification of the image (i.e. the predicted image class) performed by the DNN 104 (i.e. deployed on the user-end device 110). In another example, the user-end device 110 may be an autonomous vehicle which may receive real-time images from surrounding and detect different objects captured in the images through in-built trained DNN 104. In such scenario, the user-end device 110 may use the pre-trained classifier 106 to determine the semantic feedback (i.e. the set of semantic classes) on the prediction output of the DNN 104, and indicate or warn about a potential mis-judgement or incorrect detection, performed by the DNN 104 of the autonomous vehicle, to a user associated with the autonomous vehicle. In some embodiments, the user-end device 110 may take appropriate actions (for example apply brakes or control steering of the autonomous vehicle) based on the incorrection detection or mis-judgement performed by the DNN 104, deployed in the autonomous vehicle.
In another example, the user-end device 110 may be audio security system which may perform user authentication based on speech recognition performed by the DNN 104 trained on different speech data samples. Similarly, the user-end device 110 may validate the authentication of the user performed by the DNN 104, by use of the trained classifier 106 to validate the accuracy of authentication, using the set of semantic classes. It should be noted here that the aforementioned examples are not be construed as limiting for the disclosure, and the DNN 104 may be used many possible applications which have not been mentioned for the sake of brevity. Examples of the user-end device 110 may include, but are not limited to, a mobile device, a desktop computer, a laptop, a computer work-station, a computing device, a mainframe machine, a server, such as a cloud server, and a group of servers.
The communication network 112 may include a communication medium through which the electronic device 102 may communicate with the server which may store the database 108 and the user-end device 110. Examples of the communication network 112 may include, but are not limited to, the Internet, a cloud network, a Wireless Fidelity (Wi-Fi) network, a Personal Area Network (PAN), a Local Area Network (LAN), and/or a Metropolitan Area Network (MAN). Various devices in the environment 100 may be configured to connect to the communication network 112, in accordance with various wired and wireless communication protocols. Examples of such wired and wireless communication protocols may include, but are not limited to, at least one of a Transmission Control Protocol and Internet Protocol (TCP/IP), User Datagram Protocol (UDP), Hypertext Transfer Protocol (HTTP), File Transfer Protocol (FTP), ZigBee, EDGE, IEEE 802.11, light fidelity(Li-Fi), 802.16, IEEE 802.11s, IEEE 802.11g, multi-hop communication, wireless access point (AP), device to device communication, cellular communication protocols, and/or Bluetooth (BT) communication protocols, or a combination thereof.
Modifications, additions, or omissions may be made to
The processor 204 may comprise suitable logic, circuitry, and/or interfaces that may be configured to execute program instructions associated with different operations to be executed by the electronic device 102. For example, some of the operations may include reception of the first dataset, control of the DNN 104 for the prediction of the first class associated with the first datapoint in the received first dataset, determination of the first set of feature scores for the first datapoint, identification of the set of confusing class pairs, clustering of the received first dataset, and the training of the classifier 106. The one or more operations may further include reception of the second datapoint, control of the DNN 104 for the prediction of the second class associated with the second datapoint, determination of the second set of features for the second datapoint, application of the trained classifier 106 on the second datapoint, and classification of the second datapoint into one of the set of semantic classes. The processor 204 may include any suitable special-purpose or general-purpose computer, computing entity, or processing device including various computer hardware or software modules and may be configured to execute instructions stored on any applicable computer-readable storage media. For example, the processor 204 may include a microprocessor, a microcontroller, a digital signal processor (DSP), an application-specific integrated circuit (ASIC), a Field-Programmable Gate Array (FPGA), or any other digital or analog circuitry configured to interpret and/or to execute program instructions and/or to process data. Although illustrated as a single processor in
In some embodiments, the processor 204 may be configured to interpret and/or execute program instructions and/or process data stored in the memory 206 and/or the persistent data storage 208. In some embodiments, the processor 204 may fetch program instructions from the persistent data storage 208 and load the program instructions in the memory 206. After the program instructions are loaded into the memory 206, the processor 204 may execute the program instructions. Some of the examples of the processor 204 may be a GPU, a CPU, a RISC processor, an ASIC processor, a CISC processor, a co-processor, and/or a combination thereof.
The memory 206 may comprise suitable logic, circuitry, interfaces, and/or code that may be configured to store program instructions executable by the processor 204. In certain embodiments, the memory 206 may be configured to store operating systems and associated application-specific information. The memory 206 may include computer-readable storage media for carrying or having computer-executable instructions or data structures stored thereon. Such computer-readable storage media may include any available media that may be accessed by a general-purpose or special-purpose computer, such as the processor 204. By way of example, and not limitation, such computer-readable storage media may include tangible or non-transitory computer-readable storage media including Random Access Memory (RAM), Read-Only Memory (ROM), Electrically Erasable Programmable Read-Only Memory (EEPROM), Compact Disc Read-Only Memory (CD-ROM) or other optical disk storage, magnetic disk storage or other magnetic storage devices, flash memory devices (e.g., solid state memory devices), or any other storage medium which may be used to carry or store particular program code in the form of computer-executable instructions or data structures, and which may be accessed by a general-purpose or special-purpose computer. Combinations of the above may also be included within the scope of computer-readable storage media. Computer-executable instructions may include, for example, instructions and data configured to cause the processor 204 to perform a certain operation or group of operations associated with the electronic device 102.
The persistent data storage 208 may comprise suitable logic, circuitry, and/or interfaces that may be configured to store program instructions executable by the processor 204, operating systems, and/or application-specific information, such as logs and application-specific databases. The persistent data storage 208 may include computer-readable storage media for carrying or having computer-executable instructions or data structures stored thereon. Such computer-readable storage media may include any available media that may be accessed by a general-purpose or special-purpose computer, such as the processor 204.
By way of example, and not limitation, such computer-readable storage media may include tangible or non-transitory computer-readable storage media including Compact Disc Read-Only Memory (CD-ROM) or other optical disk storage, magnetic disk storage or other magnetic storage devices (e.g., Hard-Disk Drive (HDD)), flash memory devices (e.g., Solid State Drive (SSD), Secure Digital (SD) card, other solid state memory devices), or any other storage medium which may be used to carry or store particular program code in the form of computer-executable instructions or data structures and which may be accessed by a general-purpose or special-purpose computer. Combinations of the above may also be included within the scope of computer-readable storage media. Computer-executable instructions may include, for example, instructions and data configured to cause the processor 204 to perform a certain operation or group of operations associated with the electronic device 102.
In some embodiments, either of the memory 206, the persistent data storage 208, or combination may store the classifier 106 and the DNN 104 as software instructions. The processor 204 may fetch the software instructions related to the classifier 106 and the DNN 104 to perform different operations of the disclosed electronic device 102. In some embodiments, either of the memory 206, the persistent data storage 208, or combination may store the first dataset, second datapoint, the training/test dataset, the first set of features, the predicted first class, and/or the set of datapoints clustered in the set of semantic classes.
The I/O device 210 may include suitable logic, circuitry, interfaces, and/or code that may be configured to receive a user input. The I/O device 210 may be further configured to provide an output in response to the user input. For example, I/O device 210 may receive a command (e.g., through a user-interface), a voice instruction, or a handwritten text as a user input from a user, where the received user input may be used to initiate the training of the DNN 104, training of the classifiers 106, or to provide the semantic feedback (i.e. one of the set of semantic classes) on the prediction result of the trained DNN 104. The I/O device 210 may include various input and output devices, which may be configured to communicate with the processor 204 and other components, such as the network interface 212. The I/O device 210 may include an input device or an output device. Examples of the input device may include, but are not limited to, a touch screen (e.g., the display screen 210A), a keyboard, a mouse, a joystick, and/or a microphone. Examples of the output device may include, but are not limited to, a display (e.g., the display screen 210A) and a speaker.
The display screen 210A may comprise suitable logic, circuitry, interfaces, and/or code that may be configured to display the semantic class in which a datapoint may be classified by the classifier 106. Further, the display screen 210A may be configured to render an action associated with the classified semantic class. The display screen 210A may be configured to receive the user input from the user 114. In such cases the display screen 210A may be a touch screen to receive the user input. The display screen 210A may be realized through several known technologies such as, but not limited to, a Liquid Crystal Display (LCD) display, a Light Emitting Diode (LED) display, a plasma display, and/or an Organic LED (OLED) display technology, and/or other display technologies.
The network interface 212 may comprise suitable logic, circuitry, interfaces, and/or code that may be configured to establish a communication between the electronic device 102, the database 108, and the user-end device 110, via the communication network 112. The network interface 212 may be implemented by use of various known technologies to support wired or wireless communication of the electronic device 102 via the communication network 112. The network interface 212 may include, but is not limited to, an antenna, a radio frequency (RF) transceiver, one or more amplifiers, a tuner, one or more oscillators, a digital signal processor, a coder-decoder (CODEC) chipset, a subscriber identity module (SIM) card, and/or a local buffer.
The network interface 212 may communicate via wireless communication with networks, such as the Internet, an Intranet and/or a wireless network, such as a cellular telephone network, a wireless local area network (LAN) and/or a metropolitan area network (MAN). The wireless communication may use any of a plurality of communication standards, protocols and technologies, such as Global System for Mobile Communications (GSM), Enhanced Data GSM Environment (EDGE), wideband code division multiple access (W-CDMA), Long Term Evolution (LTE), code division multiple access (CDMA), time division multiple access (TDMA), Bluetooth, Wireless Fidelity (Wi-Fi) (such as IEEE 802.11a, IEEE 802.11b, IEEE 802.11g and/or IEEE 802.11n), voice over Internet Protocol (VoIP), light fidelity (Li-Fi), or Wi-MAX.
Modifications, additions, or omissions may be made to the example system 202 without departing from the scope of the present disclosure. For example, in some embodiments, the example system 202 may include any number of other components that may not be explicitly illustrated or described for the sake of brevity. In an example, the DNN 104 and the classifier 106 may be integrated in the electronic device 102 as shown in
At block 302, a first dataset may be received. The first dataset may be associated with a real-time application. The first dataset may include a plurality of datapoints including a first datapoint. For example, the first datapoint may include, but is not limited to, an image, audio/speech samples, text characters, software instructions, or other forms of digital signals, such as but not limited to, electrical bio-signals, motion data, or depth data. Examples of the real-time applications may include, but are not limited to, an image recognition application, an image classification application, a speech recognition application, a text recognition application, a malware detection application, an autonomous vehicle application, an anomaly detection application, a machine translation application, or pattern recognition application from digital signals/data.
In some embodiments, the processor 204 may be configured to receive the first dataset (for example, a set of images) that may be stored in either or combination of the memory 206, the persistent data storage 208, or the database 108. The first datapoint (for example, an image) of the first dataset may be received for classification or prediction into a particular class label, where the classification or prediction may be performed by the pre-trained DNN 104.
At block 304, a first class associated with the first datapoint of the received first datapoint may be predicted. The pre-trained DNN 104 may be controlled to predict the first class associated with the received first datapoint. In one or more embodiments, the processor 204 may be configured to control the pre-trained DNN 104 to predict the first class for the received first datapoint. For example, in case the DNN 104 is pre-trained for image classification tasks and the first datapoint is an image, the pre-trained DNN 104 may predict the first class as a living object (e.g., an animal, plant, or a human) or a non-living object (e.g., a building, a vehicle, a street, a symbol, or any other object) for the image. In case, the image input to the DNN 104 is of a dog animal, then the DNN 104 may output a unique class label which may indicate the classification of the image into a dog label. The output class label may be considered as the prediction result of the DNN 104.
The DNN 104 may be configured to generate a first confidence score, as a native confidence score, of the DNN 104. The first confidence score may indicate a probability value (say between 0 to 1) to indicate the prediction result of the DNN 104. In other words, the first confidence score generated by the DNN 104 may indicate the prediction of the first class (i.e. class label) for the received input image (or the datapoint). Similarly, the processor 204 may control the pre-trained DNN 104 to predict the first class associated with each datapoint in the received first dataset.
At block 306, a first set of feature scores may be determined for the first datapoint based on the predicted first class associated with the first datapoint. In one or more embodiments, the processor 204 may be configured to determine the first set of feature scores for the first datapoint based on the predicted first class associated with the first datapoint. Examples of the first set of feature scores may include at least one of, but not limited to, a Likelihood-based Surprise Adequacy (LSA) score, a Distance-based Surprise Adequacy (DSA) score, a confidence score, a logit score, or a robustness diversity score, for the first datapoint. Determination of surprise adequacy based scores is described next. For example, let N={n1, n2, . . . } be a set of neurons associated with the DNN 104, and let X={x1, x2, . . . } be a set of datapoints in the received first dataset. An activation value of a neuron n with respect to a datapoint x (e.g., the first datapoint) may be represented as αn(x). A vector of activation values may be represented as αN(x), for an ordered sub-set of neurons (i.e., N⊂N). The term αN(x) is referred herein as an Activation Trace (AT) of x (i.e., the first datapoint) over the neurons in N. Similarly, a set of activation traces that may be observed over the neurons in N, for the set of datapoints X (i.e., the received first dataset) may be represented as AN(X)={αN(x)|x∈X}. For a training dataset (e.g., a dataset T) of the DNN 104, the processor 204 may determine a set of activation traces over all neurons in N for each datapoint in the training dataset T. Such determined set of activation traces for the training dataset may be represented by AN(T). For a new datapoint x (e.g., the first datapoint), the processor 204 may determine a degree of surprise associated with the new datapoint x (i.e., the first datapoint) with respect to the training dataset (i.e., the dataset T) based on a comparison of αn(x) for the new datapoint x (i.e., the first datapoint) and AN(T) for the training dataset T. The degree of surprise is referred herein as a Surprise Adequacy score. In the present disclosure, the training dataset may correspond to the first dataset itself, of which the first datapoint may be a part. Thus, the processor 204 may determine the Surprise Adequacy score for the first datapoint (which may be represented as x) with respect to the first dataset (which may be represented as X, where x∈X), based on a comparison of αn(x) for the first datapoint and AN(X) for the first dataset.
The Likelihood-based Surprise Adequacy (LSA) score may be indicative of whether the first datapoint is in-distribution with the first dataset. In an example, the LSA score may be determined based on a Kernel Density Estimation (KDE) technique that may estimate a probability density function of a random variable such that the estimated probability density function may enable estimation of a relative likelihood of a certain value of the random variable. For example, the KDE technique may be used to determine the LSA score associated with the first datapoint by estimation of a probability density of each activation value in AN(X) and determination of a degree of surprise (i.e., a Surprise Adequacy score) of the first datapoint with respect to the estimated probability density.
The Distance-based Surprise Adequacy (DSA) score may be indicative of whether the first datapoint is closer to the predicted first class than another class neighboring to the predicted first class in a hyper-space associated with the DNN 104. For example, the DSA score associated with the first datapoint may be determined based on a distance between Activation Traces (ATs) as a degree of surprise (i.e., a Surprise Adequacy score). The DSA score associated with the first datapoint may be determined based on a Euclidean distance between the Activation Trace (AT) of the first datapoint and Activation Traces (ATs) associated with each datapoint in the first dataset. As an example, let C represent a set of classes associated with the DNN 104, AN(X) represent the set of Activation Traces associated with the first dataset, x represent the first datapoint, and cx (where cx∈C) represent the predicted first class associated with the first datapoint x. The processor 204 may determine a reference datapoint (e.g., xa) in the first dataset that may be a nearest neighbor of the first datapoint (i.e., x) and may be associated with the same first class cx. Herein, an activation trace (e.g., αN(xa)) of the reference datapoint (i.e., xa) may have a nearest Euclidean distance to the activation trace (e.g., αN(x)) of the first datapoint (i.e., x) than other datapoints in the first dataset that may associated with first class (i.e., cx). Let the Euclidean distance between the activation trace (e.g., αN(xa)) of the reference datapoint (i.e., xa) and the activation trace (e.g., αN(x)) of the first datapoint (i.e., x) may be represented by dista. The processor 204 may further determine a datapoint (e.g., xb) that may be closest neighbor of the reference datapoint (i.e., xa) and may be associated with a class other than the first class (i.e., cx). The processor 204 may determine a Euclidean distance (e.g., distb) between the activation trace (e.g., αN(xa)) of the reference datapoint (i.e., xa) and an activation trace (e.g., αN(xb)) of the datapoint xb. In an example, the processor 204 may determine the DSA score associated with the first datapoint based on a ratio of dista and distb, i.e., dista/distb.
The confidence score may correspond to a probability score of the DNN 104 for the prediction of the first class associated with the first datapoint. In other words, the confidence score may be a probability determined by the DNN 104 that the first class predicted by the DNN 104 for the first datapoint is accurate. The logit score may correspond to a score from a pre-softmax layer of the DNN 104 for the prediction of the first class associated with the first datapoint. The logit score may be an un-normalized score associated with the prediction output of the DNN 104 that may be obtained from the pre-softmax layer. A higher logit score for the first class associated with the first datapoint may be indicative that the prediction of the DNN 104 for the first datapoint is accurate.
The robustness diversity score may be indicative of a degree of stability of a prediction by the DNN 104 for one or more variations corresponding to the first datapoint. In an embodiment, the processor 204 may determine a set of variations of the first datapoint based on application of one or more variations on the first datapoint. The robustness diversity score may be determined based on a ratio of a number of variations that may be accurately predicted by the DNN 104 to the total number of variations in the set of variations. For example, in case the first datapoint is an image of a cat, the processor 204 may determine the set of variations based on application of one or more rotation variations on the first datapoint. By application of the one or more rotation variations may obtain one or more first images in which the cat may be rotated by different angles. Similarly, the processor 204 may apply one or more translation variations and/or one or more scaling variations on the first datapoint to obtain one or more second images and/or one or more third images, respectively. The set of variations of the first datapoint may be obtained as a collection of the one or more first images, the one or more second images, and the one or more third images. Similarly, the processor 204 may determine different variations, such as a zoom variation, a brightness variation, a contrast variation, a color variation, a flip variation, a sharpness variation, or a shear variation to determine different datapoints for the first datapoint of the first dataset. In an embodiment, the processor 204 may further control the DNN 104 to predict a class associated with each of the set of variations of the first datapoint. Thereafter, the processor 204 may determine a ratio of a number of variations for which the first class is accurately predicted to the total number of variations in the set of variations may be determined. The robustness diversity score may be determined based on the ratio of accurately predicted variations (i.e., the variations for which the first class is correctly predicted by the DNN 104) to the total number of variations in the set of variations. Thus, based on the ratio, the robustness diversity score may indicate the degree of stability of the prediction by the DNN 104 for different variations applied to the first datapoint. Similarly, the processor 204 may be configured to determine the first set of features (i.e., the LSA score, the DSA score, the confidence score, the logit score, and the robustness diversity score) for each datapoint in the first dataset.
At block 308, a set of confusing class pairs associated with the DNN 104 may be identified based on the predicted first class and a predetermined class associated with the first datapoint. In one or more embodiments, the processor 204 may be configured to identify the set of confusing class pairs associated with the DNN 104 based on the predicted first class and the predetermined class associated with the first datapoint. The predetermined class may correspond to a ground truth or expected class for the first datapoint. Similarly, the electronic device 102 may be configured to identify the set of confusing class pairs for each datapoint in the first dataset. The identification of the set of confusing classes for a datapoint is described further, for example, in
At block 310, the received first dataset may be clustered into one of a set of semantic classes based on the determined first set of features, the predicted first class, and the identified set of confusing class pairs for each datapoint in the received first dataset. Each of the set of semantic classes may be indicative of a prediction accuracy associated with a set of datapoints clustered in corresponding semantic class of the set of semantic classes. The prediction accuracy associated with the set of semantic classes is described further, for example, in
At block 312, the classifier 106 may be trained based on the clustered first dataset, the determined first set of features, and the set of semantic classes. In one or more embodiments, the processor 204 may be configured to control a training of a classifier (e.g., the classifier 106 or a meta-classifier) based on the clustered first dataset, the determined first set of feature scores, and the set of semantic classes. Examples of the classifier 106 may include, but are not limited to, a decision tree classifier, a Support Vector Machine (SVM) classifier, a Naïve Bayes classifier, a Logistic Regression classifier, or a k-nearest neighbor (K-NN) classifier. In an example, the classifier 106 may be built and trained as a K-NN classifier with K=5 nearest neighbors. As an example, a Scikit-Learn library may be used to build and train the K-NN classifier. Based on the training of the K-NN classifier (e.g., the classifier 106), the K-NN classifier may output the set of confusing class pairs associated with the DNN 104.
In an embodiment, the processor 204 may control the training of the classifier 106 based on a relationship between the first set of features scores associated with each datapoint in the first dataset and a semantic class of the set of semantic classes in which each datapoint in the first dataset may be clustered. In an example, the likely-correct semantic class may be a semantic class with datapoints that may be in-distribution with the training data (i.e., the first dataset). Further, in the likely-correct semantic class, a confidence associated with a prediction by the DNN 104 for associated datapoints may be high and the prediction for such datapoints may not be confusing. In addition, the prediction by the DNN 104 for such datapoints may be robust to variations in the datapoints clustered in the likely-correct semantic class. In another example, the one-of-two semantic class may be a semantic class with clustered datapoints, such that a prediction by the DNN 104 for such datapoints may have two equally likely classes with a similar confidence. Further, the prediction by the DNN 104 for such datapoints may not be robust. In yet another example, the likely-incorrect class may be a semantic class with clustered datapoints that may be severely out of distribution with the training data (i.e., the first dataset). Further, a confidence associated with a prediction by the DNN 104 for such datapoints may be low. In addition, the prediction by the DNN 104 for such datapoints may not be robust to variations in the datapoints clustered in the likely-incorrect class. In another example, the do-not-know semantic class may be a semantic class that may include datapoints of the first dataset that may remain after the clustering of the first dataset into the likely-correct semantic class, the one-of-two semantic class, and the likely-incorrect semantic class. The training of the classifier (e.g., the classifier 106) for the classification of a datapoint (i.e. input to a pre-trained DNN (e.g., the DNN 104)) into one of the set of semantic classes is described further, for example, in
Although the flowchart 300 is illustrated as discrete operations, such as 302, 304, 306, 308, 310, and 312. However, in certain embodiments, such discrete operations may be further divided into additional operations, combined into fewer operations, or eliminated, depending on the particular implementation without detracting from the essence of the disclosed embodiments.
At block 402, a second datapoint may be received. The second datapoint may be associated with a real-time application. For example, the second datapoint may include, but is not limited to, an image, audio/speech samples, text characters, software instructions, or other forms of digital signals, such as but not limited to, electrical bio-signals, motion data, or depth data. Examples of the real-time applications may include, but are not limited to, an image recognition application, an image classification application, a speech recognition application, a text recognition application, a malware detection application, an autonomous vehicle application, an anomaly detection application, a machine translation application, or pattern recognition application from digital signals/data.
In some embodiments, the processor 204 may be configured to receive the second datapoint (for example, an image) that may be stored in either or combination of the memory 206, the persistent data storage 208, or the database 108. The second datapoint (for example, the image) may be received for classification or prediction into a particular class label, where the classification or prediction may be performed by the pre-trained DNN 104. In an embodiment, the second datapoint may be a new datapoint which may not be included in the training dataset (i.e. first dataset) of the trained DNN 104.
At block 404, a second class associated with the received second datapoint may be predicted. The pre-trained DNN 104 may be controlled to predict the second class associated with the received second datapoint. In one or more embodiments, the processor 204 may be configured to control the pre-trained DNN 104 to predict the second class for the received second datapoint. For example, in case the DNN 104 is pre-trained for image classification tasks and the second datapoint is an image, the pre-trained DNN 104 may predict the second class as a living object (e.g., an animal, plant, or a human) or a non-living object (e.g., a building, a vehicle, a street, a symbol, or any other object) for the image. In case, the image input to the DNN 104 is of a dog animal, then the DNN 104 may output a unique class label which may indicate the classification of the image into a dog label. The output class label may be considered as the prediction result of the DNN 104. The DNN 104 may be configured to generate a second confidence score, as a native confidence score, of the DNN 104. The second confidence score may indicate a probability value (say between 0 to 1) to indicate the prediction result of the DNN 104. In other words, the second confidence score generated by the DNN 104 may indicate the prediction of the second class (i.e. class label) for the received input image (or the datapoint).
At block 406, a second set of feature scores may be determined for the second datapoint based on the predicted second class associated with the second datapoint. In one or more embodiments, the processor 204 may be configured to determine the second set of feature scores for the second datapoint based on the predicted second class associated with the second datapoint. Examples of the second set of feature scores may include at least one of, but not limited to, a Likelihood-based Surprise Adequacy (LSA) score, a Distance-based Surprise Adequacy (DSA) score, a confidence score, a logit score, or a robustness diversity score, for the second datapoint. The determination of the second set of feature scores may be similar to the determination of the first set of feature scores, as described further, for example, in
At block 408, a pre-trained classifier (e.g., the classifier 106) may be applied on the received second datapoint, based on the determined second set of feature scores, and the predicted second class for the second datapoint. The classifier 106 (or a meta-classifier) may be pre-trained to classify a datapoint into one of the set of semantic classes. Herein, each of the set of semantic classes may be indicative of a prediction accuracy associated with a set of datapoints clustered in corresponding semantic class of the set of semantic classes. The prediction accuracy associated with the set of semantic classes is described further, for example, in
At block 410, the second datapoint may be classified into one of the set of semantic classes based on the application of the pre-trained classifier (e.g., the classifier 106) on the second datapoint. In one or more embodiments, the processor 204 may be configured to classify the second datapoint into one of the set of sematic classes based on the application of the pre-trained classifier 106 on the second datapoint. Examples of the set of semantic classes may include, but are not limited to, a likely-correct semantic class, a one-of-two semantic class, a likely incorrect semantic class, or a do-not-know semantic class. The pre-trained classifier 106 may also output a confusing class pair associated with the second class in case the second datapoint is classified into the one-of-two semantic class. Each of the set of semantic classes may be easily understandable or interpretable by the end user (such as the user 114), in comparison to the confidence score or the prediction score provided by the DNN 104. The set of semantic classes may allow the end-user to make a more informed decision on how to use, or not use, the output of the DNN 104 in further actions.
At block 412, at least one action associated with the classified one of the set of semantic classes for the second datapoint may be determined. In one or more embodiments, the processor 204 may be configured to determine at least one action associated with the classified one of the set of semantic classes for the second datapoint. For example, in case the second datapoint is classified into the likely-correct semantic class, the processor 204 may determine the action as to reliably use (or accept) the class label (i.e. output by the DNN 104 for the second datapoint) as-is with a high confidence, by the user 114 or by the user-end device 110. In another example, in case the second datapoint is classified into a likely incorrect semantic class, the processor 204 may determine the action as a rejection of the class label output by the DNN 104 for the second datapoint and may further suggest using a human judgement by the user 114 to take decision for the prediction of the second datapoint. Further, in case the semantic class for the second datapoint is determined as the one-of-two semantic class, the processor 204 may determine the action as a use of a tie-breaker solution by the user 114. The tie-breaker solution may be a manual or an automated solution. For example, the automated solution may include training and use of a local classifier to specifically predict an accurate class for the second datapoint based on the confusing class pair associated with the second class predicted by the DNN 104. The local classifier may be only trained to accurate predict the classes which may be included in the confusing class pair. For example, in case the DNN 104 is trained to predict numerals (“0” to “9”) based on an input image (i.e. second datapoint) including handwritten number(s). In such case, the class labels for numerals “1” and “7” may be considered as the confusing class pair and clustered in the one-of-two semantic class by the DNN 104 and/or by the classifier 106. Thus, the tie-breaker solution may suggest an action to use the separate or the local classifier which is only trained to accurately classify “1” and “7” numeral images for better prediction and decision making by the user-end device 110 or the user 114. Further, in case the semantic class for the second datapoint is determined as the do-not-know semantic class, the processor 204 may determine the action as a use of a majority-vote solution by the user 114. The majority-vote solution may be a manual or an automated solution. For example, the automated solution may include training and using multiple DNNs for the classification task of the real-time application. The second datapoint may be fed to each DNN and prediction results of the DNNs may be compared. A class label predicted by a majority of the DNNs may be selected as a class label for the second datapoint as per the majority-vote solution.
At block 414, the determined at least one action may be rendered. In one or more embodiments, the processor 204 may be configured to render the determined at least one action. In an example, the processor 204 may render the determined action, predicted second class, and/or determined semantic class associated with the second datapoint as output information. In an embodiment, the output information may correspond to at least one of, but not limited to, a display of the determined action, predicted second class, and/or determined semantic class on the display screen 210A, a storage of the determined action, predicted second class, and/or determined semantic class in a log file, or a notification/alert based on the determined action, predicted second class, and/or determined semantic class. For example, the user-end device 110 (or the electronic device 102) may display the determined action, predicted second class, and/or determined semantic class associated with the second datapoint. In another example, the user-end device 110 (or the electronic device 102) may store the determined action, predicted second class, and/or determined semantic class associated with the second datapoint in a log file in a memory (such as the memory 206). For example, the log file may indicate how many times the DNN 104 has correctly predicted (or not) to determine the accuracy of the DNN 104 in a real-time operation (for example in an autonomous vehicle). In certain scenario, the determined action, and/or determined semantic class may be stored in the log file along with the predicted second class based on the determined semantic class. For example, the determined action, predicted second class, and/or determined semantic class may be stored in the log file when the semantic class is other than the likely-correct semantic class. In another example, the output information may be indicated as a notification (for example an alert or warning) to a user of the user-end device 110 (or the electronic device 102) based on the determined action, predicted second class, and/or determined semantic class. For example, in case, the user-end device 110 is an autonomous vehicle and the semantic class is one of the one-of-two semantic class, the likely-incorrect semantic class, or the do-not-know class, the user-end device 110 (i.e., the autonomous vehicle) may notify a user (for example a passenger of the autonomous vehicle). The notification may include a warning or alert for the user to take control of the autonomous vehicle due to a potential wrong prediction of the second class (e.g., a wrong classification or mis-identification of an object that may be an obstacle) performed by the DNN 104 being deployed in the user-end device 110. In some embodiments, the output information generated by the electronic device 102 may correspond to certain automatic actions to be taken, for example, in case of incorrect prediction of the DNN 104 detected by the classifier 106 as a semantic class (for the second datapoint) which may be other than the likely-correct semantic class. For example, in case of detection of mis-classification or incorrect prediction performed by the DNN 104 for the received second datapoint in the autonomous vehicle, the user-end device 110 or the electronic device 102 may generate the output information to automatically apply brakes, control steering, or significantly reduce speed of the autonomous vehicle. Control may pass to end.
Although the flowchart 400 is illustrated as discrete operations, such as 402, 404, 406, 408, 410, 412, and 414. However, in certain embodiments, such discrete operations may be further divided into additional operations, combined into fewer operations, or eliminated, depending on the particular implementation without detracting from the essence of the disclosed embodiments.
At block 502, the predetermined class associated with the first datapoint may be compared with the predicted first class associated with the first datapoint. In one or more embodiments, the processor 204 may be configured to compare the predetermined class associated with the first datapoint with the predicted first class associated with the first datapoint. The prediction of the first class is described, for example, at 304 in
At block 504, based on the comparison, a first instance of misclassification associated with a first class pair may be identified. The first class pair may include the predicted first class associated with the first datapoint and the predetermined class associated with the first datapoint. In one or more embodiments, the processor 204 may be configured to identify the first instance of misclassification associated with the first class pair including the predicted first class and the predetermined class. For example, if the predicted first class associated with the first datapoint is a cat class and the predetermined class associated with the first datapoint is a dog class, the processor 204 may determine that the first datapoint is misclassified. In such case, the first instance of misclassification associated with the first class pair for the first datapoint may be identified. Herein, the first class pair may include the cat class (i.e., the predicted first class) and the dog class (i.e., the predetermined class) for the first datapoint.
At block 506, a count of instances of misclassifications associated with the first class pair may be determined based on the identified first instance of misclassification. In one or more embodiments, the processor 204 may be configured to determine the count of instances of misclassifications associated with the first class pair, based on the identified first instance of misclassification. Similar to the prediction of the first class associated with the first datapoint by the DNN 104, the processor 204 may control the DNN 104 to predict the first class associated with each datapoint in the received first dataset (as described at 304 in
At block 508, the first class pair may be identified as a confusing class pair in the set of confusing class pairs, based on the count of instances of the misclassifications for the first class pair and a threshold. In one or more embodiments, the processor 204 may be configured to identify the first class pair (e.g., the dog class and the cat class) as a confusing class pair in the set of confusing class pairs, based on the count of instances of the misclassifications for the first class pair and the threshold. For example, in case the threshold is 100 and the count of instances of the misclassifications associated with the first class pair (e.g., the dog class and the cat class) for different datapoints is 120, the first class pair may be identified as a confusing class pair in the set of confusing class pairs, as the count of instances of misclassifications for the first class pair is more than the threshold. In certain embodiments, the processor 204 may be configured to sort a set of class pairs (including the first class pair) based on the count of instances of misclassifications of each class pair in a descending order. Further, the processor 204 may be configured to select or identify first N class pairs (i.e. Top N class pairs) in the sorted order as the set of confusing class pairs, rather than selecting all different class pairs in the set of confusing class pairs . In addition, the processor 204 may also select one or more class pairs from the set of class pairs as the confusing class pairs based on the threshold. For example, all the class pairs with corresponding count of instances of misclassifications higher than the predefined threshold may be considered as most confusing class pairs (i.e. identified set of confusing class pairs) for the first dataset predicted by the DNN 104. For example, based on the prediction of each datapoint of the first dataset, the class pair including dog class and the cat class may have highest misclassification than other class pairs identified in the prediction of the first dataset. In another example, the class pair including numeral “1” class label and numeral “7” class label may be most misclassified for a dataset, as both numeral “1” and “7” may have similar features for the handwritten datapoints. Control may pass to end.
Although the flowchart 500 is illustrated as discrete operations, such as 502, 504, 506, and 508. However, in certain embodiments, such discrete operations may be further divided into additional operations, combined into fewer operations, or eliminated, depending on the particular implementation without detracting from the essence of the disclosed embodiments.
With reference to
With reference to
In an embodiment, the first accuracy group may have a higher prediction accuracy than the second accuracy group.
For example, with reference to
With reference to
At block 608, a second set of datapoints in the second accuracy group may be stored in a Last-In-First-Out (LIFO) data structure, based on a number of second set of datapoints and the minimum group size threshold (i.e., n). In one or more embodiments, the processor 204 may be configured to store the second set of datapoints in the second accuracy group in the LIFO data structure, based on the number of the second set of datapoints and the minimum group size threshold (i.e., n). The processor 204 may compare the number of the second set of datapoints with the minimum group size threshold (i.e., n). If the number of the second set of datapoints is determined as more than the minimum group size threshold (for example, n=2), the processor 204 may store the second set of datapoints in the LIFO data structure. In an example, the LIFO data structure may be a stack data structure, which may be denoted by S. For example, the processor 204 may push the second set of datapoints of the second accuracy group in the stack data structure. The LIFO structure may be stored in one or more of, but not limited to, the memory 206, the persistent data storage 208, or the database 108.
For example, with reference to
With reference to
With reference to
With reference to
For example, with reference to
With reference to
At block 618, a third set of datapoints may be retrieved from the LIFO data structure based on the determination that the LIFO data structure is not empty. In one or more embodiments, based on the determination that the LIFO data structure is not empty (as determined based on the check at 616), the processor 204 may be configured to retrieve the third set of datapoints from the LIFO data structure. Based on the last-in-first-out property of the LIFO data structure (e.g., the stack “S”), a set of datapoints that are most recently pushed into the LIFO data structure may be popped and retrieved as the third set of datapoints from the LIFO data structure (i.e., the stack “S”). In an embodiment, the first set of datapoints of the first accuracy group (i.e. stored in LIFO data structure at 614) may be retrieved as the third set of datapoints. In another embodiment, the second set of datapoints of the second accuracy group (i.e. stored in LIFO data structure at 608) may be retrieved as the third set of datapoints.
At block 620, the retrieved third set of datapoints may be re-selected as the current sample to be clustered. In one or more embodiments, the processor 204 may be configured to re-select the retrieved third set of datapoints as the current sample. The re-selection of the retrieved third set of datapoints as the current sample to be further clustered may be similar to the selection of the received first dataset as the current sample to be clustered, as described at 602. For example, with reference to
With reference to
With reference to
At block 626, the classifier 106 may be trained based on the re-clustered first dataset, the determined third set of feature scores, and the set of semantic classes. In one or more embodiments, the processor 204 may be configured to control a training of a classifier (e.g., the classifier 106) based on the re-clustered first dataset, the determined third set of feature scores, and the set of semantic classes. Examples of the classifier 106 may include, but are not limited to, a decision tree classifier, a Support Vector Machine (SVM) classifier, a Naïve Bayes classifier, a Logistic Regression classifier, or a k-nearest neighbor (K-NN) classifier. In an example, the classifier 106 may be built and trained as a K-NN classifier with K=5 nearest neighbors. As an example, a Scikit-Learn library may be used to build and train the K-NN classifier. For example, the classifier 106 may be trained on a relationship between the third set of feature scores (i.e. similar to the first set of feature scores and/or the second set of feature scores) and clustered set of semantic classes for each datapoint in the first dataset as described, for example, at 312 in
Although the flowchart 600 is illustrated as discrete operations, such as 602, 604, 606, 608, 610, 612, 614, 616, 618, 620, 622, 624, and 626. However, in certain embodiments, such discrete operations may be further divided into additional operations, combined into fewer operations, or eliminated, depending on the particular implementation without detracting from the essence of the disclosed embodiments. It may be noted that the scenario 700 (including the program code) shown in
Therefore, the disclosed electronic device 102 may partition an input space of the DNN 104 into various semantic classes that may be human comprehensible and may help humans (i.e. user 114) to understand the prediction output of the DNN 104 in more meaning way. A semantic class may indicate the predictive accuracy (such as described, for example, at 610 in
An exemplary dataset and experimental setup for the disclosure is presented in Table 1, as follows:
It should be noted that data provided in Table 1 may merely be taken as experimental data and may not be construed as limiting the present disclosure.
Exemplary experimental datasets and results of training phase of a classifier (i.e. described in
It should be noted that data provided in Table 2 may merely be taken as experimental data and may not be construed as limiting the present disclosure.
Exemplary experimental datasets and results of operational (i.e., real-time) phase (i.e. described in
It should be noted that data provided in Table 3 may merely be taken as experimental data and may not be construed as limiting the present disclosure.
Various embodiments of the disclosure may provide one or more non-transitory computer-readable storage media configured to store instructions that, in response to being executed, cause a system (such as the example system 202 or the electronic device 102) to perform operations. The operations may include receiving a first dataset associated with a real-time application. The operations may further include predicting, by a Deep Neural Network (DNN) pre-trained for a classification task of the real-time application, a first class associated with a first datapoint of the received first dataset. The operations may further include determining a first set of feature scores for the first datapoint based on the predicted first class associated with the first datapoint. The operations may further include identifying a set of confusing class pairs associated with the DNN based on the predicted first class and a predetermined class associated with the first datapoint. The operations may further include clustering the received first dataset into one of a set of semantic classes based on the determined first set of features scores, the predicted first class, and the identified set of confusing class pairs for each datapoint in the received first dataset. Each of the set of semantic classes may be indicative of a prediction accuracy associated with a set of datapoints clustered in corresponding semantic class of the set of semantic classes. The operations may further include training a classifier based on the clustered first dataset, the determined first set of feature scores, and the set of semantic classes.
Various embodiments of the disclosure may provide one or more non-transitory computer-readable storage media configured to store instructions that, in response to being executed, cause a system (such as the example system 202 or the electronic device 102) to perform operations. The operations may include receiving a second datapoint associated with a real-time application. The operations may further include predicting, by a Deep Neural Network (DNN) pre-trained for a classification task of the real-time application, a second class associated with the second datapoint. The operations may further include determining a second set of feature scores for the second datapoint based on the predicted second class associated with the received second datapoint. The operations may further include applying a pre-trained classifier on the received second datapoint, based on the determined second set of feature scores and the predicted second class. The classifier may be pre-trained to classify a datapoint into one of a set of semantic classes. Each of the set of semantic classes may be indicative of a prediction accuracy associated with a set of datapoints clustered in corresponding semantic class of the set of semantic classes. The operations may further include classifying the second datapoint into one of the set of semantic classes based on the application of the pre-trained classifier on the second datapoint. The operations may further include determining at least one action associated with the classified one of the set of semantic classes for the second datapoint. The operations may further include rendering the determined at least one action.
As used in the present disclosure, the terms “module” or “component” may refer to specific hardware implementations configured to perform the actions of the module or component and/or software objects or software routines that may be stored on and/or executed by general purpose hardware (e.g., computer-readable media, processing devices, etc.) of the computing system. In some embodiments, the different components, modules, engines, and services described in the present disclosure may be implemented as objects or processes that execute on the computing system (e.g., as separate threads). While some of the system and methods described in the present disclosure are generally described as being implemented in software (stored on and/or executed by general purpose hardware), specific hardware implementations or a combination of software and specific hardware implementations are also possible and contemplated. In this description, a “computing entity” may be any computing system as previously defined in the present disclosure, or any module or combination of modulates running on a computing system.
Terms used in the present disclosure and especially in the appended claims (e.g., bodies of the appended claims) are generally intended as “open” terms (e.g., the term “including” should be interpreted as “including, but not limited to,” the term “having” should be interpreted as “having at least,” the term “includes” should be interpreted as “includes, but is not limited to,” etc.).
Additionally, if a specific number of an introduced claim recitation is intended, such an intent will be explicitly recited in the claim, and in the absence of such recitation no such intent is present. For example, as an aid to understanding, the following appended claims may contain usage of the introductory phrases “at least one” and “one or more” to introduce claim recitations. However, the use of such phrases should not be construed to imply that the introduction of a claim recitation by the indefinite articles “a” or “an” limits any particular claim containing such introduced claim recitation to embodiments containing only one such recitation, even when the same claim includes the introductory phrases “one or more” or “at least one” and indefinite articles such as “a” or “an” (e.g., “a” and/or “an” should be interpreted to mean “at least one” or “one or more”); the same holds true for the use of definite articles used to introduce claim recitations.
In addition, even if a specific number of an introduced claim recitation is explicitly recited, those skilled in the art will recognize that such recitation should be interpreted to mean at least the recited number (e.g., the bare recitation of “two recitations,” without other modifiers, means at least two recitations, or two or more recitations). Furthermore, in those instances where a convention analogous to “at least one of A, B, and C, etc.” or “one or more of A, B, and C, etc.” is used, in general such a construction is intended to include A alone, B alone, C alone, A and B together, A and C together, B and C together, or A, B, and C together, etc.
Further, any disjunctive word or phrase presenting two or more alternative terms, whether in the description, claims, or drawings, should be understood to contemplate the possibilities of including one of the terms, either of the terms, or both terms. For example, the phrase “A or B” should be understood to include the possibilities of “A” or “B” or “A and B.”
All examples and conditional language recited in the present disclosure are intended for pedagogical objects to aid the reader in understanding the present disclosure and the concepts contributed by the inventor to furthering the art, and are to be construed as being without limitation to such specifically recited examples and conditions. Although embodiments of the present disclosure have been described in detail, various changes, substitutions, and alterations could be made hereto without departing from the spirit and scope of the present disclosure.