The present disclosure relates generally to the field of machine learning, and in one aspect, but not by way of limitation, to systems and methods for building and using learning machines to understand and explain learning machines.
Learning machines are machines that can learn from data and perform tasks. Examples of learning machines include, but are not limited to, kernel machines, decision trees, decision forests, random forests, sum-product networks, Bayesian networks, Boltzmann machines, and neural networks. For example, graph-based learning machines such as neural networks, graph networks, sum-product networks, Boltzmann machines, and Bayesian networks typically consist of a group of nodes and interconnects that are able to process samples of data to generate an output for a given input, and learn from observations of the data samples to adapt or change. Such learning systems may be embodied in software executable by a processor or in hardware in the form of an integrated circuit chip or on a computer, or in a combination thereof.
One of the biggest challenges in using certain types of learning machines such as certain types of graph-based learning machines (e.g., neural networks, graph networks, Boltzmann machines, and sum-product networks) is that it is often extremely difficult to understand and explain the inner workings of such learning machines, which is commonly referred to as the ‘black box’ problem. The limitations in understanding and explaining such learning machines make it hard to trust the decisions made by such learning machines, make it hard to fix such learning machines when they generate incorrect outputs, make it hard to identify biases in the outputs generated by such learning machines, and make it difficult to design better learning machines.
A need therefore exists to develop systems and methods for building and using learning machines that can understand and explain other learning machines to address the above mentioned and other limitations and challenges.
The present disclosure provides practical applications and technical improvements to the field of machine learning, and more specifically to systems and methods for building and using learning machines to understand and explain learning machines.
In general, the present system may comprise a reference learning machine and an explainer learning machine being built for explaining and understanding the reference learning machine. The reference learning machines in the system may include but are not limited to sum-product networks, Bayesian networks, Boltzmann machines, and neural networks. The input signals can be grouped into one or more discrete states based on the outputs of the reference learning machine for a given input signal.
In some embodiments, the system feeds a set of test input signals through the reference learning machine and the outputs at the different components of the reference learning machine for each given test input signal are recorded. The recorded outputs at the different components of the reference learning machine for each given test input signal, along with the corresponding expected outputs of the reference learning machine for each given test input signal, are then used to update the parameters of the explainer learning machine. After the parameter update process, the explainer learning machine can then be queried for quantitative insights about the reference learning machine that include, but not limited to: the degree of importance of each component of the reference learning machine to the reference learning machine's generation of outputs for input signals associated with the possible states, the set of components in the reference learning machine that have high degrees of importance to the reference learning machine's generation of outputs for a large number of states, the set of components in the reference learning machine that have high degrees of importance to the reference learning machine's generation of outputs for only a small number of states, the set of components in the reference learning machine that have low degrees of importance to the reference learning machine when generating outputs, the degree of importance of each component of an input signal to the reference learning machine's generation of outputs associated with the possible states, and a description of why the reference learning machine generated particular outputs given particular input signals.
In some embodiments, the reference learning machine in the system may be a graph-based reference learning machine including but not limited to: neural networks, graph networks, sum-product networks, and Boltzmann machines, the components are nodes and interconnects, and the input signals can be grouped into one or more discrete states based on the outputs of the graph-based reference learning machine for a given input signal. The explainer learning machine being built for explaining and understanding the graph-based reference learning machine may comprise (1) an input signal component state importance assignment module, (2) a reference learning machine component state importance assignment module, (3) a reference learning machine component state importance classification module, (4) a set of matrices of parameters, with each matrix corresponding to a component in the reference learning machine, and each parameter in a matrix representing the normalized degree of importance of a component in the reference learning machine to the reference learning machine's generation of outputs given a particular state, and (5) a description generator module.
As used herein, a module may be implemented in software executable by one or more processors, in hardware or a combination thereof. Also as used herein, the term reference graph-based learning machine can be used interchangeably with the term graph-based reference learning machine.
The system may feed a set of test input signals through the reference graph-based learning machine and the outputs at each node in the reference graph-based learning machine for each given test input signal are recorded. The recorded outputs at each node in the graph-based learning machine for each given test input signal are then used to update each parameter in the set of parameter matrices in the explainer learning machine based on derived products of the recorded outputs along with the corresponding expected output of the reference graph-based learning machine for each given test input signal.
After the parameter update process, the explainer learning machine can then be queried for: (1) the degree of importance of each node in the reference learning machine to the reference learning machine's generation of outputs given the possible states, (2) the set of nodes in the reference learning machine that have high degrees of importance to the reference learning machine's generation of outputs for a large number of states, (3) the set of nodes in the reference learning machine that have high degrees of importance to the reference learning machine's generation of outputs for only a small number of states, (4) the set of nodes in the reference learning machine that have low degrees of importance to the reference learning machine when generating outputs, (5) the degree of importance of each component of an input signal to the reference learning machine's generation of outputs given the possible states, and (6) a description of why the reference learning machine generated particular outputs given particular input signals.
In some embodiments, to respond to the query for the degree of importance of each node in the reference graph-based learning machine to the reference graph-based learning machine's generation of outputs for input signals given the possible states, the reference learning machine component state importance assignment module may retrieve the parameters from the set of parameter matrices and return them as the query result.
In some embodiments, to respond to the query for the set of nodes in the reference learning machine that have high degrees of importance to the reference learning machine's generation of outputs for only a small number of states, the reference learning machine component state importance classification module may analyze the statistical properties of the set of parameters in the set of parameter matrices corresponding to each node in the reference learning machine, and return the set of nodes that have aggregated parameter values above a given classification threshold and parameter value variances below another classification threshold, where the classification thresholds are either learned or predefined. The dominant state for a given node, which is determined by the reference learning machine component state importance classification module as the state associated with the highest degree of importance amongst the possible states for a given node, may also be returned as part of the query result.
In some embodiments, to respond to the query for the set of nodes in the reference learning machine that have high degrees of importance to the reference learning machine's generation of outputs for a large number of states, the reference learning machine component state importance classification module may analyze the statistical properties of the set of parameters in the set of parameter matrices corresponding to each node in the reference learning machine, and return the set of nodes that have aggregated parameter values above a given classification threshold and parameter value variances above another classification threshold, where the classification thresholds are either learned or predefined.
In some embodiments, to respond to the query for the set of nodes in the reference learning machine that have low degrees of importance to the reference learning machine when generating outputs, the reference learning machine component state importance classification module may analyze the statistical properties of the set of parameters in the parameter matrices corresponding to each node in the reference learning machine, and return the set of nodes that would not be returned as query results for both nodes in the reference learning machine that have high degrees of importance to the reference learning machine's generation of outputs for only a small number of states and nodes in the reference learning machine that have high degrees of importance to the reference learning machine's generation of outputs for a large number of states.
In some embodiments, to respond to the query for the degree of importance of each component of an input signal to the reference learning machine's generation of outputs associated with the possible states, an input signal is fed through the reference graph-based learning machine and the outputs at each node of the reference graph-based learning machine for that test input signal are recorded. The recorded outputs at each node of the reference graph-based learning machine for that test input signal may then be fed, by the system, into the input signal component state importance assignment module, which queries the reference learning machine component state importance classification module for the set of nodes in the reference learning machine that have high degrees of importance to the reference learning machine's generation of outputs for only a small number of states, along with their dominant states. The input signal component state importance assignment module then may project derived products of the recorded outputs of the aforementioned set of nodes returned as query result by the reference learning machine component state importance classification module to the input signal domain, and these projected outputs may then be aggregated based on the dominant states of their associated nodes to determine the degree of importance of each component of an input signal to the reference learning machine's generation of outputs associated with the possible states. In some other embodiments, the values of a component of an input signal that has a degree of importance between a particular range may be replaced by alternative values to create an altered input signal, and the difference between the reference learning machine's outputs associated with the possible states when the input signal is fed through the reference learning machine and the reference learning machine's outputs associated with the possible states when the altered signal is fed through the reference learning machine may be computed and aggregated to provide an additional metric for the degree of importance of a component of an input signal to the reference learning machine's generation of outputs associated with the possible states. The range may include a lower bound and an upper bound, both may be set manually or determined in an automatic manner. By way of example, when an input signal (any input signal regardless of the source) is received by the reference learning machine, the reference learning machine will generate corresponding outputs (e.g., for an image classification network, the input signal is an image, and the output is the confidence that the image belongs to one of many categories). With the above described process, an additional metric may indicate how important each part of the input signal is to the corresponding output from the reference learning machine (e.g., in this example, this would tell how important different parts of the image is to the confidence that the image belongs to one of many categories, for example, it thinks it is a dog because of the tail part of the input image).
In some embodiments, to respond to the query for a description of why the reference learning machine generated particular outputs given an input signal, the input signal, the expected outputs, the outputs generated by the reference learning machine given the input signal, and the degree of importance of each component of an input signal to the reference learning machine's generation of outputs associated with the possible states responded by querying the explainer learning machine may be fed into the description generator module. The description generator module may construct a description (which can be in, but not limited to, text, image, audio format or a combination of these) of why the reference learning machine generated particular outputs given an input signal that comprises of, but not limited to, some combination of: the input signal, the expected outputs, the outputs generated by the reference learning machine given the input signal, the dominant state of each component of an input signal, and the location of each component of an input signal within the input signal.
In some other embodiments, the explainer learning machine being built for explaining and understanding the reference graph-based learning machine may comprise of (1) an input signal component state importance assignment module, (2) a reference learning machine component state importance assignment neural network, (3) a reference learning machine component state importance classification neural network, and (4) a description generator module.
The system may feed a set of test input signals through the reference graph-based learning machine and the outputs at each node of the reference graph-based learning machine for each given test input signal may be recorded. The recorded outputs at each node of the reference graph-based learning machine for each given test input signal may then be used to train the reference learning machine component state importance assignment neural network and the reference learning machine component state importance classification neural network based on derived products of the recorded outputs along with the corresponding expected output of the reference graph-based learning machine for each given test input signal.
In some embodiments, after the neural network training process, the explainer learning machine can then be queried for: the degree of importance of each node in the reference learning machine to the reference learning machine's generation of outputs given the possible states, the set of nodes in the reference learning machine that have high degrees of importance to the reference learning machine's generation of outputs for a large number of states, the set of nodes in the reference learning machine that have high degrees of importance to the reference learning machine's generation of outputs for only a small number of states, the set of nodes in the reference learning machine that have low degrees of importance to the reference learning machine when generating outputs, and the degree of importance of each component of an input signal to the reference learning machine's generation of outputs given the possible states.
In some embodiments, to respond to the query for the degree of importance of each node of the reference graph-based learning machine to the reference graph-based learning machine's generation of outputs for input signals given the possible states, the reference learning machine component state importance assignment network is fed the derived products of the recorded outputs of each node along with the corresponding expected output of the graph-based learning machine for each given test input signal, and returns the network output (which is the degree of importance for each possible state) as the query result.
In some embodiments, to respond to the query for the set of nodes in the reference graph-based learning machine that have high degrees of importance to the reference learning machine's generation of outputs for only a small number of states, the reference learning machine component state importance classification network may be fed the degree of importance of each node in the reference graph-based learning machine to the reference graph-based learning machine's generation of outputs for input signals given the possible states from the reference learning machine component state importance assignment network, and the network outputs of which the states each node is associated with. In some embodiments, each node may be associated with three states (1. Node has high degree of importance to a large number of states, 2. Node has high degree of importance to a small number of states, and 3. Node with low degree of importance to reference learning machine's output generation). The set of nodes classified as being nodes with high degree of importance to a small number of states may be returned as the query result. The dominant state for a given node, which is determined from the output of the reference learning machine component state importance assignment network as the state associated with the highest degree of importance amongst the possible states for a given node, may also be returned as part of the query result.
In some embodiments, to respond to the query for the set of nodes in the reference learning machine that have high degrees of importance to the reference learning machine's generation of outputs for a large number of states, the reference learning machine component state importance classification network may be fed the degree of importance of each node of the reference graph-based learning machine to the reference graph-based learning machine's generation of outputs for input signals given the possible states from the reference learning machine component state importance assignment network, and the network outputs of which the states each node is associated with. In some embodiments, each node may be associated with three states (1. Node has high degree of importance to a large number of states, 2. Node has high degree of importance to a small number of states, and 3. Node with low degree of importance to reference learning machine's output generation). The set of nodes classified as being nodes with high degree of importance to a large number of states may be returned as the query result.
In some embodiments, to respond to the query for the set of nodes in the reference learning machine that have low degrees of importance to the reference learning machine when generating outputs, the reference learning machine component state importance classification network may be fed the degree of importance of each node of the reference graph-based learning machine to the reference graph-based learning machine's generation of outputs for input signals given the possible states from the reference learning machine component state importance assignment network, and the network outputs of which the states each node is associated with. In some embodiments, each node may be associated with three states (1. Node has high degree of importance to a large number of states, 2. Node has high degree of importance to a small number of states, and 3. Node with low degree of importance to reference learning machine's output generation). The set of nodes classified as being nodes with low degree of importance may be returned as the query result.
In some embodiments, to respond to the query for the degree of importance of each component of an input signal to the reference learning machine's generation of outputs associated with the possible states, an input signal may be fed through the reference graph-based learning machine and the outputs at each node of the graph-based learning machine for that test input signal may be recorded. The recorded outputs at each node of the graph-based learning machine for that test input signal may then be fed into the input signal component state importance assignment module, which queries the reference learning machine component state importance classification network for the set of nodes in the reference learning machine that have high degrees of importance to the reference learning machine's generation of outputs for only a small number of states, along with their dominant states. The input signal component state importance assignment module may then project derived products of the recorded outputs of the aforementioned set of nodes returned as query result by the reference learning machine component state importance classification module to the input signal domain, and these projected outputs may then be aggregated based on the dominant states of their associated nodes to determine the degree of importance of each component of an input signal to the reference learning machine's generation of outputs associated with the possible states. In some other embodiments, the values of a component of an input signal that has a degree of importance between a particular range may be replaced by alternative values to create an altered input signal, and the difference between the reference learning machine's outputs associated with the possible states when the input signal is fed through the reference learning machine and the reference learning machine's outputs associated with the possible states when the altered signal is fed through the reference learning machine may be computed and aggregated to provide an additional metric for the degree of importance of a component of an input signal to the reference learning machine's generation of outputs associated with the possible states. The range may include a lower bound and an upper bound, both may be set manually or determined in an automatic manner.
In some embodiments, to respond to the query for a description of why the reference learning machine generated particular outputs given an input signal, the input signal, the expected outputs, the outputs generated by the reference learning machine given the input signal, and the degree of importance of each component of an input signal to the reference learning machine's generation of outputs associated with the possible states responded by querying the explainer learning machine may be fed into the description generator module. The description generator module may construct a description (which can be in, but not limited to, text, image, audio format or a combination of these) of why the reference learning machine generated particular outputs given an input signal that comprises of, but not limited to, some combination of: the input signal, the expected outputs, the outputs generated by the reference learning machine given the input signal, the dominant state of each component of an input signal, and the location of each component of an input signal within the input signal.
In some embodiments, the reference learning machine and the explainer learning machine built for explaining and understanding the reference learning machine may be embodied in software, or in hardware in the form of an integrated circuit chip, a digital signal processor chip, or on a computing device, or a combination thereof.
In this respect, before explaining at least one embodiment of the disclosure in detail, it is to be understood that the disclosure is not limited in its application to the details of construction and to the arrangements of the components set forth in the following description or the examples provided therein or illustrated in the drawings. Therefore, it will be appreciated that a number of variants and modifications can be made without departing from the teachings of the disclosure as a whole. Therefore, the present system, method and apparatus is capable of other embodiments and of being practiced and carried out in various ways. Also, it is to be understood that the phraseology and terminology employed herein are for the purpose of description and should not be regarded as limiting.
As noted above, the present disclosure provides practical applications and technical improvements to the field of machine learning, and more specifically to systems and methods for building and using learning machines for understanding and explaining learning machines.
The present system and method will be better understood, and objects of the disclosure will become apparent, when consideration is given to the following detailed description thereof. Such description makes reference to the annexed drawings, wherein:
In the drawings, embodiments are illustrated by way of example. It is to be expressly understood that the description and drawings are only for the purpose of illustration and as an aid to understanding and are not intended as describing the accurate performance and behavior of the embodiments and a definition of the limits of the invention.
The present disclosure relates to systems and methods for building and using learning machines to understand and explain learning machines.
It will be appreciated that for simplicity and clarity of illustration, where considered appropriate, reference numerals may be repeated among the figures to indicate corresponding or analogous elements or steps. In addition, numerous specific details are set forth in order to provide a thorough understanding of the embodiments described herein. However, it will be understood by those of ordinary skill in the art that the embodiments described herein may be practiced without these specific details. In other instances, well-known methods, procedures and components have not been described in detail so as not to obscure the embodiments described herein. Furthermore, this description is not to be considered as limiting the scope of the embodiments described herein in any way, but rather as merely describing the implementation of the various embodiments described herein.
In an aspect, with reference to
In some embodiments, the reference learning machine 101 and the explainer learning machine 102 may be embodied in hardware in the form of an integrated circuit chip, a digital signal processor chip, or on a computer. Learning machines may be also embodied in hardware in the form of an integrated circuit chip or on a computer.
With reference to
With reference to
In some embodiments, to respond to the query for the degree of importance of each node of the reference graph-based learning machine to the reference graph-based learning machine's generation of outputs for input signals given the possible states, the reference learning machine component state importance assignment module 304 may retrieve the parameters from the set of parameter matrices 306 and return them as the query result.
In some embodiments, to respond to the query for the set of nodes in the reference learning machine that have high degrees of importance to the reference learning machine's generation of outputs for only a small number of states, the reference learning machine component state importance classification module 305 may analyze the statistical properties of the set of parameters P_{node i,state 1}, P_{node i,state 2}, . . . , P_{node i,state k} (where k is the number of possible states) in the set of parameter matrices corresponding to each node i in the reference learning machine, and return the set of nodes that have aggregated parameter values (A) above a given classification threshold T1 and parameter value variances (S) below another classification threshold T2, where the classification thresholds are either learned or predefined. In some embodiments, the aggregated parameter value A_i for node i may be determined as the average parameter value across all states for node i and the parameter value variance S_i for node i may be determined as the variance of parameter values across all states for node i. Note that this is an illustrative embodiment for updating a parameter and is not limited to the particular embodiments described. The dominant state for a given node, which is determined by the reference learning machine component state importance classification module as the state associated with the highest degree of importance amongst the possible states for a given node, may also be returned as part of the query result.
In some embodiments, to respond to the query for the set of the nodes in the reference learning machine that have high degrees of importance to the reference learning machine's generation of outputs for a large number of states, the reference learning machine component state importance classification module 305 may analyze the statistical properties of the set of parameters in the set of parameter matrices corresponding to each node in the reference learning machine, and return the set of nodes that have aggregated parameter values (A) above a given classification threshold T1 and parameter value variances (S) above another classification threshold T2, where the classification thresholds are either learned or predefined.
In some embodiments, to respond to the query for the set of nodes in the reference learning machine that has low degrees of importance to the reference learning machine when generating outputs, the reference learning machine component state importance classification module 305 may analyze the statistical properties of the set of parameters in the parameter matrices corresponding to each node in the reference learning machine, and return that set of the nodes that would not be returned as query results for both the set of nodes in the reference learning machine that have high degrees of importance to the reference learning machine's generation of outputs for only a small number of states and the set of nodes in the reference learning machine that have high degrees of importance to the reference learning machine's generation of outputs for a large number of states.
In some embodiments, to respond to the query for the degree of importance of each component of an input signal to the reference learning machine's generation of outputs associated with the possible states, a test input signal 310 may be fed through the reference graph-based learning machine and the outputs at each node of the graph-based learning machine for that test input signal 310 are recorded. The recorded outputs at each node of the graph-based learning machine for that test input signal may then be fed into the input signal component state importance assignment module 303, which queries the reference learning machine component state importance classification module 305 for the set of nodes in the reference learning machine that have high degrees of importance to the reference learning machine's generation of outputs for only a small number of states, along with their dominant states. The input signal component state importance assignment module 303 may then project derived products of the recorded outputs of the aforementioned set of nodes returned as query result by the reference learning machine component state importance classification module to the input signal domain, and these projected outputs may then be aggregated based on the dominant states of their associated nodes to determine the degree of importance of each component of an input signal to the reference learning machine's generation of outputs associated with the possible states. In some embodiments, the input signal domain may be defined as the domain in which the input signal that is fed into the graph-based learning machine is expressed. For example, in the case where the input signal is a digital image, the input signal domain may be the spatial domain. In another example, in the case where the input signal is a digital audio signal, the input signal domain may be the time domain or the frequency domain.
In some other embodiments, the values of a component of an input signal 310 that has a degree of importance between a particular range may be replaced by alternative values to create an altered input signal. In an illustrative embodiment, the values of a component of an input signal 310 that has a degree of importance between a lower bound lb and upper bound ub are set to zero to create an altered input signal. In another illustrative embodiment, the values of a component of an input signal that has a degree of importance between a lower bound lb and upper bound ub are set to a random value U generated by a random number generator to create an altered input signal A. It is important to note that other alternative values may be used, and the above example embodiments are not meant to be limiting. It is also important to note that the lower bound lb and upper bound ub may be set manually or determined in an automatic manner. The difference between the reference learning machine's outputs O_I associated with the possible states when the input signal 310 may be fed through the reference learning machine and the reference learning machine's outputs O_A associated with the possible states when the altered signal A is fed through the reference learning machine is computed and aggregated to provide an additional metric M for the degree of importance of a component of an input signal to the reference learning machine's generation of outputs associated with the possible states.
M=J(O_I−O_A)
In an illustrative embodiment, the metric I can be defined as the squared error between O_I and O_A:
M=(O_I−O_A){circumflex over ( )}2
In another illustrative embodiment, the metric I can be defined as the absolute error between O_I and O_A:
M=|O_I−O_A|
It is important to note that other alternative metrics may be used, and the above example embodiments are not meant to be limiting.
In some embodiments, to respond to the query for a description of why the reference learning machine generated particular outputs given an input signal, the input signal 310, the expected outputs 309, the outputs generated by the reference learning machine given the input signal, and the degree of importance of each component of an input signal to the reference learning machine's generation of outputs associated with the possible states responded by querying the explainer learning machine may be fed into the description generator module 311. The description generator module 311 may construct a description (which can be in, but not limited to, text, image, audio format or a combination of these) of why the reference learning machine generated particular outputs given an input signal 310 that comprises of, but not limited to, some combination of: the input signal 310, the expected outputs 309, the outputs generated by the reference learning machine given the input signal, the dominant state of each component of an input signal that has a degree of importance between a lower bound lb and upper bound ub, and the location of each component of an input signal within the input signal. In an illustrative example where the reference learning machine is a neural network for determining if a car should steer left or right, where the input signal 310 is an image captured from a camera on a car, and the output is either a decision of ‘steer left’ or ‘steer right’, and expected output is either a decision of ‘steer left’ or ‘steer right’, and components of the input signal are objects (such as cars, posts, lane markings, tree, pedestrian, etc.) in the image, the description generator module 311 may construct a text description of why the reference learning machine generated particular outputs given an input signal in the following form:
In another illustrative example where the reference learning machine is a neural network for determining if a concentration of chlorine is problematic or not problematic, where the input signal 310 is the chlorine levels at different pipe junctions at different times, and the output is either a decision of ‘problematic’ or ‘not problematic’, and expected output is either a decision of ‘problematic’ or ‘not problematic’, and components of the input signal are chlorine levels at different pipe junctions at different times, the description generator module 311 may construct a text description of why the reference learning machine generated particular outputs given an input signal in the following form:
In yet another illustrative example where the reference learning machine is a neural network for determining if a stock is a ‘buy’ or a ‘sell’, where the input signal 310 is the closing stock prices at different times, and the output is either a decision of ‘buy’ or ‘sell’, and expected output is either a decision of ‘buy’ or ‘sell’, and components of the input signal are closing stock prices, open stock prices, and trade volumes at different times, the description generator module 311 may construct a text description of why the reference learning machine generated particular outputs given an input signal in the following form:
It is important to note that other alternative description formats may be used, and the above example embodiments are not meant to be limiting.
With reference to
With reference to
In some embodiments, to respond to the query for the degree of importance of each node of the reference graph-based learning machine to the reference graph-based learning machine's generation of outputs for input signals given the possible states, the reference learning machine component state importance assignment network 404 may be fed the derived products of the recorded outputs of each node along with the corresponding expected output 408 of the graph-based learning machine for each given test input signal 406, and returns the network output (which is the degree of importance for each possible state) as the query result.
In some embodiments, to respond to the query for the nodes in the reference learning machine that have high degrees of importance to the reference learning machine's generation of outputs for only a small number of states, the reference learning machine component state importance classification network 405 may be fed the degree of importance of each node of the reference graph-based learning machine to the reference graph-based learning machine's generation of outputs for input signals given the possible states from the reference learning machine component state importance assignment network 404, and the network outputs of which states each node is associated with. In some embodiments, each node may be associated with three states (1. Node has high degree of importance to a large number of states, 2. Node has high degree of importance to a small number of states, and 3. Node with low degree of importance to reference learning machine's output generation). The set of nodes classified as being nodes with high degree of importance to a small number of states may be returned as the query result. The dominant state for a given node, which is determined from the output of the reference learning machine component state importance assignment network 404 as the state which a given node has the highest degree of importance amongst the possible states, may also be returned as part of the query result.
In some embodiments, to respond to the query for the nodes in the reference learning machine that have high degrees of importance to the reference learning machine's generation of outputs for a large number of states, the reference learning machine component state importance classification network 405 may be fed the degree of importance of each node of the reference graph-based learning machine to the reference graph-based learning machine's generation of outputs for input signals given the possible states from the reference learning machine component state importance assignment network 404, and the network outputs of which states each node is associated with. In some embodiments, each node may be associated with three states (1. Node has high degree of importance to a large number of states, 2. Node has high degree of importance to a small number of states, and 3. Node with low degree of importance to reference learning machine's output generation). The set of nodes classified as being nodes with high degree of importance to a large number of states may be returned as the query result.
In some embodiments, to respond to the query for the nodes in the reference learning machine that have low degrees of importance to the reference learning machine when generating outputs, the reference learning machine component state importance classification network 405 may be fed the degree of importance of each node of the reference graph-based learning machine to the reference graph-based learning machine's generation of outputs for input signals given the possible states from the reference learning machine component state importance assignment network 404, and the network outputs of which states each node is associated with. In some embodiments, each node may be associated with three states (1. Node has high degree of importance to a large number of states, 2. Node has high degree of importance to a small number of states, and 3. Node with low degree of importance to reference learning machine's output generation). The set of nodes classified as being nodes with low degree of importance are returned as the query result.
In some embodiments, to respond to the query for the degree of importance of each component of an input signal 409 to the reference learning machine's generation of outputs associated with the possible states, an input signal 409 may be fed through the reference graph-based learning machine 401 and the outputs at each node of the graph-based learning machine for that input signal 409 are recorded. The recorded outputs at each node of the graph-based learning machine for that test input signal 409 may then be fed into the input signal component state importance assignment module 403, which queries the reference learning machine component state importance classification network 405 for the set of nodes in the reference learning machine that have high degrees of importance to the reference learning machine's generation of outputs for only a small number of states, along with their dominant states. The input signal component state importance assignment module 403 may then project derived products of the recorded outputs of the aforementioned set of nodes returned as query result by the reference learning machine component state importance classification module to the input signal domain, and these projected outputs may then be aggregated based on the dominant states of their associated nodes to determine the degree of importance of each component of an input signal to the reference learning machine's generation of outputs associated with the possible states.
In some other embodiments, the values of a component of an input signal 409 that has a degree of importance between a particular range are replaced by alternative values to create an altered input signal. In an illustrative embodiment, the values of a component of an input signal 409 that has a degree of importance between a lower bound lb and upper bound ub are set to zero to create an altered input signal. In another illustrative embodiment, the values of a component of an input signal that has a degree of importance between a lower bound lb and upper bound ub are set to a random value U generated by a random number generator to create an altered input signal A. It is important to note that other alternative values may be used, and the above example embodiments are not meant to be limiting. It is also important to note that the lower bound lb and upper bound ub may be set manually or determined in an automatic manner. The difference between the reference learning machine's outputs O_I associated with the possible states when the input signal 409 is fed through the reference learning machine and the reference learning machine's outputs O_A associated with the possible states when the altered signal A is fed through the reference learning machine is computed and aggregated to provide an additional metric M for the degree of importance of a component of an input signal to the reference learning machine's generation of outputs associated with the possible states.
M=J(O_I−O_A)
In an illustrative embodiment, the metric I can be defined as the squared error between O_I and O_A:
M=(O_I−O_A){circumflex over ( )}2
In another illustrative embodiment, the metric I can be defined as the absolute error between O_I and O_A:
M=|O_I−O_A|
It is important to note that other alternative metrics may be used, and the above example embodiments are not meant to be limiting.
In some embodiments, to respond to the query for a description of why the reference learning machine generated particular outputs given an input signal, the input signal 409, the expected outputs 408, the outputs generated by the reference learning machine given the input signal, and the degree of importance of each component of an input signal to the reference learning machine's generation of outputs associated with the possible states responded by querying the explainer learning machine may be fed into the description generator module 410. The description generator module 410 may construct a description (which can be in, but not limited to, text, image, audio format or a combination of these) of why the reference learning machine generated particular outputs given an input signal 409 that may comprise of, but not limited to, some combination of: the input signal 409, the expected outputs 408, the outputs generated by the reference learning machine given the input signal, the dominant state of each component of an input signal that has a degree of importance between a lower bound lb and upper bound ub, and the location of each component of an input signal within the input signal. In an illustrative example where the reference learning machine is a neural network for determining if a car should steer left or right, where the input signal 409 is an image captured from a camera on a car, and the output is either a decision of ‘steer left’ or ‘steer right’, and expected output is either a decision of ‘steer left’ or ‘steer right’, and components of the input signal are objects (such as cars, posts, lane markings, tree, pedestrian, etc.) in the image, the description generator module 410 may construct a text description of why the reference learning machine generated particular outputs given an input signal in the following form:
In another illustrative example where the reference learning machine is a neural network for determining if there is the concentration of chlorine is problematic or not problematic, where the input signal 409 is the chlorine levels at different pipe junctions at different times, and the output is either a decision of ‘problematic’ or ‘not problematic’, and expected output is either a decision of ‘problematic’ or ‘not problematic’, and components of the input signal are chlorine levels at different pipe junctions at different times, the description generator module 410 may construct a text description of why the reference learning machine generated particular outputs given an input signal in the following form:
In yet another illustrative example where the reference learning machine is a neural network for determining if a stock is a ‘buy’ or a ‘sell’, where the input signal 409 is the closing stock prices at different times, and the output is either a decision of ‘buy’ or ‘sell’, and expected output is either a decision of ‘buy’ or ‘sell’, and components of the input signal are closing stock prices, open stock prices, and trade volumes at different times, the description generator module 311 may construct a text description of why the reference learning machine generated particular outputs given an input signal in the following form:
It is important to note that other alternative description formats may be used, and the above example embodiments are not meant to be limiting.
With reference to
Now referring to
Now referring to
It is important to note that other visualization methods of displaying the query results from the explainer learning machine such as hard masking and multiple overlays of degree of importance may be used in the present system and that these embodiments should not to be considered as limiting.
Once again, all systems described herein may utilize a computing device, such as a computing device as described with reference to
Now referring to
While illustrative embodiments have been described above by way of example, it will be appreciated that various changes and modifications may be made without departing from the scope of the invention, which is defined by the following claims.
One or more of the components, processes, features, and/or functions illustrated in the figures may be rearranged and/or combined into a single component, block, feature or function or embodied in several components, steps, or functions. Additional elements, components, processes, and/or functions may also be added without departing from the disclosure. The apparatus, devices, and/or components illustrated in the Figures may be configured to perform one or more of the methods, features, or processes described in the Figures. The algorithms described herein may also be efficiently implemented in software and/or embedded in hardware.
Note that the aspects of the present disclosure may be described herein as a process that is depicted as a flowchart, a flow diagram, a structure diagram, or a block diagram. Although a flowchart or diagram may describe the operations as a sequential process, many of the operations can be performed in parallel or concurrently. In addition, the order of the operations may be re-arranged. A process is terminated when its operations are completed. A process may correspond to a method, a function, a procedure, a subroutine, a subprogram, etc. When a process corresponds to a function, its termination corresponds to a return of the function to the calling function or the main function.
Those of skill in the art would further appreciate that the various illustrative logical blocks, modules, circuits, and algorithm steps described in connection with the aspects disclosed herein may be implemented as electronic hardware, computer software, or combinations of both. To clearly illustrate this interchangeability of hardware and software, various illustrative components, blocks, modules, circuits, and processes have been described above generally in terms of their functionality. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the overall system. The enablements described above are considered novel over the prior art and are considered critical to the operation of at least one aspect of the disclosure and to the achievement of the above described objectives. The words used in this specification to describe the instant embodiments are to be understood not only in the sense of their commonly defined meanings, but to include by special definition in this specification: structure, material or acts beyond the scope of the commonly defined meanings. Thus, if an element can be understood in the context of this specification as including more than one meaning, then its use must be understood as being generic to all possible meanings supported by the specification and by the word or words describing the element.
The definitions of the words or drawing elements described above are meant to include not only the combination of elements which are literally set forth, but all equivalent structure, material or acts for performing substantially the same function in substantially the same way to obtain substantially the same result. In this sense it is therefore contemplated that an equivalent substitution of two or more elements may be made for any one of the elements described and its various embodiments or that a single element may be substituted for two or more elements in a claim.
Changes from the claimed subject matter as viewed by a person with ordinary skill in the art, now known or later devised, are expressly contemplated as being equivalents within the scope intended and its various embodiments. Therefore, obvious substitutions now or later known to one with ordinary skill in the art are defined to be within the scope of the defined elements. This disclosure is thus meant to be understood to include what is specifically illustrated and described above, what is conceptually equivalent, what can be obviously substituted, and also what incorporates the essential ideas.
In the foregoing description and in the figures, like elements are identified with like reference numerals. The use of “e.g.,” “etc.,” and “or” indicates non-exclusive alternatives without limitation, unless otherwise noted. The use of “including” or “includes” means “including, but not limited to,” or “includes, but not limited to,” unless otherwise noted.
As used above, the term “and/or” placed between a first entity and a second entity means one of (1) the first entity, (2) the second entity, and (3) the first entity and the second entity. Multiple entities listed with “and/or” should be construed in the same manner, i.e., “one or more” of the entities so conjoined. Other entities may optionally be present other than the entities specifically identified by the “and/or” clause, whether related or unrelated to those entities specifically identified. Thus, as a non-limiting example, a reference to “A and/or B”, when used in conjunction with open-ended language such as “comprising” can refer, in one embodiment, to A only (optionally including entities other than B); in another embodiment, to B only (optionally including entities other than A); in yet another embodiment, to both A and B (optionally including other entities). These entities may refer to elements, actions, structures, processes, operations, values, and the like.
The present application is a continuation of International Application No. PCT/CA2019/050377, filed Mar. 27, 2019, which claims priority to U.S. Provisional Application Ser. No. 62/724,566, filed Aug. 29, 2018, the disclosures of both of which are incorporated herein in their entireties.
Number | Date | Country | |
---|---|---|---|
62724566 | Aug 2018 | US |
Number | Date | Country | |
---|---|---|---|
Parent | PCT/CA2019/050377 | Mar 2019 | US |
Child | 17187743 | US |