Claims
- 1. A method of computing a posterior probability estimate for a sequential detector system comprising:
selecting samples of a data set sequentially, wherein each selected sample is processed comprising:
performing a likelihood computation based upon said sample; accumulating said likelihood computation with likelihood computations from previously processed samples; and, computing said posterior probability estimate based upon the accumulation of said likelihood computations.
- 2. The method of computing a posterior probability estimate for a sequential detector system according to claim 1, wherein said posterior probability estimate defines a measure of the likelihood that a source phenomenon of interest being tested belongs to a particular class.
- 3. The method of computing a posterior probability estimate for a sequential detector system according to claim 1, wherein said posterior probability estimate is used to discriminate between at least two classes.
- 4. The method of computing a posterior probability estimate for a sequential detector system according to claim 1, wherein said posterior probability estimate is used to perform a feature selection.
- 5. The method of computing a posterior probability estimate for a sequential detector system according to claim 1, wherein said likelihood computation is expressed as zk and the accumulation of said likelihood computations is expressed as Σ
- 6. The method of computing a posterior probability estimate for a sequential detector system according to claim 1, wherein said posterior probability estimate is computed by implementing a neural network configured to approximate Bayes optimal discriminant functions.
- 7. The method of computing a posterior probability estimate for a sequential detector system according to claim 1, wherein said posterior probability estimate is computed by constructing a first neural network implemented as a feedforward neural network having at least one input, at least one hidden layer that utilizes a hyperbolic tangent activation, and an output.
- 8. The method of computing a posterior probability estimate for a sequential detector system according to claim 1, wherein said posterior probability estimate is computed by constructing a first neural network comprising accumulating said likelihood computations into a linear output and transforming said linear output into a sigmoid output.
- 9. The method of computing a posterior probability estimate for a sequential detector system according to claim 1, wherein said posterior probability estimate is denoted {circumflex over (π)} and is given by the formula
- 10. The method of computing a posterior probability estimate for a sequential detector system according to claim 1, wherein each likelihood computation comprises a log-likelihood computation expressed as
- 11. The method of computing a posterior probability estimate for a sequential detector system according to claim 10, wherein said log-likelihood computation is implemented as the natural log.
- 12. The method of computing a posterior probability estimate for a sequential detector system according to claim 1, wherein said posterior probability estimate accounts for a prior bias in the source data by expressing said posterior probability estimate as a soft-max function based upon the accumulation of said likelihood computations.
- 13. The method of computing a posterior probability estimate for a sequential detector system according to claim 1, wherein said posterior probability estimate is denoted {circumflex over (π)} and is given by the formula
- 14. A method of performing adaptive sequential data analysis on a labeled data set comprising:
sequentially accessing a labeled data sample from said labeled data set; computing for each labeled data sample, a posterior probability estimate comprising: performing a likelihood computation for said labeled data sample;
accumulating said likelihood computation with likelihood computations from previously considered samples; and computing said posterior probability estimate based upon the accumulation of likelihood computations; determining a first cost associated with making a classification decision in view of the risk of an error in classification given said posterior probability estimate; determining a second cost associated with collecting another labeled data sample before making a classification decision, said second cost based at least in part upon said posterior probability estimate; comparing said first and second costs against a predetermined stopping criterion; automatically repeating each of the above steps if the results of the comparison suggest taking another labeled data sample; and performing a predetermined action if the results of the comparison suggest stopping.
- 15. The method of performing adaptive sequential data analysis according to claim 14, wherein said first cost is denoted U(π,{circumflex over (θ)}), and is expressed by U(πk,{circumflex over (θ)})=(1−γU)U(πk,{circumflex over (θ)})+γUL({circumflex over (θ)},θ) where L({circumflex over (θ)},θ) denotes a loss function and the term γu is a measure of how fast the sequential data analysis process is trying to learn as compared with the amount of information already learned.
- 16. The method of performing adaptive sequential data analysis according to claim 14, wherein said first cost is expressed as the expected decision cost of deciding in favor of a specific class given a specific value for said posterior probability estimate.
- 17. The method of performing adaptive sequential data analysis according to claim 14, wherein said first cost is computed by multiplying a probability that the sequential data analysis process will improperly classify the data by a weighting factor.
- 18. The method of performing adaptive sequential data analysis according to claim 14, wherein said first cost is determined by a neural network operating as a universal approximator, said neural network designed using a reinforcement learning algorithm that implements an on-policy version of the Q-learning algorithm.
- 19. The method of performing adaptive sequential data analysis according to claim 14, wherein said second cost is denoted V(π) and is expressed by V(πk)=(1−γV)V(πk)+γV min{c+V(πk+1),U(πk+1,{circumflex over (θ)}*)}.
- 20. The method of performing adaptive sequential data analysis according to claim 14, wherein said second cost is determined by a neural network operating as a universal approximator, said neural network designed using a reinforcement learning algorithm that implements an on-policy version of the Q-learning algorithm.
- 21. The method of performing adaptive sequential data analysis according to claim 14, wherein a decision is made to stop sampling and make a classification decision when said second cost is greater than said first cost.
- 22. The method of performing adaptive sequential data analysis according to claim 14, wherein at least one of said first and second costs are updated when a decision is made to stop collecting samples and make a classification decision.
- 23. The method of performing adaptive sequential data analysis according to claim 14, wherein said predetermined stopping criterion is determined by:
identifying a greedy function wherein said second cost is greater than said first cost, said greedy function representing a first stopping criterion; occasionally selecting a random function to test the hypothesis that said greedy function made a good choice in representing said stopping criterion, updating said first and second costs based upon said random function; and using the updates to said first and second cost functions to determine the accurateness of said greedy function.
- 24. The method of performing adaptive sequential data analysis according to claim 14, wherein said predetermined stopping criterion is determined by:
identifying a greedy function wherein said second cost is greater than said first cost, said greedy function representing a first stopping criterion; choosing a greedy action with probability 1−η; employing a random exploration that deviates from the greedy policy with a positive probability η to test the hypothesis that said greedy policy made a good choice in representing said stopping criterion; updating said first and second costs based upon said random exploration; and using the updates to said first and second cost functions to determine the accurateness of said greedy function.
- 25. The method of performing adaptive sequential data analysis according to claim 24, wherein the probability of said random explorations to check the greedy policy diminishes as confidence in the first and second costs are developed and increases as the first and second costs close in value.
- 26. The method of performing adaptive sequential data analysis according to claim 14, wherein said posterior probability estimate is computed without reliance on a predetermined statistical distribution of said source phenomenon of interest.
- 27. The method of performing adaptive sequential data analysis according to claim 14, wherein said posterior probability estimate is determined for each sample by performing a likelihood computation.
- 28. The method of performing adaptive sequential data analysis according to claim 14, wherein said posterior probability estimate defines a conditional density function derived from an accumulation of said log-likelihoods.
- 29. A method of automatically making a decision on the order of sampling from a given set of data streams comprising:
sequentially accessing a labeled data sample; computing a posterior probability for said labeled data sample; determining a first cost associated with making a classification decision in view of the risk of an error in classification given said posterior probability for each feature of a plurality of features; determining a second cost associated with collecting another labeled data sample before making a classification decision, said second cost based at least in part upon said posterior probability; choosing a data stream by comparing at least two of said first costs associated with respective features and selecting one stream associated with a selected one of said features based upon the comparison of said at least two of said first costs; comparing said first cost associated with said stream and said second cost against a predetermined stopping criterion; automatically repeating each of the above steps if the results of the comparison suggest taking another labeled data sample; and performing a predetermined action if the results of the comparison suggest stopping.
- 30. The method of automatically making a decision on the order of sampling according to claim 29, wherein said first cost associated with each of said plurality of features may be calculated using a different weight value.
- 31. The method of automatically making a decision on the order of sampling according to claim 29, wherein said predetermined stopping criterion is determined by:
- 32. The method of automatically making a decision on the order of sampling according to claim 29, wherein said data stream is chosen by comparing said first costs associated with each of said plurality of features and selecting the data stream associated with the minimum one of said first costs.
- 33. The method of automatically making a decision on the order of sampling according to claim 29, wherein said posterior probability of each of said first costs is determined by a unique neural network.
- 34. The method of automatically making a decision on the order of sampling according to claim 29, wherein said posterior probability is determined by an accumulation of likelihoods without a need to comprehend underlying source statistics.
- 35. The method of automatically making a decision on the order of sampling according to claim 29, wherein a log-likelihood is computed for each feature.
- 36. The method of automatically making a decision on the order of sampling according to claim 35, wherein a soft-max function is used to fuse accumulations of each of said log-likelihood determinations.
- 37. A detector for sequential data analysis systems comprising:
a posterior probability estimator arranged to analyze samples from a data set in a sequential manner, and generate an estimated posterior probability based upon an accumulation of log-likelihood determinations computed for each sample considered.
- 38. The detector according to claim 37, wherein said accumulation of log-likelihoods defines a probability estimate that said sample belongs to a predetermined class.
- 39. The detector according to claim 37, wherein said accumulation of log-likelihoods defines a probability estimate that is used to perform a feature selection operation.
- 40. The detector according to claim 37, wherein each log-likelihood is expressed by the equation
- 41. The detector according to claim 37, wherein said accumulation of log-likelihoods is transformed into a conditional density distribution expressed by the equation:
- 42. The detector according to claim 37, wherein said posterior probability estimator comprises a universal approximator having:
at least one input; at least one nonlinear hidden layer that utilizes a hyperbolic tangent activation communicably coupled to said at least one input; at least one linear output communicably coupled to said at least one hidden layer; and, a logistic output communicably coupled to said at least one linear output arranged to transform an accumulation of linear output computations into at least one logistic output.
- 43. The detector according to claim 37, wherein said posterior probability estimate is denoted {circumflex over (π)} and is given by the formula
- 44. A detector for sequential data analysis systems comprising:
a posteriori probability estimator arranged to analyze labeled data samples sequentially and compute an estimated posterior probability by computing for each labeled data sample received, a probability that a source phenomenon of interest described by said labeled data samples belongs to a first class, said probability computed without reliance on a predetermined statistical distribution of said source phenomenon of interest.
- 45. An adaptive sequential data analysis system comprising:
a posterior probability estimator arranged to access a labeled data sample from a labeled data set sequentially and compute therefrom an estimated posterior probability, wherein said posterior probability estimator: performs a likelihood computation for said labeled data sample;
accumulates said likelihood computation with likelihood computations from previously considered samples; and computes said posterior probability based upon the accumulation of likelihood computations a cost of decision estimator communicably coupled to said posterior probability estimator, said cost of decision estimator arranged to determine a first cost associated with making a classification decision in view of the risk of an error in classification given said posterior probability, a cost to go estimator communicably coupled to said posterior probability estimator, said cost to go estimator arranged to determine a second cost associated with collecting another labeled data sample before making a classification decision, said second cost based at least in part upon said posterior probability; and, a decision processor communicably coupled to said cost of decision estimator and said cost to go estimator, said decision processor arranged to compare said first and second costs against a predetermined stopping criterion, wherein said decision processor is configured to trigger a predetermined action based upon the comparison.
- 46. The adaptive sequential data analysis system according to claim 45, wherein said decision processor is configured to decide whether to collect another sample automatically based upon the comparison between said first and second costs.
- 47. The adaptive sequential data analysis system according to claim 45, wherein said cost of decision processor computes said first cost denoted U(π,{circumflex over (θ)}) by implementing the equation U(πk,{circumflex over (θ)})=(1−γU)U(πk,{circumflex over (θ)})+γUL({circumflex over (θ)}, θ) where L({circumflex over (θ)}, θ) denotes a loss function and the term γu is a measure of how fast the sequential data analysis process is trying to learn as compared with the amount of information already learned.
- 48. The adaptive sequential data analysis system according to claim 45, wherein said first cost is expressed as the expected decision cost of deciding in favor of a specific class given a specific value for said posterior probability.
- 49. The adaptive sequential data analysis system according to claim 45, wherein said cost of decision estimator is configured to compute said first cost by multiplying a probability that the sequential data analysis process will improperly classify the data by a weighting factor.
- 50. The adaptive sequential data analysis system according to claim 45, wherein said cost of decision estimator comprises a neural network operating as a universal approximator, said neural network designed using a reinforcement learning algorithm that implements an on-policy version of the Q-learning algorithm.
- 51. The adaptive sequential data analysis system according to claim 45, wherein said cost to go estimator computes said second cost, denoted V(π) and computed by implementing the equation V(πk)=(1−γV)V(πk)+γV min{c+V(πk+1)U(πk+1,{circumflex over (θ)}*)}.
- 52. The adaptive sequential data analysis system according to claim 45, wherein said cost to go estimator comprises a neural network operating as a universal approximator, said neural network designed using a reinforcement learning algorithm that implements an on-policy version of the Q-learning algorithm.
- 53. The adaptive sequential data analysis system according to claim 45, wherein said decision processor is configured to stop sampling and make a classification decision when said second cost is greater than said first cost.
- 54. The adaptive sequential data analysis system according to claim 45, wherein the system is configured to update at least one of said first and second costs when said decision processor decides to stop collecting samples and make a classification decision.
- 55. The adaptive sequential data analysis system according to claim 45, wherein said decision processor is configured to:
identify a greedy function wherein said second cost is greater than said first cost, said greedy function representing a first stopping criterion; occasionally select a random function to test the hypothesis that said greedy function made a good choice in representing said stopping criterion, update said first and second costs based upon said random function; and use the updates to said first and second cost functions to determine the accurateness of said greedy function, in order to determine said predetermined stopping criterion.
- 56. The adaptive sequential data analysis system according to claim 45, wherein said decision processor is configured to:
identify a greedy function wherein said second cost is greater than said first cost, said greedy function representing a first stopping criterion; choose a greedy action with probability 1−η; employ a random exploration that deviates from the greedy policy with a positive probability η to test the hypothesis that said greedy policy made a good choice in representing said stopping criterion; update said first and second costs based upon said random exploration; and use the updates to said first and second cost functions to determine the accurateness of said greedy function, in order to determine said stopping criterion.
- 57. The adaptive sequential data analysis system according to claim 56, wherein said decision processor is configured to diminish the probability of said random explorations to check the greedy policy as confidence in the first and second costs are developed.
- 58. The adaptive sequential data analysis system according to claim 56, wherein said decision processor is configured to increase the probability of said random explorations if the first and second costs are close in value.
- 59. The adaptive sequential data analysis system according to claim 45, wherein said posterior probability estimator is configured to compute said posterior probability without reliance on a predetermined statistical distribution of said source phenomenon of interest.
- 60. The adaptive sequential data analysis system according to claim 59, wherein said posterior probability estimator is configured to define said posterior probability as a conditional density function derived from an accumulation of said log-likelihoods.
- 61. A sequential detector capable of analyzing multiple streams comprising:
a posterior probability estimator arranged to access a labeled data set sequentially and compute therefrom an estimated posterior probability; a plurality of cost of decision estimators each communicably coupled to said posterior probability estimator, each of said cost of decision estimators arranged to determine a first cost associated with making a classification decision in view of the risk of an error in classification given said posterior probability for a select one of a plurality of features; a cost to go estimator communicably coupled to said posterior probability estimator, said cost to go estimator arranged to determine a second cost associated with collecting another labeled data sample before making a classification decision, said second cost based at least in part upon said posterior probability; and a decision processor communicably coupled to each of said cost of decision estimators and said cost to go estimator, said decision processor arranged to:
choose a data stream by comparing at least two of said first costs associated with respective features and selecting one stream associated with a selected one of said features based upon the comparison of said at least two of said first costs; and compare said first cost associated with said stream and said second cost against a predetermined stopping criterion.
- 62. The sequential detector according to claim 61, wherein said posterior probability estimator continues to collect new data samples sequentially until said predetermined stopping criterion is met.
- 63. The sequential detector according to claim 61, wherein each of said cost to go estimators compute said first cost associated with each of said plurality of features using a different weight value.
- 64. The sequential detector according to claim 61, wherein said decision processor is configured to determine said predetermined stopping criterion when the minimum one of said first costs is greater than said second cost.
- 65. The sequential detector according to claim 61, wherein said decision processor is configured to determine said predetermined stopping criterion according to the equation min(V(π1), V(π2) . . . V(πN−1), V(πN))>U(π,{circumflex over (θ)}).
- 66. The sequential detector according to claim 61, wherein decision processor is configured to select a data stream by comparing said first costs associated with each of said plurality of features and selecting the data stream associated with the minimum one of said first costs.
- 67. The sequential detector according to claim 61, wherein said posterior probability estimator comprises a plurality of neural networks, each neural network configured to compute the posterior probability for a respective feature.
- 68. The sequential detector according to claim 61, wherein said posterior probability estimator is configured to determine said posterior probability by an accumulation of likelihoods without a need to comprehend underlying source statistics.
- 69. The sequential detector according to claim 61, wherein said posterior probability estimator is configured to determine a log-likelihood for each feature.
- 70. The sequential detector according to claim 69, wherein said posterior probability estimator is configured to utilize a soft-max to fuse accumulations of each of said log-likelihood determinations.
CROSS REFERENCE TO RELATED APPLICATIONS
[0001] This application claims priority to U.S. Provisional Patent Application Serial No. 60/368,947 filed Mar. 29, 2002; the disclosure of which is hereby incorporated by reference.
Provisional Applications (1)
|
Number |
Date |
Country |
|
60368947 |
Mar 2002 |
US |