Claims
- 1. An information classification system combining training data and test data to classify a source which comprises:
- a vector processor formulating a plurality of feature vectors depicting characteristics of said test data and for receiving said plurality of feature vectors wherein each of said plurality of feature vectors is quantized to one of M symbols and for generating a quantized test vector y having as components the number of occurrences of each of the M symbols in said plurality of quantized feature vectors received;
- means for storing quantized training data, said quantized training data having one quantized training vector x.sub.c for each class of a plurality of output classes; and
- a classification processor, responsive to said quantized test vector and said quantized training data, for estimating sysmbol probabilities for each output class and for classifying the quantized test vector y unto one of said plurity output classes, wherein said classification processor estimates the symbol probabilities for each output class and classifies the quantized test vector using a combined Bayes test given by ##EQU15## wherein X.sub.k,i is the number of occurrences of the i.sup.th symbol in a quantized training vector for class k; x.sub.l,i is the number of occurrences of the i.sup.th symbol in a quantized training vector for class l; y.sub.i is the number of occurrences of the i.sup.th symbol in the quantized test vector; ##EQU16## is the total number of occurrences of the M symbols in the training data for class k; ##EQU17## is the total number of occurrences of the M symbols in the training data for class l; and ##EQU18## is the total number of occurrences of the M symbols in the quantized feature vectors.
- 2. An information classification system combining training data and test data to classify a source which comprises:
- a vector processor formulating a plurality of feature vectors depicting characteristics of said test data and for receiving said plurality of feature vectors wherein each of said plurality of feature vectors is quantized to one of M symbols and for generating a quantized test vector y having as components the number of occurrences of each of the M symbols in said plurality of quantized feature vectors received;
- means for storing quantized training data, said quantized training data having one quantized training vector x.sub.c for each class of a plurality of output classes; and
- a classification processor, responsive to said quantized test vector and said quantized training data, for estimating sysmbol probabilities for each output class and for classifying the quantized test vector y unto one of said plurity output classes, wherein said classification processor estimates the symbol probabilities for each one of C output classes and classifies the quantized test vector into one of the C output classes using a combined Bayes test given by ##EQU19## in which x.sub.k,i is the number of occurrences of the i.sup.th symbol in a quantized training vector for class k, y.sub.i is the number of occurrences of the i.sup.th symbol in the quantized test vector, N.sub.k is the total number of occurrences of the M symbols in the training data for class k, N.sub.l is the total number of occurrences of the M symbols in the training data for class l, and is the total number of occurrences of the M symbols in the quantized feature vectors.
STATEMENT OF GOVERNMENT INTEREST
The invention described herein may be manufactured and used by or for the Government of the United States of America for governmental purposes without the payment of any royalties thereon or therefor.
US Referenced Citations (7)