The invention relates in general to document classification, and in particular to classification of document weight or thickness based on sound captured by an audio transducer. Knowledge of document characteristics such as weight or thickness can be used by other scanner systems.
In a document transport system, documents having different thickness are scanned and passed through the transport. When a document is moving through a document transport there is an associated sound with movement of the document. This sound can be characterized by its spectral features. The sound characteristics of the document moving through the transport will vary based on the thickness of the document. These features can be used to classify documents.
In a document scanner, the weight of the document can translate to thickness and is related to the translucence of the document. Document scanners will often be used in such a way that many different weighted documents will be scanned within the same batch. These attributes of a document can require specific treatment by other systems such as an ultrasonic document detection system (UDDS), described in U.S. Pat. No. 6,511,064, wherein the thickness of the document attenuates the ultrasonic signal more than a lighter weight or thinner document. Knowing the weight or thickness of a document can enable system parameters to be adjusted to better meet the machine processing requirements of a given document.
Ultrasonic document detection can provide other useful information about a document that is being transported through a scanner. For example, the detector can determine if multiple documents are being fed, which may result in loss of information from the scanning process since some documents will not be scanned. Another problem is that often the detector can confuse a thick document with a multi-fed document. There is, therefore, a need for an improved determination of thickness of a document, whether a document is wrinkled, and whether multiple documents are stapled together.
Briefly, according to one aspect of the present invention a method for classifying documents based on sound includes feeding the document to a document transport; detecting a sonic profile produced by the document as it is transported; and determining document characteristics based on the sonic profile.
In one embodiment, a document scanner captures an audio signal, using an audio transducer, of a document entering the scanner transport. The audio signal is then conditioned, digitized, and processed to provide spectral information with regard to the signal. The spectral information, sometimes referred to as a sonic profile, is then compared to known spectral attributes of different weighted documents for comparison and classification.
As shown in
As shown in
When feeding a document 75 into the scanner 4 the audio signal generated by the document is captured 80. Features are extracted from the audio signal 85 and compared to a feature set in memory 90. Based on the compared features of the captured audio signal and features in the feature set, the document is classified as a certain weight or thickness of document 95.
The document classification system basically consists of two phases, an audio phase and a classification phase. In the audio phase, various spectral features, or sonic profile, for example, like pitch or spectral centroid or amplitude or other, are determined in the audio signal for different thicknesses of paper. Features that are selected for learning purposes have good distinguishable properties for different thickness of documents. To generate the audio feature descriptors, windowed scan over the audio samples is used. The windowed scan includes sliding a window over the audio data in fixed increments, wherein each window represents a window of time. Spectral features are extracted from the sliding window using short time Fourier transform (STFT) techniques. STFT provides a rich representation that is capable of modeling a variety of perceptual characteristics such as pitch, loudness, amplitude, etc. These sets of feature vectors, corresponding to different document thicknesses are then stored in memory.
In the classification phase, the goal is to determine the category of a new document that is currently entering the scanner to a particular thickness based on the audio signal. The first step for classification is to extract the same spectral features as were determined in the learning phase. Classification of the document to a certain thickness is done by comparing these extracted features with the feature sets stored in the memory 51. Support vector machines (SVM) may be used for this comparison purpose.
While the audio signal is processed in the processor50, the document continues moving through the transport 30. Processor 50 and memory 51 may be internal or external to scanner 4. Document thickness is determined and classified before the document reaches the ultrasonic sensor 25. The document continues through the transport 30 to the upper imaging area 40, lower imaging area 45, out of the transport 30, and into the document output area 35.
The invention has been described in detail with particular reference to certain preferred embodiments thereof, but it will be understood that variations and modifications can be effected within the scope of the invention.
Reference is made to commonly-assigned copending U.S. patent application Ser. No. ______ (Attorney Docket No. 96155/NAB), filed herewith, entitled SONIC DOCUMENT CLASSIFICATION, by Schaertel et al., the disclosure of which is incorporated herein.