This invention relates to computational methods for discriminating graphite pencil marks from pre-printed background text and graphics, regardless of color, in an optical scan of a response form.
Written surveys and assessments are routinely administered by providing users (respondents) with response sheets including pre-printed artwork and locations for the user to enter responses to survey questions or items presented in the assessment. Pre-printed artwork on a response sheet comprises matter printed on each sheet prior to distribution to the respondents and includes information not typically needed to evaluate the user's response(s). Pre-printed artwork may include alphanumeric text, such as survey or test identification information, response instructions, or survey or test questions, graphics (including charts, tables, graphs, and diagrams), graphic features (such as lines, borders, shading, and bubble targets (circles, ellipses, ovals, boxes), and letters or numbers within bubble targets. In the context of the present disclosure, a “bubble target” is defined as the printed circle (or ellipse, oval, box, or other closed shape), whether marked or not, on a response sheet, and “response bubble” is defined as the mark made by a person to fill the bubble target. The user enters marks by filling in, e.g., with a no. 2 lead (graphite) pencil, appropriate bubble targets (i.e., by making response bubbles) corresponding to a letter or number entered in a response box above the bubble target or corresponding to answer choices for one or more questions presented in the survey or test. The test sheets are then scanned, and the responses are evaluated automatically by applying a template or optical mark recognition (OMR) techniques to tally or assess each response based on the bubble target that has been filled in.
In the automated response evaluation process, the user marks must be separated from the pre-printed artwork on the response sheet, so the pre-printed artwork is “dropped out” to create a document including only the user response marks, and possibly other necessary information typically used in OMR processing (e.g., sheet identification marks, timing marks, and skew marks).
Conventionally, separating user response marks from background material, such as the pre-printed artwork, requires the use of an infrared scanner and non-carbon-bearing inks, so that the pre-printed artwork is not included in the scanned document image. Most colored inks used for the pre-printed background artwork are transparent to the wavelength of infrared scanners, while carbon absorbs the light. As a result, the background artwork vanishes, leaving only the graphite responses and other carbon-bearing marks. Thus, a test response sheet including pre-printed artwork and user response marks scanned with an infrared scanner will produce a document that includes only the user response marks (due to infrared light being sensitive only to carbon-bearing inks) and necessary pre-printed marks configured to be detected by an infrared scanner (e.g., sheet identification marks and timing marks). Then OMR is employed to evaluate only the user response marks without interference from the pre-printed artwork.
The necessity of using infrared scanners to separate user response marks from pre-printed artwork creates a number of disadvantages due in part to the high cost of infrared scanners. Infrared scanners are typically much more costly than conventional visible-light scanners (∞$100,000). Exemplary OMR infrared scanners include the iNSIGHT™ 150 by Scantron. Further, the maintenance costs of infrared scanners are high. Infrared scanners are not commonly owned by scanning companies, schools, or other businesses that require scanning of response sheets. This limits or prevents collaborations with groups that do not own infrared scanners, prevents the use of hand-held scanning devices at the survey room or classroom level, and prevents local scanning at the response-collection sites using ordinary, inexpensive scanners.
Software configured to numerically drop out specific background color or color sets from a scanned document has existed from some time. For example, Dunord Technologies does this in its product called ColErase. Other companies, such as Hewlett-Packard and Kofax, also offer background drop-out as a way to increase accuracy of data capture from scanned forms. Existing color drop-out technology, however, requires that the pre-printed content to be dropped out be a different color than the response marks, and often it requires advance knowledge of which color(s) is (are) to be dropped out.
Accordingly, it would be desirable to have systems and processes for processing response forms in a manner that distinguishes the response marks from the pre-printed text and graphics (i.e., background) using standard scanners that are less expensive to purchase and maintain, and that does not require advance knowledge of the colors or shapes of the background elements to be dropped out.
The present invention systematically detects and isolates the graphite elements in scanned test documents with varying background colors and artwork. Aspects of the invention relate to methods, systems, apparatus, and software to detect graphite responses in scanned response forms, such as test documents, using inexpensive scanners. The graphite responses of interest are the penciled bubbles and the alphanumeric characters written by the respondent in OMR documents.
A test image is modeled as a superposition of a graphite image and the background image. The background image is further modeled as a superposition of several constituent components. The process is formulated in terms of background removal, and a sequence of operations is used to systematically identify and remove the background elements. This methodology can be applied to a wide range of response documents scanned using an ordinary visible-light scanner with results that are visually similar to those obtained using an infrared scanner.
In addition to dropping out the backgrounds for the purposes of OMR scanning, other uses for this invention may include the following. It will be apparent to one of ordinary skill in the art that these uses represent some, but not all, potential uses for this invention.
The detection of graphite responses through alternative methods is quite complex because there are hundreds of different types of optical mark recognition sheets which vary in background colors and artwork. Furthermore, optical mark recognition sheets may also be customized for specific needs. The graphite detection method described herein significantly reduces the cost by using an inexpensive color scanner (≈$75) in conjunction with a sequence of image processing operations and a pattern classification algorithm. The outputs of this method are similar to those produced by the infrared scanners typically used.
The accompanying drawings, which are incorporated herein and form part of the specification, illustrate various embodiments of the present invention and, together with the description, further serve to explain the principles of the invention and to enable a person skilled in the pertinent art to make and use the invention. In the drawings, like reference numbers indicate identical or functionally similar elements.
Image Model
The output of the inexpensive color scanner in response to a test optical mark recognition document is referred to as the “response image” and comprises a matrix of pixel values represented by the function F(x,y). The response image can be modeled as the sum of two constituent components: (1) the scanner response to the graphite, and (2) the scanner response to the blank optical mark recognition document which will be referred to as the background. If G(x,y) and N(x,y) are the graphite and background pixel elements, respectively, the response image is given by
F(x,y)=G(x,y)+N(x,y). (1)
Using this model, the objective is to determine G(x,y) given F(x,y). If N(x,y) is known and if F(x,y) and N(x,y) can be perfectly aligned, this problem could be easily solved by taking the difference between F(x,y) and N(x,y). In order to make the detection invariant to the numerous optical mark recognition backgrounds, it is assumed the blank optical mark recognition documents are not available, therefore, N(x,y) is assumed unknown. Furthermore, perfect alignment in practice is impossible; therefore, it is not assumed. The present invention systematically removes the background using a sequence of image processing and classification algorithms—i.e., algorithms for classifying elements of a response sheet as handwritten elements or preprinted elements. The formulations of these algorithms are described in the following sections.
Image Processing of the Response Image
Referring to
N(x,y)=H(x,y)+V(x,y)+A(x,y)+S(x,y)+T(x,y)+K(x,y) (2)
It will be apparent that the image may be a color image or a grayscale image. In the case of a grayscale image, the process of Color-based Segmentation described below cannot be done, but the other processes described below accomplish the background dropout.
Horizontal and Vertical Line Removal
The background elements vary from document to document depending on the colors used to design the document. In order to make the detection of these background elements invariant to background color, the color response image F(x,y) is converted into a grayscale image f(x,y) according to
where FR(x,y), FG(x,y), and FB(x,y), are the red, green, and blue background components of F(x,y), respectively. The pixels of the solid vertical and horizontal lines in f(x,y) can be detected by the colinearity Hough transform. Moreover, because the pixels of the dot screening patterns also form horizontal and vertical lines, they may also be detected by the Hough transform and then removed from the background. In order to apply the Hough transform, the grayscale image is converted into a binary image using the following thresholding rule:
The factor, δ, 0<δ<1, can be determined empirically across a large number of response images. It is important to note that the segmentation threshold can, therefore, be determined autonomously because it is a function of the intensities of a given response. Let A(ρ, θ) be the colinearity Hough transform of g(x,y) and let {At1(ρ,0)} and {At2(ρ,±90)} be the sets of accumulator cells with counts exceeding t1 and t2, respectively. If ft
{tilde over (g)}(x,y)=g(x,y)−ft
will be free of horizontal and vertical line segments with more than t1 and t2 pixels, respectively. The reason for imposing thresholds t1 and t2 is to make sure that short horizontal and vertical line segments which may be parts of graphite characters are not removed. The values of t1 and t2 can be determined empirically across a large number of response images.
Color-Based Segmentation
In the next step, {tilde over (g)}(x,y) is used as a mask for removing the horizontal and vertical lines in the original color image. This is accomplished by multiplying the color image F(x,y) by the mask {tilde over (g)}(x,y). That is, the resulting color image is given by
{tilde over (F)}(x,y)=F(x,y){tilde over (g)}(x,y) (6)
=A(x,y)+T(x,y)+K(x,y)+G(x,y)
As noted earlier, optical mark recognition sheets come in numerous colors. Therefore, the color clusters of the background images vary widely in the RGB color space. However, the cluster of the graphite element is relatively invariant in the RGB color space. To some degree (depending on the background colors), the removal of the background elements can be made invariant to color by segmenting the graphite cluster in the RGB color space. The graphite pixels can be modeled as a (3×1) random vector Y whose density function is given by
where T denotes the transpose of a vector and m and Ω are the (3×1) and (3×3) mean and covariance matrices of Y, respectively. That is, the cluster is a hyper-ellipsoid. For this case, the major principal axis will be along the diagonal of the color cube which corresponds to the line connecting the coordinates (0, 0, 0) to the coordinates (255, 255, 255). The segmentation of the graphite cluster can, therefore, be expressed as
where
r2=(Y−m)TΩ−1(Y−m) (9)
is the squared Mahalanobis distance between Y and m. Therefore, non-gray pixels are identified and set to 0, using Equation (8). The colored art elements and the black text elements of the background may or may not be partially removed in this step. In order for the formulation to be general, it is assumed that the segmented image is now
{circumflex over (F)}(x,y)=Â(x,y)+{circumflex over (T)}(x,y)+{circumflex over (K)}(x,y)+G(x,y) (10)
where Â(x,y), {circumflex over (T)}(x,y), and {circumflex over (T)}(x,y) are the modified elements of A(x,y), T(x,y), and K(x,y), respectively, due to color segmentation. The modified black machine text element {circumflex over (K)}(x,y) is relatively dark and can be removed using intensity based segmentation. In order to do so, {circumflex over (F)}(x,y) is converted into a grayscale image ĝ(x,y) using Equation (3) and segmented according to
where β is an empirically determined threshold, thereby generating a grayscale image h(x,y) with non-gray and near-black pixels set to zero. The grayscale image h(x,y) may now be expressed as
h(x,y)=hA(x,y)+hT(x,y)+hG(x,y) (12)
Where hA(x,y), hT(x,y), and hG(x,y) are the grayscale components corresponding to A(x,y), T(x,y), and G(x,y), respectively.
Classification of Elements
Due to the overlap of the clusters in h(x,y), the removal of hA(x,y) and hT(x,y) directly using image processing algorithms is not possible, therefore, the elements of h(x,y) are separated using a texture recognition approach. Texture is chosen because the graphite element is expected to have a coarser texture when compared with the other two machine-printed elements. Furthermore, texture carries over from the color domain to the grayscale domain thus simplifying the detection of texture in one grayscale image rather than in three (red, green, and blue) images. Therefore, h(x,y) is considered to be an image containing objects belonging to two classes, ω1 (graphite) and ω2 (non-graphite). The texture features are selected according to the following criteria: they must be invariant to the object (a) shape, because the objects within each class have varying shapes, and (b) size, because the objects within a class have varying dimensions. Before classifying the elements the zero matrix of the same size as the response image is created
M(x,y)=0 (12)
In order to satisfy these requirements, texture features are derived from the normalized histograms of the objects because the normalized histogram, represented by p(zi), i=1, 2, . . . , 255, is invariant to the shape and size of the objects. The following features are selected:
Entropy
e(z)=−Σi=oL-1p(zi)log2 p(zi) (13)
Skewness
μ3(z)=Σi=0L-1(zi−m)3p(zi) (14)
Uniformity
U(z)=Σi=0L-1p2(zi) (15)
Using the above feature set, a three-dimensional multivariate classifier whose discriminant function is given by
is designed to separate the two classes. The parameters μi and εi are the (3×1) mean vector and (3×3) covariance matrices, respectively, of the classes labeled ω1 and ω2, respectively. P(ωi) is the prior probability of class ωi, i=1,2. A test object represented by a vector Z* is assigned to class ω* given by
ω*=argimax[di(Z*)]. (17)
If ω* is identified as a graphite class for a given test element feature set Z, the corresponding pixel locations in M(x,y) are changed to intensity value 1. This creates a binary image where all the pencil marks that were identified from the response sheet using the developed process are set to 1 and the rest of the pixels are set to 0.
Once all the elements are tested, each pixel of the binary mask M(x,y) is multiplied with the corresponding pixel of the grayscale image of the response sheet f(x,y), thus creating a grayscale image GS(x,y) with all the background as intensity 0 and identified pencil marks (i.e., graphite elements) with their own intensities.
GS(x,y)=M(x,y)f(x,y) (18)
All the 0 intensities (which are labeled as background) that are in GS(x,y) are changed to intensity 220, corresponding to a light gray which is similar to the color of white paper. It is also important to be noted that the intensity value 220 is arbitrary and any intensity value can be used to label the background.
It will be apparent to one of ordinary skill in the art that other features may be more useful in different circumstances. Suitable features may include, but are not limited to:
Mean
Standard Deviation
Smoothness
Kurtosis
Root Mean Square (RMS)
Variance
Median Absolute Deviation
MAD=mediani(|Xi−medianj(Xj)|)
Range
R(X)=Max(X)−Min(X)
Interquartile Range
IQR=Q3−Q1
Validation of the Process
The graphite detection procedure can be used on a wide range of optical mark recognition sheets with different background colors and artwork. Each sheet can have penciled bubbles and penciled characters written by different individuals using different graphite pencils. In one example, the spatial resolution of the scanner is set to 300 dots per inch. The information required to estimate the color segmentation parameters (m and Ω) and the classifier parameters (P(ωi), μi, and εi)) is extracted from representative classification sets of the graphite and non-graphite classes. In the present invention, computationally-detectable characteristics or features of markings on a scanned response sheet image are used to distinguish certain types of markings (e.g., hand-written graphite markings) from other types of markings (e.g., pre-printed background markings). Thus, sheets containing a set of markings known to be of one particular type (e.g., only hand-written graphite markings or only pre-printed background markings), referred to herein as “classification sets,” are scanned and the methodology of the present invention is used to identify the computationally-detectable characteristics or features of those markings so that those characteristics or features can later be recalled and used to identify a particular type of marking among other markings in a scanned image of a response sheet that includes more than one type of marking.
The software creates a set of characteristics, or features, benchmarks, or sample sets, against which each document is automatically evaluated. A graphite classification set is generated by a thresholding process that rejects the background color of the sheet and keeps only the graphite marks, based on a predetermined thresholds of red, green, and blue (RGB) pixel values, from a scanned image of a white sheet 60 with penciled characters (see
The classification sets used for the texture classifier may be the entropy, skewness, and uniformity of graphite and non-graphite image elements, from which mean and covariance are computed. In this example, pixels of the scanned optical mark recognition sheet 62, shown in
A process for implementing the invention is represented by flow chart 10 in
In Step 20, the grayscale image 18 is converted to a binary image using Equation (4), thereby forming a binary image 22 of the OMR sheet.
In Step 24, the horizontal and vertical lines of the binary image 22 are detected and removed using the Hough transform, Equation (5), thereby generating a binary image with horizontal and vertical lines removed 26.
In Step 28, the binary image 26 is multiplied with three planes of color image, thereby generating a color image 30 with all horizontal and vertical lines removed, as described above in the discussion regarding Equation (6).
In Step 32, non-gray pixels are identified, and are set to zero, thereby generating a color image with non-gray pixels set to zero 34, as described above in the discussion regarding Equations (7) and (8).
In Step 36, the color image with non-gray pixels set to zero 34 is converted to grayscale using Equation (3), thereby generating the grayscale image with non-gray pixels set to zero 38.
In Step 40 all pixels that are near zero are set to zero using Equation (11), thereby generating a grayscale image with non-gray and near-black pixels set to zero 42, as described above in the discussion regarding Equations (10) through (12).
Step 44 includes four sub-steps. In a first sub-step, all connected pixel groups are identified. In a second sub-step, textural features of each connected pixel group are computed using, for example, Equations (13), (14), and (15)—that is, using the same textural features that are computed for the classification sets. In a third sub-step, a two-class Gaussian classifier is implemented comparing the textural features computed for the OMR sheet with the pencil and artwork classification sets at 68 (see also
In Step 48, the binary image with pencil (graphite) pixels is set to one, as described above.
In Step 50, the binary image is multiplied with the grayscale image 1, as described above in the discussion regarding Equation (18). This results in a grayscale image 52 showing only pencil marks on a black background 52.
In Step 54 all zero (black) pixels are replaced with the value 220 to mimic paper color, as described above. This results in grayscale image 56 that is similar to an image generated by an infrared scanner. The image file can be utilized in computer memory during the same process, or it can be saved as an image file for later use and/or processing.
The results clearly show that the output images created by this process and output images created by an infrared scanner are quite similar (see
These results are typical of the results obtained for other test sheets. Note that text printed in carbon-bearing ink remains in the infrared-scanned image, as shown in
Objective Evaluations
For this case, the data set consisted of simulated response images generated by the following model:
F(x,y)=G(x,y)+N(x,y)+η(x,y)
where G(x,y) is the graphite image, N(x,y) is the background (blank optical mark recognition) image, and η(x,y) is the noise introduced during scanning. The scanning noise component is included to account for the typical variations that occur when a given image is scanned repeatedly. That is, each pixel at a given location varies randomly from scan to scan. The scanning noise is assumed to be zero-mean Gaussian with variance σ2. A specific response Fi(x,y) is given by
Fi(x,y)=G(x,y)+αiFR(x,y)+βiFG(x,y)+γiFGB(x,y)+η(x,y)
where
are the coefficients that determine the color of the resulting background image. The maximum values of the red, green, and blue component images of N(x,y) are represented by mr, mg, and mb, respectively.
For the experiments, the simulated responses were generated using the following sequence of steps:
(i) One blank optical mark recognition sheet was selected and scanned. This gave the image N(x,y).
(ii) A small smooth region of constant intensity in N(x,y) was chosen and the variance of the pixels in this region was estimated. This estimate was used for the scanning noise variance σ2.
(iii) Ten individuals with different pencils wrote and filled bubbles in various parts of the selected optical mark recognition sheet and the resulting penciled sheet was scanned. The image N(x,y) was manually aligned and subtracted from the penciled sheet image to obtain G(x,y). The image G(x,y) was examined and any errors due to misalignment were corrected manually.
(iiiiv) A simulated response image was obtained by changing the color coefficients of N(x,y) randomly, adding Gaussian noise with zero mean and variance σ2, and adding the graphite image G(x,y) to the result.
It should be noted that numerous response images with a multitude of background colors can be generated using the method outlined above. Another advantage is the fact that a single image G(x,y) encompasses the writings of several individuals using different pencils. The noise introduced by different scanners can also be accommodated by changing the variance. Most importantly, the performance of the graphite detection methodology can be evaluated objectively over a large number of response images.
The two types of detection errors that can occur are (a) (b/g): a graphite pixel misclassified as a background pixel, and (b) (g/b): a background pixel misclassified as a graphite pixel. Therefore, for a given response image, the detection error probability can be determined from
P(E)=P(b/g)P(g)+P(g/b)P(b).
The prior probabilities P(g) and P(b) of the graphite and background pixels can be estimated from a response image as
P(b)=1−P(g)
The detection results can be used to estimate the error probabilities as
For the detection experiments, the graphite classification set, the segmentation factor, and the Hough transform thresholds were the same as those used in the subjective evaluation experiments. Error probabilities of such magnitudes show very little visual difference between the detected outputs and the graphite image.
Aspects of the invention are implemented via computing hardware components, user-created software, data storage components, data input components, and data output components. As shown in
Software comprises instructions stored on non-transitory computer-readable media, such as instructions implementing process 10 of
While the present invention has been described and shown in considerable detail with reference to certain illustrative embodiments, including various combinations and sub-combinations of features, those skilled in the art will readily appreciate other embodiments and variations and modifications thereof as encompassed within the scope of the present invention. Moreover, the descriptions of such embodiments, combinations, and sub-combinations is not intended to convey that the invention requires features or combinations of features other than those expressly recited in the claims. Accordingly, the present invention is deemed to include all modifications and variations encompassed within the spirit and scope of the following appended claims.
Number | Name | Date | Kind |
---|---|---|---|
4907096 | Stansfield et al. | Mar 1990 | A |
5239390 | Tai | Aug 1993 | A |
5384648 | Seidner et al. | Jan 1995 | A |
5404294 | Karnik | Apr 1995 | A |
5619592 | Bloomberg et al. | Apr 1997 | A |
5694494 | Hart et al. | Dec 1997 | A |
5896464 | Horiuchi et al. | Apr 1999 | A |
6035058 | Savakis et al. | Mar 2000 | A |
6160913 | Lee et al. | Dec 2000 | A |
6298150 | Sonoda et al. | Oct 2001 | B1 |
6330357 | Elmenhurst et al. | Dec 2001 | B1 |
6757426 | Link et al. | Jun 2004 | B2 |
6970267 | Scanlon | Nov 2005 | B1 |
7020328 | Barton | Mar 2006 | B2 |
7545540 | Shiau | Jun 2009 | B2 |
7573616 | Poor | Aug 2009 | B2 |
7853074 | Mischler | Dec 2010 | B2 |
7961941 | Withum et al. | Jun 2011 | B2 |
20020154339 | Kuo et al. | Oct 2002 | A1 |
20030190090 | Beeman et al. | Oct 2003 | A1 |
20070217701 | Liu et al. | Sep 2007 | A1 |
20070253040 | Lee et al. | Nov 2007 | A1 |
20070272753 | Scanlon | Nov 2007 | A1 |
20090060330 | Liu | Mar 2009 | A1 |
20090214109 | Nakashima et al. | Aug 2009 | A1 |
20110176736 | Davison et al. | Jul 2011 | A1 |
Number | Date | Country |
---|---|---|
WO 2008077715 | Jul 2008 | WO |
WO 2008106004 | Sep 2008 | WO |