The present invention relates generally to methods for identifying user made marks on a sheet of paper, and more specifically, to methods for automatically identifying circled answers made by a test-taker on a multiple choice examination sheet.
The automatic grading of multiple choice tests has traditionally been accomplished through the use of bubble sheets. The process involves providing test-takers with test question sheets, usually containing multiple-choice questions, and a corresponding bubble sheet for recording their answers. Each bubble sheet contains several pre-printed hollow bubbles for each question number, and the pre-printed hollow bubbles correspond to answer choices for each test question. Test-takers generally designate their answer choices by filling in the pre-printed hollow bubbles that correspond to desired answer choices of the test questions. To be graded, the filled-in or marked bubble sheets have to be fed into specialized bubble sheet reading machines. Such a process is often cumbersome, expensive, and restrictive due to the use of specialized bubble sheet reading machines. Further, because bubble sheets have to be designed, printed, and distributed in addition to the test question sheets, additional costs and efforts are incurred. As a result, oftentimes only formal or standardized tests are conducted using this process.
The use of bubble sheets can be a source of grief for test-takers as well. Test-takers must take special care to bubble in the correct answer choice for the correct test question. For instance, if a test-taker were to accidentally skip a question when bubbling in answers on the bubble sheet, then a series of answer choices may be marked incorrect. Test-takers who attempt to avoid such errors by marking their answer choices on their question sheets first, and intending to bubble in the answers later, may run out of time to transfer the answers onto their bubble sheets. These problems place additional stress on test-takers.
Thus, there remains an unsatisfied need in the industry for a simple, reliable, and efficient method for automatically grading tests without the use of separate bubble sheets.
Methods for automatically grading tests using test question sheets marked by a test-taker utilize image processing algorithms to automatically recognize the circled answer selections on the test question sheets. Marked test question sheets may be scanned by an optical scanner and graded automatically without the use of bubble sheets, thereby simplifying testing while significantly reducing the cost of tests.
According to one embodiment of the invention, there is disclosed a method of identifying a user-selected answer. The method includes scanning a marked copy of an answer sheet, where the marked copy includes a marking corresponding to a user-selected answer of a plurality of answer choices, and comparing at least one portion of the marked copy to a corresponding at least one portion of the unmarked version of the answer sheet. The method also includes identifying differences between the at least one portion of the marked copy and the corresponding at least one portion of the unmarked version, and based on the identified differences, determining the user-selected answer.
According to one aspect of the invention, the method also includes the step of generating a digital pixel map of the unmarked answer sheet, and generating a digital pixel map of the marked copy. According to another aspect of the invention, the step of comparing may further include the step of comparing at least one answer region of the marked copy to a corresponding answer region of the unmarked version, where the respective answer regions of the marked copy and unmarked version encompass at least one of the plurality of answer choices. According to yet another aspect of the invention, the step of comparing may include the step of comparing a digital pixel map of the answer region of the marked copy to a digital pixel map of the corresponding answer region of the unmarked version.
The method may also include the step of creating a difference map, where the difference map shows at least some of the differences between the marked copy and the unmarked version. According to one aspect of the invention, the step of creating a difference map may include creating a digital difference map that identifies at least some of the pixel differences between the digital pixel map of the marked copy and the digital pixel map of the unmarked version. Further, the step of creating a difference map may include creating a digital difference map that identifies the pixel differences between an answer region in the digital pixel map of the marked copy and a corresponding answer region in the digital pixel map of the unmarked version.
According to another aspect of the invention, the method may also include the step of determining the number of pixels that are different in the answer region of the marked copy compared to the corresponding answer region of the unmarked version. According to yet another aspect of the invention, the step of identifying differences may also include the step of measuring the similarity between the at least one portion of the marked copy and the corresponding at least one portion of the unmarked version, where the similarity measurement is based on a correlation computation.
According to another embodiment of the invention, there is disclosed a method of identifying user-selected answers. The method includes scanning an answer sheet, where the answer sheet includes at least one marking corresponding to a user-selected answer, comparing a first region of the scanned answer sheet to a corresponding first region of an unmarked version of the answer sheet, identifying the differences between the first region of the scanned answer sheet and the corresponding first region of the unmarked version of the answer sheet, and determining the user-selected answer based on the identified differences.
According to one aspect of the invention, the method may further include the step of comparing a second region of the scanned answer sheet to a corresponding second region of the unmarked version of the answer sheet. The method may also include the steps of establishing a first rank based on the identified differences between the first region of the scanned answer sheet and the corresponding first region of the unmarked version of the answer sheet, and establishing a second rank based on the identified differences between the second region of the scanned answer sheet and the corresponding second region of the unmarked version of the answer sheet. According to one aspect of the invention, the step of determining the user-selected answer may include the step of determining the user-selected answer by comparing the first rank and the second rank. The step of identifying the differences may also include the step of generating a difference map based on the identified differences between the first region of the scanned answer sheet and the corresponding first region of the unmarked version of the answer sheet.
According to another aspect of the invention, the step of identifying the differences may also include the step of determining the number of pixels that are different in the first region of the scanned answer sheet from the corresponding first region of the unmarked version of the answer sheet. According to yet another aspect of the invention, the method includes comparing a second region of the scanned answer sheet to a corresponding second region of the unmarked version of the answer sheet, and determining the number of pixels that are different in the second region of the scanned answer sheet from the corresponding second region of the unmarked version of the answer sheet. The number of pixels that are different in the first region of the scanned answer sheet may also be compared to the number of pixels that are different in the second region of the scanned answer sheet.
According to another aspect of the invention, the method may include the step of storing the location of an answer on the answer sheet. The location of the corresponding first region of the unmarked version of the answer sheet may also be stored. According to yet another aspect of the invention, the method may include the step of increasing the size of the first region of the scanned answer sheet and the corresponding first region of the unmarked version of the answer sheet.
Having thus described the invention in general terms, reference will now be made to the accompanying drawings, which are not necessarily drawn to scale, and wherein:
The present inventions now will be described more fully hereinafter with reference to the accompanying drawings, in which some, but not all embodiments of the invention are shown. Indeed, these inventions may be embodied in many different forms and should not be construed as limited to the embodiments set forth herein; rather, these embodiments are provided so that this disclosure will satisfy applicable legal requirements. Like numbers refer to like elements throughout.
Referring now to
According to one aspect of the invention, the test-taker's answer selection may be automatically identified by scanning the completed multiple choice problem 10 and comparing the completed multiple choice problem 10 to an unmarked version of the problem. The unmarked version of the problem is a clean copy of the problem prior to receiving the test-taker's marking 14. The comparison may be performed using digital copies of both the completed multiple choice problem (i.e., the test-taker's marked copy) and an unmarked version. Although both the marked copy and the unmarked version may be converted into digital form by a scanner, the unmarked version may be generated in digital form and stored as a master copy of the test so that subsequent scanning of the document is unnecessary to place it in digital form.
As is well known in the art, a scanner may capture images of the marked copy and convert it into a digital pixel map, which allows for computer processing of the digital pixel map. A digital pixel map of the unmarked version is also used for processing, which, as noted above, may be generated by a scanner or may be the digitally generated master copy. According to one aspect of the invention, a direct pixel map comparison of the marked copy and the unmarked version may be made to identify the changes the test-taker has made to the test. As will be discussed with respect to
Although a direct pixel map comparison may be made to the entire digital versions of the unmarked version and marked copy, only corresponding portions of each are preferably compared. For instance, only one or more answer regions around the answer choices may be compared on the unmarked version and the marked copy. According to one aspect of the invention, a square region around each answer alternative on the marked copy may be compared to the same square region around each answer choice on the unmarked version. Based on a comparison of the pixel maps of each copy, the differences between the marked copy and unmarked version, or between the respective portions or answer regions thereof, may be identified. Based on these identified differences, the user-selected answer may be determined.
As described above, a direct pixel map comparison may be made only to selected regions of a marked copy of a test and an unmarked version of the same test. Reducing the size of the direct pixel map comparison maximizes speed of the comparison and minimizes the memory and computing power required to execute the comparison. This may be particularly important where a test includes a large number of multiple choice problems on the same or multiple pages. The illustrative embodiment of
Defining answer regions, such as a square region around each answer choice, also enables a comparison to occur between the regions to identify the answer region that has changed the greatest from the unmarked version. According to one aspect of the invention, the difference in the number of pixels provided within each answer region 20, 22, 24, 26 of the difference map may be used as an indicator as to which of the four answer choices is marked and selected. According to one aspect of the invention, the number of pixels within each answer region in the difference map may be counted to determine the answer region containing the greatest number of pixel differences from the unmarked version. The greater the number of pixel differences, the greater the changes within the region. Because a test-taker's marking results in pixel differences in the difference map, the answer region in which the marking was made may be identified. In the illustrative example of
It will be appreciated that the answer regions 20, 22, 24, 26 may be defined around each answer choice during the creation of a problem or test. According to one aspect of the present invention, when a digital copy of a problem or test is generated, the location of each answer choice letter is stored in a database. As noted above, the unmarked version may be formed digitally as a master copy such that scanning of an unmarked version is not required. According to one aspect of the invention, the location of each answer choice letter may be stored in a database. One or more tests may be digitally created from a collection of stored test questions, such that each test may automatically store the location of each answer choice letter used.
A base answer region covering each answer letter may also be stored. According to one aspect of the invention, the answer region may be based on the location of each answer choice letter, for instance, the letters ‘A’, ‘B’, ‘C’, or ‘D’ in
Using a direct pixel map comparison and a difference map to identify answers is effective when a test-taker marks answers using circles or equivalent marks that are of a minimum size in circumference to encircle a chosen answer, but not too large so that the marking impedes on answer choices. However, test-takers do not mark each answer choice consistently, as the test-takers' circular markings may differ in length, width, shape, and orientation from question to question. Furthermore, some test-takers mark answers differently from other test-takers. These problems result in some instances where the method described above with respect to
If only the square regions 40, 42, 44, 46 defined by the boxes in
To overcome the problem presented by an over-drawn answer circle, such as the problem presented by the marking 34 illustrated in
As shown in
Next, using the known location of the answer regions, the answer regions corresponding to a single test question are extracted from a digital pixel map of a marked copy of a test sheet (block 52). Although it is preferred that the answer regions be extracted on a question by question basis, answer regions for multiple questions marked by a test-taker may also be extracted at once. It will be appreciated that extraction of a greater number of answer regions would require more computing power as location information for each answer region must be retrieved and used to extract the answer regions from a digital pixel map. For illustrative purposes, the remaining discussion will be with reference to the extraction of answer regions corresponding to a single multiple choice question.
Next, a digital pixel map of the unmarked version of the test, which may be the master copy, is accessed (block 54). The answer regions of the marked copy are then compared to the corresponding answer regions of the unmarked answer sheet (block 56). A difference map is generated (not illustrated), as described in detail with respect to
As shown in
To illustrate the above computations, in the illustrative example of
The difference condition computation may also be illustrated with respect to the illustrative example of
It will be appreciated that although the above process is described with reference to a test question having four answer choices, the above process may also be implemented with questions having a greater or fewer number of answer choices. Regardless of the number of choices, each answer region may be ranked and then compared with other regions in the manner described above. Where an odd number of answer choices exist, this may require multiple comparisons to occur, each which may have to satisfy a difference condition. Additionally, it will be appreciated that because ratios are taken, it may be presumed that the lowest number of pixel differences within any region is 1, to avoid ratios having a denominator of 0.
If the difference condition is satisfied (block 58), the test-taker's answer for the question is stored (block 66). Alternatively, if the difference condition is not satisfied, the answer recognition method will then determine if a ratio condition is met (block 60). Like the difference condition computation, the ratio condition is dependent upon the number of difference pixels within each answer region, as provided by a difference map after a comparison of pixel maps of a marked copy and an unmarked version of the same question or test. Because the ratio condition is considered only if the difference condition is not satisfied, and the difference condition is dependent upon difference map computation, the difference map does not have to be regenerated to determine if the ratio condition is satisfied.
It will be appreciated by those of ordinary skill in the art that an area of a digital region, such as an answer region, may be defined as the number of pixels within the region. Therefore, in a difference map, an answer region containing a test-taker's marking should have a larger area than an answer region that fails to contain the test-taker's marking. The pixels that comprise the user's (typically circular) mark should normally be greater than the pixels that make up an answer letter, or that are required to encircle an answer letter. This relationship may be impacted by the font type and size used for alternative test answers, and by the thickness of the user's marking. According to one aspect of the invention, the ratio between the total number of pixels used for a test-taker's marking and the total number of pixels used in generating an answer letter may be used to identify a test-taker's answer selection. According to one aspect of the invention, this ratio should be equal or greater than 2. When such a condition is met, the ratio condition is satisfied, and the test-taker's answer is presumed identified. As with the difference threshold, the ratio does not have to be exactly equal to two, although it will be appreciated that the greater the ratio, the more likely it is that a test-taker's answer is accurately identified.
It will be appreciated that the ratio condition is unlike the difference condition, and unlike the method described with respect to
If the ratio condition is satisfied (block 60), the test-taker's answer for the question is stored (block 66). Alternatively, if the ratio condition is not satisfied, the answer recognition method will next determine if a similarity condition is satisfied (block 62). According to one aspect of the invention, the similarity condition uses a correlation equation to compare two digital images. More specifically, a correlation equation is used to compare answer regions in a test-taker's marked copy to corresponding answer regions of an unmarked version of a problem or test. Because the comparison of two answer regions, one scanned from a test-taker's copy and one from an unmarked version, such as a master copy, may not be perfectly aligned, a difference map generated from a comparison of the two may carry misalignment information instead of the marks made by the test-taker. A similarity measure tiled over the answer region can reduce such errors, including noise. Although it will be appreciated that many alternative similarity measures may be used, according to one aspect of the invention, a similarity measure may be achieved via the following correlation equation:
where,
A correlation equation like the one provided above, as is known in the art, indicates how similar one pixel region is to another. Therefore, they do not execute a pixel by pixel comparison of two images, but consider pixels immediately around each pixel. In the above equations, xi,yi is the ith pixel from the respective answer regions of the marked copy and unmarked version. For a pixel position x,y, λ gives an indication as to the similarity of two different answer regions for that pixel position, where the similarity considers adjacent pixels within a predetermined block N. Thus, as used above, N is a block size (e.g., 25 pixels for a 5 by 5 block) that defines the area considered around each pixel.
Using the above equations, if two compared answer regions are identical, λ, which may be referred to as the similarity index, would be equal to 1. On the other hand, if two answer regions are dissimilar, the similarity index would be less than 1. These values are used to generate a similarity map that illustrates how similar an answer region of the marked copy is to a corresponding answer region on the unmarked version. Once λ is calculated for each pixel position in the similarity map, an inverse of λ is taken (1−λ), which provides a weighted value for each pixel x,y. Therefore, for a given pixel position, 0 would indicate that the position is identical in the two compared answer regions.
Next, a center of gravity of each answer region in the similarity map may be computed, as is well known in the art. For each multiple choice question, the answer recognition method will then determine the center of gravity that is closest to the location of an answer letter. According to one aspect of the invention, the center of gravity that is closest to an answer letter may be used to identify the test-taker's selection. According to one aspect of the invention, the center of gravity has to be within a certain distance from the location of an answer letter, or the similarity condition will not be met. If the similarity condition is satisfied (block 62), the test-taker's answer for the question is stored (block 66).
If the similarity condition is not satisfied, the answer recognition method will repeat each of the three conditions, but only after the answer region is increased in size (block 64). As noted above, if the answer regions become too large, it may be difficult to perfectly align the marked-copy with the unmarked version to execute a comparison between answer regions. However, using incremental increases in size, in combination with the above conditions, effectively identify a test-taker's answer without negative impact from scanning distortion, discrepancies between the marked copy and unmarked version, and the like. According to one aspect of the invention, the answer region is progressively dilated until one of the conditions is met, where the answer region is enlarged by a small percentage each time. For instance, a five pixels increase may be implemented.
It will be appreciated by one of ordinary skill in the art that the difference, ratio and similarity conditions may be processed in a different order than is presented in
According to another embodiment of the present invention, a test sheet may be created with circling guides in an effort to avoid the problem presented by an over-drawn answer circle, as in the illustrative example of
According to one aspect of the invention, the circling guides may be printed on a test sheet as a thin or faint circular line that is not distracting to a test-taker. The circling guides may also be printed using a dashed or dotted line, or the like, such that they are not continuous around each answer choice. It will also be appreciated that the circling guides may take the shape of a square, rectangle, oval, or the like, or portions thereof. Furthermore, the circling guides may identify the boundaries of each answer region to encourage test-takers to mark answers entirely within the answer region.
The circling guides will preferably appear on both the unmarked version of a test and the test-taker's marked copy of the test. Because the circling guides appear on both versions, the circling guides may not appear in a difference map generated by a direct pixel map comparison of an unmarked test and a test-taker's marked copy. Therefore, the use of circling guides may be used in conjunction with the methods described above with respect to
Next, it will be appreciated that each of the methods described above with respect to
It will be appreciated that the memory 76 in which the answer identification tool 78 resides may include random access memory, read-only memory, a hard disk drive, a floppy disk drive, a CD Rom drive, or optical disk drive, for storing information on various computer-readable media, such as a hard disk, a removable magnetic disk, or a CD-ROM disk. Generally, the answer identification tool 78 receives information input or received by the answer recognition module 70, including digital versions of the marked and unmarked answer sheets. The answer identification tool 78 also receives answer letter and answer region data 86, which identifies the location of the answer letters and answer regions for each answer choice for each multiple choice test question. According to one aspect of the invention, the answer letter and answer region data may be stored local to the answer recognition module 70, such as in the database 84, although the data may also be received from one or more remote sources via the I/O interface 82. Using information it receives, the answer identification tool 78 effects the methods described in detail above with respect to
Referring again to
The database 84 of the answer recognition module 70, which is connected to the bus 80 by an appropriate interface, may include random access memory, read-only memory, a hard disk drive, a floppy disk drive, a CD Rom drive, or optical disk drive, for storing information on various computer-readable media, such as a hard disk, a removable magnetic disk, or a CD-ROM disk. In general, the purpose of the database 84 is to provide non-volatile storage to the answer recognition module 70. As shown in
It is important to note that the computer-readable media described above with respect to the memory 76 and database 82 could be replaced by any other type of computer-readable media known in the art. Such media include, for example, magnetic cassettes, flash memory cards, digital video disks, and Bernoulli cartridges. It will be also appreciated by one of ordinary skill in the art that one or more of the answer recognition module 70 components may be located geographically remotely from other answer recognition module 70 components. For instance, the answer letter and answer region data 86 may be located geographically remote from the answer recognition module 70, such that historical data and lookup tables are accessed or retrieved from a remote source in communication with the answer recognition module 70 via the I/O interface 82.
It should also be appreciated that the components illustrated in
Many modifications and other embodiments of the inventions set forth herein will come to mind to one skilled in the art to which these inventions pertain having the benefit of the teachings presented in the foregoing descriptions and the associated drawings. Therefore, it is to be understood that the inventions are not to be limited to the specific embodiments disclosed and that modifications and other embodiments are intended to be included within the scope of the appended claims. Although specific terms are employed herein, they are used in a generic and descriptive sense only and not for purposes of limitation.