The present invention relates to a system for processing answers to test questions.
The scoring of test answer sheets involves complex problems. These test answer sheets typically include a series of response positions such as, for example, “bubbles,” ovals, or rectangles. A person taking a test would, for example, darken in an appropriate oval with a pencil to answer a multiple choice question. These test answer sheets may also include handwritten answers, such as essay or short answer questions. Systems for scanning and scoring the bubbles on such answer sheets are known in the art. Increased difficulties are encountered, however, when such answer sheets either include other types of answers, such as handwritten answers, or cannot be machine graded. For example, if the student has failed to include his or her name on the test answer sheet, the system may be unable to machine score the test answer.
The goals in scoring test answers that cannot be machine scored include efficiency and consistency. These test answer sheets are typically scored by test resolvers either by manually scoring the physical test answer sheet or scoring an electronic representation of the test answer sheet on a computer. Ideally, the scores provided by the various test resolvers for a particular test question should be consistent, since the scores are used in comparing performance of the students against one another. In addition, a test resolver should ideally work efficiently so as to maintain consistently high scoring rates. The test resolver should not have such a high scoring rate that the consistency or quality of scoring significantly declines; likewise, the test resolver should not have such a low scoring rate that the too few answer sheets are being scored. This manual scoring of test answer sheets, however, makes it difficult to monitor the consistency of scoring among the various test resolvers.
In many situations, test resolvers actually travel to a particular location so that all test resolvers may simultaneously score test answer sheets. Requiring the test resolvers to travel to a given location is inconvenient for the resolvers and expensive for those who administer the tests. Furthermore, tracking the performance of test resolvers against both their own performance and the performance of other resolvers can be very difficult with a manual scoring environment.
The process of resolving test questions is currently done manually, and this presents problems. A resolver is manually presented with the actual test answer sheets for scoring. This process is relatively inefficient, since the resolvers must score the answer sheets one at a time and in the order in which they are presented. Also, manual scoring systems do not have the capability to efficiently gather and categorize the test answers for subsequent analysis. Therefore, with a manual system it is very difficult to determine how teaching methods should be changed to decrease, for example, the number of incorrect answers.
A need thus exists for a system that promotes and achieves consistency and efficiency in scoring or resolving of tests.
The present categorized data item reporting method and system groups data items into predefined categories for analysis and review. In the method, a plurality of data items are received. The data items comprise an electronic representation of at least a portion of a person's work product. The data items are divided according to predefined categories. The divided data items are organized into separate groupings. Finally, the data items in the groupings are reported.
In the following detailed description of the preferred embodiment, reference is made to the accompanying drawings which form a part hereof and in which is shown by way of illustration a specific embodiment in which the invention may be practiced. This embodiment is described in sufficient detail to enable those skilled in the art to practice the invention, and it is to be understood that other embodiments may be utilized and that structural or logical changes may be made without departing from the scope of the present invention. The following detailed description is, therefore, not to be taken in a limiting sense, and the scope of the present invention is defined by the appended claims.
The system uses the scanners for reading in test answer sheets. These test answer sheets may comprise, for example, test forms with “bubbles” or ovals representing possible answers, handwritten essays, or other various types of written or printed information. After receiving the scanned test data, the system within the RISC servers can process those scanned test answer sheets to generate test items of interest from the answer sheets. A test item is, therefore, an electronic representation of at least a portion of a test answer sheet. The system may distribute these test items to the work stations for on-line scoring. A test scorer at a work station can then score the test item and enter a test score. The system receives the test scores via the network and the RISC servers and distributes the scores to an appropriate computer for subsequent printing and reporting; the appropriate computer may include, for example, the mainframe computer 20 or a server. The system may also transmit the test scores to, for example, a disk or telephone line.
The computer 24, preferably implemented with an HP 1000, is interfaced to the scanner 25 and PC 26 for controlling the operation of the scanning unit. The computer 24 is optional; the system may alternatively be configured such that all of the functionality of the computer 24 is within the PC 26. The computer 24 controls the scanner via the OMR logic 32 and thus controls when image data is scanned in and subsequently transferred to the PC 26. The PC 26 essentially acts as a buffer for holding the image data. The computer 24 further controls when the PC 26 will interrogate the image data for transmission to a server 27 for subsequent processing and scoring. The PC 26 can also electronically remove or “clip” an area of interest from the image data, which represents at least a portion of the scanned test answer sheets.
Examples of two systems for storing and extracting information from scanned images of test answer sheets are shown in U.S. Pat. Nos. 5,134,669 and 5,103,490, both of which are assigned to National Computer Systems, Inc. and are incorporated herein by reference as if fully set forth.
The server 27 receives the image data, which includes test items, and provides for processing and control of the image data. This portion, which may be a test item, is then distributed to the work stations 28, 29 and 30 for subsequent scoring. A test resolver (scorer) at the work station typically receives the test item, performs the scoring, and transmits the score to the receiving computer.
The main processing module 45 controls the processing of test items. It controls the transmission of test items to the work stations for scoring and the transmission of scores to the mainframe computer 20. The main processing module 45 also monitors the performance of the test resolvers to maintain consistent and efficient resolving of test items, as is explained below.
The main processing module 45 typically contains the following basic functions, which are controlled by system management module 32. A work flow module 38 receives image data from the database 36 and controls the flow of data into an edit process module 39. The edit process module 39 may perform machine scoring of the test items. For those test items which cannot be machine scored, or possibly for other test items, the system transmits such test items to the job build function 40. The job build function 40 determines what type of subsequent scoring is required for the test item and, for example, which work station will receive the test item. A job send module 41 receives the test item and transmits it to a router 42, which in turn transmits the test item to a send/receive communication module 43. Edit work module 34 and edit server module 35 control the flow of test items into and out of server 27. Incoming data, such as test answers from the work station, are transmitted through modules 34 and 35 to a job receive module 44. The job receive module transmits the data to the edit process module 39 for subsequent storage within the database 36.
The system waits at step 55 until it determines that a test item is ready to be resolved or scored. If multiple resolution items are present within the image data, as determined at step 59, then the system sends the test item to multiple item processing at step 63. Otherwise, the system performs other resolution processes on the data at step 60 and stores the result in work-in-process storage 55 at step 61. Other resolution processes may include, for example, machine scoring, raw key entry, and analytic resolving.
Analytic resolving or scoring may include, for example, map comparisons such as bit-mapped comparisons between two test items. The map comparisons allow a test resolver to compare, for example, the answers of a respondent over time to track the respondent's progress. For example, the analytic scoring may involve comparing two hand-drawn circles by the respondent to determine if the respondent's accuracy in drawing circles has improved over time. Analytic scoring may also include, for example, circling or electronically indicating misspelled words and punctuation errors in an answer such as an essay.
The system typically transmits test items to a particular resolver based upon the resolver's resolution expertise. For example, a certain resolver may be assigned to score all of the test items relating to science questions. Resolution expertise may also comprise, for example, math, english, history, geography, foreign languages, or other subjects.
An example of an interface on the resolver display is shown in FIG. 14. The interface typically comprises a plurality of cells 74, with each cell containing one test item to be resolved. After displaying the multiple items in the cells of the resolver display, the system allows the resolver at step 72 to score the multiple items. A test resolver would typically indicate the score of the answers by using a “mouse,” light pen, touch screen, voice input, or some other type of cursor control or input device.
In the example shown in
After scoring or resolving, the system receives the results at step 73 for subsequent storage in work-in-process storage 55. A test resolver typically transmits the results of resolving all displayed test items in the cells as a single unit for batch processing.
In addition, the system may merge an image of a test item with the corresponding score. In order to facilitate teaching of material to which the test relates, the system typically merges a test item representing an incorrect answer with the corresponding score. By reporting the actual test item, an instructor may gain insight into a thought process used by the student in arriving at the incorrect answer. Therefore, by having some knowledge of why a student answered a test question incorrectly, an instructor can take measures to change or modify teaching strategies to correct the situation.
The categorized item reporting normally comprises the following functions. The system at step 75 scans the work-in-process storage for items that are ready to be reported. If test items are ready for reporting, as determined at step 76, the system processes the data at step 77 for generating an appropriate report of the data. At step 78, the system scans the central application. repository for definitions of categorized (special) items. As special items are available for reporting, as determined at step 79, the system retrieves the special items at step 80 and can merge it at step 81 with other report information, such as the corresponding test items, as explained above. The system then distributes a report at step 82, which can be a printed report.
The system at steps 83 and 84 determines if items are available for scoring. At step 85, the system receives collaborative scoring requirements from the database and determines at step 86 if collaborative scoring is required. Examples of collaborative scoring requirements are illustrated below. If collaborative scoring has been specified, the system retrieves the item to be scored from the work-in-process database at step 87 and sends the item to resolvers 1 and 2 at steps 88 and 91. At step 140, the system can prevent resolver 1 and resolver 2 from scoring the same test item if they have average scores within predefined scoring criteria.
The system is further able to choose resolvers according to selection criteria at steps 89 and 90. The selection criteria of the resolvers for scoring answers may include, for example, race, gender, or geographic location. The ability of the system to assign test resolvers to score particular test items provides the basis for increased fairness and consistency in the scoring of tests. For example, the system may assign test resolvers to test items based on the same racial classification, meaning that the test resolver has the same racial classification as the student or respondent whose test the resolver is scoring. The system may also assign test resolvers to test items based on a different, forced different, or preferred blend of classifications. The system monitors consistency in scoring based on the selection criteria and, more importantly, can change the selection criteria to ensure consistent and fair scoring of test items.
The system records the scores from resolvers 1 and 2 at steps 94 and 95, respectively, and stores such scores in a temporary storage 96. At step 97, the system compares the scores according to criteria specified in the central application repository. Such criteria may include, for example, requiring that the scores be within a predefined percentage of each other. If the scores meet the criteria as determined at step 98, the system records the score in the work-in-process database at step 46. Otherwise, if the scores do not meet the criteria, the system determines at step 99 if the scores of the resolvers must agree. If the first two resolvers scores do not need to agree, then the system preferably transmits the test item to a third resolver to “cure” the discrepancy in the first two scores. At step 100, the system determines if the third resolver should see the first scores.
Instead of allowing the resolvers to work together to record an agreed-upon score, the system may optionally record either a greater value of the first and second test scores, a lower value of the first and second test scores, or an average value of the first and second test scores.
If the collaborative scoring criteria specifies that the third resolver should arbitrate the discrepancy and determine a score, then the system displays scores from the resolvers 1 and 2 at step 106 for resolver 3. The third resolver (resolver 3) then typically enters a score for the test item at step 107, and the system records the score in the work-in-process database at step 108.
If the collaborative scoring requirement specifies that the third resolver should not see the first two scores, then the system executes steps 109-111. At step 109, the system displays the test item for the third resolver. The third resolver then typically enters a score at step 110, and the system records the score in the work-in-process database at step 111.
The system then waits for a scheduled quality check at step 113. At the quality check, the system, at step 114, sends the known quality item to the scheduled resolver. At step 116, the system updates the resolver's quality profile based on the evaluation at step 115. If the resolver should receive a quality result, as determined at step 117, the system displays the quality profile to the resolver at step 118. At step 119, the system sends the quality profile to a manager for subsequent review. At step 120, the system takes action required to assure scoring accuracy.
Validity is typically measured by determining if a particular resolver is applying the scoring key correctly to test items or, in other words, scoring test items as an expert would score the same items. Reliability is typically measured by determining if a particular will resolve the same test item the same way over time (providing consistent scoring). Speed is typically measured by comparing a resolver's scoring rate with past scoring rates of the resolver or other scoring rates, such as average scoring rates or benchmark scoring rates.
At step 121, the system typically continually monitors the resolver's performance and updates the performance. Monitoring the resolver's performance may include, as explained above, monitoring the resolver's validity, reliability, and speed in resolving test items. The system periodically, according to predefined criteria, performs performance checks of the test resolvers. Predefined criteria may include, for example: a time period; recalls (how often a resolver evaluates his or her own work); requesting help; the number of agreements among multiple resolvers; the amount of deviation between the resolver's score and a known score, which may be determined using quality items; the frequency of these deviations; the speed at which a resolver enters a response during resolving of test items; the length of time between scores entered by a test resolver; a test resolver's previous scoring rate, an average scoring rate of a test resolver; average scoring rates of other test resolvers; or some predetermined benchmark scoring rate.
At step 122, the system determines whether it is time for a scheduled performance check according to the predetermined criteria. If it is time for a performance check, the system at step 123 compares the resolvers' current performance, as determined at step 121, with the stored performance criteria. At step 124, the system determines if there is a discrepancy in the resolver's performance according to the predetermined criteria. For example, the system may determine if the resolver's current scoring rate is within a predefined percentage of the average scoring rate in order to ensure efficient scoring by the test resolver. If there is no discrepancy, the system returns to monitoring the resolver's performance. In addition, the system may store the resolver's current performance values for later processing. Otherwise, the system reports the discrepancy at step 125.
At step 126, the system determines if it should recommend a break in scoring to the resolver. If according to predetermined performance criteria, the system should recommend a break in scoring, then the system signals the resolver at step 128 to halt scoring. Predefined performance criteria may include, for example, deviations in the resolver's validity, reliability, or speed of resolving test items. Examples of predefined performance criteria are provided above with respect to the monitoring of resolvers' performance.
When the resolver stops scoring, the system may provide the resolver with the option of requesting diversionary activities. Diversionary activities are designed to provide the test resolver with a rest period and “break” from scoring to increase efficiency. Examples of diversionary activities include computer games and cross word puzzles. If the resolver has requested such diversionary activities, as determined at step 129, then the system transmits a diversionary activity to the resolver at step 130. Otherwise, the system returns to monitoring the resolver's scoring rate when the resolver resumes the scoring.
If the system at step 126 does not recommend a break in scoring based on the discrepancy, then the system may optionally provide the resolver with diversionary activities as determined at step 127. If the resolver should receive the diversionary activities, then the system sends such activities to the resolver at step 130. Otherwise the system returns to monitoring the resolver's scoring rate.
At step 131, the system sends a test item to a resolver for scoring and displays the test item at step 132. If the resolver has requested scoring rules, as determined at step 133, then the system interrogates a stored scoring guide to locate scoring rules that correspond to a test question for the test item currently displayed to the resolver. The system retrieves those particular scoring rules at step 135 and displays them to the resolver at step 136. The system preferably uses a multi-tasking environment in order to simultaneously display the scoring rules and the test item. At step 134, the system waits for the resolver to score the test item. At step 137, the system stores the test score entered by the resolver into the work-in-process storage.
As described above, the present invention is a system that processes test items. The various functions used in processing the test items promote efficient, high quality, and consistent scoring of test items.
While the present invention has been described in connection with the preferred embodiment thereof, it will be understood that many modifications will be readily apparent to those skilled in the art, and this application is intended to cover any adaptations or variations thereof. For example, a different hardware configuration may be used without departing from the scope of the invention and many variations of the processes described may be used. It is manifestly intended that this invention be limited only by the claims and equivalents thereof.
This is a continuation of application Ser. No. 10/425,775 now U.S. Pat. No. 6,749,435, filed Apr. 29, 2003, which is a continuation of application Ser. No. 09/660,204, filed Sep. 12, 2000, now U.S. Pat. No. 6,558,166 B1, which is a continuation of application Ser. No. 09/141,804, filed on Aug. 28, 1998, now U.S. Pat. No. 6,168,440 B1, which is a continuation of application Ser. No. 09/003,979, filed on Jan. 7, 1998, now abandoned, which is a continuation of application Ser. No. 08/561,081, filed Nov. 20, 1995, now U.S. Pat. No. 5,735,694, which is a continuation of application Ser. No. 08/290,014, filed Aug. 12, 1994, now U.S. Pat. No. 5,558,521, which is a division of application Ser. No. 08/014,176, filed Feb. 5, 1993, now U.S. Pat. No. 5,437,554; and application Ser. No. 09/143,682, filed on Aug. 28, 1998, now U.S. Pat. No. 6,159,018, which is a continuation of application Ser. No. 09/003,979, filed on Jan. 7, 1998, now abandoned, which is a continuation of application Ser. No. 08/561,081, filed on Nov. 20, 1995, now U.S. Pat. No. 5,735,694, which is a continuation of application Ser. No. 08/290,014, filed Aug. 12, 1994, now U.S. Pat. No. 5,558,521, which is a division of application Ser. No. 08/014,176, filed Feb. 5, 1993, now U.S. Pat. No. 5,437,554, are hereby incorporated by reference in their entirety.
Number | Name | Date | Kind |
---|---|---|---|
3405457 | Bitzer | Oct 1968 | A |
3538626 | Frank | Nov 1970 | A |
3762072 | From | Oct 1973 | A |
3932948 | Goddard et al. | Jan 1976 | A |
4004354 | Yamauchi | Jan 1977 | A |
4151659 | Lien et al. | May 1979 | A |
4205780 | Burns et al. | Jun 1980 | A |
4478584 | Kaney | Oct 1984 | A |
4518267 | Hepp | May 1985 | A |
4518361 | Conway | May 1985 | A |
4553261 | Froessl | Nov 1985 | A |
4627818 | Von Fellenberg | Dec 1986 | A |
4648062 | Johnson et al. | Mar 1987 | A |
4671772 | Slade et al. | Jun 1987 | A |
4694352 | Ina et al. | Sep 1987 | A |
4705479 | Maron | Nov 1987 | A |
4708503 | Poor | Nov 1987 | A |
4715818 | Shapiro et al. | Dec 1987 | A |
4741047 | Sharpe, II | Apr 1988 | A |
4760246 | Shepard | Jul 1988 | A |
4764120 | Griffin et al. | Aug 1988 | A |
4785472 | Shapiro | Nov 1988 | A |
4789543 | Linder | Dec 1988 | A |
4798543 | Spiece | Jan 1989 | A |
4845739 | Katz | Jul 1989 | A |
4867685 | Brush et al. | Sep 1989 | A |
4878175 | Norden-Paul et al. | Oct 1989 | A |
4895518 | Arnold et al. | Jan 1990 | A |
4908759 | Alexander, Jr. et al. | Mar 1990 | A |
4930077 | Fan | May 1990 | A |
4937439 | Wanninger et al. | Jun 1990 | A |
4958284 | Bishop et al. | Sep 1990 | A |
4978305 | Kraft | Dec 1990 | A |
4996642 | Hey | Feb 1991 | A |
5002491 | Abrahamson et al. | Mar 1991 | A |
5003613 | Lovelady et al. | Mar 1991 | A |
5011413 | Ferris et al. | Apr 1991 | A |
5023435 | Deniger | Jun 1991 | A |
5035625 | Munson et al. | Jul 1991 | A |
5038392 | Morris et al. | Aug 1991 | A |
5054096 | Beizer | Oct 1991 | A |
5058185 | Morris et al. | Oct 1991 | A |
5059127 | Lewis et al. | Oct 1991 | A |
5072383 | Brimm et al. | Dec 1991 | A |
5086385 | Launey et al. | Feb 1992 | A |
5100329 | Deesen et al. | Mar 1992 | A |
5101447 | Sokoloff et al. | Mar 1992 | A |
5103490 | McMillin | Apr 1992 | A |
5105354 | Nishimura | Apr 1992 | A |
5119433 | Will | Jun 1992 | A |
5134669 | Keogh et al. | Jul 1992 | A |
5140650 | Casey et al. | Aug 1992 | A |
5147205 | Gross et al. | Sep 1992 | A |
5151948 | Lyke et al. | Sep 1992 | A |
5176520 | Hamilton | Jan 1993 | A |
5180309 | Egnor | Jan 1993 | A |
5195033 | Samph et al. | Mar 1993 | A |
5204813 | Samph et al. | Apr 1993 | A |
5211564 | Martinez et al. | May 1993 | A |
5258855 | Lech et al. | Nov 1993 | A |
5259766 | Sack et al. | Nov 1993 | A |
5261823 | Kurokawa | Nov 1993 | A |
RE34476 | Norwood | Dec 1993 | E |
5267865 | Lee et al. | Dec 1993 | A |
5294229 | Hartzell et al. | Mar 1994 | A |
5302132 | Corder | Apr 1994 | A |
5310349 | Daniels et al. | May 1994 | A |
5318450 | Carver | Jun 1994 | A |
5321611 | Clark et al. | Jun 1994 | A |
5344132 | LeBrun et al. | Sep 1994 | A |
5376007 | Zirm | Dec 1994 | A |
5379213 | Derks | Jan 1995 | A |
5387104 | Corder | Feb 1995 | A |
5418865 | Bloomberg | May 1995 | A |
5433615 | Clark | Jul 1995 | A |
5437554 | Clark et al. | Aug 1995 | A |
5437555 | Ziv-El | Aug 1995 | A |
5452379 | Poor | Sep 1995 | A |
5458493 | Clark et al. | Oct 1995 | A |
5466159 | Clark et al. | Nov 1995 | A |
5496175 | Oyama et al. | Mar 1996 | A |
5544255 | Smithies et al. | Aug 1996 | A |
5558521 | Clark et al. | Sep 1996 | A |
5565316 | Kershaw et al. | Oct 1996 | A |
5596698 | Morgan | Jan 1997 | A |
5634101 | Blau | May 1997 | A |
5647017 | Smithies et al. | Jul 1997 | A |
5672060 | Poor | Sep 1997 | A |
5690497 | Clark et al. | Nov 1997 | A |
5691895 | Kurtzberg et al. | Nov 1997 | A |
5709551 | Clark et al. | Jan 1998 | A |
5716213 | Clark et al. | Feb 1998 | A |
5718591 | Clark et al. | Feb 1998 | A |
5735694 | Clark et al. | Apr 1998 | A |
5752836 | Clark et al. | May 1998 | A |
5987149 | Poor | Nov 1999 | A |
5987302 | Driscoll et al. | Nov 1999 | A |
5991595 | Romano et al. | Nov 1999 | A |
5999908 | Abelow | Dec 1999 | A |
6155839 | Clark et al. | Dec 2000 | A |
6157921 | Barnhill | Dec 2000 | A |
6159018 | Clark et al. | Dec 2000 | A |
6168440 | Clark et al. | Jan 2001 | B1 |
6183261 | Clark et al. | Feb 2001 | B1 |
6193521 | Clark et al. | Feb 2001 | B1 |
6558166 | Clark et al. | May 2003 | B1 |
Number | Date | Country |
---|---|---|
0 171 663 | Feb 1986 | EP |
2 274 932 | Aug 1994 | GB |
56-10634 | Feb 1981 | JP |
62-75578 | Apr 1987 | JP |
3-1709 | Jan 1991 | JP |
4-147288 | May 1992 | JP |
5-74825 | Oct 1993 | JP |
WO 9005970 | May 1990 | WO |
WO 9906930 | Feb 1999 | WO |
Number | Date | Country | |
---|---|---|---|
20040086841 A1 | May 2004 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 08014176 | Feb 1993 | US |
Child | 08290014 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 10425775 | Apr 2003 | US |
Child | 10690335 | US | |
Parent | 09660204 | Sep 2000 | US |
Child | 10425775 | US | |
Parent | 09141804 | Aug 1998 | US |
Child | 09660204 | US | |
Parent | 09143682 | Aug 1998 | US |
Child | 09141804 | US | |
Parent | 09003979 | Jan 1998 | US |
Child | 09143682 | US | |
Parent | 08561081 | Nov 1995 | US |
Child | 09003979 | US | |
Parent | 08290014 | Aug 1994 | US |
Child | 08561081 | US |