Claims
- 1. A method for automated analysis of an essay, the method comprising:
accepting an essay; determining whether each of a predetermined set of features is present or absent in each sentence of the essay; for each sentence in the essay, calculating a probability that the sentence is a member of a certain discourse element category, wherein the probability is based on the determinations of whether each feature in the set of features is present or absent; and choosing a sentence as the choice for the discourse element category, based on the calculated probabilities.
- 2. The method of claim 1 wherein the discourse element category is thesis statement.
- 3. The method of claim 1 wherein the essay is in an electronic form.
- 4. The method of claim 3 wherein the essay is an ASCII file.
- 5. The method of claim 1 wherein the accepting step comprises:
scanning a paper form of the essay; and performing optical character recognition on the scanned paper essay.
- 6. The method of claim 1 wherein the predetermined set of features comprises:
a feature based on position within the essay.
- 7. The method of claim 1 wherein the predetermined set of features comprises:
a feature based on presence or absence of certain words.
- 8. The method of claim 7 wherein the certain words comprise words empirically associated with thesis statements.
- 9. The method of claim 7 wherein the certain words comprise words of belief.
- 10. The method of claim 1 wherein the predetermined set of features comprises:
a feature based on rhetorical relation.
- 11. The method of claim 10 wherein the determining step comprises:
parsing the essay using a rhetorical structure parser.
- 12. The method of claim 1 wherein the calculating step comprises:
utilizing a multivariate Bernoulli model.
- 13. The method of claim 12 wherein the calculating step calculates the following quantity for each sentence:
- 14. The method of claim 13 wherein the choosing step comprises:
choosing the sentence for which the quantity is the largest.
- 15. The method of claim 1 wherein the calculating step comprises:
utilizing a LaPlace estimator.
- 16. The method of claim 1 further comprising:
providing an essay question, the essay being an answer to the essay question.
- 17. The method of claim 1 further comprising:
repeating the calculating and choosing steps for one or more different discourse element categories.
- 18. The method of claim 1 further comprising:
outputting the choice.
- 19. The method of claim 1 further comprising:
outputting a revision checklist.
- 20. A process of training an automated essay analysis method, the process comprising:
accepting a plurality of essays; accepting manual annotations demarking discourse elements in each of the plurality of essays; accepting a set of features that purportedly correlate with whether a sentence in an essay is a particular type of discourse element; calculating empirical probabilities relating to the frequency of the features; and calculating empirical probabilities relating features in the set of features to discourse elements.
- 21. The process of claim 20 further comprising:
performing the method of claim 1 on each of the plurality of essay; and judging the performance of the method of claim 1 as compared to the manual annotations; and if the performance of the method of claim 1 is inadequate, modifying the set of features and repeating the method of claim 1.
- 22. A computer readable medium on which is embedded a computer program, the computer program performing the method of claim 1.
- 23. A computer readable medium on which is embedded a computer program, the computer program performing the process of claim 20.
Parent Case Info
[0001] This application claims priority to U.S. Provisional Patent Application No. 60/263,223, filed Jan. 23, 2001, which is incorporated herein by reference.
Provisional Applications (1)
|
Number |
Date |
Country |
|
60263223 |
Jan 2001 |
US |