Claims
- 1. A method of grammar checking an input string of text, the method comprising:
generating alternative strings of text; generating separate stochastic parse scores for the input string of text and for each of the alternative strings of text; determining which one string of text, out of the input string of text and the alternative strings of text, has the highest stochastic parse score; and selecting the one string of text which has the highest stochastic parse score.
- 2. The method of claim 1, wherein generating alternative strings of text further comprises generating the alternative strings of text such that each of the alternative strings of text corresponds to a different possible grammatical correction of the input string of text.
- 3. The method of claim 2, and further comprising recommending to a user the selected one string of text which has the highest stochastic parse score.
- 4. The method of claim 3, wherein if the selected one string of text is the input string of text, then recommending to the user the selected one string of text further comprises not recommending a grammar correction for the input string of text.
- 5. The method of claim 1, wherein generating separate stochastic parse scores for the input string of text and for each of the alternative strings of text further comprises:
producing parse trees for the input string of text and for each of the alternative strings of text; and calculating separate parse scores for each of the parse trees generated for the input string of text and for each of the alternative strings of text.
- 6. The method of claim 5, wherein calculating separate parse scores for each of the parse trees further comprises generating a statistical goodness measure for each of the parse trees.
- 7. The method of claim 6, wherein the statistical goodness measure for each of the parse trees is an indicator of a likelihood that the particular parse tree represents the intended meaning of a human originating the text corresponding to the input string of text.
- 8. The method of claim 7, wherein calculating the separate parse scores for each of the parse trees further comprises calculating the parse scores as functions of probabilities determined using a training corpus.
- 9. The method of claim 6, wherein generating a statistical goodness measure for each of the parse trees comprises combining probabilities of each node in a particular parse tree, wherein the probabilities of each node are determined by the steps comprising:
receiving language-usage probabilities based upon appearances of instances of combinations of linguistic features within the training corpus; and calculating the probabilities of each node based upon linguistic features of each node and the language-usage probabilities.
- 10. A computer-readable medium having computer-executable instructions for performing the grammar checking steps comprising:
receiving an input string of text; generating alternative strings of text; generating separate stochastic parse scores for the input string of text and for each of the alternative strings of text; determining which one string of text, out of the input string of text and the alternative strings of text, has the highest stochastic parse score; and selecting the one string of text which has the highest stochastic parse score.
- 11. The computer readable medium of claim 10, wherein generating alternative strings of text further comprises generating the alternative strings of text such that each of the alternative strings of text corresponds to a different possible grammatical correction of the input string of text.
- 12. The computer readable medium of claim 11, and further having computer-executable instructions for performing the grammar checking step of recommending to a user the selected one string of text which has the highest stochastic parse score.
- 13. The computer readable medium of claim 12, wherein if the selected one string of text is the input string of text, then recommending to the user the selected one string of text further comprises not recommending a grammar correction for the input string of text.
- 14. The computer readable medium of claim 10, wherein generating separate stochastic parse scores for the input string of text and for each of the alternative strings of text further comprises the steps:
producing parse trees for the input string of text and for each of the alternative strings of text; and calculating separate parse scores for each of the parse trees generated for the input string of text and for each of the alternative strings of text.
- 15. The computer readable medium of claim 14, wherein calculating separate parse scores for each of the parse trees further comprises generating a statistical goodness measure for each of the parse trees.
- 16. The computer readable medium of claim 15, wherein the statistical goodness measure for each of the parse trees is an indicator of a likelihood that the particular parse tree represents the intended meaning of a human originating the text corresponding to the input string of text.
- 17. The computer readable medium of claim 16, wherein calculating the separate parse scores for each of the parse trees further comprises calculating the parse scores as functions of probabilities determined using a training corpus.
- 18. The computer readable medium of claim 17, wherein generating a statistical goodness measure for each of the parse trees comprises combining probabilities of each node in a particular parse tree, wherein the probabilities of each node are determined by the steps comprising:
receiving language-usage probabilities based upon appearances of instances of combinations of linguistic features within the training corpus; and calculating the probabilities of each node based upon linguistic features of each node and the language-usage probabilities.
- 19. A grammar checking system comprising:
an alternative generator configured to receive an input string of text, and in response, to generate alternative strings of text corresponding to different possible grammatical corrections of the input string of text; a parse tree producer configured to generate parse trees for the input string of text and for each of the alternative strings of text; a stochastic score generator configured to receive the parse trees for the input string of text and for each of the alternative strings of text and to generate separate parse scores for each of the strings of text from the corresponding parse tree; and a string selector configured to determine which string of text, out of the input string of text and the alternative strings of text, has a greatest parse score and to select the string of text having the highest parse score.
- 20. The grammar checking system of claim 19, wherein the stochastic score generator is configured to generate the separate parse scores for each of the strings of text by generating a separate statistical goodness measure for each of the strings of text using the corresponding parse tree.
- 21. The grammar checking system of claim 20, wherein the statistical goodness measure for each of the parse trees is an indicator of a likelihood that the particular parse tree represents the intended meaning of a human originating the text corresponding to the input string of text.
- 22. The grammar checking system of claim 21, wherein the stochastic score generator is configured to generate the separate statistical goodness measure for each of the strings of text as functions of probabilities determined using a training corpus.
- 23. The grammar checking system of claim 22, wherein the stochastic score generator is configured to generate the separate statistical goodness measure for each of the strings of text by combining probabilities of each node in the corresponding parse tree, wherein the probabilities of each node are determined by the steps comprising:
receiving language-usage probabilities based upon appearances of instances of combinations of linguistic features within the training corpus; and calculating the probabilities of each node based upon linguistic features of each node and the language-usage probabilities.
- 24. The grammar checking system of claim 21, and further comprising a grammar checker including the alternative generator and the string selector, the grammar checker further comprising:
string storage coupled to the alternative generator and to the stochastic score generator, and configured to store the input string of text and each of the alternative strings of text, and configured to store the statistical goodness measure generated for each string of text; and a parceler coupled to the string storage and to the parse tree producer, and configured to call the parse tree producer and the stochastic score generator in order to produce a parse tree and a statistical goodness measure for each string of text.
- 25. The grammar checking system of claim 24, and further comprises a stochastic ranking parser including the parse tree producer and the stochastic score generator.
CROSS-REFERENCE TO RELATED APPLICATION
[0001] Cross reference is made to U.S. patent application Ser. No. 09/620,745, entitled, “RANKING PARSER FOR A NATURAL LANGUAGE PROCESSING SYSTEM”, filed Jul. 20, 2000.