Claims
- 1. A method of correcting a misspelled word in input text, the method comprising steps of:
detecting a misspelled word in the input text, wherein the detecting comprises comparing each word in the input text to a dictionary database and characterizing a word as misspelled when the word does not match any words in the dictionary database; determining a list of alternative words for the misspelled word; and ranking the list of alternative words based on a context of the input text, wherein the alternative words yield correct parts of speech sequences according to the context.
- 2. The method of claim 1, wherein the list of alternative words includes a best alternative word, the method further comprising the step of:
replacing the misspelled word with the best alternative word.
- 3. The method of claim 1, further comprising the step of:
determining whether an alternative word is an element of a compound word or lexical phrase; and wherein the ranking step comprises modifying the rank of one or more of the alternative words if the alternative word is an element of a compound word or lexical phrase.
- 4. A method according to claim 1, 2, or 3, wherein the ranking step comprises:
generating a finite state machine (“FSM”) for the input text, the FSM having a plurality of arcs which include the alternative words and weights associated therewith, where a weight of each alternative word corresponds to a likelihood that the alternative word, taken out of grammatical context, comprises a correctly-spelled version of the misspelled word; and modifying the FSM in accordance with one or more of a plurality of predetermined grammatical rules.
- 5. A method according to claim 1, 2 or 3, wherein the ranking step comprises:
generating a first finite state machine (“FSM”) for the input text, the first FSM having a plurality of arcs which include the alternative words and weights associated therewith, where a weight of each alternative word corresponds to a likelihood that the alternative word, taken out of grammatical context, comprises a correctly-spelled version of the misspelled word; applying a second FSM to the first FSM, wherein the second FSM encodes a set of grammatically correct sequences of words.
- 6. The method of claim 5, wherein the applying step comprises:
generating a third FSM.
- 7. The method of claim 6, wherein the applying step comprises modifying the first FSM.
- 8. A method of retrieving text from a source, the method comprising the steps of:
inputting a search word; correcting a spelling of the search word by comparing the search word to a dictionary database and characterizing the search word as a misspelled search word when the search word does not match any words in the dictionary database; determining a list of alternative words including a best alternative word for the misspelled search word; ranking the list of alternative words based on a context of the input text; replacing the misspelled search word with the best alternative word; and retrieving text from the source that includes the corrected search word.
- 9. The method of claim 8, wherein the source comprises a pre-stored database.
- 10. The method of claim 8, wherein the source comprises a remote network location; and wherein the method further comprises the step of displaying the retrieved text on a local display screen.
- 11. The method of claim 8, wherein the correcting step further comprises the step of:
displaying one or more corrected search words and manually selecting one of plural corrected search words.
- 12. The method of claim 8, wherein the correcting step further comprises the step of:
automatically selecting one of plural corrected search words.
- 13. A method of retrieving text from a source, the method comprising the steps of:
inputting a search phrase comprised of a plurality of words, at least one of the plurality of words being a misspelled word; replacing the misspelled word in the search phrase with a corrected word in order to produce a corrected search phrase by
comparing each word in the search phrase to a dictionary database and characterizing a word as a misspelled word when the word does not match any words in the dictionary database,determining a list of alternative words including a best alternative word for the misspelled word, ranking the list of alternative words based on a context of the input text, and replacing the misspelled word with the best alternative word; and retrieving text from the source based on the corrected search phrase.
- 14. The method of claim 13, wherein the source comprises a pre-stored database.
- 15. The method of claim 13, wherein the source comprises a remote network location; and wherein the method further comprises the step of displaying the retrieved text on a local display screen.
- 16. The method of claim 13, wherein the correcting step comprises displaying one or more corrected search words and manually selecting one of plural corrected search words.
- 17. The method of claim 13, wherein the correcting step comprises automatically selecting one of plural corrected search words.
- 18. A method of correcting misspelled words in input text sequences received from a plurality of different clients, the method comprising the steps of:
storing, in a memory on a server, a single shared lexicon comprised of a plurality of reference words; receiving the input text sequences from the plurality of different clients; spell-checking the input text sequences using the reference words in the single shared lexicon; and outputting spell-checked text sequences to the plurality of different clients.
- 19. The method of claim 18,
wherein the single shared lexicon comprises one or more lexicon finite state machines (“FSM”), each of the lexicon FSMs representing plural reference words, wherein a representation of a reference word comprises one or more states and one or more arcs, each arc comprising a character in the reference word; and wherein the spell-checking step comprises a correcting step for correcting misspelled words in each of the input text sequences substantially in parallel using the single shared lexicon comprised of one or more lexicon FSMs.
- 20. The method of claim 19, wherein, for each text sequence, the correcting step comprises:
generating an additional FSM comprising a plurality of states, each state including information identifying a state of a lexicon FSM and a position in the input word and a cost, wherein the cost is used to select states of the additional FSM that are to be expanded; selecting one or more reference words from the lexicon FSMs based on the additional FSM; and replacing the misspelled word in the text sequence with a selected one of the one or more reference words.
- 21. The method of claim 19, further comprising the step of:
generating an input FSM for a misspelled word in the text sequence, wherein the input FSM comprises one or more states and one or more arcs, each arc comprising a character in the reference word.
- 22. The method of claim 19, further comprising the step of:
generating an input FSM for a misspelled word in the text sequence, wherein the input FSM comprises one or more states and one or more arcs, each arc comprising a pair of characters, one of which is a character in the reference word and the other of which is a phonetic representation thereof.
- 23. The method of claim 18, further comprising, after the outputting step, the step of retrieving a document from a source using one of the spell-checked text sequences.
- 24. A method of selecting a replacement word for an input word in a phrase, the method comprising the steps of:
determining alternative words for the input word, the alternative words including at least one compound word which is comprised of two or more separate words, each alternative word having a rank associated therewith; and selecting, as the replacement word, an alternative word having a highest rank.
- 25. The method of claim 24, further comprising, between the determining and selecting steps, the step of generating a finite state machine (“FSM”) comprised of two or more arcs which include the alternatives to the input words.
- 26. The method of claim 24, wherein, in a case that the selecting step selects a compound word as the replacement word, the method further comprises the step of replacing the input word and at least one other word in the phrase with the compound word.
- 27. The method of claim 24, further comprising, between the detecting and selecting steps, the step of adjusting the rank of each alternative based on a grammatical context of the input word in the phrase.
- 28. The method of claim 27,
wherein each word in the phrase has a part of speech associated therewith, and each of the alternative words has a part of speech associated therewith; and wherein the adjusting step adjusts the rank of each alternative word based on whether a part of speech of the alternative word fits with a part of speech of at least one word adjacent to the input word.
- 29. The method of claim 27, wherein each compound word in the phrase has a single part of speech associated therewith.
- 30. The method of claim 24, further comprising, between the determining and selecting steps, the step of displaying the alternative words ranked in order; and wherein the selecting step is performed manually.
- 31. The method of claim 24, wherein at least one of the alternative words comprises a word having an accent mark and/or a diacritic which is different from, and/or missing from, the input word.
- 32. A method of correcting a grammatical error in input text, the method comprising the steps of:
detecting a grammatical error in input text; generating a first finite state machine (“FSM”) for the input text, the first finite state machine including alternative words for at least one word that comprises a grammatical error in the input text and a rank associated with each alternative word; adjusting the ranks in the first FSM in accordance with one or more of a plurality of predetermined grammatical rules; determining which of the alternative words is grammatically correct based on the ranks associated with the alternative words; and replacing the at least one word that comprises a grammatical error in the input text with a grammatically correct alternative word determined in the determining step.
- 33. The method of claim 32, wherein a grammatical error comprises a word that is spelled correctly but is grammatically incorrect in the context in which it is found in the input text.
- 34. The method of claim 32, wherein a grammatical error comprises a word that is spelled correctly but corresponds to one of a plurality of words which is substantially similar in spelling or pronunciation.
- 35. The method of claim 32, wherein the first FSM also includes one or more parts-of-speech for the alternative words; and
wherein the determining step determines which of the alternative words is grammatically correct based in addition on the parts-of-speech of the alternative words.
- 36. The method of claim 35, wherein the adjusting step comprises:
applying a second FSM to the first FSM, wherein the second FSM encodes a set of grammatically correct sequences of words, wherein the applying step comprises either generating a third FSM for the input text or modifying the second FSM to create a third FSM, the third FSM including the alternative words and ranks associated with each alternative word; and combining the ranks in the third FSM with the ranks in the first FSM.
- 37. The method of claim 35 or 36, wherein the generating step comprises performing a morphological analysis on each word in the input text in order to provide the parts-of-speech and ranks.
- 38. A word processing method for creating and editing text documents, the word processing method comprising the steps of:
inputting text into a text document; checking the document for grammatically-incorrect words; replacing grammatically-incorrect words in the document with grammatically-correct words; and outputting the document; wherein the checking step comprises (i) generating a finite state machine (“FSM”) for text in the text document, the finite state machine including alternative words for at least one word in the text and a rank associated with each alternative word, (ii) adjusting the ranks in the FSM in accordance with one or more of a plurality of predetermined grammatical rules, and (iii) determining which of the alternative words is grammatically correct based on ranks for the alternative words.
- 39. A method of retrieving text from a source, the method comprising the steps of:
inputting a search phrase comprised of a plurality of words, at least one of the plurality of words being a grammatically incorrect word; replacing the grammatically incorrect word in the search phrase with a grammatically correct word in order to produce a corrected search phrase; and retrieving text from the source based on the corrected search phrase.
- 40. The method of claim 39, wherein the source comprises a remote network location; and wherein the method further comprises the step of displaying the retrieved text on a local display screen.
- 41. The method of claim 39, wherein the correcting step comprises displaying one or more corrected search words and manually selecting one of plural corrected search words.
- 42. The method of claim 39, wherein the correcting step comprises automatically selecting one of plural corrected search words.
- 43. A method of retrieving text from a source, the method comprising the steps of:
inputting a search phrase comprised of a plurality of words, at least one of the plurality of words being a grammatically incorrect word; replacing the grammatically incorrect word in the search phrase with a grammatically correct word in order to produce a corrected search phrase, wherein the replacing step comprises the method of any of claims 241, 249, or 253; and retrieving text from the source based on the corrected search phrase.
- 44. A method of spell-checking input text, the method comprising the steps of:
detecting a misspelled word in the input text; storing one or more lexicon finite state machines (“FSM”) in a memory, each of the lexicon FSMs including plural reference words; generating an input FSM for the misspelled word; selecting one or more reference words from the lexicon FSMs based on the input FSM, the one or more reference words substantially corresponding to a spelling of the misspelled word; and outputting selected ones of the one or more reference words.
- 45. The method of claim 44,
wherein each of the lexicon FSMs also includes a phonetic representation of each reference word; wherein the input FSM also includes a phonetic representation of the misspelled word; and wherein the selecting step selects reference words from the lexicon FSMs which also substantially correspond to the phonetic representation of the misspelled word.
Parent Case Info
[0001] This application is a continuation of and claims priority to U.S. patent application Ser. No. 09/084,535, filed May 26, 1998, the contents of which are hereby incorporated by reference in their entirety.
Continuations (1)
|
Number |
Date |
Country |
| Parent |
09084535 |
May 1998 |
US |
| Child |
10153460 |
May 2002 |
US |