Claims
- 1. A natural language information retrieval method, comprising:
receiving a query; tokenizing the query; selecting a query template based on a match between the tokenized query and one or more sequence patterns associated with a first portion of the query template; initiating an information retrieval command associated with a second portion of the selected query template; receiving results based on the initiated information retrieval command; and returning at least a portion of the received results.
- 2. The method of claim 1, wherein the act of tokenizing comprises:
identifying semantic units in the query; associating a token with each uniquely identified semantic unit; and representing the query as an ordered combination of the tokens.
- 3. The method of claim 2, wherein the act of tokenizing further comprises:
identifying stems for one or more semantic units; and associating the identified stems with the token associated with the semantic unit.
- 4. The method of claim 3, wherein the act of tokenizing further comprises:
identifying lexical equivalents for a semantic unit; and associating the identified lexical equivalents with the token associated with the semantic unit.
- 5. The method of claim 4, wherein the act of identifying lexical equivalents comprises identifying misspellings, stems, synonyms, or hyponyms for the identified semantic units.
- 6. The method of claim 2, wherein the act of identifying a semantic unit comprises identifying single words, multiple word phrases, and arbitrary but predefined character strings in the query.
- 7. The method of claim 6, wherein the act of identifying arbitrary but predefined character strings comprises identifying email addresses, telephone numbers, acronyms, time indications, state designations, addresses or uniform resource locators (URLs).
- 8. The method of claim 1, wherein the act of tokenizing comprises representing the query as an ordered sequence of one or more required tokens and zero or more optional tokens.
- 9. The method of claim 1, wherein the act of selecting comprises:
ranking a plurality of query templates based on a match between the tokenized query and one or more sequence patterns associated with a first portion of each of the plurality of query templates; and selecting a query template having the best match between the tokenized query and the one or more sequence patterns associated with the first portion of the query templates.
- 10. The method of claim 9, wherein the act of ranking comprises assigning a value to each query template in accordance with any desired relevancy scheme.
- 11. The method of claim 1, wherein the act of initiating comprises initiating a uniform resource locator display command, a SQL database query, or a text-based query.
- 12. The method of claim 1, wherein the act of initiating comprises initiating more than one information retrieval command for a selected query template.
- 13. The method of claim 1, wherein the act of initiating comprises initiating one or more information retrieval commands for one or more selected query templates.
- 14. The method of claim 1, wherein the act of initiating comprises initiating an operating system call that returns data.
- 15. The method of claim 1, wherein the act of receiving results further comprises sorting the results in accordance with any desired relevancy scheme.
- 16. The method of claim 15, wherein the act of returning comprises returning only the ‘N’ highest ranked received results.
- 17. The method of claim 16, wherein the act of returning only the ‘N’ highest ranked received results comprises returning a predetermined number of received results or a predetermined range of the received results.
- 18. The method of claim 17, wherein the act of returning a predetermined range of the received results comprises returning those results that are in the top ten percent (10%) of the sorted results.
- 19. The method of claim 1, wherein the act of initiating comprises initiating a standard text-based search based on the query if a relevant query template is not available.
- 20. The method of claim 19, further comprising recording those queries that are not matched to a relevant query template.
- 21. The method of claim 20, further comprising alerting an administrator after a specified number of received queries are not matched to relevant query templates.
- 22. The method of claim 1, wherein the query template comprises one or more static sequence patterns.
- 23. The method of claim 1, wherein the query template comprises one or more dynamic sequence patterns.
- 24. A natural language method to tokenize a query, comprising:
identifying semantic units in a query; associating a token with each uniquely identified semantic unit; identifying stems for one or more of the tokens; identifying lexical equivalents for one or more of the tokens; representing the query as an ordered combination of the identified stems and tokens.
- 25. The method of claim 24, wherein the ad of identifying a semantic unit comprises identifying single words, multiple word phrases, and arbitrary but predefined character strings.
- 26. The method of claim 25, wherein the act of identifying arbitrary but predefined character strings comprises identifying email addresses, telephone numbers, acronyms, time indications, state designations, addresses or uniform resource locators (URLs).
- 27. The method of claim 24, wherein the act of associating comprises using one or more dictionaries.
- 28. The method of claim 24, wherein the act of identifying lexical equivalents comprises identifying misspellings, synonyms or hyponyms for at least one of the identified semantic units.
- 29. The method of claim 28, wherein the act of identifying lexical equivalents further comprises associating each of the identified lexical equivalents for a semantic unit with that semantic unit's token.
- 30. The method of claim 24, wherein the act of representing comprises combining two or more of the identified tokens in an ordered sequence to represent a meaning of the query.
- 31. The method of claim 30, wherein the act of combining two or more identified tokens in an ordered sequence comprises one or more logical operators.
- 32. The method of claim 40, wherein the act of combining two or more identified tokens in an ordered sequence comprises a proximity connector.
- 33. A natural language query template accessible by a program being executed on a programmable control device, comprising:
a first portion having a sequence pattern representing a query, wherein the sequence pattern includes an ordered sequence of one or more required elements and zero or more optional elements; and a second portion having a command sequence for generating a response to the query.
- 34. The natural language query template of claim 33, wherein the sequence pattern comprises one or more tokens, each of said one or more tokens associated with a semantic unit of the query.
- 35. The natural language query template of claim 34, wherein the one or more tokens represent a single word, a multiple word phrase or an arbitrary but predefined character string in the query.
- 36. The natural language query template of claim 35, wherein an arbitrary but predefined character string comprises an email addresses, a telephone number, an acronym, a time indication, a state designation, an addresses or a uniform resource locators (URLs).
- 37. The natural language query template of claim 33, wherein the first portion comprises two or more sequence patterns.
- 38. The natural language query template of claim 33, wherein the command sequence comprises a computer executable instruction.
- 39. The natural language query template of claim 38, wherein the computer executable instruction comprises a BAG OF WORDS search command.
- 40. The natural language query template of claim 38, wherein the computer executable instruction comprises a SOFT AND search command.
- 41. The natural language query template of claim 38, wherein the computer executable instruction comprises a SOFT NOT search command.
- 42. The natural language query template of claim 38, wherein the computer executable instruction comprises a search command, a SQL database query, a uniform resource locator (URL) display command or a operating system call.
- 43. The natural language query template of claim 42, wherein the computer executable instruction initiates a communication between a user submitting the query and a service provider organization.
- 44. The natural language query template of claim 43, wherein the communication comprises a Internet chat communication.
- 45. The natural language query template of claim 43, wherein the communication comprises an email communication.
- 46. The natural language query template of claim 38, wherein the command sequence comprises a static command sequence.
- 47. The natural language query template of claim 38, wherein the command sequence comprises a dynamic command sequence.
- 48. The natural language query template of claim 47, wherein the command sequence comprises a slot that is filled-in with a token from the sequence pattern from the first portion.
- 49. The natural language query template of claim 33, wherein the second portion comprises two or more command sequences.
- 50. The natural language query template of claim 49, wherein each of the two or more command sequences are selected for execution.
- 51. The natural language query template of claim 49, wherein one of the two or more command sequences are selected for execution.
- 52. A method to index data, comprising:
identifying semantic units in each of a plurality of data units; associating an identified semantic unit with a location in each of the data units in which the identified semantic unit is found; associating tokens with each uniquely identified semantic unit; and indexing the tokens.
- 53. The method of claim 52, wherein the act of identifying semantic units comprises identifying single words, multiple word phrases, and arbitrary but predefined character strings.
- 54. The method of claim 53, wherein the act of identifying arbitrary but predefined character strings comprise identifying email addresses, telephone numbers, acronyms, time indications, state designations, addresses or uniform resource locators (URLs).
- 55. The method of claim 53, wherein the act of identifying comprises using a dictionary.
- 56. The method of claim 53, wherein the act of identifying semantic units further comprises identifying misspellings, stems, synonyms, or hyponyms for at least one of the identified semantic units.
- 57. The method of claim 53, wherein the act of identifying further comprises identifying syntactic elements in at least one of the data units.
- 58. The method of claim 57, wherein the act of identifying syntactic elements comprise identifying line breaks, paragraph breaks, or sentence breaks.
- 59. The method of claim 52, wherein the act of identifying data units comprise identifying text documents, web pages, uniform resource locators (URLs) or database tables.
- 60. The method of claim 52, wherein the act of indexing comprises organizing the identified tokens in a predetermined lexical ordering.
Parent Case Info
[0001] This application claims priority on the U.S. Provisional application entitled “Automated Customer Communication and Answering System” by T. Harrison, M. Barrett, S. Reddi and J. Lowe, filed Sep. 24, 2001 (Serial No. 60/324,726).
Provisional Applications (1)
|
Number |
Date |
Country |
|
60324726 |
Sep 2001 |
US |