Claims
- 1. A software tool for training and testing a knowledge base of a computerized customer relationship management system, the software tool comprising:
corpus editing processes for displaying and editing corpus items belonging to a corpus, and for assigning a suitable category from a set of predefined categories to individual corpus items; knowledge base building processes for building a knowledge base by analyzing a first subset of the corpus items; knowledge base testing processes for testing the knowledge base on a second subset of the corpus items by classifying each corpus item of the second subset into at least one of the predefined categories using information contained in the knowledge base; and reporting processes for generating reports based on results produced by the knowledge base testing processes and causing the reports to be displayed to a user.
- 2. The software tool of claim 1, wherein the knowledge base testing processes calculate a set of scores for each corpus item in the second subset, each score from the calculated set of scores being associated with a corresponding category and being representative of a confidence that the corpus item belongs to the corresponding category.
- 3. The software tool of claim 1, wherein the reporting processes generate a report relating to a single selected category.
- 4. The software tool of claim 1, wherein the reporting processes generate a cumulative report relating to a plurality of categories.
- 5. The software tool of claim 1, wherein the reporting processes calculate and display, for a selected category, a threshold match value based on user input consisting of one of a precision value, a recall value, false positive rate, false negative rate, automation ratio or a cost ratio.
- 6. The software tool of claim 1, wherein the reporting processes calculate and display, for a selected category, a precision value and a recall value based on a threshold match value input by the user.
- 7. The software tool of claim 1, wherein the reporting processes calculate precision as a function of recall and cause a graph to be displayed depicting the relationship between precision and recall.
- 8. The software tool of claim 1, wherein the reporting processes generate and display a graph depicting cumulative success over time, the graph showing, for a plurality of groups of corpus items each having a common time parameter, the fraction of corpus items in the group that were appropriately classified.
- 9. The software tool of claim 1, wherein the reporting processes generate and display a report showing, for each of a plurality of pairs of categories, a percentage of corpus items initially assigned to a first category of the pair of categories that were erroneously classified into a second category of the pair of categories.
- 10. The software tool of claim 1, wherein the reporting processes generate and display a scoring report showing, for a selected category, match values for each corpus item in the second subset, the match scores being representative of the relevance of the selected category to the corpus item.
- 11. The software tool of claim 1, wherein the first and second subsets of corpus items are selected in accordance with user input.
- 12. The software tool of claim 1, wherein the knowledge base building processes and the knowledge base testing processes use a modeling engine to analyze and classify corpus items.
- 13. The software tool of claim 12, wherein the modeling engine includes a natural language processing engine and a semantic modeling engine.
- 14. The software tool of claim 1, wherein the reporting processes are configured to allow a user to select a report to be generated from a plurality of available reports.
- 15. The software tool of claim 1, wherein the corpus items comprise customer communications.
- 16. A method for training and testing a knowledge base of a computerized customer relationship management system, comprising steps of:
collecting corpus items into a corpus; assigning a category from a set of predefined categories to individual corpus items; building a knowledge base by analyzing a first subset of corpus items; testing the knowledge base on a second subset of corpus items by classifying each corpus item of the second subset into at least one of the predefined categories using information contained in the knowledge base; and generating and displaying a report based on results produced by the testing step.
- 17. The method of claim 16, wherein the step of testing the knowledge base includes calculating a set of scores for each corpus item in the second subset, each score from the calculated set of scores being associated with a corresponding category and being representative of a confidence that the corpus item belongs to the corresponding category.
- 18. The method of claim 16, wherein the step of generating and displaying a report includes generating a report relating to a single selected category.
- 19. The method of claim 16, wherein the step of generating and displaying a report includes generating a cumulative report relating to a plurality of categories.
- 20. The method of claim 16, wherein the step of generating and displaying a report includes:
receiving user input specifying one of a precision value, a recall value, false positive rate, false negative rate, automation ratio or a cost ratio; and calculating and displaying, for a selected category, a threshold match value based on the user input.
- 21. The method of claim 16, wherein the step of generating and displaying a report includes:
receiving user input specifying a threshold match value; and calculating and displaying, for a selected category, a precision value and a recall value based on the user input.
- 22. The method of claim 16, wherein the step of generating and displaying a report includes calculating precision as a function of recall and causing a graph to be displayed depicting the relationship between precision and recall.
- 23. The method of claim 16, wherein the step of generating and displaying a report includes generating and displaying a graph depicting cumulative success over time, the graph showing, for a plurality of groups of corpus items each having a common time parameter, the fraction of corpus items in the group that were appropriately classified.
- 24. The method of claim 16, wherein the step of generating and displaying a report includes generating and displaying a report showing, for each of a plurality of pairs of categories, a percentage of corpus items initially assigned to a first category of the pair of categories that were erroneously classified into a second category of the pair of categories.
- 25. The method of claim 16, wherein the step of generating and displaying a report includes generating and displaying a scoring report showing, for a selected category, match values for each corpus item in the second subset, the match scores being representative of the relevance of the selected category to the corpus item.
- 26. The method of claim 16, wherein the first and second subsets of corpus items are selected in accordance with user input.
- 27. The method of claim 16, wherein the steps of use building and testing the knowledge base include using a modeling engine to analyze and classify corpus items.
- 28. The method of claim 16, wherein the step of generating and displaying a report includes selecting a report from a plurality of available reports in response to user input.
- 29. The method of claim 16, wherein the corpus items comprise customer communications.
- 30. The method of claim 16, wherein the corpus items include structured and unstructured information.
- 31. The software tool of claim 1, wherein the corpus items include structured and unstructured information.
- 32. A computer-readable medium embodying instructions executable by a computer for performing the steps of:
collecting corpus items into a corpus; assigning a category from a set of predefined categories to individual corpus items; building a knowledge base by analyzing a first subset of corpus items; testing the knowledge base on a second subset of corpus items by classifying each corpus item of the second subset into at least one of the predefined categories using information contained in the knowledge base; and generating and displaying a report based on results produced by the testing step.
CROSS-REFERENCE TO RELATED APPLICATION
[0001] This application claims the benefit of U.S. Provisional Application No. 60/468,493, filed May 6, 2003. The disclosure of the foregoing application is incorporated herein by reference.
Provisional Applications (1)
|
Number |
Date |
Country |
|
60468493 |
May 2003 |
US |