Claims
- 1. A computer implemented method for accessing information from a plurality of searchable information sources comprising the steps of:
analyzing a user's search query to determine a subject matter of the query; and selecting a subset of information sources from the plurality of information sources based upon the determined subject matter of the query; wherein the analyzing step combines at least two different methods of deriving a subject matter from the query.
- 2. The computer implemented method of claim 1, further comprising the step of searching at least one information source in the subset of information sources for documents relevant to the search query.
- 3. The computer implemented method of claim 1, wherein:
one deriving method of the analyzing step includes the step of comparing at least a portion of the search query against a plurality of entity lists; each entity list includes a list of phrases, each of the phrases corresponding with one or more subject matters; and the comparing step including the step of matching a phrase in an entity list against at least a portion of the search query, and upon such match, returning a subject matter corresponding to the matched phrase in the entity list.
- 4. The computer implemented method of claim 1, wherein:
one deriving method of the analyzing step includes the step of comparing the search query against a knowledge-base; the knowledge base includes a taxonomy of subject matters and a set of terms for at least some of the respective subject matters, the set of terms representing information likely to be found in the respective subject matters; and the comparing step compares at least portions of the search query against the sets of terms in the knowledge base to determine the respective subject matters of matching terms.
- 5. The computer implemented method of claim 4, further comprising the step of building the knowledge base, wherein the building step includes the steps of:
defining the taxonomy of subject matters; for at least some of the subject matters in the taxonomy, providing at least one example document that represents content typically found for the respective subject matter; generating a set of terms from the example document; and linking the set of terms to the respective subject matter.
- 6. The computer implemented method of claim 5, wherein the taxonomy is structured as a multi-tier hierarchy.
- 7. The computer implemented method of claim 6, wherein the multi-tier hierarchy includes a top-level tier of subject matter areas, at least one mid-level tier of subject matter categories for the respective subject matter areas, and at least one lower-level tier of one or more elements taken from a group consisting of, (a) example documents for the respective subject matter categories, (b) a set of terms representing content typically found for the respective subject matter categories, and (c) entity lists corresponding to the respective subject matter categories.
- 8. The computer implemented method of claim 4, wherein the step of comparing the search query against a knowledge-base further includes a step of assigning a score to the determined subject matter based upon a confidence level of the comparison.
- 9. The computer implemented method of claim 8, wherein the step of determining a subject matter of the query further includes the steps of:
displaying one or more of the subject matters having a score greater than a predetermined threshold; and selecting, by a user, at least one of the displayed subject matters.
- 10. The computer implemented method of claim 8, wherein the analyzing step determines a plurality of subject matters, and the method further includes a step of organizing the determined plurality of subject matters according, at least in part, to the scores assigned to the plurality of subject matters.
- 11. The computer implemented method of claim 4, wherein:
another deriving method of the analyzing step includes the step of comparing at least a portion of the search query against a plurality of entity lists; each entity list including a list of phrases corresponding with one or more subject matters; and the comparing step including the step of matching a phrase in an entity list against at least a portion of the search query, and upon such match, returning a subject matter corresponding to the matched phrase in the entity list.
- 12. The computer implemented method of claim 11, wherein:
the step of comparing at least a portion of the search query against a knowledge-base further includes a step of assigning a score to the subject matter determined by this comparison based upon a confidence level of this comparison; and the step of comparing at least a portion of the search query against a plurality of entity lists further includes a step of assigning a score to the subject matter determined by this comparison based upon a confidence of this comparison.
- 13. The computer implemented method of claim 12, wherein the analyzing step determines a plurality of subject matters, and the method further includes a step of organizing the determined plurality of subject matters according, at least in part, to the scores assigned to the plurality of subject matters.
- 14. The computer implemented method of claim 13, wherein the step of assigning a score to the subject matter determined by the step of comparing at least a portion of the search query against a plurality of entity lists includes the steps of:
assigning a confidence score to each entity list based upon a level of specificity of the respective entity list; and assigning a score to the subject matter corresponding to the confidence score of the entity list from which the subject matter was returned.
- 15. The computer implemented method of claim 12, wherein the scores resulting from the step of comparing at least a portion of the search query against a knowledge-base are lower than the scores resulting from the step of comparing at least a portion of the search query against a plurality of entity lists.
- 16. A computer implemented method for accessing information from a plurality of searchable information sources comprising the steps of:
analyzing a user's search query to determine a subject matter of the query; providing a category-to-source map that includes a plurality of categories, the categories having at least one information source linked thereto; obtaining at least one category pertaining to the subject matter of the query; and adding the information source linked to the category in the category-to-source map to a subset of information sources.
- 17. The computer implemented method of claim 16, wherein each information source is assigned a performance score pertaining to at least one performance quality of the information source.
- 18. The computer implemented method of claim 17, further comprising the steps of searching at least one information source in the subset of information sources for documents relevant to the search query and displaying search results from the output of the searching step, wherein the displaying step displays the search results in an order based upon, at least in part, the performance scores of the information sources from which the search results are obtained.
- 19. The computer implemented method of claim 17, wherein the performance quality is taken from a group consisting of:
the frequency that the respective information source is accessed; the amount of time spent accessing the respective information source; the frequency of problems accessing the respective information source; and feedback provided by users of the respective information source.
- 20. The computer implemented method of claim 17, further comprising the step of eliminating from the subset of information sources any information source having a performance score lower than a predetermined threshold.
- 21. A computer implemented method for accessing information from a plurality of searchable information sources comprising the steps of:
analyzing a user's search query to determine a subject matter of the query; selecting a subset of information sources from the plurality of information sources based upon the determined subject matter of the query; assigning each information source in the subset of information sources a performance score pertaining to performance qualities of the information source; searching the information sources in the subset of information sources for documents relevant to the search query; and displaying search results from the output of the searching step, wherein the search results are ordered based upon, at least in part, the performance scores of the information sources from which the search results are obtained.
- 22. The computer implemented method of claim 21, wherein the performance scores are calculated based, at least in part, upon the number of times the respective information source is accessed by a community of users.
- 23. The computer implemented method of claim 22, wherein a weighting factor is used for calculating the performance score for a particular information source, and the method includes the steps of:
selecting, by a user, an item in the search results; and updating the weighting factor for the information source corresponding to the selected item in the search results.
- 24. The computer implemented method of claim 23, further comprising the step of adjusting the weighting factor based upon an amount of time the user spends in the information source after the selecting step.
- 25. The computer implemented method of claim 23, further comprising the step of adjusting the performance score based upon a confidence score resulting from a comparison of at least portions of the search query against a set of terms representing content typically found for the respective subject matter.
- 26. The computer implemented method of claim 23, further comprising the step of adjusting the performance score based upon a confidence score resulting from a comparison of at least portions of the search query against at least one example document representing content typically found for the respective subject matter.
- 27. The computer implemented method of claim 23, further comprising the step of adjusting the performance score based upon a confidence score resulting from a comparison of at least portions of the search query against a list of names or symbols typically associated with the respective subject matter.
- 28. The computer implemented method of claim 23, further comprising the step of adjusting the weighting factor depending upon the access performance of the information source.
- 29. A computer implemented method for accessing information from a plurality of searchable information sources comprising the steps of:
analyzing a user's search query to determine a subject matter of the query; selecting a subset of information sources from the plurality of information sources based upon the determined subject matter of the query; searching the information sources in the subset of information sources for documents relevant to the search query; and displaying the search results from the output of the searching step, wherein the search results are segregated for each of the information sources in the subset of information sources.
- 30. The computer implemented method of claim 29, wherein the searching step searches the information sources in the subset of information sources in parallel and wherein the displaying step displays the segregated search results in parallel.
- 31. A computer implemented method for accessing information from a plurality of searchable information sources comprising the steps of:
analyzing a user's search query to determine a subject matter of the query; selecting a subset of information sources from the plurality of information sources based upon the determined subject matter of the query; searching a standard information source for documents relevant to the search query; and displaying the result of the step of searching the standard information source along with an option, selectable by the user, for searching the subset of information sources for documents relevant to the search query upon selection of the option by the user.
- 32. The computer implemented method of claim 31, wherein the standard information source is the World Wide Web.
- 33. The computer implemented method of claim 32, wherein the subset of information sources is maintained on a private computer network.
- 34. The computer implemented method of claim 31, wherein:
the analyzing step determines a plurality of subject matters from the query; the selecting step selects a subset of information sources for each of the plurality of subject matters determined in the analyzing step; the displaying step displays a plurality of options for each subject matter determined in the analyzing step; each option is identified by its respective subject matter in the displaying step; and each option is provided for searching the subset of information sources associated therewith for documents relevant to the search query upon selection of the option by the user.
- 35. A computer implemented method for accessing information from a plurality of searchable information sources comprising the steps of:
analyzing a user's search query to determine a subject matter of the query; selecting a subset of information sources from the plurality of information sources based upon the determined subject matter of the query; searching a standard information source for documents relevant to the search query; searching the subset of information sources for documents relevant to the search query; and simultaneously displaying the results of the step of searching the standard information source and the step of searching the subset of information sources.
- 36. The computer implemented method of claim 35, wherein the displaying step segregates the results of the step of searching the standard information source from the results of the step of searching the subset of information sources.
- 37. A computer implemented method for accessing information from a plurality of searchable information sources comprising the steps of:
analyzing a user's search query to determine a plurality of subject matters of the query; for each of the determined subject matters of the query, selecting a subset of information sources from the plurality of information sources associated with the respective determined subject matter; and automatically searching the subset of information sources associated with the subject matter having the closest match to the search query for documents relevant to the search query.
- 38. A computer implemented method for searching a plurality of searchable information sources, the information sources including at least one secure source and at least one non-secure source, the method comprising the steps of:
storing security credentials necessary for accessing the secure source; accessing the secure source utilizing the stored security credentials; accessing the non-secure source; searching the accessed sources for documents relevant to a search query; and displaying results of the searching step.
- 39. The computer implemented method of claim 38, wherein the searching step searches the accessed sources substantially in parallel.
- 40. The computer implemented method of claim 39, wherein:
the plurality of information sources includes a plurality of secure sources; the step of storing security credentials includes the step of storing respective security credentials necessary for accessing each secure source; and the step of accessing the secure source involves the step of accessing the plurality of secure sources, substantially in parallel, utilizing the respective stored security credentials.
- 41. The computer implemented method of claim 40, wherein:
the method operates on a computer network system having a plurality of users; and the step of storing security credentials includes the step of storing respective security credentials for accessing each secure server by each user of the computer network system.
- 42. The computer implemented method of claim 41, wherein the security credentials are stored in a database, wherein the database includes a table for each user, and wherein each table includes the set of respective security credentials for accessing each secure source by the respective user.
- 43. The computer implemented method of claim 41, wherein at least certain of the security credentials are shared by certain groups of users during at least one of the accessing and searching steps.
- 44. The computer implemented method of claim 39, wherein the step of storing security credentials includes the steps of recording a user's security credentials as the user preliminarily enters the secure source and storing the recorded user's security credentials for the subsequent step of accessing the secure sever.
- 45. The computer implemented method of claim 44, wherein the stored user's security credentials are reusable for multiple steps of accessing the secure server.
- 46. The computer implemented method of claim 44, wherein the security credentials are used substantially transparently to the user during the step of accessing the secure server.
- 47. The computer implemented method of claim 44, wherein:
the step of accessing the secure source further includes the step of storing session cookies set by the source for use during the searching step.
- 48. The computer implemented method of claim 44, wherein the steps of recording a user's security credentials as the user preliminarily enters the secure source and storing the recorded user's security credentials for the subsequent step of accessing the secure sever, includes the steps of providing a visual tool to the user that displays a log-in page for the secure source and has the user perform the step of logging into the secure source provided by the visual tool, wherein the visual tool records the user's security credentials during the log-in step.
- 49. A computer implemented method for searching a plurality of searchable information sources by plurality of users to a computer network system, the information sources including at least one secure source, the method comprising the steps of:
for each user, storing security credentials necessary for accessing the secure source; accessing, by each user, the secure source utilizing the stored security credentials for each user; and searching the accessed secure source, by the plurality of users for documents relevant to one or more search queries.
- 50. The computer implemented method of claim 49, wherein the searching step includes the step of searching the accessed secure source, by the plurality of users substantially in parallel.
- 51. The computer implemented method of claim 49, further comprising the step of creating a session file for each user accessing the secure source.
- 52. The computer implemented method of claim 51, wherein the session file includes one or more elements take from a group consisting of:
cookies; session parameters; session ids; and a session state.
- 53. The computer implemented method of claim 49, wherein:
the information sources include a plurality of secure sources; the storing step includes the step of storing, for each user, security credentials necessary for accessing one or more of the plurality of secure sources; the accessing step includes the step of accessing, by each user, one or more of the plurality of secure sources utilizing the stored security credentials for each user; and the searching step includes the step of searching the accessed secure sources, by the plurality of users, for documents relevant to one or more search queries.
- 54. The computer implemented method of claim 53, wherein, for each user, a session file is created for each secure source accessed by that user.
- 55. The computer implemented method of claim 49, wherein:
the information sources include a plurality of secure sources; the storing step includes the step of storing, for each user, security credentials necessary for accessing one or more of the plurality of secure sources; the accessing step includes the step of accessing, by each user, one or more of the plurality of secure sources utilizing the stored security credentials for each user; and the searching step includes the step of searching the accessed secure sources, by the plurality of users, substantially in parallel, for documents relevant to one or more search queries.
- 56. The computer implemented method of claim 50, wherein at least certain of the security credentials are shared by certain groups of users during at least one of the accessing and searching steps.
- 57. A computer implemented method for generating a searchable source broker for defining patterns of search-result information specific to a searchable source, the method comprising the steps of:
accessing a given searchable source; performing an example search on the given searchable source to produce search results by that searchable source; identifying regular expressions from the search results.
- 58. The computer implemented method of claim 57, further comprising the step of storing the regular expressions for the given searchable source for subsequent reuse by a federated search system.
- 59. The computer implemented method of claim 58, wherein:
the step of identifying regular expressions is performed substantially automatically; the method further comprises the step of reviewing, by a user, output of applying the regular expressions to search results produced by the given searchable source; and the method further comprises the step of approving by the user the regular expressions based upon the reviewing step.
- 60. The computer implemented method of claim 59, wherein the method further includes a step of modifying the regular expressions by the user before the approving step, if the user determines the modifying step is necessary based upon the reviewing step.
- 61. The computer implemented method of claim 59, wherein the reviewing step involves the step of simultaneously displaying to the user search results produced by the given search and the output of applying the regular expressions to the search results.
- 62. The computer implemented method of claim 57, wherein the step of identifying regular expressions includes the steps of:
distilling a structure of the search results; parsing the search results to distill a structure of the search results; identifying repeating blocks of information from the parsed search results; identifying essential search-result elements from the repeating blocks of information; and generating a regular expression for each identified essential search-result elements and a regular expression for the repeating block.
- 63. The computer implemented method of claim 62, wherein the essential search-result elements include at least one element taken from a group consisting of:
a title; a URL; a date; a key-word; a summary; a passage; and a score.
- 64. The computer implemented method of claim 58, wherein the accessing step includes the steps of:
providing a log-in form, for the searchable source; logging into the searchable source by entering the appropriate log-in information to the log-in form by the user; recording security credential information provided by the user during the logging step; and storing the security credential information with the searchable source broker for re-use by the searchable source broker in the federated search system.
- 65. A computer implemented method for accessing information from a plurality of searchable information sources comprising the steps of:
analyzing a user's search query to determine a subject matter of the query; and selecting a subset of information sources from the plurality of information sources based upon the determined subject matter of the query, wherein at least one of the subset of information sources is a secure information source; accessing the secure information source utilizing stored security credentials for the information source; and searching the information sources in the subset of information sources for documents relevant to the search query.
- 66. The computer implemented method of claim 65, wherein the searching step involves the step of searching the information sources in the subset of information sources, substantially in parallel, for documents relevant to the query.
- 67. The computer implemented method of claim 65, wherein the step of accessing the secure information source utilizes the stored security credentials substantially automatically and substantially transparently to the user.
- 68. The computer implemented method of claim 65, wherein the step of searching the information sources in the subset of information sources utilizes source brokers for each of the information sources in the subset of information sources, wherein the source brokers define patterns of search-result information specific to their respective information source.
- 69. The computer implemented method of claim 68, wherein the source broker for the secure information source includes the stored security credentials utilized in the accessing step.
- 70. The computer implemented method of claim 68, further comprising the step of defining the source broker for each of the information sources in the subset of information sources.
- 71. The computer implemented method of claim 70, wherein the defining step includes the steps of:
preliminarily accessing the respective information source; preliminarily performing an example search on the respective information source to produce example search results; identifying regular expressions. from the example search results; and storing the regular expressions as at least part of the source broker.
- 72. The computer implemented method of claim 71, wherein the defining step further includes the steps of:
detecting whether the respective information source is a secure information source; and if the detecting step determines that the respective information source is a secure information source, performing the additional steps of:
providing a log-in form for the secure information source; logging into the secure information source by entering the appropriate log-in information to the log-in form by the user; recording security credential information provided by the user during the logging step; and storing the security credential information with the respective source broker.
CROSS REFERENCE TO RELATED APPLICATIONS
[0001] The present application claims the benefit from U.S. Provisional Patent Application Serial No. 60/360,754, filed Mar. 1, 2002; the contents of which are incorporated herein by reference.
Provisional Applications (1)
|
Number |
Date |
Country |
|
60360754 |
Mar 2002 |
US |