Claims
- 1. A computer method of gathering and summarizing information available through a network, the method comprising:
collecting information from a plurality of network sites according to respective maps of the network sites; converting the collected information from HTML-language web pages to XML-language documents and storing the XML-language documents in a storage medium; searching for documents according to a search query having at least one term and identifying the documents found in the search; and displaying the documents so as to indicate similarity of the documents to each other.
- 2. The method of claim 1, wherein said information is collected from said plurality of network sites at a predefined time interval.
- 3. The method of claim 2, wherein the method is carried out by a software agent computer program.
- 4. The method of claim 3, wherein the software agent computer program is originated in the JAVA computer language.
- 5. The method of claim 3, said software agent computer program residing in a computer also operating an agent hosting program; and wherein the software agent computer program is a client program in relation to the agent hosting program.
- 6. The method of claim 5, wherein the method is carried out by a plurality of software agent programs residing on a corresponding plurality of computers having agent-hosting programs, said software agent programs communicating with each other through the agent hosting programs.
- 7. The method of claim 1, further comprising comparing a similarity of a plurality of documents by calculating a similarity function for the plurality of documents.
- 8. The method of claim 7, wherein the similarity of a additional document added to the plurality of documents is calculated by comparing the additional document to a portion of a similarity matrix for the plurality of documents and without recalculating the entire similarity matrix for the plurality of documents.
- 9. The method of claim 1, wherein the documents are displayed as nodes of a tree structure having links and nodes in which similarity of documents is indicated by proximity of nodes to each other and by a length of links connecting the nodes to a common vertex.
- 10. The method of claim 1, wherein the documents are displayed in a hierarchical folder organization.
- 11. The method of claim 1, wherein the network is the Internet.
- 12. The method of claim 1, wherein the storage medium is a computer memory.
- 13. A computer system for gathering and summarizing information available through a network, the computer system being operable on at least one computer having a software operating system, the computer system comprising:
an agent hosting program for running under said software operating system; a plurality of agent programs operating with said agent hosting program, said plurality of agent programs including programs for collecting documents from respective network sites; wherein said agent program operates according to a stored search ontology providing a map of each respective network site and a time interval between search updates for the network site.
- 14. The computer system of claim 13, further comprising a second host computer having a software operating system and further comprising:
an agent hosting program for running under said software operating system; a plurality of agent programs operating with said agent hosting program, said plurality of agent programs including programs for collecting documents from respective network sites; wherein said plurality of agent programs operate according to a stored search ontology providing a map of each respective network site and a time interval between search updates for the network site.
- 15. The computer system of claim 13, wherein said at least one of said agent programs is relocatable from one of said host computers to the other of said host computers and is operable on said other one of said host computers.
- 16. The computer system of claim 13, wherein the network is the Internet and the network sites are Internet web sites.
- 17. A computer system for gathering and summarizing information available through a network, the computer system being operable on at least one computer having a software operating system, the computer system comprising:
an agent hosting program for running under said software operating system; a plurality of agent programs operating with said agent hosting program, said plurality of agent programs including programs for collecting documents from respective network sites; wherein said plurality of agent programs operate according to a stored search ontology providing a map of each respective network site and a time interval between search updates for the network site; and further comprising an agent for applying a similarity algorithm to documents found in the search of the network sites; and a user interface agent for providing a display of the results of the search and the results of applying the similarity algorithm; and an agent program for interfacing said user interface agent, said clustering agent and said plurality of agent programs for collecting documents from respective network sites.
- 18. The computer system of claim 17, wherein the network is the Internet and the network sites are Internet web sites.
CROSS REFERENCE TO RELATED APPLICATIONS
[0001] The benefit of priority is claimed herein based on U.S. Provisional Appl. No. 60/341,755 filed Dec. 21, 2001.
STATEMENT REGARDING FEDERALLY SPONSORED RESEARCH
[0002] This invention was made with Government support under Interagency Agreement No. 2302-Q326-A1 with the Office of Naval Research. Additional support has been provided under Contract No. DE-AC05-00OR22725 awarded to UT-Battelle, LLC, by the U.S. Department of Energy. The Government has certain rights in this invention.
Provisional Applications (1)
|
Number |
Date |
Country |
|
60341755 |
Dec 2001 |
US |