Claims
- 1. A computer-implemented citation indexing system comprising:
means for locating and acquiring publications in electronic format; document parser for extracting semantic features, including citations, from acquired publications; and means for identifying citations to the same publication.
- 2. A computer-implemented citation indexing system as set forth in claim 1, where said means for locating and acquiring publications comprises means selected from the group consisting of web search engines, crawling robot, usenet newsgroups, and internet or intranet mailing lists.
- 3. A computer-implemented citation indexing system as set forth in claim 1, where said means for locating and acquiring publications comprises crawler means.
- 4. A computer-implemented citation indexing system as set forth in claim 1, where said means for identifying citations to the same publication normalizes citations, sorts citations by length and processes citations in order beginning with the longest length citation, computes distance measures to all previously identified groups of citations for each citation, and either adds the citation to an existing group or creates a new group, where said distance measure comprises word matching.
- 5. A computer-implemented citation indexing system as set forth in claim 1, where said means for identifying citations to the same publication normalizes citations, sorts citations by length and processes citations in order beginning with the longest length citation, computes distance measures to all previously identified groups of citations for each citation, and either adds the citation to an existing group or creates a new group, where said distance measure comprises word and phrase matching.
- 6. A computer-implemented citation indexing system as set forth in claim 1, where said means for identifying citations to the same publication comprises applying a subfield detection algorithm.
- 7. A computer-implemented citation indexing system as set forth in claim 1, where said means for identifying citations to the same publication comprises using learning techniques and known bibliographic information.
- 8. A computer-implemented citation indexing system as set forth in claim 1, where said publications in electronic format are printed publications converted to electronic form using optical character recognition.
- 9. A computer-implemented citation indexing system as set forth in claim 1, where said publications in electronic format are publications on the world wide web.
- 10. A computer-implemented citation indexing system as set forth in claim 9 further comprising query processing means for searching the world wide web.
- 11. A computer-implemented citation indexing system as set forth in claim 9, where said means for locating and acquiring publications comprises web search engines.
- 12. A computer-implemented citation indexing system as set forth in claim 9, where said means for locating and acquiring publications comprises crawler means.
- 13. A computer-implemented citation indexing system as set forth in claim 9, where said means for identifying citations to the same publication normalizes citations, sorts citations by length and processes citations in order beginning with the longest length citation, computes distance measures to all previously identified groups of citations for each citation, and either adds the citation to an existing group or creates a new group, where said distance measure comprises word matching.
- 14. A computer-implemented citation indexing system as set forth in claim 9, where said means for identifying citations to the same publication normalizes citations, sorts citations by length and processes citations in order beginning with the longest length citation, computes distance measures to all previously identified groups of citations for each citation, and either adds the citation to an existing group or creates a new group, where said distance measure comprises word and phrase matching.
- 15. A computer-implemented citation indexing system as set forth in claim 9, where said means for identifying citations to the same publication comprises applying a subfield detection algorithm.
- 16. A computer-implemented citation indexing system as set forth in claim 9, where said means for identifying citations to the same publication comprises using learning techniques and known bibliographic information.
- 17. A computer-implemented citation indexing system for providing context of a citation in a publication comprising:
means for locating and acquiring publications in electronic format; and document parser for extracting semantic features, including citations, from acquired publications; where said parser provides the citations and text surrounding the citations in the publication.
- 18. A computer-implemented citation indexing system as set forth in claim 17, where said means for locating and acquiring publications comprises means selected from the group consisting of web search engines, crawling robot, usenet newsgroups, and internet or intranet mailing lists.
- 19. A computer-implemented citation indexing system as set forth in claim 17, where said means for locating and acquiring publications comprises crawler means.
- 20. A computer-implemented citation indexing system as set forth in claim 17, where said publications in electronic format are printed publications converted to electronic form using optical character recognition.
- 21. A computer-implemented citation indexing system as set forth in claim 17, where said publications in electronic format are publications on the world wide web.
- 22. A computer-implemented citation indexing system as set forth in claim 21, where said means for locating and acquiring publications comprises web search engines.
- 23. A computer-implemented citation indexing system as set forth in claim 21, where said means for locating and acquiring publications comp rises crawler means.
- 24. A computer-implemented citation indexing system comprising:
query processing means for searching for a keyword in electronic publications; means for locating and acquiring publications having the keyword; and document parser for extracting citations from acquired publications.
- 25. A computer-implemented citation indexing system as set forth in claim 24, where said document parser provides a citation including at least a portion of the text containing the citation.
- 26. A computer-implemented citation indexing system as set forth in claim 24, further comprising acquiring publications having at least one of similar keywords or similar extracted citations.
- 27. A computer-implemented citation indexing system as set forth in claim 26, where said similar publications are determined from a CCIDF algorithm.
- 28. A computer-implemented citation indexing system as set forth in claim 24, where said means for locating and acquiring publications comprises crawler means.
- 29. A computer-implemented citation indexing system as set forth in claim 24, where said electronic publications are on the world wide web.
- 30. A computer-implemented citation indexing system as set forth in claim 29, where said document parser provides a citation including at least a portion of the text containing the citation.
- 31. A computer-implemented citation indexing system as set forth in claim 29, further comprising acquiring publications having at least one of similar keywords or similar extracted citations.
- 32. A computer-implemented citation indexing system as set forth in claim 31, where said similar publications are determined from a CCIDF algorithm.
- 33. A computer-implemented citation indexing system as set forth in claim 29, where said means for locating and acquiring publications comprises crawler means.
- 34. A method of computer-implemented citation indexing for identifying different forms of citations to the same publication comprising the steps of:
searching for desired publications in electronic format; locating and acquiring the desired publications; parsing the acquired publications for extracting and storing semantic features, including citations, from the acquired publications; and identifying citations to the same publication.
- 35. A method of computer-implemented citation indexing as set forth in claim 34, said identifying citations to the same publication comprising normalizing citations, sorting citations by length and processing citations in order beginning with the longest length citation, computing for each citation, distance measures to all previously identified groups of citations, and either adding the citation to an existing group or creating a new group, where said computing distance measures comprises word matching.
- 36. A method of computer-implemented citation indexing as set forth in claim 34, said identifying citations to the same publication comprising normalizing citations, sorting citations by length and processing citations in order beginning with the longest length, computing for each citation, distance measures to all previously identified groups of citations, and either adding the citation to an existing group or creating a new group, where said computing distance measures comprises word and phrase matching.
- 37. A computer-implemented citation indexing system as set forth in claim 34, where said identifying citations to the same publication comprises applying a subfield detection algorithm.
- 38. A computer-implemented citation indexing system as set forth in claim 34, where said means for identifying citations to the same publication comprises using learning techniques and known bibliographic information.
- 39. A method of computer-implemented citation indexing as set forth in claim 34, where said means for locating and acquiring publications comprises crawler means.
- 40. A method of computer-implemented citation indexing as set forth in claim 34, where said publications in electronic format are publications on the world wide web.
- 41. A method of computer-implemented citation indexing as set forth in claim 40, said identifying citations to the same publication comprising normalizing citations, sorting citations by length and processing citations in order beginning with the longest length citation, computing for each citation, distance measures to all previously identified groups of citations, and either adding the citation to an existing group or creating a new group, where said computing distance measures comprises word matching.
- 42. A method of computer-implemented citation indexing as set forth in claim 40, said identifying citations to the same publication comprising normalizing citations, sorting citations by length and processing citations in order beginning with the longest length, computing for each citation, distance measures to all previously identified groups of citations, and either adding the citation to an existing group or creating a new group, where said computing distance measures comprises word and phrase matching.
- 43. A computer-implemented citation indexing system as set forth in claim 40, where said identifying citations to the same publication comprises applying a subfield detection algorithm.
- 44. A computer-implemented citation indexing system as set forth in claim 40, where said means for identifying citations to the same publication comprises using learning techniques and known bibliographic information.
- 45. A method of computer-implemented citation indexing as set forth in claim 40, where said locating and acquiring publications comprises using crawler means.
- 46. A computer-implemented citation indexing system comprising:
query processor; web browser interface for interfacing with said query processor; crawler for locating and acquiring publications on the world wide web; text extracting means for converting the acquired publications into text documents; text document database means for storing text documents to be searched by said query processor; document parsing and database creation means for extracting semantic features, including citations, from text documents; database means for storing the extracted features, and means for identifying citations to the same publication.
CROSS-REFERENCE TO RELATED APPLICATIONS
[0001] This is a conversion of provisional application Ser. No. 60/070,489, filed Jan. 5, 1998.
Provisional Applications (1)
|
Number |
Date |
Country |
|
60070489 |
Jan 1998 |
US |
Continuations (1)
|
Number |
Date |
Country |
Parent |
09082071 |
May 1998 |
US |
Child |
09859031 |
May 2001 |
US |