Claims
- 1. A method for processing information, comprising:
receiving a segmented judgment matrix, the segmented judgment matrix being a numerical matrix pairing each of a set of terms to each of a set of classifications, each term being a word or phrase, the segmented judgment matrix having a plurality of information submatrices, each element of each information submatrix representing a rating of a relevance of the term of the element to the classification of the element, each information submatrix being a numerical matrix representing the relevance of each of a subset of the set of terms to each of a subset of the set classifications; and using the segmented judgment matrix to calculate an information spectrum.
- 2. The method of claim 1, wherein at least some of the elements of the information submatrices represent ratings of relevance made by a human being.
- 3. The method of claim 1 wherein the segmented judgment matrix has rows and columns and each column of the segmented judgment matrix represents a classification and each row of the segmented judgment matrix represents a term.
- 4. The method of claim 1, further comprising:
receiving a search request; using the segmented judgment matrix to calculate an information spectrum of the search request; using the segmented judgment matrix to calculate an information spectrum for each of a plurality of documents; and identifying at least some documents of the plurality of documents as relevant to the search request based upon a comparison of the calculated information spectrums.
- 5. The method of claim 4 wherein:
each information submatrix has a plurality of classifications and a plurality of terms relevant to each classification; and using the segmented judgment matrix to calculate an information spectrum for each of a plurality of documents comprises calculating an information spectrum for each of the plurality of documents based upon at least some of the plurality of terms; the method further comprising: selecting the plurality of terms based upon a relevance of each term of the plurality of terms to at least some of the classifications of the information submatrices.
- 6. The method of claim 4 wherein the step of calculating an information spectrum for each document and for the search request further comprises determining a log average among the ratings of relevance of the terms for each classification.
- 7. The method of claim 4 wherein the step of identifying at least some documents further comprises determining a distance between the information spectrum of the at least some documents and the information spectrum of the search request.
- 8. The method of claim 4 further comprising:
selecting a document of the identified documents as definitely relevant to the search request including calculating an information spectrum of the selected document; and using the calculated information spectrum of the selected document as a new search request.
- 9. The method of claim 4 further comprising:
zooming in on a portion of a document information spectrum; and determining that a document and request have a wide spectrum with significant content in a field F of a term and measuring the request and document using a subengine for field F.
- 10. A computer program product comprising instructions operable to cause data processing apparatus to:
receive a segmented judgment matrix, the segmented judgment matrix being a numerical matrix pairing each of a set of terms to each of a set of classifications, each term being a word or phrase, the segmented judgment matrix having a plurality of information submatrices, each element of each information submatrix representing a rating of a relevance of the term of the element to the classification of the element, each information submatrix being a numerical matrix representing the relevance of each of a subset of the set of terms to each of a subset of the set classifications; and use the segmented judgment matrix to calculate an information spectrum.
- 11. The product of claim 10 wherein at least some of the elements of the information submatrices represent ratings of relevance made by a human being.
- 12. The product of claim 10 wherein the segmented judgment matrix has rows and columns and each column of the segmented judgment matrix represents a classification and each row of the segmented judgment matrix represents a term.
- 13. The product of claim 10 further comprising instructions to:
receive a search request; use the segmented judgment matrix to calculate an information spectrum of the search request; use the segmented judgment matrix to calculate an information spectrum for each of a plurality of documents; and identify at least some documents of the plurality of documents as relevant to the search request based upon a comparison of the calculated information spectrums.
- 14. The product of claim 13 wherein:
each information submatrix has a plurality of classifications and a plurality of terms relevant to each classification; and the instructions to use the segmented judgment matrix to calculate an information spectrum for each of a plurality of documents comprise instructions to calculate an information spectrum for each of the plurality of documents based upon at least some of the plurality of terms; the product further comprising instructions to: select the plurality of terms based upon a relevance of each term of the plurality of terms to at least some of the classifications of the information submatrices.
- 15. The product of claim 13 wherein the instructions to calculate an information spectrum for each document and for the search request further comprise instructions to determine a log average among the ratings of relevance of the terms for each classification.
- 16. The product of claim 13 wherein the instructions to identify at least some documents further comprise instructions to determine a distance between the information spectrum of the at least some documents and the information spectrum of the search request.
- 17. The product of claim 13 further comprising instructions to:
select a document of the identified documents as definitely relevant to the search request including instructions to calculate an information spectrum of the selected document; and use the calculated information spectrum of the selected document as a new search request.
- 18. The method of claim 13 further comprising instructions to:
zoom in on a portion of a document information spectrum; and determine that a document and request have a wide spectrum with significant content in a field F of a term and measure the request and document using a subengine for field F.
- 19. A computer program product for processing text information, the product comprising instructions operable to cause data processing apparatus to perform the operations of:
receiving a judgment matrix that is segmented into a plurality of information submatrices where each submatrix has a plurality of classifications and a plurality of terms relevant to each classification; evaluating a relevance of each term of the plurality of terms with respect to each classification of each information submatrix of the information submatrices; calculating an information spectrum for each of a plurality of documents based upon at least some of the plurality of terms; receiving a search request; calculating an information spectrum of the search request based upon at least some of the plurality of terms; and identifying at least some documents of the plurality of documents as relevant to the request based upon a comparison of the calculated information spectrums.
CROSS REFERENCE TO RELATED APPLICATIONS
[0001] This application is a continuation of U.S. application Ser. No. 10/315,059 filed Dec. 10, 2002, which is a continuation of U.S. application Ser. No. 09/305,583 filed May 5, 1999, and both titled WIDE-SPECTRUM INFORMATION SEARCH ENGINE, which is hereby incorporated by reference in its entirety for all purposes.
Continuations (2)
|
Number |
Date |
Country |
Parent |
10315059 |
Dec 2002 |
US |
Child |
10800217 |
Mar 2004 |
US |
Parent |
09305583 |
May 1999 |
US |
Child |
10315059 |
Dec 2002 |
US |