Claims
- 1. A method of determining interactive topic-based text summaries comprising the steps of:
determining a text to summarize; determining at least one text segment of the determined text; determining at least one sentence constituent based summary type; determining at least one discrete interactive topic-based text summary for a determined segment based on the determined sentence constituent based summary type and a statistical measure.
- 2. The method of claim 1, wherein the statistical measure is a measure of determined segment characterization and differentiation from at least one other segment.
- 3. The method of claim 2, wherein the at least one other segment is at least one of the n-preceding and m- following segments and where n+m>0.
- 4. The method of claim 2, wherein the sentence constituent based summary type is at least one of a keyword, a key-phrase, an n-gram, a sentence and a noun phrase.
- 5. The method of claim 2, wherein the measure of text region characterization and other region differentiation is at least one of mutual information, log odds ratio, χ2, G2 distributions.
- 6. The method of claim 1, wherein the sentence constituent based summary type is determined based on at least one of a device characteristic, a user preference and a user selection.
- 7. The method of claim 5, further comprising the steps of:
dynamically determining at least one text segment to summarize based on dynamic user selections; dynamically displaying at least one discrete interactive topic-based text summary based on the at least one dynamically determined text segment and wherein a human sensible display characteristic indicates omitted text.
- 8. The method of claim 7, wherein the human sensible display characteristic is at least one of a visual, auditory, tactile, olfactory and taste characteristic.
- 9. The method of claim 8, wherein the at least one visual characteristic is at least one of a text font, color, italics, bolding and placement.
- 10. The method of claim 1, wherein the statistical measure is at least one of four components of mutual information based on presence in a segment and presence in other segments.
- 11. The method of claim 1, wherein the statistical measure is at least two of four components of mutual information based on presence in a segment and presence in other segments.
- 12. The method of claim 1, further comprising the step of normalizing key-phrase probabilities based on key-phrase length.
- 13. A system for determining interactive topic-based text summaries comprising:
an input/output circuit for receiving, a selected sentence constituent based summary type, a selected text; a processor that retrieves the selected text from the input/output circuit, segments the text and determines a text segment to summarize; a probability and latent class determining circuit for determining distributions of the selected sentence constituent type in the segmented text and in at least one text segment; a statistical measure determining circuit for determining a statistical measure of at least one text segment; and wherein the processor determines a discrete interactive topic-based summary for at least one text segment based on the received sentence constituent based summary type, the determined distributions and a statistical measure of the segment.
- 14. The system of claim 13, wherein the statistical measure is a measure of text segment characterization and other segment differentiation.
- 15. The system of claim 14, wherein other segments are at least one of n-preceding text segments of a determined text segment and m- following text segments and where n+m>0.
- 16. The system of claim 14, wherein the sentence constituent based summary type is at least one of a keyword, a key-phrase, an n-gram, a sentence and a noun phrase.
- 17. The system of claim 14, wherein the measure of text segment characterization and other segment differentiation is at least one of mutual information, log odds ratio, χ2, G2 distributions.
- 18. The system of claim 13, wherein the sentence constituent based summary type is determined based on at least one of a device characteristic, a user preference and a user selection.
- 19. The system of claim 17, further comprising:
a display circuit that displays at least one discrete interactive topic-based text summary based on a dynamic selection of the selected text segment and using a human sensible display characteristic to indicate omitted text.
- 20. The system of claim 19, wherein the human sensible display characteristic is at least one of a visual, auditory, tactile, olfactory and taste characteristic.
- 21. The system of claim 20, wherein the visual characteristic is at least one of text font, text color, text italics, text bolding and placement.
- 22. The system of claim 13, wherein the statistical measure is at least one of four components of mutual information based on presence in a segment and presence in other segments.
- 23. The system of claim 13, wherein the statistical measure is at least two of four components of mutual information based on presence in a segment and presence in other segments.
- 24. The system of claim 13, further comprising the step of: normalizing key-phrase probabilities based on key-phrase length.
- 25. Computer readable storage medium comprising: computer readable program code embodied on the computer readable storage medium, the computer readable program code usable to program a computer to a method of determining interactive topic-based text summaries comprising the steps of:
determining a text to summarize; determining at least one text segment of the determined text; determining at least one sentence constituent based summary type; determining at least one discrete interactive topic-based text summary based on the determined sentence constituent based summary type and a statistical measure.
- 26. A carrier wave encoded to transmit a control program, useable for determining an interactive topic-based text summary, to a device for executing the program, the control program comprising:
instructions for determining a text to summarize; instructions for determining at least one text segment of the determined text; instructions for determining at least one sentence constituent based summary type; instructions for determining at least one discrete interactive topic-based text summary based on the determined sentence constituent based summary type and a statistical measure.
INCORPORATION BY REFERENCE
[0001] This Application incorporates by reference:
[0002] Attorney Docket No. D/A1708, entitled “SYSTEMS AND METHODS FOR DETERMINING THE TOPIC STRUCTURE OF A PORTION OF TEXT” by I. Tsochantaridis et al., filed Mar. 22, 2002 as U.S. patent application Ser. No. 10/103,053;
[0003] Attorney Docket No. D/A2523Q1, entitled “SYSTEMS AND METHODS FOR DISPLAYING INTERACTIVE TOPIC BASED TEXT SUMMARIES” by F. Chen et al., filed Dec. 16, 2002, as U.S. patent application Ser. No. XX/XXX,XXX;
[0004] Attorney Docket No. D/A2523Q, entitled “SYSTEMS AND METHODS FOR SENTENCE BASED INTERACTIVE TOPIC BASED TEXT SUMMARIZATION” by F. Chen et al., filed Dec. 16, 2002, as U.S. patent application Ser. No. XX/XXX,XXX; each, in their entirety.