Claims
- 1) A method of forming an overview for hierarchically related information, where the information can be represented as a set of nodes wherein each node is associated with a portion of the information comprising:
a) forming a set of lexically central nodes and a remainder set, b) forming a set of auxiliary nodes from the remainder set, c) combining the set of lexically central nodes and the set of auxiliary nodes to form an extraction node set, d) selecting an overview string from the information associated with each node in the extraction node set, and e) combining the overview strings to form an overview for the information.
- 2) The method of claim 1 wherein determining the lexically central nodes comprises:
a) determining a word vector for each node, b) determining a lexical centroid of the word vectors, and c) selecting a given number of nodes having word vectors closest to the lexical centroid.
- 3) The method of claim 1 wherein the step of forming a set of auxiliary nodes from the remainder set comprises selecting nodes which are parents of central nodes.
- 4) The method of claim 1 wherein the step of forming a set of auxiliary nodes from the remainder set comprises selecting nodes which are parents of more than a specified number of nodes.
- 5) The method of claim 1 wherein the step of forming a set of auxiliary nodes from the remainder set comprises selecting nodes which are parents of central nodes and selecting nodes which are parents of more than a specified number of nodes.
- 6) The method of claim 1 wherein selecting an overview string from the information associated with each node in the extraction node set comprises:
a) determining whether the node is a root node, and b) responsive to the determination in step a, selecting an initial string from the information.
- 7) The method of claim 1 wherein selecting an overview string from the information associated with each node in the extraction node set comprises:
a) determining whether the node is in the set of lexically central nodes, and if the information associated with the node comprises one or more quoting portions, and b) responsive to the determination in step a, selecting at least one string from the information following a quoting portion.
- 8) The method of claim 7 wherein an initial string is selected from the information.
- 9) The method of claim 1 wherein selecting an overview string from the information associated with each node in the extraction node set comprises:
a) determining whether the information associated with the node comprises one or more quoted portions, and b) responsive to the determination in step a, selecting at least one string from the information comprising part of at least one quoted portion.
- 10) The method of claim 1 wherein selecting an overview string from the information associated with each node in the extraction node set comprises:
a) determining whether the node is an auxiliary node and the information associated with the node does not contain any quoted portions, and b) responsive to the determination in step a, selecting at least one string from the final part of the information associated with the node.
- 11) The method of claim 7 wherein selecting at least one string from the information comprises selecting the most lexically central string.
- 12) The method of claim 2 wherein determining the lexically central nodes further comprises determining the lexical centroid of selected groups of adjacent nodes and selecting the node closest to the lexical centroid of each such group.
- 13) The method of claim 2 wherein determining the lexically central nodes further comprises
a) determining whether the nodes include the root node of a tree representing a stored conversation. b) responsive to the determination in step a, selecting the root node and a proportion of its children.
CROSS REFERENCE TO RELATED APPLICATIONS
[0001] This patent application is related to:
[0002] U.S. patent application Ser. No. 10/AAA,AAA, titled “A Method and Apparatus for Normalizing Quoting Styles in Electronic Mail”, by Newman, filed concurrently herewith,
[0003] U.S. patent application Ser. No. 10/BBB,BBB, titled A Method and Apparatus for Clustering Hierarchically Related Information, by Newman et al. filed concurrently herewith,
[0004] U.S. patent application Ser. No. 10/DDD,DDD, titled “A Method and Apparatus for Generating Summary Information for Hierarchically Related Information”, by Blitzer filed concurrently herewith,
[0005] U.S. patent application Ser. No. 10/EEE,EEE, titled “Method and Apparatus for Displaying Hierarchical Information”, by Newman filed concurrently herewith, and
[0006] U.S. patent application Ser. No. 10/FFF,FFF, titled “Method and Apparatus for Segmenting Hierarchical Information for Display Purposes”, by Newman filed concurrently herewith.
[0007] The following patents and/or patent applications are herein incorporated by reference:
[0008] U.S. patent application Ser. No. 09/732,024, titled “Method and System for Presenting Email Threads as Semi-connected Text by Removing Redundant Material”, by Paula Newman and Michelle Baldonado, filed Dec. 8, 2000.
[0009] U.S. patent application Ser. No. 09/732,029, titled “Method and System for Display of Electronic Mail, by Paula Newman, filed Dec. 8, 2000
[0010] U.S. patent application Ser. No. 09/954,388, titled “Method and Apparatus for the Construction and use of Table-like visualizations of Hierarchic Material, by Paula Newman and Stuart Card, filed Sep. 10, 2001
[0011] U.S. patent application Ser. No. 09/954,530, titled “Method and Apparatus for the Viewing and Exploration of the Content of Hierarchical Information, by Paula Newman and Stuart Card, filed Sep. 10, 2001.
[0012] U.S. patent application Ser. No. 09/717,278, titled “Systems and Methods for Performing Sender-Independent Managing of Electronic Messages, by Michelle Baldonado, Paula Newman, and William Janssen, filed Nov. 22, 2000
[0013] U.S. patent application Ser. No. 09/732,028 titled “Method and System for presenting semi-linear hierarchy displays” by Paula Newman, filed Dec. 8, 2000
[0014] U.S. patent application Ser. No. 09/747,634, titled “System and Method for Browsing Node-Link Structures Based on Estimated Degree of Interest”, filed on Dec. 21, 2000 by Stuart Card
[0015] U.S. patent application Ser. No. 10/103,053, titled “Systems and Methods for Determining the Topic Structure of a Portion of a Text” by loannis Tsochantaridis, Thorsten Brants, and Francine Chen, filed Mar. 2, 2002
[0016] U.S. patent application Ser. No. 10/164,587, titled “Authoring Tools, Including Content-Driven Treetables for Fluid Text” by Polle Zellweger, Paula Newman, and Maribeth Back (D/A2017)