The subject regarded as the invention is particularly pointed out and distinctly claimed in the claims at the conclusion of the specification. The foregoing and other objects, features, and advantages of the invention are apparent from the following detailed description taken in conjunction with the accompanying drawing in which:
The detailed description explains an exemplary embodiment of the invention, together with advantages and features, by way of example with reference to the drawings.
Referring to
A network 40 interconnects a client node 30 and the server node 20. The client node 30 includes an electronic receptacle 32 and a dynamic interest profile member (DIP) 34. The client node 30 is configured to receive the transmitted electronic mail via the electronic receptacle 32. The DIP 34 shall change as the user's interest evolves over a period of time. In order to reflect these changes during text indexing, two tasks must be performed: (i) adding new documents to reflect recent interests, and (ii) removing documents to reflect topics the user is no longer interested in,
The DIP 34 is a mechanism for ranking the importance of a piece of electronic mail. The ranking is generated based on the content of the electronic mail, as well as the sender of the electronic mail. The DIP 34 keeps track of the topics most important to the user by extracting keywords the user has shown interest in. Through a user interface at the client node, the user may define the identity of the senders that increase the DIP ranking and (ii) the keywords listed in the contents of the electronic mail that increase the DIP ranking. The user may also assign varying weights to the senders and keywords so that the effect of the sender or keyword on the DIP ranking can vary. For example, and email from a high level manager may contribute more to the DIP ranking than an email from a friend. The user may periodically adjust these weights, including eliminating senders or keywords from adding to the DIP ranking, as interests change.
The DIP 34 is configured to assign a DIP ranking to each piece of received electronic mail. The ranking is predicated upon at least one of, (i) the identity of the sender, and (ii) the keywords listed in the contents of the electronic mail. As noted above, emails from a particular sender or containing certain keywords may be associated with a higher DIP ranking by the user.
The DIP ranking of the electronic mail is compared to a DIP threshold. The electronic mail is added to a full text index 36 located in the client node 30 when the DIP ranking exceeds the DIP threshold. The full text index 36 is operably associated with the electronic receptacle 32 and the DIP 34. The DIP 34 may also be utilized by the user to index a web document, a web page, a document library or a personal search index. The same principle would apply, which is only the important web documents, web pages, library documents and personal search index documents will be indexed. If applied to the web, the DIP 34 may rely only on keywords and/or would replace the sender with the source web site. The optimization based on DIP rankings are not restricted to mail, and may be applied to desktop indexing applications, which also index documents and web pages.
The DIP 34 is configured to change as the user's interest evolves over a period of time, these changes being reflected in text indexing by at least one of, (i) adding new documents to reflect recent interests, and (ii) removing documents to reflect topics the user is no longer interested in. Furthermore, the DIP is configured to enhance recall type searches and not research type searches.
An indicator 50 may be included in the client node 30 for indicating that the document has been automatically added to the full text index 36. Furthermore, the client node 30 allows the user to manually add the electronic mail, the web document and the web page to the full text index 36 despite the DIP 34 ranking of the electronic mail, the web document or the web page.
In order for the electronic mail to be added to the full text index 36, the original transmitted document markup must be converted to a plain text format as well as undergo language detection. Afterwards, the plain text format of the electronic mail is tokenized prior to being indexed in the full text index 36.
Referring to
Next, at step 120, a DIP ranking is assigned to each piece of electronic mail predicated upon at least one of, (i) the identity of the sender, and (ii) the keywords listed in the contents of the electronic mail. In conclusion at step 130, the DIP ranking of the electronic mail is compared to a DIP threshold and electronic mail is added to a full text index located in the client node when the DIP ranking exceeds the DIP threshold.
Optionally, at step 140, an indication that the indexed document has been automatically added to the full text index may be invoked.
While the preferred embodiment to the invention has been described, it will be understood that those skilled in the art, both now and in the future, may make various improvements and enhancements which fall within the scope of the claims which follow. These claims should be construed to maintain the proper protection for the invention first described.