This application relates to searching, in particular searching web-searchable documents.
As the size of the documents posted on the Internet and transmittable via the Internet continues to grow, so does the amount of useful information stored and organized within user files. There are many data collections stored on servers and associated with one or more individuals. Examples of such data collections include online notes (such as Google Notebook), annotated albums of images (such as Flickr), and blogs.
Much of this data is used collaboration, but access to the data is restricted by rudimentary access control lists. Often, users wish to share this information in a collaborative manner but still want some level of control over its distribution. For example, a user may have an online notebook storing thoughts/opinions with regard to a particular website. The user may be willing to share this information with someone who finds it via a web search but may wish to have discrete control of its dissemination to others.
According to exemplary embodiments, methods for accessing web-searchable documents are provided. According to one embodiment, an Internet search query is received from a user, the query including at least one search term. A document in a search index of documents is analyzed, wherein keywords within the document are assigned group priority ratings. The user's relation rating to an owner of the document is determined, and the search term in the query is compared only to those indexed keywords within the document that have a group priority rating that is less than or equal to the user's relation rating to the owner of the document. An overall document ranking may be determined based on the comparison of the search term to the indexed keywords. The steps of analyzing a document, determining a user's relation rating, comparing the search term, and determining an overall document ranking may be repeated as long as there are documents in the search index. An abstract is constructed including keywords with a group priority rating less than or equal to the user's relation rating and presented to the user. The abstract may include documents with the highest document rankings. A request may be received from the user for a document based on the abstract, either for a private document or a public document. If the request is for a public document, the document is presented to the user. If the request is for a private document, it may be presented to the user if the user has been granted viewing rights. If the user has not been granted viewing rights, the user may be redirected to submit a document request form.
According to another embodiment, a method is provided for controlling document access. Keywords are parsed from a web-searchable document context to create a keyword list. For each keyword in the keyword list, a group priority rating is determined and assigned. For example, high group priority ratings are assigned to keywords that are sensitive or personal in nature, and low group priority ratings are assigned to keywords that are common an not sensitive or personal in nature. The group priority rating is indicative of a group of users that the document owner is willing to share the document with. The keywords with the associated group priority ratings are added to a search index. The group priority ratings control access to the documents in response to search queries from users.
Additional features and advantages are realized through the techniques of the present invention. Other embodiments and aspects of the invention are described in detail herein and are considered a part of the claimed subject manner. For a better understanding of the invention with advantages and features, refer to the description and to the drawings.
The subject matter which is regarded as the invention is particularly pointed out and distinctly claimed in the claims at the conclusion of the specification. The foregoing and other objects, features, and advantages of the invention are apparent from the following detailed description taken in conjunction with the accompanying drawings in which:
a and 3b illustrate an exemplary method for search and retrieval according to an exemplary embodiment; and
The detailed description explains exemplary embodiments, together with advantages and features, by way of example with reference to the drawings.
According to exemplary embodiment, a web-searchable document is analyzed for keywords. The keywords are assigned group priority level rights. Common words within the document (e.g., vacation, dog, etc.) may be assigned low priority group ratings, while less common, more sensitive, and person words (e.g., a person's name), may be assigned higher priority group ratings. When a user of a search engine performs a search, and a document (webpage) is found in response to the search, that user's relation rating to the document's owner is determined, and the terms in the search query are compared to those keywords within the document that have a priority rating that is less than or equal to the user's relation rating with respect to the document. owner. In this way, users have different search capabilities based on their relation to the owner of each document.
According to an exemplary embodiment, keywords are indexed differently than in typical search engines. Keywords are identified and parsed, and a group priority level or rating is determined for each keyword. The group priority level indicates how close a user must be to the document owner in order for that user'query to be compared with each keyword in the search index, i.e., what relation rating the user must have to the document owner in order to be presented with search results based on each keyword. Ideally, this will result in minimizing rejections of document requests in order to maximize the delivery of positive results. Therefore, the closer that user is to the document owner, the more keywords from the document will be available to match the user's search query (i.e., there is less “scrubbing” done by the system.).
Referring to
When a user performs a web search, that user's relation rating to each private document owner is determined, and the terms in the search query are compared to the keywords from the documents' index that have a priority rating that is less than or equal to the user's relation rating with regard to the document owner. In this way, users have different search capabilities based on their relation to the owner of each document. So, far example, a buddy “Kevin” may be able to find a document owner's Flickr vacation image in a search, whereas a complete stranger may not.
a and 3b illustrate a search and retrieval process according to an exemplary embodiment. The process begins at step 305 at which a user enters a search query. At step 310, a search engine analyzes a document is a search index. At step 315, the user's relation to the document owner is determined. At step 320, a search engine compares the search terms to the document's indexed keywords, where the keywords have group priority level less than or equal to the user's relation level. A step 325, the search engine determines an overall page (document) rating. At step 330, a determination is made whether there are more documents in the index. If so, the process returns to step 310. Otherwise, the process continues to step 335 at with the search engine constructs an abstract for documents with the highest document rating. The abstract only includes keywords with a group priority level less than or equal to the user's relation level. At step 340, the results and abstract are presented to the user. From step 340, the process continues to step 345 at which a determination is made whether the user has requested a private document from abstract results. If so, a determination is made at step 350 whether the user is granted viewing rights in the document's accesses list. If not, the user is redirected to the document request from submission process (
As described above, embodiments can be embodied in the form of computer-implemented processes and apparatuses for practicing those processes. In exemplary embodiments, the invention is embodied in computer program code executed by one or more network elements. Embodiments include computer program code containing instructions embodied in tangible medial, such as floppy diskettes, CD-ROMs, hard drives, or any other computer-readable storage medium, wherein, when the computer program code is loaded into and executed by a computer, the computer becomes an apparatus for practicing the invention. Embodiments include computer program code, for example, whether stored in a storage medium, loaded into and/or executed by a computer, or transmitted over some transmission medium, such as over electrical wiring or cabling, through fiber optics, or via electromagnetic radiation, wherein, when the computer program code is loaded into and executed by a computer, the computer becomes an apparatus for practicing the invention. When implemented on a general-purpose microprocessor, the computer program code segments configure the microprocessor to create specific logic circuits.
While the invention has been described with reference to exemplary embodiments, it will be understood by those skilled in the art that various changes may be made and equivalents may be substituted for elements thereof without departing from the scope of the invention. In addition, many modifications may be made to adapt a particular situation or material to the teachings of the invention without departing from the essential scope thereof. Therefore, it is intended that the invention not be limited to the particular embodiment disclosed as the best mode contemplated for carrying out this invention, but that the invention will include all embodiments falling within the scope of the appended claims. Moreover, the use of the terms first, second, etc. do not denote any order or importance, but rather the terms first, second, et., are used to distinguish one element from another. Furthermore, the use of the terms a, an, etc., do not denote a limitation of quantity, but rather denote the presence of at least one of the referenced item.