1. Field of Invention
This invention relates to automatic information retrieval based on the content of a user's context.
2. Description of Related Art
Contact information retrieval systems, such as rolodexes, address lists, directories, contact databases, relationship management softwares and the like, commonly require a user to interrupt the current task to look up contact information, such as a contact name or address. A typical contact directory or database will be accessed by a user if the user suspects that an organization or a person is related to the task the user is currently performing. The user will then interrupt the user's task and use the contact directory to access the needed information.
Furthermore, the user may not think of other contacts because the user doesn't realize or remember that the user has a contact that is relevant, or that a colleague of the user has a relevant contact. Prior work related to this invention can be broadly categorized into three categories: contact systems, content-based inference engines, and just-in-time retrieval agents.
Contact systems include Ricoh's Innovation™ guestbook kiosk, business card scanners and electronic rolodexes, such as CardScan™, BizCard Reader™, and Microsoft Outlook™, as well as contact/relationship management tools such as GoldMine™, TeleMagic™, Maximizer™ and Act™. To obtain contact information using such contact systems, users must manually search lists of known contacts or enter names.
Content-based inference engines are commonly used in recommender systems, match-makers, and knowledge managements systems. AnswerGarden2™ and ReferralWeb™ are systems that allow users to manually locate others with specific expertise, or to explore inter-connections between people in social networks. Match-maker systems, such as Yenta™, can help users locate potentially relevant contacts based on detailed profiles provided by users. The inference engines used in enterprise knowledge management systems, offered by companies such as Verity™ and Autonomy™, can use data about users' interests, assigned tasks, and expertise to locate potentially related information.
Just-in-time retrieval agents do not require the user to manually formulate queries. Remembrance Agent™, Margin Notes™, Watson™, Suitor™, Letizia™ and Xlibris™ are all examples of systems that include just-in-time retrieval agents that recommend content such as web pages or text documents. Jimminy™, a wearable information system, receives information from a sensor informing it that a certain person is in the same room. If this person happens to be in the user's personal rolodex, Jimminy™ displays their name proactively.
However, none of the systems described above retrieves contacts based on the user's context, and none proactively recommend contacts. Known contact retrieval systems result in a loss of time, because the user has to interrupt the task the user is performing and start querying information sources for possible contact information. Furthermore, since such contact retrieval systems also often do not produce desired results, a system which can produce reliable contact information without needing to interrupt the user's task would be highly useful. Similarly, the ability to acquire relevant contact information while the user is performing another task is highly desirable.
This invention provides systems and methods for analyzing a user's context to identify possible relevant contact information.
This invention separately provides systems and methods for identifying possible relevant contact information based on a user's context.
This invention separately provides systems, methods and graphical user interfaces for displaying identified possible relevant contacts to a user without interrupting the user's workflow.
In various exemplary embodiments of systems and methods according to this invention, an information retrieval system includes a database that stores contact information, a contact information retrieval circuit, routine or application that analyzes the user's context and that retrieves contact information, and a display circuit, routine or application that unobtrusively displays the retrieved contact information.
In various exemplary embodiments, contact information relevant to a user is automatically retrieved based on the user's context. This approach increases efficiency, collaboration, and productivity. Various exemplary embodiments of systems and methods according to this invention improve the user's ability to capture, reuse and share personal and organizational contacts that are directly relevant to the task that the user is performing. This has the added benefit of supporting serendipity, or promoting the discovery of new contacts that the user may not have thought of, based on the context of the document the user is currently viewing.
In various exemplary embodiments of systems and methods according to this invention, a database is queried based on the content of the document currently being viewed by the user, relevant contact information is retrieved, and, without distracting the user from their current task, these contacts are then unobtrusively presented in a manner that does not disrupt the user's task, but which also allows the user to access the details of the contacts with, for example, a single interaction.
In various exemplary embodiments of systems and methods according to this invention, an information retrieval apparatus includes an information gathering circuit, routine or application that inputs contact information, a database that stores the contact information, an information monitoring and analysis circuit, routine or application and a data output circuit, routine or application that notifies the user of relevant contact information. The information gathering circuit, routine or application can be implemented using a computer with at least one of a scanner or a camera, and/or a keyboard, a touch-screen or other input circuit, routine or application that may serve as an audio/video guestbook. The information gathering circuit, routine or application may, for example, be located in a reception area. In various exemplary embodiments of systems and methods according to this invention, the reception area may be the reception area of a company building, a hotel or a convention hall. The information entered or recorded into the information gathering circuit, routine or application is then stored in a database. In various exemplary embodiments of systems and methods according to this invention, the database is located at a remote location and connected to the information gathering circuit, routine or application via a link or a network.
In various exemplary embodiments of systems and methods according to this invention, the information retrieval apparatus includes one or both of an information monitoring circuit, routine or application and an information analysis circuit, routine or application which allow the information contained in the user's context to be continuously monitored and analyzed. In various exemplary embodiments of systems and methods according to this invention, the information retrieval apparatus includes a data output circuit, routine or application that notifies the user of relevant information by unobtrusively bringing relevant contact information to the user's attention.
Various exemplary embodiments of systems and methods according to this invention include populating a contact database with contact information, analyzing a user's context, and displaying relevant contact information to the user. Populating the contact database is performed by capturing contact information, such as a business card, an email address, a telephone number, an address, an audio/video recording of a person's face and voice, and the like. The contact information is then transmitted to and stored in the contact database via a link or a network. In various exemplary embodiments, analyzing the user's context include determining possible matches between the user's current context and the contact information stored in the contact information database. Once matches are determined, the relevant contact information is brought to the user's attention. In various exemplary embodiments, the relevant contact information is displayed as a supplementary toolbar on the user's screen. The contact information can then be readily accessed by the user simply by clicking on the toolbar.
These and other features and advantages of this invention are described in, or are apparent from, the following detailed description of various exemplary embodiments of the systems and methods according to this invention.
Various exemplary embodiments of systems and methods according to this invention will be described in detail with reference to the following figures, wherein:
In various exemplary embodiments according to this invention, contact information is collected from visitors who are asked to provide their name and organization and answer a few questions. For example, an operator may operate a computer, scan the visitor's business card into a scanner, make an audio and/or video recording of the visitor and/or record the visitor's answers to questions, as discussed below. Alternatively, the visitors may directly scan their business cards into the scanner, make an audio and/or video recording and/or answer various questions at a self-operated input station. In various exemplary embodiments of the systems and methods of this invention, the scanned information, audio/video recording and/or answers to the questions are automatically transmitted and stored in a memory or database connected to the computer or other input circuit, routine or application through a link or a network.
It should be noted that, in various exemplary embodiments of the systems and methods according to this invention, during step S1100, the email address and/or other electronic identification and/or location information, such as, for example, a uniform resource locator (URL) or a uniform resource identification (URI) associated with the visitor or the visitor's organization, is recognized from the scanned business card and stored in a memory or database. One or more telephone numbers and/or postal addresses associated with the visitor are also recognized and stored in the memory or database. The telephone numbers can include one or more of a personal voice number, a work voice number, a personal facsimile number, a work facsimile number, a mobile number, a pager number or any other known or later-developed type of telephone number.
In various exemplary embodiments of systems and methods according to this invention, the contact information, initially stored in the computer or other input device, is transmitted, such as, for example, immediately or at the end of each day or work period, to the memory or database. This information corresponds to the contact information obtained from the visitor. Once transmitted and stored, this contact information, which comprises additional data, enriches the contact information in the contact information database. The contact information database is accessed, for example, by a computer that is connected to the database via a link or a network. Once a database containing contact information is constituted, then a method for retrieving contact information can be implemented.
It should be appreciated that, in step S2400, the identified contacts in the ranked list can be selected for display to the user in any of a number of ways. For example, in various exemplary embodiments, any identified contact having a sufficiently high score is selected for display, regardless of the number of such identified contacts. In various other exemplary embodiments, a give number n of identified contacts from the top of the ranked list are selected for display, regardless of the actual score associated with those n identified contacts. In still other exemplary embodiments, the top n identified contacts are initially selected for display from the top of the ranked list. Then, of the top n selected contacts, only those having a sufficiently high score are actually displayed. In various exemplary embodiments, this is done, for example, by removing from the list of the top n contacts those that do not have a sufficiently high score.
As shown in
Next, in step S2110, a first or next one of these groups is selected as a current group. Then in step S2115, a determination is made whether the current group is a postal code. In various exemplary embodiments of the systems and methods according to this invention, any five- or nine-digit number, ignoring the dash “-” that may be present in a nine-digit number, is presumed to be a potential postal code. If the current group corresponds to a postal code, operation continues to step S2120. Otherwise, operation jumps to step S2130.
In step S2120, one or more information groups that precede the current group that could be a postal code are analyzed. For example, up to three lines preceding a current group that could be a postal code are analyzed, since these lines may represent the remaining part of an address. In some exemplary embodiments, these preceding information groups are temporarily stored, along with the current group that could be a postal code, for example, in a memory of the computer. Next, in step S2125, a determination is made whether these preceding groups correspond to an address. If the number of words present within these preceding groups is greater than a defined limit, which may be related, for example, to an average number of words present in a typical address, then the five- or nine-digit number and the three preceding lines are deleted from memory and operation jumps to step S2145. Otherwise, if the number of words present within the lines is less than or equal to the defined limit, operation continues to step S2140.
It should be noted that, in various exemplary embodiments, for a current group identified as a postal code and address, to determine whether these preceding groups preceding a postal code correspond to an address, a comparison is made between the number of words contained in the lines and a defined limit on the number of words, which may be related, for example, to the number of words present in a typical address, ignoring words such as “Avenue”, “Street”, and the like. Since not every five- or nine-digit number is a postal code and not every postal code is necessarily preceded by the rest of the address, the total number of words is counted.
In contrast, in step S2130, a determination is made whether the current group of characters corresponds to an email address. If the current group of characters corresponds to an email address, operation jumps to step S2140. Otherwise, operation continues to step S2135.
In step S2135, a determination is made whether the current group corresponds to a telephone number. For example, a group in the parsed document that contains a ten-digit number, ignoring any parentheses or dashes that may be used to delineate area code from actual telephone number, is identified. Any group containing a ten-digit number is identified as a possible telephone number. If the current group of characters corresponds to a telephone number, operation continues to step S2140. Otherwise, operation jumps to step S2145.
In step S2140, the current group is a postal code, an email address or a telephone number. Accordingly, if the current group, along with, in various exemplary embodiments, in one or more situations, one or more preceding and/or one or more following groups is a postal code, the set of these preceding groups and the current group, representing the entire address, including the identified postal code, is added to an analysis list for further analysis. In contrast, when the current group is an email address, it should be noted that, in various exemplary embodiments, in step S2140, the part of the email address immediately following the “@” character, i.e., the suffix, may represent a company or organization. In step S2140, this email address, or at least the suffix, is stored in the analysis list for further comparison with the contact information contained in the contact information database. In various exemplary embodiment, in contrast to both of these, when the current group is a telephone number, the telephone number is stored into the analysis list. This telephone number is stored for further comparison with the contact information contained in the contact information database. Operation then continues to step S2145.
In step S2145, the current group is compared to the list of contact person names to determine if it matches at least one contact name in the database. Next, in step S2150, a determination is made whether the current group matches a person's name in the contact information database. If a match is found, then operation continues to step S2155. Otherwise, operation jumps to step S2170. In step S2155, a score is assigned to the current group based on how well it matches one or more person names in the contact database. Then, in step S2160, a determination is made whether the score assigned to the match is high enough to be relevant to the user currently viewing the context. In various exemplary embodiments, the score is compared to a determined threshold score. If the score is higher than the threshold, the score is considered high enough. If the score is high enough, operation proceeds to step S2165. Otherwise, operation again jumps to step S2170. In step S2165, the matched person's name is stored in a display list.
In step S2170, the current group is compared to organization names in the contact database. Then, in step S2175, a determination is made whether the current group matches one or more organization names that are present in the contact information database. If the current group matches an organization's name in the contact database, operation continues to step S2180. Otherwise, operation jumps to step S2195.
In step S2180, a score is assigned to the current group based on how well it matches the organization's name. Then, in step S2185, a determination is made whether the score assigned to the match is high enough to be relevant to the user currently viewing the context. If the score is high enough, operation proceeds to step S2190. Otherwise, operation jumps to step S2195. In step S2190, the matched organization's name is stored in the display list.
In step S2195, a determination is made whether the current group being analyzed is the last group. If there are more groups left to be analyzed, operation returns to step S2110. Otherwise, operation continues to step S2199, where operation returns to step S2200.
It should be appreciated that, in various other exemplary embodiments, in addition to or in place of step S2145, rather than comparing each group to the names of the contacts and/or their organizations that are present in the database, each contact (i.e., person) name and/or each organization name present in the database can be selected in turn and compared to the content of the user's current context to find matches between a contact name or an organization name present in the database and a text string in the user's current context. For example, one exemplary embodiment of comparing the names in the database to the user's current context includes creating a representation of a selected person or organization name that is present in the database and then querying the user's current context using that representation. In various exemplary embodiments, that representation is merely the selected person or organization name. In various other exemplary embodiments, that representation can be a regular expression derived from the selected person or organization name.
In other exemplary embodiments, since it is likely that explicitly searching for known names leads to more accurate results, and since the number of known names in the contact information database is limited, this approach is feasible and efficient enough for real-time matching. In various exemplary embodiments, matches of the form [FirstName LastName] or [LastName, Firstname], are scored highly. In other exemplary embodiments, the score is increased slightly based on the number of times the names are found in the document. Matches of the form [Initial LastName] [LastName, Initial] receive lower scores. An organization match begins when the first word of an organization is found in the document. The score is based on the number of remaining words in the organization name that are found near the first word, ignoring common words, such as “Inc”, “Co”, and the like.
It should be noted that, in various exemplary embodiments of systems and methods according to this invention, the score assigned to an analysis list element is based on the quality of match between the analysis list element and the contact information database elements. For instance, in the case of a postal code, a high score is given to the analysis list element if every word of the address in that analysis list element is present in a matching address from the contact information database, ignoring common words such as “Avenue”, “Street”, “Road”, or their contractions and the like. A partial score is given if only parts of the address in that analysis list element are present in a matching address from the contact information database or vice versa. A lower score is given to the analysis list element if only the postal code matches an address from the contact information database.
It should also be noted that, in various exemplary embodiments of systems and methods according to this invention, in the case of an analysis list element that is identified as an email address, the contact information database is analyzed for people with the same email or organizations with the same suffix as the email address present in that element of the analysis list. If a determination is made that the email address present in that element of the analysis list does not exactly match any email address in the contact information database, then a determination is also made whether the email address present in that element of the analysis list partially matches any email address in the contact information database. If a determination is made that the email address present in that element of the analysis list completely matches an email address from the contact information database, then the matched analysis list element is assigned a high score. A partial score is given to that matched analysis list element if only the suffix, or a portion of the suffix, matches one or several email addresses from the contact information database. If a determination is made that the email address present in that element of the analysis list does not match any email address from the contact information database, then a zero score is assigned to that email address.
It should further be noted that, in various exemplary embodiments of systems and methods according to this invention, in the case of a telephone or facsimile number, the contact information database is analyzed for people or organizations with the same telephone or facsimile number as a telephone or facsimile number present in that element of the analysis list.
If a determination is made that a telephone number present in that element of the analysis list matches a telephone number from the contact information database, then a score is assigned to the matching analysis list element containing that telephone number. In various exemplary embodiments of systems and methods according to this invention, the score is based on whether that analysis list element matches a personal voice telephone number, a work voice telephone number, a mobile telephone number, a personal or work facsimile telephone number or a pager number or some other type of telephone number. In various exemplary embodiments, if the matched analysis list element containing that telephone number is matched to a voice number in the contact information database, a higher score is given to that analysis list element than if that analysis list element matches, for instance, a facsimile number.
If a determination is made that the telephone number present in that analysis list element does not match any telephone number in the contact information database, then a zero score is assigned to that analysis list element.
In step S2230, for at least some analysis list elements that are related to each other, a combined score is generated for any such analysis list element. In various exemplary embodiments of systems and methods according to this invention, the combined score for such analysis list elements is based on the number of matches identified between such analysis list elements in the analysis list and the contact information database. For instance, if a postal code is present in an analysis list element and is matched to a particular record of a postal code in the contact information database, and an email address is also matched to an email address in the same or related record of the contact information database, then the score of these two elements (postal code and email address) is combined and associated with each such analysis list element. In various exemplary embodiments, the analysis list elements themselves can be combined.
Next, in steps S2235, a first or next analysis list element is selected as a current analysis list element. Then, in step S2240, the combined score of the current analysis list element is compared to a defined score threshold. If the combined score is above the defined threshold, operation continues to step S2250. Otherwise, operation jumps to step S2260. In step 2250, the current analysis list element is analyzed to determine if the current analysis list element contains a person's or organization's name. If the scored analysis list element does not contain a person's or organization's name, operation again continues to step S2260. Otherwise, if the current analysis list element contains a person's or organization's name, operation jumps to step S2270.
In step S2260, the current analysis list element is discarded. Operation then jumps to step S2280. In contrast, in step S2270, the current analysis list element is added to a display list. The display list contains contact names to be displayed to the user. In various exemplary embodiments of the systems and methods according to this invention, the display list elements are ranked from most relevant entry, i.e., highest score, to least relevant entry, i.e., lowest score, of the elements added to the display list. Next, in step S2280, a determination is made whether there are any unselected elements in the analysis list to be analyzed. If there are any elements in the analysis list that have not been selected, operation returns to S2235. Otherwise, operation continues to step S2290, where operation returns to step S2300.
In various exemplary embodiments, as long as the user is viewing a context, the context-based contact retrieval method performs its function. The method ends when the user shuts off the current context.
The display list element is displayed in a separate window, or as a graphical user interface widget, such as, for example, a toolbar, as illustrated, for instance, in
In response to selecting one of the contact displayed in the contact information tool bar 140 or 240 shown in
The monitoring circuit, routine or application 440 is used to perform parsing of a context being viewed by the user into a searchable representation. The monitoring circuit, routine or application 440 is also used to compare each portion to contact information present in the contact information database, and to store matched contact information in the memory 430. The analyzing circuit, routine or application 450 is used to analyze the information monitored by the use of the monitoring circuit, routine or application 440 in assigning scores to the matched contact information.
The ranking circuit, routine or application 460 is used to rank the matched contact information to be displayed to the user. The list display portion that contains the list of contacts to be presented to the user is generated by the display list generalizing circuit, routine or application 470. The generated display list portion of the matched contact information is output to the application manager 480 to be displayed to the user.
In various exemplary embodiments of systems and methods according to this invention, the memory 430 comprises several memory portions dedicated to different activities of the context-based contact retrieval system 400. Memory portions are allocated in the memory 430, such as, for example, to store the current context in a memory portion 431, the parsed groups or portions in a memory portion 432, an analysis list compiled from the context in a memory portion 433, the scores given to each matched element of the analysis list in a memory portion 434, and the display list in a memory portion 435.
The memory 430 can be implemented using any appropriate combination of alterable, volatile or non-volatile memory or non-alterable, or fixed, memory. The alterable memory, whether volatile or non-volatile, can be implemented using any one or more of static or dynamic RAM, a floppy disk and disk drive, a writable or re-writeable optical disk and disk drive, a hard drive, flash memory or the like. Similarly, the non-alterable or fixed memory can be implemented using any one or more of ROM, PROM, EPROM, EEPROM, an optical ROM disk, such as a CD-ROM or DVD-ROM disk, and disk drive or the like.
As shown in
The link 510 between the context-based contact retrieval system 400 and the data source 500 can be implemented using any known or later-developed device or system for connecting the contact retrieval system 400 to the data source 520, including a direct cable connection, a connection over a wide area network or a local area network, a connection over an intranet, a connection over the Internet, or a connection over any other distributed processing network or system. In general, each of the connections can be any known or later developed connection system or structure usable to connect the context-based contact retrieval system 400 and the data source 500. Of course, it should be appreciated that the contact information database, rather than being stored in the data source 500, can be stored locally relative to the context-based contact retrieval system 400.
Once the contact information is output to the data source 500, when a user is viewing a context, such as, for instance, an email, a word document or a webpage, the monitoring circuit, routine or application 440, under control of the controller 420, monitors the context being viewed by the user that is stored in the memory portion 431, by breaking the context into separate groups of characters. In various exemplary embodiments, the monitoring circuit, routine or application 440 identifies several different types of portions, for example, postal address, email address and telephone number that are relatable to specific contact information stored in the contact information database. In various exemplary embodiments, the monitoring circuit, routine or application 440 also appellate the group stored in the memory portion as the user's current context evolves, such as, for example, as the user authors a document or the like.
In various exemplary embodiments, the monitoring circuit, routine or application 440 searches the groups obtained from the current context and stored in the memory portion 431 for any five- or nine-digit number. The monitoring circuit, routine or application 440 associates any group containing a five- or nine-digit number, ignoring the dash “-” that may be present in a nine-digit number, to a potential postal code. The monitoring circuit, routine or application 440 also identifies groups that contain content appearing up to three lines preceding the potential postal code within the user's current context, since these lines may represent the remaining part of an address. The monitoring circuit, routine or application 440 also determines whether a given group of characters corresponds to an email address. Any group comprising the “@” character is associated with a potential email address. In various exemplary embodiments, a group that contains other electronic address type information, such as “www”, “http”, “html” or the like is associated with a potential electronic address. The monitoring circuit, routine or application 440 also associates group containing any ten-digit number with a potential telephone number. The monitoring circuit, routine or application 440 then outputs all the groups that contain potential, postal codes, potential email or other electronic addresses and/or potential telephone numbers to the memory portion 432 as an analysis list.
The analyzing circuit, routine or application 450, under control of the controller 420, inputs the potential postal codes, email or other electronic addresses or telephone numbers of the analysis list from the memory portion 432. Then, the analyzing circuit, routine or application 450 compares group, whether that group contains a postal code, an email or other electronic address and/or a telephone number, present in the analysis list to postal codes, email and other electronic addresses and telephone numbers, respectively, present in the contact information data base, which is stored, for example, in the data source 500. The analyzing circuit, routine or application 450 then assigns scores to any matches identified between postal codes, email and other electronic addresses and telephone numbers present in the data source 500 and the postal codes, email and other electronic addresses and telephone numbers present in the group contained in the analysis list. The analyzing circuit, routine or application 450, if applicable, also combines the scores of interrelated postal codes, email and other electronic addresses or telephone numbers because, for example, they belong to the same person or entity and/or to a related person or entity. The analyzing circuit, routine or application, 450 under control of the controller 420, stores the scores or combined scores for each match, in the memory portion 434.
The ranking circuit, routine or application 460, under control of the controller 420, sorts the elements of the analysis list with respect to their respective scores from most relevant contact information (i.e., highest score) to least relevant contact information (i.e., lowest score). In various exemplary embodiments, the ranking circuit, routine or application 460, under control of the controller 420, then outputs the contact information associated with at least the n highest ranked groups, where n is any desired value, to the memory portion 435 as a display list of contact information that is appropriate to display to the user in view of the current context. In various other exemplary embodiments, the ranking circuit, routine or application 460, under control of the controller 420, then outputs the contact information associated with any group having a score that is greater than or equal to a defined threshold score to the memory portion 435 as the display list. In still other exemplary embodiments, the ranking circuit, routine or application 460, under control of the controller 420, then outputs the contact information associated with at most the top n groups that have a score that is greater than or equal to the defined threshold score to the memory portion 435 as the display list.
The list display portion generating circuit, routine or application 470, under control of the controller 420, inputs the display list from the memory portion 435. The list display portion generating circuit, routine or application 470 generates an unobtrusive list display portion structure or device, such as a toolbar or other graphical user interface widget, that contains selectable icons or some other appropriate elements for the contact information associated with the n highest ranked groups, and outputs the generated list display portion to the application manager 480, which is added by the application manager to the displayed information displayed by that application to the user, so that the display list portion is displayed to the user.
While this invention has been described in conjunction with the exemplary embodiments outlined above, various alternatives, modifications, variations, improvements, and/or substantial equivalents, whether known or that are or may be presently unforeseen, may become apparent to those having at least ordinary skill in the art. Accordingly, the exemplary embodiments of the invention, as set forth above, are intended to be illustrative, not limiting. Various changes may be made without departing from the spirit and scope of the invention. Therefore, the claims as filed and as they may be amended are intended to embrace all known or later-developed alternatives, modifications variations, improvements, and/or substantial equivalents.