1. Technical Field
The systems and methods disclosed herein relate to the field of synchronizing related entries in different electronically stored directories and, more specifically, to systems and methods of accommodating for potential errors as changes are made to those directories.
2. Description of the Related Art
People today have different means of communication. Many of these communication methods are facilitated by computerized terminal devices, such as cellular telephones and personal computers. These so-called intelligent terminals often provide electronic directories, allowing users to store the names, profiles, various addresses, and phone numbers for people they contact regularly. Discrepancies can arise between these directories, where the information does not match. Such discrepancies can be the result of users making changes to the directory information in one device without making the corresponding changes to the directory information in the other devices.
There are systems that can provide synchronization between these disparate directories. Such systems can synchronize additions as well. Changes to existing records can be synchronized based on time stamps, assuming the last changed record is correct, or user configuration, wherein one directory is deemed to be the master directory. These mechanisms work well in cases where the data used to update the directories can be assumed to be correct.
The present inventors have devised automated means of updating these directories by parsing voicemail messages or real-time voice communications to extract the information. See, for example, the copending application entitled, “Automated Extraction of Information from Ongoing Voice Communications,” Ser. No. ______, filed ______, and the copending application entitled, “Automated Directory Updates from Voicemail,” Ser. No. ______, filed ______. The content of these two applications is hereby expressly incorporated by reference.
When directories are updated by automated means, based on information extracted using speech recognition or other means subject to errors, there is a possibility that the updates will insert errors into the directory. The existing synchronization mechanisms are likely to simply propagate such errors, potentially overwriting all copies of the correct information. Thus, there is a need for a system that can perform such synchronization in the face of errors in the directory data.
In one embodiment, a method for synchronizing related entries in different electronically stored directories is provided. This method includes the steps of storing first entries for a first one of the directories in a first memory, these first entries having first fields for different types of information, and each first field having a related stored confidence level indicating a degree of confidence of the accuracy of the data stored in each first field. This method further includes storing second entries for a second one of the directories in a second memory, these second entries corresponding to the stored first entries of the first directory, and these second entries having second fields for different types of information, each second field corresponding to a first field in a related first entry and each second field having a stored confidence level indicating a degree of confidence of the accuracy of the data stored in each second field.
In this one embodiment, the method further includes determining when a change has been made to a field of an entry and updating the corresponding field in the other directory when the confidence level for said field exceeds a threshold. This threshold may, for example, be pre-assigned independent of the stored confidence level of any field, or may, for example, comprise the confidence level of the field in the other directory to be changed.
Consistent with another embodiment, a method of synchronizing related entries in different electronically stored directories is provided comprising the steps of receiving a proposed change in a field of one entry for one of the directories; determining the confidence level of the proposed change, this confidence level indicating a degree of confidence of the accuracy of the data in the proposed change; storing, in a memory device, the proposed field change and confidence level for the field of the one entry in the one directory; and synchronizing a corresponding field in at least one related entry in at least another one of the directories based on the confidence level for the one field.
In another embodiment, the method of synchronizing related entries in different electronically stored directories comprises the steps of periodically comparing hash entry values of the related entries in the different directories; identifying at least one changed field in one of the related entries in one of the directories when hash values for the related entries in the different directories are not the same; identifying a most recently changed field in the related entries corresponding to the one changed field when the hash entry values are different for related entries; identifying a confidence level stored in a memory for the identified most recent changed field, the confidence level indicating a degree of confidence of the accuracy of data stored in the most recent changed field; and synchronizing the fields in the related entries corresponding to the one changed field in at least another one of the directories based on the confidence level for the one field.
The present invention may also take the form of a system for synchronizing related entries in different electronically stored directories, comprising, in one embodiment, a first memory storing first entries for a first one of the directories, the first entries having first fields for different types of information, and each first field having a related stored confidence level; a second memory storing second entries for a second one of the directories corresponding to the stored first entries of the first directory, the second entries having second fields for different types of information, each second field corresponding to a first field in a related first entry and each second field having a stored confidence level; and a processor connected to communicate with the first and second memories, and programmed to determine when a change has been made to a field of an entry and to update the corresponding field in the other directory when the confidence level for the changed field exceeds a threshold.
Still further, the invention may take the form of a directory for use with a system for synchronizing related entries in different electronically stored directories, that directory comprising, for example, a memory storing first entries for that directory, these first entries having fields for different types of information, and each field having a related stored confidence level that indicates a degree of confidence of the accuracy of the data stored in that field.
It is important to understand that both the foregoing general description and the following detailed description are exemplary and explanatory only, and are not restrictive of the invention as claimed.
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate various embodiments. In the drawings:
In the following description, for purposes of explanation and not limitation, specific techniques and embodiments are set forth, such as particular sequences of steps, interfaces, and configurations, in order to provide a thorough understanding of the techniques presented here. While the techniques and embodiments will primarily be described in the context of the accompanying drawings, those skilled in the art will further appreciate that the techniques and embodiments can also be practiced in other electronic devices or systems.
Reference will now be made in detail to exemplary embodiments of the present invention, examples of which are illustrated in the accompanying drawings. Whenever possible, the same reference numbers will be used throughout the drawings to refer to the same or like parts.
In accordance with one aspect of the present invention, each field 202a-202f also has a related stored confidence level 206a-206f that indicates a degree of confidence in the accuracy of the data stored in that field. As is shown in
Each entry 200 also preferably includes a time stamp 208 and a hash entry 210. It should be apparent to one of ordinary skill in the art that time stamp 208 of each entry 200 indicates the date and time of the most recent update of entry 200. Hash entry 210 is developed according to conventional techniques to identify where there has been a change in the information or confidence level 206a-206f contained in any of the respective fields 202a-202f of entry 200.
Returning to
More specifically, the purpose of the synchronization function in processor 108 is to collect data from the various directories, reconcile that data, and then update those directories with the most recent and most accurate data. The synchronization function of processor 108 is also connected to one or more sources of external reference data 110, as is also shown in
The electronic directories contained in various devices, for example, personal computer 102, cell phone 104, and server 106, as noted above, contain entries representing user contacts. These entries contain relevant contact information, such as name, address, various phone numbers (e.g., home, work, cell), and e-mail addresses. To facilitate synchronization, the following information is also contained in these electronic directories:
As was noted above,
Each of these input mechanisms has its own characteristic error rates. Users entering data via a keypad or cellular telephone, where they must press a key one or more times to get the correct letter, are more likely to make mistakes than users entering data via a standard QWERTY keyboard. In addition, people are more likely to catch an entry mistake when it is reviewed on a full-size computer screen than when seen on a two-inch-square cellular telephone screen. Speech-recognition systems, which attempt to translate audio to text through a matching process, cannot recognize spoken phrases accurately in all situations. In fact, such systems often report a confidence level ranking with the proposed text to indicate the degree to which the recognition engine expects that the text represents the spoken phrase.
Inventors have discovered that this concept of a confidence ranking can be applied with beneficial effect in the process of synchronizing electronic directories. It is possible to assign a confidence level to field entry data that has been input into a particular electronic directory through a specific methodology. The confidence level assigned based upon that methodology provides an indication of the overall expectation of error rate.
For example, typed entries might be assigned a 95% confidence level, touchscreen entries may be assigned a 90% confidence level, and keypad entries may be assigned an 85% confidence level.
The confidence level values could also be assigned adaptively. For example, the confidence level for a given field could be adjusted based on the number of characters contained in the information for that field. The greater the number of characters, the lower the confidence level. For example, with the addition of each new character, the previous confidence level could be reduced by multiplying by a factor of 99.9%.
The confidence level might also be adjusted based on the technology that was used to collect the data. If the system has performed a spell-checking operation, the confidence level might be increased. The magnitude of such an increase might vary, depending upon whether the user or the system performed the correction, thereby taking into account the fact that automated, system-performed corrections might make a mistake and the chance that such a mistake has in fact occurred.
The confidence level might also be varied with the type of entry being submitted. Given the need to include letters, numbers, and punctuation marks, the error rate for entering e-mail addresses could be higher than for other entries, and, thus, these might receive a lower confidence level for a given input type.
From these types of operations, it is possible to attain a measure of the confidence in the correctness of the value for each field in a directory and to include that information within the directory associated with each field. Although directory applications might not make this information visible to users, it should be appreciated to one of ordinary skill in the art, given the inventors' disclosure herein, that the information may be exploited as part of the synchronization operation. Alternatively, directory applications could make the confidence levels visible to the user and, therefore, make them user adjustable. This would be useful in cases where users are entering data that they are not sure is correct. The system can then handle this uncertainty, as described below.
One key to the invention, accordingly, is the use of confidence level information in conjunction with time stamps and/or hash information that is generally used to produce synchronized directories with a high probability of being correct, even when the accuracy of individual entries is questionable.
The synchronization function of the present invention can be triggered to execute in a variety of fashions. It might operate on a periodic basis, such as running every night at 2:00 a.m. or every Monday at 8:30 a.m. Alternatively, it might be triggered by the request from one of the electronic directories. The request might be triggered based upon a specific user request to synchronize or it might be an automated response to an update of the directory.
When the synchronization function is triggered, it collects the relevant directory entries from the electronic directories. If the synchronization function is triggered to perform a full comparison of all the electronic directories, it retrieves all the entries from each electronic directory. If the synchronization function is triggered to perform updates across all electronic directories as the result of an update to a single entry in one of the directories, it retrieves the corresponding entries from the other directories.
After the synchronization function has retrieved the relevant entries from the electronic directories, it compares the corresponding entries in the various electronic devices. Rather than simply assuming the latest entry is correct or assuming one directory is consistently more reliable than another master directory, the synchronization function of the present invention uses the confidence level information to ensure that the entries with high uncertainty are not used to overwrite other data.
One example of a flow diagram for an embodiment of the synchronization function of the present invention is shown in
The second step 304 inquires as to whether the hash entry values for these sets of related entries are different. Presumably, related entries will have identical content if the directories are perfectly synchronized, since the hash entry values are dependent upon the content of the entry. To the contrary, if related entries have different content, their hash values will be different, necessitating a preferred embodiment, the undertaking of a synchronization procedure. Thus, as shown in
If, however, step 304 of determining if hash entry values are different results in a positive determination, step 308 is executed, which comprises the act of comparing corresponding fields in the related entries for which hash values have been determined to be different. By way of example, if a set of related entries comprises the entries of
The next step 310 of the
As is next shown in step 312 of
Assuming, for example, an approval threshold of 80%, in the case of the
If, however, only the confidence level of the changed field is tested against an absolute threshold, it is possible that the changed entry may have a confidence level sufficiently high to justify changing the corresponding fields, but lower than the confidence level of one or more of the field entries to be synchronized. In this event, there are two alternatives. One may choose to set the system so that the new confidence level is used in all corresponding entries, thereby allowing a higher confidence level to be lowered.
In the alternative, the system may be set to have the threshold level of step 312 of
Returning now to the procedure of
Based on the results of these verification and/or correction tests, the value of the confidence level may be increased in step 420 of
There are many reasons why a low confidence level may nevertheless be associated with a field value that justifies retention. For example, the changed field might be a friend's brand new telephone number that is not yet represented in an online directory, nor is it in the user's other directories. In this case, the synchronization function of the present invention can act on an entry to add an extra field as indicated, for example, by adding extra field 202g in
Alternatively, the synchronization function of the present invention could leave the questionable field data in the one directory without propagating it to any other directory, and provide notification to the user, via e-mail, a host Message Service (“SMS”), or phone, of the discrepancy, thereby providing a means to correct or verify the low confidence entry. In this instance, however, it may be important that that hash entry value utilized to determine if there has been a change not be affected by this unilateral entry that has not been synchronized with other directories.
Thus, the questionable data procedure may include a step 504, similar to step 408 in
Still another possibility indicated by step 506 of
The programming that provides some or all of the functionality of
In summary, such a program preferably includes the steps of storing first entries in the first of the directories, the first entries having first fields of different types of information, and each first field having a related stored confidence level that indicates a degree of confidence of the accuracy of said information. Given the processor 108 of
Thus, for example, if the first entries are stored in personal computer 102, the second entries may be stored in the directory of cell phone 104. A related one of these first and second entries may, for example, leave their respective entries of
The computer-readable storage media of the present invention may further include as part of its programming the step of determining when a change has been made to a field of an entry and the step of updating the corresponding field in the other directory when the confidence level for the changed field exceeds a threshold, as has been described in the illustrative examples set forth above.
Still further, the present invention may include the directory itself stored in any one of the electronic devices, for example, personal computer 102, cell phone 104, and server 106 of
In summarizing the embodiments of the invention disclosed above, the synchronization function of the present invention preferably first looks at a time stamp or hash entry to see if the entries are different. Note that the use of a time stamp or hash entry is an effective measure to allow rapid comparison to see which entries differ. The synchronization function of the present invention can compare all the entries upon each activation or selectively review many entries based, for example, on their time stamp information. When the synchronization function identifies records that have been separately modified, it takes a number of actions, as described above and summarized below.
Preferably, the synchronization function of the present invention compares the entries field by field to determine which fields differ between the two entries. For each field that differs across the multiple corresponding entries, the synchronization function of the present invention preferable performs a number of different actions. These actions may include selection of the most recent field, based on the time stamp of the entry, and comparison of the confidence levels of the most recent fields against one or two preferred thresholds. These thresholds may be considered to be an approval threshold and an acceptance threshold. If the confidence level equals or exceeds the approval threshold, all entries are updated to match the field value and confidence level from the most recent field. If the confidence level is less than the approval threshold but is greater than or equal to the acceptance threshold, steps can be taken to verify the results.
Steps to verify the results may include querying external data sources for corresponding entries, comparing low confidence entries with entries in other directories, analyzing the value of the field to determine if it has a format appropriate for that field, querying external data sources and other field values (referred to hereinafter as cross field validations) to determine if the field value represents a valid result for that field, and/or initiating an introduction with the user to get confirmation on the values being presented.
If the entry can be verified based on any of these steps, the synchronization function of the preferred embodiment updates all entries to match the field value from the most recent entry. The confidence level will be determined by the type or types of successful verification steps. For example, for each verification that is successful, the system of the present invention may, as one embodiment, add to the confidence level rating an amount equal to half the difference between the current confidence level and 100%.
If the confidence level is below the acceptance threshold, the synchronization function of the present invention in at least one embodiment could take steps to prevent the questionable doubt of a problem propagating. These steps may include updating a blank extra field in all entries with the most recent value, copying the field name into the extra field name area, and copying the confidence level as well. In the alternative, these steps may include initiating an interaction with the user to get confirmation on the values being presented. It is also possible in an embodiment of the invention to leave the result alone in the most recent record, without performing any update of the other electronic directories, and so notifying the user of the discrepancy.
For the fields that are the same across multiple corresponding entries, the synchronization function of the present invention preferably performs the following actions: setting the confidence level to the highest value among all the corresponding entries and, based on the results, determining these comparisons, and updating the directory entries, including the confidence level values as determined.
More specifically, the confidence level is set as a function of the values of the corresponding entries. One such function is to take the maximum value. If all the entries agree, it would be reasonable to increase the confidence level above that of each one of them to reflect the increased confidence that results from seeing the same value multiple times. One such function would be setting the confidence to a value of 1 minus the product of (1−Ci), where Ci is the confidence level of the ith entry.
One could also update a confidence level automatically based on user activities employing the stored data. For example, if the user receives an e-mail from an e-mail address with a low confidence rating, the confidence value can be increased. Or, if the user places a successful call to a number with a low confidence rating, the confidence level for that entry could be updated accordingly. Success here might require more than just the connection, but rather require that the call last a minimum duration to ensure that the confidence level is not increased based on a call that actually is a wrong number.
The foregoing description has been presented for purposes of illustration. It is not exhaustive and does not limit the invention to the precise forms or embodiments disclosed. Modifications and adaptations of the invention can be made from consideration of the specification and practice of the disclosed embodiments of the invention. For example, one or more steps of methods described above may be performed in a different order or concurrently and still achieve desirable results.
Other embodiments of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. It is intended that the specification and examples be considered as exemplary only, with a true scope of the invention being indicated by the following claims.