Individuals are increasingly concerned with their privacy. For example, private information associated with individuals can be available online or in the possession of third parties. Such private information can be personal and/or can be confidential such that access to such private information can pose personal and/or financial risks. It is becoming increasingly difficult for individuals to determine if any such private information is available online or in the possession of third parties.
Various embodiments of the invention are disclosed in the following detailed description and the accompanying drawings.
The invention can be implemented in numerous ways, including as a process; an apparatus; a system; a composition of matter; a computer program product embodied on a computer readable storage medium; and/or a processor, such as a processor configured to execute instructions stored on and/or provided by a memory coupled to the processor. In this specification, these implementations, or any other form that the invention may take, may be referred to as techniques. In general, the order of the steps of disclosed processes may be altered within the scope of the invention. Unless stated otherwise, a component such as a processor or a memory described as being configured to perform a task may be implemented as a general component that is temporarily configured to perform the task at a given time or a specific component that is manufactured to perform the task. As used herein, the term ‘processor’ refers to one or more devices, circuits, and/or processing cores configured to process data, such as computer program instructions.
A detailed description of one or more embodiments of the invention is provided below along with accompanying figures that illustrate the principles of the invention. The invention is described in connection with such embodiments, but the invention is not limited to any embodiment. The scope of the invention is limited only by the claims and the invention encompasses numerous alternatives, modifications and equivalents. Numerous specific details are set forth in the following description in order to provide a thorough understanding of the invention. These details are provided for the purpose of example and the invention may be practiced according to the claims without some or all of these specific details. For the purpose of clarity, technical material that is known in the technical fields related to the invention has not been described in detail so that the invention is not unnecessarily obscured.
Individuals are increasingly concerned with their privacy. For example, private information associated with individuals can be available online or in the possession of third parties. Such private information can be personal and/or can be confidential such that access to such private information can pose personal and/or financial risks. It is becoming increasingly difficult for individuals to determine if any such private information is available online or in the possession of third parties (e.g., authorized third parties, or in some cases, unauthorized third parties).
What are needed are new and improved techniques for entities, such as users and/or other entities, to determine whether private information is available online or in the possession of third parties. For example, private information can refer to any information that a user deems or desires to be maintained as private to the user and/or otherwise not generally available to the public or third parties (e.g., without user's authorization). Various example types or categories of private information are described herein with respect to various embodiments.
Accordingly, techniques for privacy scoring are disclosed. In some embodiments, privacy scoring includes collecting information associated with an entity; and generating a privacy score based on the private information that was collected that is associated with the entity. In some embodiments, privacy scoring further includes determining private information that was collected that is associated with the entity.
In some embodiments, privacy scoring further includes outputting the privacy score. In some embodiments, privacy scoring further includes outputting a privacy report that includes the privacy score. In some embodiments, privacy scoring further includes outputting a privacy report that includes the privacy score, wherein the privacy score corresponds to an overall privacy score. In some embodiments, privacy scoring further includes outputting a privacy report that includes the privacy score and a recommendation to improve the privacy score.
In some embodiments, privacy scoring further includes alerting the entity based on the privacy score. In some embodiments, privacy scoring further includes periodically collecting information associated with the entity; and updating the privacy score. In some embodiments, privacy scoring further includes periodically collecting information associated with the entity; updating the privacy score; and alerting the entity that the privacy score has been updated.
In some embodiments, privacy scoring further includes verifying that the private information is associated with the entity (e.g., based on entity feedback and/or using various other techniques, such as described herein). In some embodiments, privacy scoring further includes verifying that the private information is associated with the entity and is private data (e.g., based on entity feedback and/or using various other techniques, such as described herein).
In some embodiments, privacy scoring further includes periodically collecting information associated with the entity. In some embodiments, privacy scoring further includes collecting information associated with the entity using an application programming interface to request data from a third party data source (e.g., to collect structured data related to the entity). In some embodiments, privacy scoring further includes collecting information associated with the entity using a site scraper to extract data from a web site (e.g., to collect unstructured data related to the entity). In some embodiments, privacy scoring further includes collecting information associated with the entity using a search engine to extract data from a plurality of web sites (e.g., to collect unstructured data related to the entity).
Privacy Platform
For example, a privacy report can be output to a user. The privacy report can provide an analysis of the user's digital footprint (e.g., exposed user related data on the Internet and other publicly available data sources) and analyze their exposed private data (age, birth date, social security number, and/or other personal, confidential, or sensitive information), such as what data is exposed, where such private information is exposed, how it was exposed (e.g., to potentially infer that such data may have been exposed when the user signed up with an account with a particular third party entity or from a court record), and/or what it is being used for (e.g., targeted marketing activities, stalking activities, and/or other activities). The privacy report can also include recommendations to the user to reduce their privacy risks.
As another example, a privacy score (e.g., privacy report that includes a privacy score) can be output to a user. The privacy score can provide a score that is based on a privacy risk analysis of the user's digital footprint (e.g., exposed user related data on the Internet and other publicly available data sources) and analyze their exposed private data (age, birth date, and/or other personal, confidential, or sensitive information). For example, the privacy score can be provided along with the privacy report or as part of the privacy report to provide the user with an objective and/or relative privacy risk based measure and to facilitate the user being able to gauge their private data exposure and risks. The privacy report can also include recommendations to the user to improve their privacy score and reduce their privacy risks.
In the example shown, the user of client device 106 (hereinafter referred to as “Bob”) owns his own business (“Bob's Company”). The user of client device 108 (hereinafter referred to as “Alice”) is employed by a national company (“ACME Company”). As will be described in more detail below, Bob and Alice can each access the services of privacy platform 102 via network 104, such as the Internet, to determine whether any of their private information is available online and/or in the possession of third parties. The techniques described herein can work with a variety of client devices 106-108 including, but not limited to personal computers, tablet computers, smartphones, and/or other computing devices.
In some embodiments, privacy platform 102 is configured to collect personal data and other data determined to be potentially associated with a user from a variety of sources, including websites 110-112, third party data sources 114, social networking websites 120-122, and other Internet or web based sources, such as blogs and forums 132-134. In some embodiments, users of the privacy platform 102, such as Alice and Bob, can also provide user related data to privacy platform 102, such as their full legal name, residence address(es), email address(es), phone number(s), employment information, age, birth date, and/or other personal or identifying information that can be used by the privacy platform to identify information that may be associated with the user (e.g., to perform targeted data collection and private data isolation as further described herein). In the examples described herein, web sites 110-112 can be any form of web site that can include content about entities, such as users, associations, corporations, government organizations, and/or other entities. Examples of social networking sites 120 and 122 include Facebook, Twitter, and Foursquare. In some examples, social networking sites 120-122 can allow users to take actions such as “checking in” to locations. Finally, personal blog 134 and online forum 132 are examples of other types of websites “on the open Web” that can include information that may be considered private by a user or other entity.
Platform 102 is illustrated as a single logical device in
Account/Entity Setup
For example, in order to access the services provided by privacy platform 102, Bob first registers for an account with the platform. At the outset of the process, he accesses interface 202 (e.g., a web-based interface) and provides information such as a desired username and password for his new account with the platform. He also provides payment information (if applicable). If Bob has created accounts for, for example, himself, his family, and/or his business on social networking sites such as sites 120 and 122, Bob can identify those accounts to platform 102 as well. In some cases, Bob can call the service provider to register and/or setup accounts via a telephony based registration/account set-up process.
Next, Bob is prompted by platform 102 to provide the name of the entity that he wants to perform the privacy platform services for, which in this case, it is assumed that this would be for himself, such that Bob can input his full legal name (e.g., “Bob Smith”), his personal residence address (e.g., “123 Peachtree St.; Atlanta, Calif. 30303), and (optionally) the type of information that he deems to be private information (e.g., birthdate, social security number, health information, hobbies, and/or other information). This information entered by Bob is provided to auto find engine 204, which is configured to locate, across web sites on the Internet (e.g., World Wide Web) and/or various other online third party data sources, any information that is determined to be associated with Bob, if present. The data collection performed by auto find engine can include structured data collection and unstructured data collection. For example, web sites 110-112 can be identified to have information potentially associated with Bob based on content analysis (e.g., using various natural language processing techniques). In some embodiments, a search engine, such as Bing, Google, and/or Yahoo, is used to identify URLs of particular web sites that include relative content using search interface 210 of auto find engine 204.
In the example shown in
In some embodiments, auto find engine 204 also determines trackers associated with a user's browser (e.g., tracker cookies cached on a user's client device, such as Bob's laptop 106) to determine which companies are tracking Bob Smith (e.g., by looking at cookies stored locally on Bob Smith's browser/client). For example, various inferences can be made based on such trackers and using information from third party sources that classify such trackers as further described herein.
In some embodiments, private information extractor engine 211 extracts potentially private information from the information that is collected by auto find engine 204. For example, structured information can be processed (e.g., based on fields of the structured data) to extract potentially relevant private information associated with Bob. In addition, unstructured information can be processed (e.g., using content based analysis techniques) to extract potentially relevant private information associated with Bob.
In some embodiments, results obtained by private information extractor engine 211 are provided to verification engine 212, which confirms whether such information is associated with the entity of interest, which is Bob in this example. In some embodiments, verification engine 212 also determines whether such information includes private information associated with the entity of interest, which is Bob in this example. Verification engine 212 can be configured to verify all results (including any obtained from sources 110-134), and can also be configured to verify (or otherwise process) just those results obtained via interface 210. As one example, for a given query, the first ten results obtained from search interface 210 can be examined. The result that has the best match score and also includes the expected entity name and physical address is designated as potentially relevant information on the queried site. As another example, based on verification and entity feedback, the collection process can be iteratively performed to execute more targeted data collection and private information extraction based on the verification and entity feedback to improve results (e.g., refined searches can be performed using the search interface 210 in subsequent iterations of the data collection and private information extraction process).
In some embodiments, verification engine 212 presents results to Bob for verification that the potentially private information corresponds to information that is associated with Bob. In some embodiments, verification engine 212 also presents results to Bob for verification that the potentially private information includes Bob's private information. As an example, Bob may be shown (via interface 202) a set of URLs on each of the sites 110-112 and extracted information from such URLs that were previously determined by private information extractor engine 211 and auto find engine 204 to including potentially private information associated with Bob. Once confirmed by Bob, the source (e.g., URLs, third party data source, ad tracker network, and/or other source identifying information) along with the verified private information (e.g., extracted from the third party data source), tracker information, and any other appropriate data are stored in database 214. Examples of such other data can include information associated with the data source (e.g., classification of the data source, reputation of the data source, prominence of the data source, and/or other information) and/or any social data (e.g., obtained from social sites 120-122).
At 306, results of the private information data collection performed at 304 are verified. As one example of the processing performed at 304, verification engine 212 performs checks such as confirming that various user information received at 302 is present in a given result (e.g., using content analysis techniques and threshold matching techniques). As another example, a user can be asked to confirm that results are associated with the user and that private information is included in such results, and if so, that confirmation is received as a verification at 306. Finally, at 308, verified results are stored. As an example, source identifiers (e.g., URLs or other source identifying information) for each of the verified results are stored in database 214. Although pictured as a single database in
Data Collection and Processing
In some embodiments, once an entity (e.g., Bob Smith) has an account on privacy platform 102, collecting and processing of potentially relevant private data is performed. As shown, platform 102 includes a scheduler 402 that periodically instructs collection engine 404 to obtain data from sources such as sources 110-34. Scheduler 402 can be configured to initiate data collection based on a variety of rules. For example, it can cause data collection to occur once a day for all customers (e.g., enrolled entities) across all applicable sites. It can also cause collection to occur with greater frequency for certain entities (e.g., which pay for premium services) than others (e.g., which have free accounts). Further, collection can be performed across all sources (e.g., sources 110-134) with the same frequency or can be performed at different intervals (e.g., with collection performed on site 110 once per day and collection performed on site 112 once per week).
In addition to or instead of the scheduled collection of data, data collection can also be initiated based on the occurrence of an arbitrary triggering event. For example, collection can be triggered based on a login event by a user such as Bob (e.g., based on a permanent cookie or password being supplied). Collection can also be triggered based on an on-demand refresh request by the user (e.g., where Bob clicks on a “refresh my data” button in interface 202).
In some embodiments, private data isolation engine 406 performs extraction of potentially private information associated with an entity. In some embodiments, the private data isolation engine extracts private information from structured data sets and from unstructured data sets using various techniques. For example, structured data set analysis can be performed using fields, such as name, address, past address, birth date, religion, race/ethnicity, education level, social security number, parent's names, occupation, health information/medical records, and so forth. As another example, unstructured data set analysis can be performed using various natural language processing (NLP) and contextual analysis techniques to perform entity extraction; determine associations with a particular entity, like occupation, phone number, hobby, political party, action taken (e.g., visited Stone Mountain Park in Atlanta, Ga.); perform inferences; and use verification techniques (e.g., including a user based feedback verification). In some embodiments, the verification provides a feedback loop that can be used by the private data isolation engine to become more accurate to provide refined data collection and private data isolation for a given entity. In some embodiments, the private data isolation engine includes a classifier engine.
In some embodiments, extracted structural data is used to facilitate identifying a user such as Bob, and the structured data can then be used to filter the unstructured data using various techniques described herein. For example, Bob can initially provide the platform with relevant user information (e.g., Bob Smith, Atlanta, Ga. and possibly other information). The collection engine of the platform can send requests to third party data sources (e.g., Spokeo and/or other sources) using API based queries based on such relevant user information. The platform receives back structured data set results based on such queries. The private data isolation engine of the platform can isolate information that is relevant to the user and provide that as input to the collection engine, which can then perform web based crawling and/or targeted searches using search engine(s) to collect additional data that may be relevant to the user, in which such additionally collected information can included structured data and unstructured data. The private data isolation engine of the platform can also isolate information that is relevant to the user from such structured data and unstructured data. The private data isolation engine can further process the isolated information determined to be relevant to the user to extract and store (e.g., at least temporarily) potentially private data determined to be associated with the user. In some embodiments, the verification engine can verify whether the potentially private data is associated with Bob and may include private information associated with Bob (e.g., which can also include user feedback from Bob based on the extracted results). The verified results can then be used to generate a privacy report and/or a privacy report with a privacy score for Bob as further described herein. In some embodiments, such collected and extracted information is stored temporarily (e.g., in memory) for analysis, processing, and reporting purposes but need not be stored permanently or archived for longer periods of time.
In some embodiments, the private data isolation engine also ranks sources. For example, a source that is more prominent or widely accessed can be given a higher rank than a less prominent source (e.g., a Google search result on page 1 can be deemed more prominent than a Google search result on page 100, and a Lexis Nexis search result can be deemed more prominent than a less widely used source, such as a particular individual's personal blog). The ranking of the source can be relevant information that is identified in a privacy report and/or used as a factor or weighting factor in calculating a privacy score that is generated and output to the user.
Other elements depicted in
At 504, a determination is made as to which sources should be accessed. As an example, collection engine 404 can review a set of stored sources in database 214 for Bob based on a prior private information data collection process executed for Bob. The set of stored sources associated with Bob are the ones that will be used by collection engine 404 during the refresh operation. As previously mentioned, a refresh can be performed on behalf of multiple (or all) entities, instead of an individual one such as Bob. In such a scenario, portion 504 of the process can be omitted as applicable. In some embodiments, additional sources can also be accessed during a refresh operation and such sources need not be limited to the set of previously identified set of sources associated with Bob based on a prior data collection operation for Bob.
At 506, information is obtained from the sources determined at 504. As shown in
Other types of source data collection engines can extract other types of data and/or communicate with other types of sources. As an example, source data collection engine 426 is configured to extract potentially private data from social site 120 using an API provided by site 120, such as a Spokeo, which is a person search site that provides API to pass a person's name and their City, State (e.g., Bob Smith, Atlanta, Ga.) to get their previously collected data. As another example, when an instance of source data collection engine 428 is executed on platform 102, a search is performed across the World Wide Web for blog, forum, or other web pages that may discuss potentially private data associated with Bob. In some embodiments, additional processing is performed on any results of such a search, such as content analysis to verify whether such information is associated with Bob and whether such information includes potentially relevant private information associated with Bob.
In various embodiments, information, obtained on behalf of a given entity such as Bob (or Bob's Company) or Alice (or ACME Company), is retrieved from different types of sites in accordance with different schedules. For example, while general web site data can be collected hourly, or on demand, social data (collected from sites 120-122) can be collected once a day. Data can be collected from sites on the open Web (e.g., web sites, editorials, blogs, forums, and/or other sites) once a week.
At 508, any new results (i.e., those not already present in database 214) are stored in database 214. As needed, the results are processed prior to being included in database 214. In various embodiments, database 214 supports heterogeneous records and such processing is omitted or modified as applicable.
Prior to the first time process 500 is executed with respect to Bob, no previously collected private information data associated with Bob is present in database 214. Portion 506 of the process is performed for each of the data sources applicable to Bob (via instances of the applicable source data collection engines), and the collected data is stored at 508. On subsequent refreshes of data pertinent to Bob, only new/changed information is added to database 214. In various embodiments, alerter 432 provides an alerting engine that is configured to alert Bob (e.g., via an email message, phone call, text message, or another form of communication) whenever process 500 (or a particular portion thereof) is performed with respect to his account. In some cases, alerts are only sent when new private information associated with Bob is collected, and/or when privacy scores associated with Bob (described in more detail below) change, or change by more than a threshold amount.
Privacy Reporting
Platform 102 is configured to generate a variety of privacy reports on behalf of entities including users, such as Bob and Alice, and businesses or other entities, such as Bob's Company and ACME Company. As will be described in more detail below, the privacy reports provide users with perspective on whether their private information is available online or in the possession of third parties. For example, a privacy report can detail what private information associated with Bob is available online or in the possession of third parties, where such private information is available, who has access to such private information, and possibly an intended use by third parties who are determined to have access to such private information.
In some embodiments, privacy reporting performed by privacy platform 102 includes collecting information associated with an entity (e.g., Bob, Alice, or another entity); and generating a privacy report based on private information that was collected that is associated with the entity. In some embodiments, privacy reporting further includes outputting the privacy report, such as shown in
In some embodiments, whenever Bob accesses platform 102 (and/or based on the elapsing of a certain amount of time), the privacy report shown in
In region 706 of interface 700, various privacy report data are presented including various summary reports for different categories of private data. In particular, the summary reports provide Bob with a quick perspective on what private information associated with Bob is available online or in the possession of third parties. Three example categories are shown in region 706, each of which is discussed below. A category 710 for personal related private data summary report is provided to indicate to Bob what personal related private data (e.g., birthdate, mother's maiden name, father's birthdate, and/or other personal related private data) is available online or in the possession of third parties. A category 712 for financial related private data summary report is provided to indicate to Bob what financial related private data (e.g., financial account information, and/or other personal related private data) is available online or in the possession of third parties. A category 714 for tracker activities summary report is provided to indicate to Bob what trackers may be tracking Bob's online activities and what private data such trackers may have obtained and how that private data may be used by such trackers (e.g., ad tracker networks).
In some embodiments, the summary reports include links or drill-down options to view more information, such as regarding a particular set of private data that was collected, a particular source of such private data, and how such private data may be used by the source or other third parties (e.g., based on stated policies associated with such third parties, past behaviors of such third parties, inferences, and/or other techniques). In some embodiments, for each category, Bob can see tips on how to improve his private data access online and/or with third parties by clicking on an appropriate box (e.g., boxes 720-724 for tips on improving privacy). Example recommendations can include what data should be removed from a site/source, what cookies to remove from a user's client browser, what information to not use for security/password and/or user identity verification purposes (e.g., do not use your mother's maiden name, your birthdate, and/or other information), deleting an account with a particular third party, stopping the flow into the ecosystem (e.g., recommending that a user opt out of being tracked, such as using the browser cookie no track mechanism from a service provider to stop third party cookie tracking) In some embodiments, such boxes are only displayed for privacy issues that can/should be improved.
At 804, a privacy report for the entity based on the collected information is generated (e.g., using privacy reporting engine 602). Various techniques for generating privacy reports are discussed above. Other approaches can also be used, such as by generating a privacy report for each of the categories of private data associated with Bob to provide a composite report based on those category reports.
Finally, at 806, the privacy score is output (e.g., using interface 700). As an example, a privacy report is provided as output in region 706 of interface 700. As another example, privacy reporting engine 602 can be configured to send privacy reports to users via email (e.g., using an alerting engine, such as alerter 432).
As will now be apparent to one of ordinary skill in the art in view of the embodiments described herein, various other forms of privacy reporting can be output using the privacy platform and various techniques described herein. For example, a timeliness factor can also be reported to indicate a last time a source was visited for private data collection. As another example, information about sources determined to have private data associated with the entity can also be reported (e.g., a reputation of such sources in terms of how such sources use private data of users). Further, the various privacy factors described above need not all be presented or output in the privacy report nor need they be employed in the manners described herein. Additional factors can also be used when generating a privacy report.
In some embodiments, a privacy report is provided that also includes a privacy score to provide a scoring based metric to inform an entity of their privacy risks is discussed below.
Privacy Scoring
An example computation of a privacy score that can be included in a privacy report is discussed below in conjunction with
In some embodiments, whenever Bob accesses platform 102 (and/or based on the elapsing of a certain amount of time), the composite score shown at 1004 in
In region 1004 of interface 1000, a composite privacy score (728 points in this example) is depicted on a scale 1006 as shown. Example ways of computing a composite privacy score are described below. The composite privacy score provides Bob with a quick perspective, for example, on Bob's privacy risks. A variety of factors can be considered in determining a composite privacy score. Six example factors are shown in region 1008, each of which is discussed below. For each factor, Bob can see tips on how to improve his score with respect to that factor by clicking on the appropriate box (e.g., box 1022 for tips on improving score 1010). In the example shown in
Overall Score (1010): This value reflects the average or composite privacy risk score across all categories. As shown, if Bob clicks on box 1022, he will be presented with a suggestion(s), such as a list of recommendations to improve Bob's overall privacy score and minimize his privacy risks. In some embodiments, personalized advice may also be provided, such as recommending to Bob that he subscribe to automated privacy risk alerts. In some embodiments, automated privacy reporting alerts and/or privacy scoring alerts are provided as a subscription service. In some embodiments, automated privacy reporting alerts and/or privacy scoring alerts are provided as a free service (e.g., for a limited period of time).
As also shown in
Personal (1012): This score indicates privacy risks associated with a user's personal related private data (e.g., mother's maiden name and father's birthdate, which are often selected by users as security questions for account/identity verification, and/or other personal data). For example, if Bob clicks on box 1024, he will be presented with a suggestion(s), such as the following: “Remove your birthdate information from your web site” or “Do not use your father's birthdate for purposes of a security question/identity verification.”
Financial (1014): This score indicates privacy risks associated with a user's financial related private data. For example, if Bob clicks on box 1026, he will be presented with a suggestion(s), such as the following: “Close your account with third party X” and/or “Do not use your social security number as a username or password.”
Social Factors (1016): This score indicates privacy risks associated with a user's social related private data. For example, by clicking on box 1028, Bob will be presented with an appropriate suggestion(s) for improvement.
Tracker (1018): This score indicates privacy risks associated with tracker related activities. For example, by clicking on box 1030, Bob will be presented with an appropriate suggestion(s) for improvement, such as to remove particular tracker cookies from his client/browser and/or to recommend that Bob opt out of being tracked using a service, such as described herein.
Other (1020): This score indicates privacy risks associated with various other private related data, such as health related private data and/or other private related data. In some embodiments, entities, such as Bob, can configure their account to identify new categories of interest, such as hobbies or other categories that Bob may deem to be private data that can be monitored by the privacy platform disclosed herein. For example, by clicking on box 1032, Bob will be presented with an appropriate suggestion(s) for improvement.
In various embodiments of interface 1000, additional controls for interactions are made available. For example, a control can be provided that allows a user to see specific extractions of private data and their source(s)—including private data from sources that contributed the most to/deviated the most from the overall score (and/or individual factors). As one example, a third party source that is weighted heavily in the calculation of a score or scores can be identified and presented to the user. The user could then attempt to resolve the unauthorized or undesired usage of the user's private data by that third party source, such as by using a service offered by a service provider such as Reputation.com to assist the user to remove such identified privacy risks. As another example, problematic tracker cookies can be identified and presented to the user, allowing the user to opt out of being tracked, such as using the browser cookie no track mechanism from a service provider such as Reputation.com to stop third party cookie tracking. As yet another example, if an otherwise unauthorized or undesired disclosure of certain private data is exposed by a third party source/site, Bob can be advised to avoid using such private data for security purposes (e.g., do not use such exposed private data for security questions for access to financial accounts, and/or other security related or identity verification purposes).
A variety of weights can be assigned to the above factors when generating the composite score shown in region 1004. Further, the factors described above need not all be employed nor need they be employed in the manners described herein. Additional factors can also be used when generating a composite score. An example computation of a composite score is discussed below.
In some embodiments, scoring engine 902 computes a base score that is a weighted average of all of the private data related risks identified in each category of privacy risks, such as shown in
As explained above, a variety of techniques can be used by scoring engine 902 in determining privacy scores. In some embodiments, scores for all types of entities are computed using the same sets of rules. In other embodiments, privacy score computation varies based on type of entity, category of user (e.g., profession, age, geography, and/or other categorization of users), configured criteria by the entity for that account (e.g., Bob can input custom configurations for his privacy reporting and privacy scoring for his account), geography of the entity, and/or other factors or considerations (e.g., privacy scores for adults using one approach and/or one set of factors, and privacy scores for doctors using a different approach and/or different set of factors). Scoring engine 902 can be configured to use a best in class entity when determining appropriate thresholds/values for entities within a given categorization. The following are yet more examples of factors that can be used in generating privacy scores.
In some embodiments, the privacy score is based on a scale, which is open ended score (e.g., the privacy score becomes higher as more privacy information for Bob becomes publicly available or is accessed by third parties). In some embodiments, marketing companies that are determined to have access to privacy information are weighted based on prominence, reputation, privacy policy, and/other analysis on such entities (e.g., the privacy platform can allocate different reputations to different third party data sources, such as Spokeo, Lexis Nexis, and/or other sources based on such criteria). In some embodiments, ad tracking networks/companies that are determined to have access to privacy information (e.g., are tracking a user such as Bob using tracker cookies) are weighted based on prominence, reputation, privacy policy, and/other analysis on such entities (e.g., ad tracking networks can be classified based on a reputation determined for each ad tracker such as based on the ad tracker's privacy policy and/or known/past behaviors).
At 1104, a privacy score for the entity based on the collected information is generated (e.g., using privacy scoring engine 902). Various techniques for generating privacy scores are discussed above. Other approaches can also be used, such as by determining an average score for each of the categories of private data associated with Bob and combining those average scores (e.g., by multiplying or adding them and normalizing the result).
Finally, at 1106, the privacy score is output (e.g., using interface 1000). As one example, a privacy score is provided as output in region 1004 of interface 1000. As another example, scoring engine 902 can be configured to send privacy scores to users via email (e.g., using an alerting engine, such as alerter 432).
As will now be apparent to one of ordinary skill in the art in view of the embodiments described herein, various other forms of privacy scoring can be generated and output using the privacy platform and various techniques described herein. For example, information about sources determined to have private data associated with the entity can also be used to impact a privacy score (e.g., a reputation of such sources in terms of how such sources use private data of users can be used as relative weight in the privacy score in which a lower privacy score can result from a riskier third party having private data of a user). Further, the various privacy factors described above need not all be presented or output in the privacy score nor need they be employed in the manners described herein. Additional factors can also be used when generating a privacy score. Also, various other forms of scoring or scaling can also be used, such as letter grades, scales that are commensurate with FICO scoring, and/or various other approaches using the privacy platform and techniques disclosed herein.
Although the foregoing embodiments have been described in some detail for purposes of clarity of understanding, the invention is not limited to the details provided. There are many alternative ways of implementing the invention. The disclosed embodiments are illustrative and not restrictive.
Number | Name | Date | Kind |
---|---|---|---|
5819258 | Vaithyanathan et al. | Oct 1998 | A |
5857179 | Vaithyanathan et al. | Jan 1999 | A |
5873081 | Harel | Feb 1999 | A |
5987457 | Ballard | Nov 1999 | A |
6006218 | Breese et al. | Dec 1999 | A |
6178419 | Legh-Smith et al. | Jan 2001 | B1 |
6182066 | Marques | Jan 2001 | B1 |
6324650 | Ogilvie | Nov 2001 | B1 |
6484068 | Yamamoto et al. | Nov 2002 | B1 |
6510432 | Doyle | Jan 2003 | B1 |
6513031 | Fries et al. | Jan 2003 | B1 |
6532459 | Berson | Mar 2003 | B1 |
6611825 | Billheimer et al. | Aug 2003 | B1 |
6678690 | Kobayashi et al. | Jan 2004 | B2 |
6766316 | Caudill et al. | Jul 2004 | B2 |
6775677 | Ando et al. | Aug 2004 | B1 |
6968333 | Abbott et al. | Nov 2005 | B2 |
6985896 | Perttunen | Jan 2006 | B1 |
7028026 | Yang et al. | Apr 2006 | B1 |
7076558 | Dunn | Jul 2006 | B1 |
7117207 | Kerschberg et al. | Oct 2006 | B1 |
7178096 | Rangan et al. | Feb 2007 | B2 |
7289971 | O'Neil et al. | Oct 2007 | B1 |
7624173 | Bary et al. | Nov 2009 | B2 |
7631032 | Refuah et al. | Dec 2009 | B1 |
7634810 | Goodman et al. | Dec 2009 | B2 |
7640434 | Lee et al. | Dec 2009 | B2 |
7653646 | Horn et al. | Jan 2010 | B2 |
7792816 | Funes et al. | Sep 2010 | B2 |
7818231 | Rajan | Oct 2010 | B2 |
7970872 | Liu et al. | Jun 2011 | B2 |
8166406 | Goldfeder et al. | Apr 2012 | B1 |
8185531 | Nakano | May 2012 | B2 |
8234688 | Grandison et al. | Jul 2012 | B2 |
20020016910 | Wright et al. | Feb 2002 | A1 |
20020111847 | Smith | Aug 2002 | A1 |
20020174230 | Gudorf et al. | Nov 2002 | A1 |
20020178381 | Lee et al. | Nov 2002 | A1 |
20030014402 | Sealand et al. | Jan 2003 | A1 |
20030014633 | Gruber | Jan 2003 | A1 |
20030069874 | Hertzog et al. | Apr 2003 | A1 |
20030093260 | Dagtas et al. | May 2003 | A1 |
20030135725 | Schirmer et al. | Jul 2003 | A1 |
20030147536 | Andivahis et al. | Aug 2003 | A1 |
20030229668 | Malik | Dec 2003 | A1 |
20040019584 | Greening et al. | Jan 2004 | A1 |
20040019846 | Castellani et al. | Jan 2004 | A1 |
20040063111 | Shiba et al. | Apr 2004 | A1 |
20040078363 | Kawatani | Apr 2004 | A1 |
20040082839 | Haugen | Apr 2004 | A1 |
20040088308 | Bailey et al. | May 2004 | A1 |
20040093414 | Orton | May 2004 | A1 |
20040122926 | Moore et al. | Jun 2004 | A1 |
20040169678 | Oliver | Sep 2004 | A1 |
20040267717 | Slackman | Dec 2004 | A1 |
20050005168 | Dick | Jan 2005 | A1 |
20050050009 | Gardner et al. | Mar 2005 | A1 |
20050071632 | Pauker et al. | Mar 2005 | A1 |
20050114313 | Campbell et al. | May 2005 | A1 |
20050160062 | Howard et al. | Jul 2005 | A1 |
20050177559 | Nemoto | Aug 2005 | A1 |
20050216443 | Morton et al. | Sep 2005 | A1 |
20050234877 | Yu | Oct 2005 | A1 |
20050251536 | Harik | Nov 2005 | A1 |
20050256866 | Lu et al. | Nov 2005 | A1 |
20060004716 | Hurst-Hiller et al. | Jan 2006 | A1 |
20060015942 | Judge et al. | Jan 2006 | A1 |
20060026593 | Canning et al. | Feb 2006 | A1 |
20060042483 | Work et al. | Mar 2006 | A1 |
20060047605 | Ahmad | Mar 2006 | A1 |
20060047725 | Bramson | Mar 2006 | A1 |
20060116896 | Fowler et al. | Jun 2006 | A1 |
20060149708 | Lavine | Jul 2006 | A1 |
20060152504 | Levy | Jul 2006 | A1 |
20060161524 | Roy et al. | Jul 2006 | A1 |
20060173828 | Rosenberg | Aug 2006 | A1 |
20060174343 | Duthie et al. | Aug 2006 | A1 |
20060212931 | Shull et al. | Sep 2006 | A1 |
20060253423 | McLane et al. | Nov 2006 | A1 |
20060253458 | Dixon et al. | Nov 2006 | A1 |
20060253578 | Dixon et al. | Nov 2006 | A1 |
20060253580 | Dixon et al. | Nov 2006 | A1 |
20060253582 | Dixon et al. | Nov 2006 | A1 |
20060253583 | Dixon et al. | Nov 2006 | A1 |
20060253584 | Dixon et al. | Nov 2006 | A1 |
20060271524 | Tanne et al. | Nov 2006 | A1 |
20060287980 | Liu et al. | Dec 2006 | A1 |
20060294085 | Rose et al. | Dec 2006 | A1 |
20060294086 | Rose et al. | Dec 2006 | A1 |
20070073660 | Quinlan | Mar 2007 | A1 |
20070101419 | Dawson | May 2007 | A1 |
20070112760 | Chea et al. | May 2007 | A1 |
20070112761 | Xu et al. | May 2007 | A1 |
20070121596 | Kurapati et al. | May 2007 | A1 |
20070124297 | Toebes | May 2007 | A1 |
20070130126 | Lucovsky et al. | Jun 2007 | A1 |
20070150562 | Stull et al. | Jun 2007 | A1 |
20070288468 | Sundaresan et al. | Dec 2007 | A1 |
20080021890 | Adelman et al. | Jan 2008 | A1 |
20080077517 | Sappington | Mar 2008 | A1 |
20080077577 | Byrne et al. | Mar 2008 | A1 |
20080082687 | Cradick et al. | Apr 2008 | A1 |
20080104030 | Choi et al. | May 2008 | A1 |
20080109245 | Gupta | May 2008 | A1 |
20080109491 | Gupta | May 2008 | A1 |
20080133488 | Bandaru et al. | Jun 2008 | A1 |
20080140626 | Wilson | Jun 2008 | A1 |
20080165972 | Worthington | Jul 2008 | A1 |
20080281807 | Bartlang et al. | Nov 2008 | A1 |
20080288277 | Fasciano | Nov 2008 | A1 |
20090049528 | Friedman | Feb 2009 | A1 |
20090070325 | Gabriel et al. | Mar 2009 | A1 |
20090187537 | Yachin et al. | Jul 2009 | A1 |
20090216773 | Konopnicki | Aug 2009 | A1 |
20090265639 | Shuster | Oct 2009 | A1 |
20090292656 | Raman et al. | Nov 2009 | A1 |
20090307762 | Cudd, Jr. | Dec 2009 | A1 |
20090327026 | Bistriceanu et al. | Dec 2009 | A1 |
20100100950 | Roberts | Apr 2010 | A1 |
20100198839 | Basu et al. | Aug 2010 | A1 |
20100250515 | Ozonat et al. | Sep 2010 | A1 |
20100262454 | Sommer et al. | Oct 2010 | A1 |
20100262601 | Dumon et al. | Oct 2010 | A1 |
20100306834 | Grandison et al. | Dec 2010 | A1 |
20100313252 | Trouw | Dec 2010 | A1 |
20110016118 | Edala et al. | Jan 2011 | A1 |
20110029566 | Grandison et al. | Feb 2011 | A1 |
20110060749 | Petakov et al. | Mar 2011 | A1 |
20110078049 | Rehman et al. | Mar 2011 | A1 |
20110112901 | Fried et al. | May 2011 | A1 |
20110153551 | Gabriel et al. | Jun 2011 | A1 |
20110167474 | Sinha et al. | Jul 2011 | A1 |
20110264598 | Fuxman et al. | Oct 2011 | A1 |
20110296179 | Templin et al. | Dec 2011 | A1 |
20120023332 | Gorodyansky | Jan 2012 | A1 |
20120023586 | Flickner et al. | Jan 2012 | A1 |
20120180135 | Hodges et al. | Jul 2012 | A1 |
20120191594 | Welch et al. | Jul 2012 | A1 |
20130006879 | Ramanathan et al. | Jan 2013 | A1 |
20130007014 | Fertik et al. | Jan 2013 | A1 |
20130035982 | Zhang et al. | Feb 2013 | A1 |
20130124653 | Vick et al. | May 2013 | A1 |
20130191466 | Perlow et al. | Jul 2013 | A1 |
Number | Date | Country |
---|---|---|
WO02001046868 | Feb 2001 | WO |
Entry |
---|
Liu et al., “Personalized Web Search by Mapping User Queries to Categories,” CIKM, '02, McLean, Virginia, Nov. 4-6, 2002, pp. 558-565. |
PCT Notification of Transmittal of the International Search Report and the Written Opinion of the International Searching Authority for International Application No. PCT/US2012/044668, dated Dec. 21, 2012, 11pages. |
Pretschner et al., “Ontology Based Personalized Search,” Proc. 11th IEEE International Conference on Tools with Artificial Intelligence, Chicago, Illinois, Nov. 1999, pp. 391-398. |
Sugiyama et al., “Adaptive Web Search Based on User Profile Constructed Without Any Effort from Users,” ACM, New York, NY, May 17-22, 2004, pp. 675-684. |
PCT Notification of Transmittal of the International Search Report and the Written Opinion of the International Searching Authority for International Application No. PCT/US2012/043392, mailed Jan. 25, 2013, 10 pages. |
Daranyi et al., Svensk Biblioteksforskning; Automated Text Categorization of Bibliographic Records; Boras Academic Digital Archieve (BADA); artice peer reviewed [on-line], Hogskolan I Boras, vol. 16, Issue 2, pp. 1-14 as paginated or 16-29 as unpaginated of 47 pages, 2007 [retrieved on Nov. 6, 2012]. |