Authentication techniques are used to ensure that actions, for example accessing a computer or other resource, are performed only by an authorized human or other user. One way that websites and other electronic services authenticate their users is by requiring those users to supply a username and a valid password before being granted access. Typically the password is selected by the user the first time the user visits the site (e.g., as part of a registration process), and may be changed by the user as desired. Unfortunately, users sometimes forget their passwords—especially if the password is complex or used infrequently. Passwords can also be difficult to type, for example if the user is using a client with limited input capabilities.
Various embodiments of the invention are disclosed in the following detailed description and the accompanying drawings.
The invention can be implemented in numerous ways, including as a process; an apparatus; a system; a composition of matter; a computer program product embodied on a computer readable storage medium; and/or a processor, such as a processor configured to execute instructions stored on and/or provided by a memory coupled to the processor. In this specification, these implementations, or any other form that the invention may take, may be referred to as techniques. In general, the order of the steps of disclosed processes may be altered within the scope of the invention. Unless stated otherwise, a component such as a processor or a memory described as being configured to perform a task may be implemented as a general component that is temporarily configured to perform the task at a given time or a specific component that is manufactured to perform the task. As used herein, the term ‘processor’ refers to one or more devices, circuits, and/or processing cores configured to process data, such as computer program instructions.
A detailed description of one or more embodiments of the invention is provided below along with accompanying figures that illustrate the principles of the invention. The invention is described in connection with such embodiments, but the invention is not limited to any embodiment. The scope of the invention is limited only by the claims and the invention encompasses numerous alternatives, modifications and equivalents. Numerous specific details are set forth in the following description in order to provide a thorough understanding of the invention. These details are provided for the purpose of example and the invention may be practiced according to the claims without some or all of these specific details. For the purpose of clarity, technical material that is known in the technical fields related to the invention has not been described in detail so that the invention is not unnecessarily obscured.
Those who control access to computer or other resources commonly use automated techniques to verify the identity of a person or other user attempting to take an action with respect to the resource, such as accessing account or other information, performing an operation, etc. For example, if a user is unable to remember his or her password, that user typically can request that the password be reset via an automated process by which the identity of the user is attempted to be verified. Assorted schemes exist that attempt to determine whether a particular reset request has been submitted by a legitimate user that has genuinely forgotten his or her password or by a nefarious individual attempting to gain unauthorized access to an account. Unfortunately, some such schemes can be vulnerable to data-mining (e.g., asking for a mother's maiden name or county in which the user owns property, both of which can be obtained from public records). Other schemes can be vulnerable to guessing (e.g., requiring the user to supply answers to a series of questions at enrollment time such as “what is the name of your pet?” for which there are common responses). If questions attempting to elicit less common responses are used (e.g., “what is the name of your grade school teacher?”) a risk exists that a legitimate user will, in addition to forgetting the password, also forget the information needed to complete a password reset action. Other schemes, such as requiring that the user call a help desk for assistance can be expensive, and can also be subject to social engineering on the part of an attacker. If the user maintains a personal website or makes use of social networking sites, an attacker potentially has even more information to mine (e.g., for pet names, names of friends, former addresses etc.) and thus may have an even easier time gaining unauthorized access to the legitimate user's account.
As described in more detail below, using the techniques disclosed herein, a series of preferences, such as whether an individual likes onions and whether an individual listens to reggae music are used to help determine whether a request has been submitted by a legitimate user or by an attacker. In various embodiments, additional information is used to augment the preference information supplied by the user. The techniques described herein can be used as a secondary form of authentication (e.g., for when the user has forgotten or is otherwise unable to supply an existing password) and can also be used as a primary form of authentication (e.g., instead of a traditional password, PIN, or other credential), as applicable.
The first time Alice decides to avail herself of Acme Bank's online banking features, she visits website 112 and commences an enrollment process. She is asked to choose a username and supply information such as her full name, phone number, and bank account numbers. She is also asked to select a password. As described in more detail below, she is also asked to provide additional information (e.g., her preferences for assorted things) for use in the event she ever forgets her password or is otherwise unable to authenticate herself using her selected password.
In the example shown in
System 102 provides authentication services for website 112. System 102 also provided authentication services for Acme's bank by phone tool 120, which allows customers to check their account balances and perform other actions via conventional telephone 118. In various embodiments, system 102 and website 112 are maintained by two different entities. For example, a website hosting company might maintain website 112 under a contract with Acme Bank, while another entity maintains authentication system 102. Similarly, when Alice visits website 112, a portion of the content rendered in her browser may be served by the web hosting company, while another portion of the content (120) may be served by system 102.
In various embodiments, the infrastructure provided by portions of authentication system 102 is located on and/or replicated across a plurality of servers rather than the entirety of authentication system 102 being collocated on a single platform. Such may be the case, for example, if the contents of database 106 are vast and/or if system 102 is used to provide authentication services to multiple websites 112 and/or other electronic resources not shown. Similarly, in some embodiments the serving of content such as website 112 is provided by system 102, rather than being provided by a separate system (e.g., provided by a web hosting company). Whenever authentication system 102 performs a task (such as receiving information from a user, authenticating the user, etc.), either a single component or a subset of components or all components of authentication system 102 may cooperate to perform the task.
In the example shown in
Item 304 (“have you been to Chicago?”) is an example of a life question, included in some embodiments to augment the preference information collected from the user, and described in more detail below.
It is typically easier for people to remember preferences (e.g., “I like mushrooms”) than arbitrary information such as a password. Additionally, one typical goal of service providers is to permit legitimate users to correctly answer authenticating information while at the same time making it unlikely for illegitimate users to correctly do so. In some embodiments the items for which a user's preference is solicited (especially in the aggregate) have a “high entropy,” meaning that given a random sample of individuals, it would be difficult to guess a particular individual's preference for a particular item, whether statistically, or based on knowledge of basic demographic or other information about an individual that can readily be obtained (e.g., through data-mining of publicly available information). Examples of items with low entropy in their corresponding user preferences include items such as “vacation” and “money” as likes and “being sick” and “pain” as dislikes. Conversely, “seafood,” “going to the opera,” and “horror movies” are examples of items with high entropy in their corresponding user preferences.
The following are examples of items for which a given user's are unlikely to be guessed by a third party, but are also likely to be remembered by that user.
In the “TV” category, users may select among different types of shows such as: reality shows, news, sports, sitcoms, dramas, movies, soap operas, game shows, and documentaries.
In the “food” category, users may select among different types of food such as: American, Barbecue, Cajun/Southern, California-Fusion, Caribbean/Cuban, Chinese/Dim Sum, Continental, Deli, Eastern-European, Fast Food/Pizza, French, German, Indian, Italian, Japanese/Sushi, Jewish/Kosher, Korean, Mediterranean, Mexican, Middle Eastern, Seafood, Soul Food, South American, Southwestern, Spanish, Thai, Vegetarian/Organic, Vegan, Vietnamese.
In the “Music” category, users may select among different styles of music such as: Acoustic, Alternative, Big Band/Swing, Blues, Christian & Gospel, Classic Rock n' Roll, Classical, Country, Dance/Electronica, Disco, Easy Listening, Folk, Hard Rock & Metal, Indie, Instrumental, Jazz, Latin, Modern Rock n' Roll, New Age, Oldies, Opera, Pop/Top 40, Punk, Rap/Hip Hop, Reggae, Show tunes, Soul/R&B, Soundtracks, World Music/Ethnic.
In the “Places” and “Activities” categories, users may select among going to different types of places/events: Amusement Parks, Antique Stores, Art Galleries, Bars/Nightclubs, Beach, Bookstores, Charity Events, Circuit Parties, Coffee Houses, Comedy Clubs, Concerts, Dance Clubs, Flea Markets, Garage Sales, Karaoke/Sing-along, Libraries, Live Theater, Movies, Museums, Opera, Parks, Political Events, Raves/Parties, Restaurants, Shopping Malls, Skate/Bike Parks, Sporting Events, Symphony, Volunteer Events.
In the “Sports” category, users may select among different types of activities: Aerobics, Auto racing/Motocross, Baseball, Basketball, Billiards/Pool, Cycling, Dancing, Football, Golf, Hockey, Inline skating, Martial arts, Running, Skiing, Soccer, Swimming, Tennis/Racquet sports, Volleyball. Walking/Hiking, Weights/Machines, Yoga.
The aforementioned categories and items are examples of things for which a user's preference can be sought. In some cases, some preferences (e.g., “I like Italian food” and “I don't like onions”) are observable, such as by family members and coworkers. Some “likes” can also leave traces (e.g., a record of a purchase of a CD of a given type). However, “dislikes” are typically less likely to leave traces that can be mined. Additionally, very few third parties will be able to observe a specific individual's preferences for items across multiple categories. For example, while a co-worker might observe Alice's preference for Italian food, the co-worker is less likely to know that Alice likes visiting comedy clubs, dislikes Karaoke, and reads romance novels (or all of that information collectively). In various embodiments, the number of likes and dislikes that a user such as Alice must supply, and from how many different categories the preferred items are selected is configurable (e.g., by an administrator, based on one or more policies).
Additional formats may also be used to collect preference information. For example, rather than soliciting an answer of “like,” “dislike,” or “no opinion,” in some embodiments the user is asked to perform a selection of more and less important events. For example, the following question is of that type: “Among the following sports, which one(s) do you prefer watching: auto racing, baseball, basketball, bowling, cricket, football, golf, hockey, soccer, ski jump, figure skating, tennis.” A similar question can be posed to ask which ones the user does not like watching.
Yet another example of a technique for collecting user preferences is to provide a question that permits several simultaneous correct entries. For example, “Describe your personality by selecting one or more suitable descriptions: introvert, extrovert, peaceful, worrying, goal-oriented, impulsive, confrontational, shy, passionate.” In this example, some of the pairs are in contradiction to each other, such as “introvert” and “extrovert,” while others are roughly synonymous, such as “introvert” and “shy.”
In addition to soliciting a user's preference for things, as previously mentioned system 102 can be configured to solicit other kinds of information about the user's life that are neither readily available through data mining, nor easily guessable. Examples include “do you sleep on the left side of the bed,” “have you been to Chicago,” “do you snore,” etc. Different types of questions can also be mixed within an interface. For example, Alice may be asked to rate some items on a scale of 1-5, to select 4 likes and dislikes using an interface as is shown in
Preferences can be captured (and later tested against) in a variety of ways. For example, in addition to answering written questions via website 112, preferences could also be captured acoustically (e.g., over the phone, via a series of voice prompts to which the user responds with a spoken “yes” or “no”). Similarly, the user could be played a set of audio clips and asked whether they like or dislike the sample. Other non-text entry methods can also be used, as applicable, such as devices used by the physically disabled.
In some cases, a designated portion of the items shown in
Comparing Received Preferences to Stored Preferences
A variety of techniques can be used to store and subsequently compare stored preferences (e.g., those received from Alice at enrollment) to subsequently received preferences (e.g., received through interface 600). For example, suppose system 102 makes use of the interface shown in
During the process shown in
In some embodiments, the preferences supplied at 204 in the process shown in
When an authentication is attempted (e.g., when Alice wishes to reset her password), system 102 retrieves the associated value U. Then, the user is requested to answer questions qi. An answer (502) corresponds to selection of one or more of the vertices associated with this question. A selected vertex is associated with the value 1, while a vertex that is not selected is associated with the value 0. A variable zij is used to record these values; thus, zij is set to 1 (resp. 0) if the vertex vi is (resp. is not) selected in this process.
To determine whether a given authentication attempt should be considered successful, the following is performed at 504: The sum of all (wijk*uij*zik) is computed, for 0<i, j, k<n+1. This sum is denoted S2. At 506, it is determined whether S2 is greater than some pre-set threshold value t2. If so, authentication is considered successful, otherwise not.
The value yijk is the “benefit” of selecting outcome j for question i during enrollment, and then subsequently selecting outcome k for question k during authentication. A low value, such as the value T, can be used as a “punishment” for answering a question incorrectly, whereas a higher value is to be interpreted as a reward for answering the question correctly.
Suppose q1=“Do you like cats?” with three possible answers “yes”, “neutral,” and “no.” These possible answers correspond to three nodes v11, v12, and v13. All values u11, u12, and u13 are set to 0. If the user selects that he likes cats, then the value u11 is set to 1; if he has no strong opinion, then u12 is set to 1; and if he does not like cats, u13 is set to 1.
Additional constraint values are y111=3, y112=−5, y113=T, y121=0, y122=0, y123=0, y131=T, y132=−6, and y133=4.
When the user attempts to authenticate, the values z11, z12, and z13 are set. The nine combinations of preferences during enrollment vs. authentication are as follows:
(LIKE, LIKE): S2=y111=3
(LIKE, NO OPINION): S2=y112=−5
(LIKE, DISLIKE): S2=y113=T
(NO OPINION, LIKE): S2=y121=0
(NO OPINION, NO OPINION): S2=y122=0
(NO OPINION, DISLIKE): S2=y123=0
(DISLIKE, LIKE): S2=y131=T
(DISLIKE, NO OPINION): S2=y132=−6
(DISLIKE, DISLIKE): S2=y133=4
Thus, if the user first says he likes cats (during enrollment), and later says he does not during authentication, then the sum S2 becomes minus infinity. The same thing happens if he says that he dislikes cats during enrollment, and later says he likes them. (In various embodiments, the punishment is set to a value of much smaller absolute value. For example, while a correct answer may give 5 points, an incorrect answer may cause the loss of 20 points.) However, if he has no opinion during enrollment, then his answer during authentication always results in the sum S2=0. If he has an opinion during enrollment, and no strong opinion during authentication, the sum is set to a small negative value. If the user retains his like or dislike from enrollment to authentication, S2 is a positive number.
The assignment of low absolute weights allows for the later cancellation of incorrect answers to questions that the user has no strong opinion of (e.g., the types of questions where temporal variance is going to be the greatest). The assignment of large negative weights introduce strong negative feedback for questions where users have a strong opinion, but where the answer during authentication is incorrect. The assignment of positive weights allow for the detection of correct answers given during authentication. The assignment of low absolute weights reduces the impact of a small number of incorrect answers during authentication, where the incorrect answers are not contradictory with the users previously stated opinion, but merely not in complete agreement with these.
As multiple questions are considered, the sum S2 corresponds to the cumulative value of all these contributions from the different questions. A sum that is greater than the set threshold t2 means that the user answered in a similar-enough manner during authentication as he did during enrollment. In some embodiments if the sum is not greater than this threshold, then the user either mistook a strong like for a strong dislike (which is unlikely) or vice versa; stated that he had no strong opinion in the authentication phase for a sufficient number of questions he stated a strong opinion for in the enrollment phase, or a combination. The threshold t2 of the authentication phase, and the values yijk are set in a manner that balances the risk for false positives with the risk for false negatives, and reflects the degree to which the answers to these questions are estimated to be maintained over time in some embodiments. The threshold t1 of the enrollment phase is set to guarantee a sufficient number of answers that are not “no strong opinion,” in turn making it impossible to authenticate by answering “no opinion” to all or too many questions. In some embodiments, several values t2 are used (e.g., one for each type of access right), out of some collection of possible values and types of account access and privileges. The value t2 can be a function of the value t1, and of some minimum value required for access, as well as of other parameters describing the user and his or her risk profile.
Questions with more than three possible answers, such as degrees of opinion, and questions that have only two possible answers, and any type of question with multiple answers can be scored by adapting the techniques described herein.
In some embodiments instead of assigning the variable yijk an arbitrary value describing the associated reward or punishment, a set of values representing yijk can be selected and saved. Each such value will be a point in a two-dimensional space, with an x-coordinate and a y-coordinate. For practical purposes, we will assume that all the x-coordinates are distinct, and that all coordinates are represented by integer values in a given range from 0 to p, where p is a system parameter. Associated with each user is a random polynomial f(x) described with random integer parameters in the same range, 0 to p, and where the polynomial is evaluated modulo p.
For an instance yijk with a large positive value, a large number of points on the curve f(x) are selected and associated with the indices i and k; for a large negative value, a small number of such points are selected; and for a value yijk inbetween, an intermediary number of such points are selected. The exact mapping between values of yijk and the number of selected points on the curve f(x) is a system parameter that can be customized to minimize false positives and false negatives. The variable Rik is used to denote the collection of points associated with yijk, where a large number of points from f(x) is selected if yijk is large, and a smaller number of points from f(x) are selected if yijk is small. Once a number of points on f(x) has been selected, these are stored in the record Rik, along with random points on a random polynomial f′(x) to fill up those positions that do not contain f(x) values, up to a given maximum number of values that is a system parameter. Here, f′(x) has the same or larger degree than f(x), or corresponds to a random selection of points from the appropriate space. If for each value yijk ten points in Rik are stored, then a high yijk value could be represented by ten points from f(x); a value yijk close to zero could be represented by eight values from f(x) and two values from f′(x); and the value T could be represented by ten values from f′(x). The ordering of values from f(x) and f′(x) as they are stored in Rik would be random or pseudo-random, and not disclosed.
Each value yijk would be represented in this manner. The matrix of all values Rik would be saved. This takes the place of the previously defined value U.
In the above example, the degree of the polynomial f(x) may be chosen as n*10−1. This means that, for reconstruction of the polynomial f(x) from recorded points, it is necessary to know n*10 points on the curve. The degree of f(x) could, more generally, be chosen as n*L−1−d, where L is the number of points stored per record Rik, and d is an integer value regulating the balance between false positives and false negatives, and corresponds to the total number of values from f′(x) that can be selected as all questions are answered, while still passing the authentication phase.
During the authentication phase, the user selects answers. For each such answer, he selects an associated collection of points, in turn associated with the values i (of the question) and k (of the response to the question). During authentication, the machine used by the user does not know what elements are from f(x) and what elements are from f′(x). However, if a selection is made for which Rik has a large portion of values from f′(x), then it is unlikely that only points from f(x) are going to be selected, and therefore, unlikely that the polynomial f(x) can be reconstructed. If there is a failure, the machine can try another set of points corresponding to the same user selection. A large number of these can be tried. If more than a certain number, say 1000, are tried, then the login script can generate an error to the user and request that the user attempts to authenticate again. An attacker would not have to limit himself to 1000 attempts, but if he has a large number of incorrect selections, he is unlikely to ever be able to reconstruct the polynomial. A machine can determine whether a polynomial is correctly interpolated by trying to compute f(x) on an input value x given by the server. If this is done correctly, then the server will allow the client access, and call the authentication attempt successful. The machines would not have to communicate the values in the clear, but could do this over an encrypted channel, or the client machine may send a one-way function of the result f(x) for the requested value x. Since the server knows the polynomial f(x) as well as x, it can verify whether this is a correct value. It is also possible to use an f-value for a known x-coordinate, such as x=0, as a cryptographic key, provided this point on the curve is never chosen to be part of an entry Rjk. Thus, a user that answers a sufficient number of questions correctly would enable his computer to compute f(0) using standard interpolation techniques (and as described above), thereby deriving the key f(0); a computer that fails to compute f(0) would not be able to perform the associated cryptographic actions. Thus, users who fail to authenticate sufficiently well would cause their computer to be unable to perform such actions.
In various embodiments other techniques are used to score stored (and subsequently received) preferences. For example, the entropy of a particular question can be used as a weight that is used when computing S2. Thus, a question such as “do you sleep on the left side of the bed” may be inherently worth more points (based on its entropy) than a question such as “do you like ice cream.” Special rules can also take into account answers—particularly to life questions—that wrong answer to which may cause an authentication attempt to fail irrespective of the other questions being answered correctly. For example, if Alice indicates that she has been to Chicago at enrollment, and then subsequently denies being there, such an event might indicate that an attacker is trying to impersonate Alice. Conversely, mechanisms can also be used to make sure that for questions the answers to which might evolve over time (e.g., fondness for certain foods considered to appeal only to adults, such as mushrooms and sushi) don't result in false negatives.
Policies
A variety of policies can be put in place based on the security and other needs of website 112 (or other appropriate electronic entity). For example, different users may have different personal thresholds for what will constitute a valid authentication and what will not, but certain global minimums can be applied simultaneously. Additionally, different actions can be taken based on factors such as by how much a threshold was exceeded. For example, in a banking context, several thresholds could be used in which if the highest threshold is exceeded, the user is permitted full access to his or her account. If the second highest threshold is exceeded, the user is permitted full access to the account, but a flag is set alerting an administrator to review the user's account once the user logs off. Other lower thresholds can also be set with their own corresponding set of permitted actions, such as allowing the user read-only access to the account, informing the user that access will only be granted after any additional step is performed (e.g., requiring the user to make or receive a phone call, a respond to a piece of email, etc.).
Policies can also be used to specify, e.g., how many likes/dislikes a user must supply, whether (e.g., in the case of the interface shown in
Additional information, such as the presence or absence of a cookie, can be use to adjust thresholds (and permitted actions/levels of access) accordingly. As another example, suppose Alice's phone 116 includes a GPS. In some embodiments a side channel may be used to capture Alice's location information and to present Alice (e.g. in interface 600) with a question that asks if she was in <location> last week. Alice's GPS coordinates can also be used to determine the selection of the threshold required to pass authentication: a person attempting to authenticate from a location close to Alice's believed location (e.g., her home address, places she frequents, etc.) may be offered a lower threshold than a person who is distant from all likely GPS positions of Alice. In some embodiments, multiple types of additional information are used/combined. For example, if Alice's GPS reports that she is currently in California, but her alleged IP address reflects that she is in Romania, the threshold score needed to gain access to her account may be increased, she may be required to supply a username and password, and also provide preferences whereas she might otherwise only be required to supply a correct username/password, etc.
Although the foregoing embodiments have been described in some detail for purposes of clarity of understanding, the invention is not limited to the details provided. There are many alternative ways of implementing the invention. The disclosed embodiments are illustrative and not restrictive.
This application is a continuation of co-pending U.S. patent application Ser. No. 12/215,048, entitled PERFORMING AUTHENTICATION filed Jun. 23, 2008, which is a continuation-in-part of U.S. patent application Ser. No. 11/890,408, entitled METHOD AND APPARATUS FOR EVALUATING ACTIONS PERFORMED ON A CLIENT DEVICE filed Aug. 6, 2007, which claims priority to U.S. Provisional Application No. 60/836,641, entitled METHOD AND APPARATUS FOR IMPROVED WEB SECURITY filed Aug. 9, 2006, and claims priority to U.S. Provisional Application No. 60/918,781, entitled SECURE LOGGING OF CRITICAL EVENTS, ALLOWING EXTERNAL MONITORING filed Mar. 19, 2007, all of which are incorporated herein by reference for all purposes. U.S. patent application Ser. No. 12/215,048 also claims priority to U.S. Provisional Patent Application No. 60/967,675, entitled Method AND APPARATUS FOR LIGHT-WEIGHT AUTHENTICATION filed Sep. 6, 2007, which is incorporated herein by reference for all purposes.
Number | Name | Date | Kind |
---|---|---|---|
5704017 | Heckerman | Dec 1997 | A |
5991882 | O'Connell | Nov 1999 | A |
6108706 | Birdwell et al. | Aug 2000 | A |
6496936 | French | Dec 2002 | B1 |
6766319 | Might | Jul 2004 | B1 |
7861287 | Pomerantz | Dec 2010 | B2 |
7930735 | Vigelette | Apr 2011 | B2 |
8078881 | Liu | Dec 2011 | B1 |
8453222 | Newstadt et al. | May 2013 | B1 |
20040123162 | Antell | Jun 2004 | A1 |
20040189441 | Stergiou | Sep 2004 | A1 |
20090119763 | Park | May 2009 | A1 |
20100122341 | Golle | May 2010 | A1 |
Number | Date | Country | |
---|---|---|---|
60967675 | Sep 2007 | US | |
60918781 | Mar 2007 | US | |
60836641 | Aug 2006 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 12215048 | Jun 2008 | US |
Child | 14325239 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 11890408 | Aug 2007 | US |
Child | 12215048 | US |