Recent years have witnessed an increasing development of mobile devices such as smartphones and tablets. Smartphones are also becoming an important means for accessing various online services, such as online social networks, email and cloud computing. Many applications and websites allow users to store their information, passwords, etc. Users also save various contact information, photos, schedules and other personal information in their smartphones.
No one wants personal and sensitive information to be leaked to others without their permission. However, the smartphone is easily stolen, and the attacker can have access to the personal information stored in the smartphone. Furthermore, the attacker can steal the victim's identity and launch impersonation attacks in networks, which would threaten the victim's personal and sensitive information like his bank account, as well as the security of the networks, especially online social networks. Therefore, providing reliable access control of the information stored on smartphones, or accessible through smartphones, is very important.
Public clouds offer elastic and inexpensive computing and storage resources to both companies and individuals. Cloud customers can lease computing resources, like Virtual Machines, from cloud providers to provide web-based services to their own customers—who are referred to as the end-users. Past efforts for protecting a cloud customer's Virtual Machines tended to focus on attacks within the cloud from malicious Virtual Machines that are co-tenants on the same servers, or from compromised Virtual Machine Monitors, or from network adversaries. However, end-users can also pose serious security threats. Consider the increasingly common situation of accessing cloud-based services and data through a smartphone. Users register accounts for these services. Then they login to their accounts from their smartphones and use these cloud services. However, after log-in, the user may leave her smartphone un-attended or it may be co-opted by an attacker, and now the attacker has legitimate access to the cloud-based services and data or the sensitive data stored in the smartphone itself. Ideally, smartphone users should re-authenticate themselves, but this is inconvenient for legitimate users and attackers have no incentive to “re-authenticate.”
Further, smartphones themselves store private, sensitive and secret information related to people's daily lives. Users do not want these accessible to an attacker who has stolen the device, or has temporary access to it. Current smartphones use passwords or biometrics to authenticate the end-users during their initial login to the devices or to protected cloud services. These may be insufficient for many use cases. First, users often choose poor passwords, and passwords are vulnerable to guessing and dictionary attacks, and password reuse. Also, biometrics are vulnerable to forgery attacks. A recent report shows that a lot of users disable these authentication methods simply because they are inconvenient. Second, using just initial login authentication is not enough, since adversaries can take control of the users' smartphones, after the legitimate users' initial login. Then the adversaries can access the services and data, which may be proprietary and sensitive, whether stored in the cloud or in the mobile device itself. To protect data and services, whether in the cloud or in the smartphone itself, from adversaries who masquerade as legitimate end-users, what is needed is a secure and usable re-authentication system, which is ideally both implicit and continuous. An implicit authentication method does not rely on the direct involvement of the user, but is closely related to her behavior, habits or living environment. This is more convenient than having to re-enter passwords or pins. A continuous re-authentication method should keep authenticating the user, in addition to the initial login authentication. This can detect an adversary once he gets control of the smartphone and can prevent him from accessing sensitive data or services via smartphones, or inside smartphones.
The present invention is directed to methods and systems capable of implicitly authenticating users based on information gathered from one or more sensors and an authentication model trained via a machine learning technique.
In the present invention, methods and systems are provided in which an authentication model is trained using one or more machine learning techniques, and then, data is collected, manipulated, and assessed with the authentication model in order to determine if the user is authentic.
Among the many different possibilities contemplated, the sensors may include motion and/or non-notion sensors, including but not limited to accelerometers, gyroscopes, magnetometers, heart rate monitors, pressure sensors, or light sensors, which may be located in different devices, including smartphones, wearable devices (including hut not limited to smartwatches and smartglasses), implantable devices, and other sensors accessible via an internet of things cloT) system. Further, the method can include continuously testing the user's behavior patterns and environment characteristics, and allowing authentication without interrupting the user's other interactions with a given device or requiring user input. The method may also involve the authentication model being retrained, or adaptively updated to include temporal changes in the user's patterns, where the retraining can include, but is not limited to, incorporating new data into the existing model or using an entirely new set of data. The retraining may occur automatically when the system determines that confidence in the authentication has been too low for a sufficiently long period of time, such as when the confidence score for multiple authentications within a 20 second period are below 0.2. The method may also include determining context of the measurements. The authentication may also involve the use of multiple features from the measurements, including one or more frequency domain features, one or more time domain features, or a combination of the two. The machine learning technique that is utilized can include, but is not limited to, decision trees, kernel ridge regression, support vector machine algorithms, random forest, naïve Bayesian, k-nearest neighbors (K-NN), and least absolute shrinkage and selection operator (LASSO). Unsupervised machine learning algorithms and Deep Learning algorithms can also be used. The method may also involve preventing unauthorized users from gaining access to a device or a system accessible from the device without requiring explicit user-device interaction. The method may also include an enrollment phase that includes receiving sensor data, sending the data for use in training an authentication model, and receiving the authentication model, whether the training is done by a remote server, or by the device itself such as a smartphone. In response to a failed authorization attempt, the method may also include blocking further access to a device or generating an alert. The sensor sampling rate may also be adjustable. This method may be conducted via a smartphone application. As such, it may also include utilizing sensors that do not generate data that is of concern for privacy, that would have required permission for those measurements to be used if they were used on the smartphone (such as GPS sensors, camera sensors, or microphones). The method may also include rapidly training an authentication model, such as when the training time is less than about 20 seconds. The method may also be utilized when the sensor is in one device, the authentication is accomplished in a second device, and a third device is optionally requesting the results of the authentication.
Reference is now made in detail to the description of the invention as illustrated in the drawings. While the invention will be described in connection with these drawings, there is no intent to limit it to the embodiment or embodiments disclosed therein.
The present invention is directed to methods and systems capable of implicitly authenticating users based on information gathered from one or more sensors and an authentication model trained via a machine learning technique.
The method (10) begins with an enrollment phase (20). Initially, the system must be trained in an enrollment phase. In preferred embodiments, this is done whenever a new user needs to be authenticated, or whenever sensors are added or removed from the authentication system. The output from the enrollment phase is at least one authentication model or classifier for the use by a device for authenticating a user. For embodiment, when users want to use a smartphone to access sensitive data or cloud services, the system starts to monitor the sensors and extract particular features from the sensors' data. This process continues and the data should be stored in a protected buffer in the smartphone until the distribution of the collected features converges to an equilibrium, which means the size of data can provide enough information to build a user's profile with sufficient accuracy.
One embodiment of the enrollment phase is depicted in
The context information, when carefully chosen, can be user-agnostic: the context of the current user can be detected (24) prior to authenticating the user. In one embodiment, the usage context was first differentiated, and then a fine-grained authentication model was utilized to implement authentication under each different context. In this embodiment, when the context detection model evaluates a user, it uses a model (i.e., classifier) that was trained with other user's data, such that the context detection model could be user-agnostic. Thus, a service provider can pre-train a user-agnostic context detection model for the users to download when they first enroll, or have it pre-installed on a device. Updates to the context detection model can also be downloaded or sent to the device.
In one embodiment, the signals of the sensors' data are segmented into a series of time windows. For context detection, the data collected by the accelerometer and the ambient light sensor were utilized because they represent certain distinctive patterns of different contexts. In another embodiment, data from a gyroscope and accelerometer were utilized for differentiating moving or stationary contexts. In these embodiments, the magnitude of each sensor data was computed. In some embodiments, the magnitude of an accelerometer data sample (t, x, y, z), is computed as m=√{square root over (x2+y2+z2)}. In other embodiments, the magnitude of data sample (t, x, y, z) may be computed as m=x+y+z. Discrete Fourier transform (DFT) can be implemented to obtain the frequency domain information. The frequency domain information is useful and is widely used in signal processing and data analysis, e.g., speech signals and images.
In such an arrangement, the feature vector for a sensor in a given time window k-for a device, such as a smartphone, can be shown as incorporating both time domain and frequency domain features:
SPi(k)=[SPit(k),SPif(k)] (Eq. 1)
One embodiment can have a selection of four time domain features and three frequency domain features where:
SPit(k)=[mean(Si(k)),var(Si(k)),max(Si(k)),min(Si(k))]
SPif(k)=[peak(Si(k)),freq(Si(k)),peak2(Si(k))] (Eq. 2)
Other embodiments can have a different selection of features.
Then, the authentication feature vector for the smartphone is:
SPif(k)=[SPsensor_1(k),SPsensor_2(k), . . . SPsensor_n(k)] (Eq. 3)
if n sensors are used for detection, where n can be an integer equal to or greater than 1.
Similarly, this method can also utilize the feature vector for the sensor data from (in this embodiment) the smartwatch, denoted SW (k). Therefore, the authentication feature vector for training the authentication model for this embodiment is:
Authenticate(k)=[SP(k),SW(k)] (Eq. 4)
A similar approach can be used for determining the context; however, this can be simpler. In one embodiment, it may only use features from the smartphone and not the smartwatch:
Context(k)=[SP(k),SW(k)] (Eq. 5)
While the same feature vectors can be used for determining the context as for training the authentication model, preferred embodiments are configured whereby some or all of the features or feature vectors used for training the authentication model are different from the features or feature vectors used for determining context, and in more preferred embodiments, the authentication model utilizes more feature vectors than are used in determining context. The ability to differentiate users can be seen in
Rather than utilize every sensor feature, it may be beneficial to utilize only those features likely to be beneficial to distinguishing users. In one embodiment, the following statistical features derived from each of the raw sensor streams were computed in each time window: Mean: Average value of the sensor stream; Var: Variance of the sensor stream; Max: Maximum value of the sensor stream; Min: Minimum value of the sensor stream; Ran: Range of the sensor stream; Peak: The amplitude of the main frequency of the sensor stream; Peak f: The main frequency of the sensor stream; Peak2: The amplitude of the secondary frequency of the sensor stream; and Peak2 f: The secondary frequency of the sensor stream. In this embodiment, the performance of each feature can be tested, and the “bad” features can be dropped. If a feature can be used to easily distinguish two users, the feature is considered a “good” feature. For a feature to distinguish two different persons, it is necessary for the two underlying distributions to be different. Each feature was tested as to whether the feature derived from different users was from the same distribution. If most pairs of them are from the same distribution, the feature is “bad” in distinguishing two persons and it can be dropped.
In one embodiment, the Kolmogorov-Smirnov test (KS test) was used to test if two data sets are significantly different. The KS test is a nonparametric statistical hypothesis test based on the maximum distance between the empirical cumulative distribution functions of the two data sets. The two hypotheses of a KS test are:
H0: the two data sets are from the same distribution.
H1: the two data sets are from different distributions.
A KS test reports a p-value, i.e. the probability that obtaining the maximum distance is at least as large as the observed one when H0 is assumed to be true. i.e., H0 is accepted. If this p-value is smaller than α, usually set to 0.05, the H0 hypothesis can be rejected because events with small probabilities rarely happen (rejecting H0 and accepting H1), indicating a “good” feature for distinguishing users. For each feature, the p-value for data points for each pair of users is calculated, and a feature could be dropped if most of its p-values are higher than α.
Next, redundant features can also be considered, by computing the correlation between each pair of features. A strong correlation between a pair of features indicates that they are similar in describing a user's behavior pattern, so one of the features can be dropped. A weak correlation implies that the selected features reflect different behaviors of the user, so both features should be kept. The Pearson's correlation coefficient can be calculated between any pair of features. Then, for every pair of features, the average of all resulting correlation coefficients over all the users was taken.
Once extracted, some or all of these features may be placed into the context detection component to decide the specific context the current user is in. In some embodiments, features from only one device are utilized, even if multiple devices are used for authentication.
In one embodiment, a Decision Tree Classifier was utilized. Decision Tree Classifier is a method commonly used in data mining. In another embodiment, a Random forest algorithm was chosen, although many other machine learning algorithms could be utilized here. The goal is to create a model that predicts the value of a target variable based on several input variables. Here, the training data is first used to train a context detection tree, and then the testing data is fed to the constructed decision tree and the resulting leaf outputs the context label of the testing data.
Although there are a variety of methods for developing context models, in this embodiment, the data for building a context model was gathered by asking subjects to use a smartphone and the smartwatch freely under each of the contexts for 20 minutes, and to stay in the current context until the experiment is finished.
The context decision tree in the embodiment data set was evaluated with 10-fold cross-validation and the confusion matrix is shown in Table I. The context detection method achieved high accuracy under four different contexts: movement inside a building, moving up or down stairs, moving outside, and standing still (static). The worst accuracy for the inside context can achieve more than 97% and the average accuracy for the four contexts is 98.1%. This provides fine-grained context information for the next-step authentication process. It is observed that the time for a context to be detected is within 4.5 milliseconds on the smartphone, which is fast and thus applicable in real world scenarios.
For the interpretation of the context decision tree trained over in this embodiment, it should be noted that: 1) the accelerometer could be used to differentiate the stationary context from the other three moving contexts; 2) the ambient light could further differentiate the outside movement from the inside movement and up/downstairs movement contexts; 3) the accelerometer could be used to further differentiate the inside and up/downstairs contexts. Based on these observations of the decision tree, it is recognized that it is naturally separable for these different contexts without dependence on the users. Therefore, the preferred user-agnostic context decision tree can provide accurate context detection performance for different usage contexts within an acceptable processing time.
Considering the scenario where the smartphone and smartwatch may not always be connected with each other, this system can also implement a context detection method by only using the smartphone, smartwatch, or some other set of sensors entirely. The confusion matrices are shown in Table II and Table III, respectively. Comparing with Table I, it is seen that combining the smartphone and smartwatch together can provide better context detection performance than using any individual device, which shows the benefits of using multiple devices in a system.
In another embodiment, four contexts were initially tested: (1) the user uses the smartphone without moving around, e.g., while standing or sitting; (2) the user uses the smartphone while moving. No constraints are set for how the user moves; (3) the smartphone is stationary (e.g., on a table) while the user uses it; (4) the user uses the smartphone on a moving vehicle, e.g., train. However, in this embodiment, it was found that these four contexts cannot be easily differentiated: contexts (3) and (4) are easily misclassified as context (1), since (1), (3) and (4) are all relatively stationary, compared to context (2). Therefore, contexts (1), (3) and (4) were combined into one stationary context, while (2) was left as the moving context. The resulting confusion matrix in Table IV showed a very high context detection accuracy of over 99% with these 2 simple contexts. The context detection time was also very short—less than 3 milliseconds.
Before training the authentication model, it is useful to understand which sensors may be of value in distinguishing users. Mobile sensing technology has matured to a state where collecting many measurements through sensors in smartphones is now becoming quite easy through, for embodiment, Android sensor APIs. Mobile sensing applications, such as the CMU MobiSens, run as a service in the background and can constantly collect sensors' information from smartphones. Sensors can be either hard sensors (e.g., accelerometers) that are physically-sensing devices, or soft sensors that record information of a phone's running status (e.g., screen on/off). Thus, practical sensors-based user authentication can be achieved today. While any and all sensors can be utilized, preferred embodiments utilize a finite subset of all available sensors.
In one embodiment, Fisher scores (FS) were used to help select the most promising sensors for user authentication. FS is one of the most widely used supervised feature selection methods due to its excellent performance. The Fisher Score enables finding a subset of features, such that in the data space spanned by the selected features, the distances between data points in different classes are as large as possible, while the distances between data points in the same class are as small as possible.
Table V shows the FS for different sensors that are widely implemented in smartphones and smartwatches. In this embodiment, it is seen that the magnetometer, orientation sensor and light sensor have lower FS scores than the accelerometer and gyroscope because they are influenced by the environment. This can introduce various background noise unrelated to the user's behavioral characteristics, e.g., the magnetometer may be influenced by the magnets.
Smartphone sensor information that is not intrinsically privacy sensitive includes measurements from an accelerometer, magnetometer, gyroscope, orientation sensor, ambient light, proximity sensor, barometric pressure and temperature. Other more privacy sensitive inputs include a user's location as measured by his GPS location, WLAN, cell tower ID and Bluetooth connections. Also privacy sensitive are audio and video inputs like the microphone and camera. These privacy sensitive sensor inputs require user permissions, for example, permissions that must be explicitly given on some Android devices. The contacts, running apps, apps' network communication patterns, browsing history, screen on/off state, battery status and so on, can also help to characterize a user. In preferred embodiments, sensors are chosen that do not require explicit permissions to be given, for example the GPS sensor, a camera, or a microphone. In preferred embodiments, sensors are selected that are commonly available on smartphones.
Therefore, in one embodiment, two sensors were selected, the accelerometer and gyroscope, because they have higher FS scores and furthermore, are the most common sensors built into current smartphones and smartwatches. These two sensors also represent different information about the user's behavior: 1) the accelerometer records coarse-grained motion patterns of a user, such as how she walks; and 2) the gyroscope records fine-grained motions of a user such as how she holds a smartphone. Furthermore, these sensors do not need the user's permissions, making them useful for continuous background monitoring in implicit authentication scenarios, without requiring user interaction. In some embodiments, only a single sensor is used. In others, two or more are used.
Once the sensors and features to be used for context and/or authentication are determined, there are still two parameters that can be considered: the window size and the size of the dataset.
The window size is an important system parameter, which determines the time that the system needs to perform an authentication, i.e., window size directly determines the system's authentication frequency. The window size can be varied as desired. In one embodiment, the window size can be varied, for embodiment from 1 second to 16 seconds. Given a window size, for each target user, the system can utilize multi-fold (e.g., 10-fold, etc.) cross-validation for training and testing. In one embodiment, a 10-fold cross-validation was used, i.e., 9/10 data was used as the training data, and 1/10 used as the testing data. The false rejection rate (FRR) and false acceptance rate (FAR) are metrics for evaluating the authentication accuracy of a system. FAR is the fraction of other users' data that are misclassified as the legitimate user's data. FRR is the fraction of the legitimate user's data that are misclassified as other users' data. For security protection, a large FAR is more harmful than a large FRR. However, a large FRR will often degrade the usage convenience.
The size of the data set also affects the overall authentication accuracy because a larger training data set provides the system more information. As shown in
In cases where the sensor measurements originally obtained are too large to process directly, a re-sampling process can be utilized. This can done for several reasons, including to reduce the computational complexity, or to reduce the effect of noise by averaging the data points. For example, to reduce the data set by 5 times, 5 contiguous data points can be averaged into one data point.
Ideally, once the user has gotten used to the device and the device-specific ‘sensor behavior’ no longer changes, and the system has observed sufficient information to have a stable estimate of the true underlying behavioral pattern of that user, the system can now train the authentication classifier, optionally under various contexts. The system and method can automatically detect a context in a user-agnostic manner and can authenticate a user based on various authentication models. That is, the system can authenticate the users without requiring any specific usage context, making it more applicable in real world scenarios.
In real world situations, systems will generally not spend significant time in enrollment (20), but rather in the test phase (30). Referring to
Referring back to
A confidence score (CS) for the k-th authentication feature vector xkT can be defined in a variety of ways. One method is to define CS as the distance between xkT and the corresponding authentication classifier w*.
CS(k)=xkTw* (Eq. 11)
As the authentication classifier w* represents the classification boundary to distinguish the legitimate user and the adversaries, a lower confidence score (smaller distance between xkT and w*) represents a less confident authentication result and suggests a change of user's behavioral pattern where retraining should be taken. For an authenticated user, one preferred embodiment involves determining if the confidence score is lower than a certain threshold εCS for a period of time T, then the system automatically retrains the authentication models.
A system that recognizes a user's behavior drift by checking the confidence score could then go back to the training module again, and upload the legitimate user's authentication feature vectors to the training module until the new behavior (authentication model) is learned. Advanced approaches in machine unlearning can be utilized to update the authentication models faster than retraining from scratch. After retraining the user's authentication models, it can be seen that the confidence score increases to normal values from Day 8 in
An attacker who has taken over a legitimate user's smartphone must not be allowed to retrain the authentication model. Fortunately, the attacker cannot trigger the retraining since, in order to trigger retraining, the confidence score should be positive and last for a period of time. However, the attacker is likely to have negative confidence scores, which cannot last for sufficient time to trigger retraining, since he can be detected rapidly.
Recall that the goal of an attacker is to get access to the sensitive information stored in the cloud through the smartphone, or in the smartphone. This system and method achieves very low FARs when attackers attempt to use the smartphone with their own behavioral patterns.
This system and method are secure even against the masquerading attacks where an adversary tries to mimic the user's behavior. Here, ‘secure’ means that the attacker cannot cheat the system via performing these spoofing attacks and the system should detect these attacks in a short time. To evaluate this, an experiment was conducted to utilize a masquerading attack where the adversary not only knows the password but also observes and mimics the user's behavioral patterns. If the adversary succeeds in mimicking the user's behavioral pattern, then the system or method will misidentify the adversary as the legitimate user and he can thus use the smartphone normally.
In this experiment, subjects were asked to be a malicious adversary whose goal was to mimic the victim user's behavior to the best of his ability. One user's data was recorded and his/her model was built as the legitimate user. The other users tried to mimic the legitimate user and cheat the system to login. The victim user was recorded utilizing a test smartphone with a VCR. Subjects were asked to watch the video and mimic the behavior. Both the adversary and the legitimate user performed the same tasks, and the user's behavior is clearly visible to the adversary. Such an attack was repeated 20 times for each legitimate user and her corresponding ‘adversaries’. In order to show the ability of the test system in defending against these mimicry attacks, the percentage of people (attackers) who were still using the smartphone without being de-authenticated by the system were counted.
Such experimental results also match with an analysis from a theoretical point of view. Assuming the probability to detect the attacker at each time window as p, then the chance that the attacker can escape from detection would be (1−p)n where n is the number of windows. Based on experimental results, a test system can achieve accuracy higher than 90%. Thus, within only three windows, the probability for the attacker to escape from detection is (1−0.9)3=0.1%.
Note that the window size for this experiment is 6 s and the actual authentication time for each window sample is 22.5 ms. Both the experimental and theoretical analysis show that the probability for the adversary to be misidentified as a legitimate user decreases very quickly with time. Therefore, embodiments of the claimed system and method can show excellent performance in defending against masquerading attacks.
Experiments utilizing various embodiments of the system and method have been conducted. In one experiment, users could use their smartphones and smartwatches as they normally do in their daily lives, without any constraints on the contexts under which they used their devices. Users were invited to take the smartphone and smartwatch for one to two weeks, and use them under free-form, real-use conditions. The accuracy of user authentication was evaluated when only the smartphone's sensor features from the accelerometer and gyroscope were used, and when both the smartphone and smartwatch's sensor features were used. The former had feature vectors with 7×2=14 elements, while the latter had feature vectors with 7×2×2=28 elements.
In another experiment, different machine learning algorithms were tested utilizing the same set of data. Some potential machine learning techniques include, but are not limited to decision trees, kernel ridge regression (KRR), support vector machine (SVM) algorithms, random forest, naïve Bayesian, k-nearest neighbors (K-NN), and least absolute shrinkage and selection operator (LASSO). It is envisioned that supervised, unsupervised and semi-supervised machine learning techniques, and deep learning techniques, can be applied. However, only certain supervised machine learning techniques are discussed in some detail herein.
Table VI shows user authentication results for a sample of machine learning techniques: Kernel ridge regressions (KRR), Support Vector Machines (SVM), linear regression, and naïve Bayes.
For the experiment disclosed above, KRR achieved the best accuracy. SVM also achieves high accuracy but the computational complexity was much higher than KRR. Linear regression and naïve Bayes have significantly lower accuracy compared to KRR and SVM, in this embodiment.
Kernel ridge regressions (KRR) have been widely used for classification analysis. The advantage of KRR is that the computational complexity is much less than other machine learning methods, e.g., SVM. The goal of KRR is to learn a model that assigns the correct label to an unseen testing sample. This can be thought of as learning a function which maps each data x to a label y. The optimal classifier can be obtained analytically according to
w*=
dρ∥w∥
2+Σk=1N(wTxk−yk)2 (Eq. 6)
where N is the data size and xkM×1 represents the transpose of Authenticate(k), the authentication feature vector, and M is the dimension of the authentication feature vector. Let X denote a M×N training data matrix X=[x1, x2, . . . xN]. Let y=[y1, y2, . . . yN]. {right arrow over (φ)}(xi) denotes the kernel function, which maps the original data xi into a higher-dimensional (J) space. In addition, Φ=[{right arrow over (φ)}(x1), {right arrow over (φ)}(x2), . . . {right arrow over (φ)}(xN)] and K=ΦTΦ. This objective function in Eq. 6 has an analytic optimal solution where
w*=Φ[K+ρI
N]−1y (Eq. 7)
By utilizing certain matrix transformation properties, the computational complexity for computing the optimal in Eq. 7 can be largely reduced from O(N2.373) to O(M2.373). This is a huge reduction in these embodiments, since N=800 data points, and M=28 features in the authentication feature vector.
The computational complexity of KRR is directly related to the data size according to Eq. 7. The computational complexity can be largely reduced to be directly related to the feature size. According to Eq. 7, the optimal classifier is
w*=Φ[K+ρI
N]−1y
Define S=ΦΦT, where Φ=[{right arrow over (φ)}(x1),{right arrow over (φ)}(x2), . . . {right arrow over (φ)}(xN)]. By utilizing a matrix transformation method, the optimal solution in Eq. 6 is equivalent to
w*=[S+ρI
J]−1Φy (Eq. 8)
The dominant computational complexity for Eq. 7 and Eq. 8 comes from taking the inversion of a matrix. Therefore, based on Eq. 7 and Eq. 8, the computational complexity is approximately min(O(N2.373), O(J2.373)). If the identity kernel is utilized, the computational complexity can be reduced from O(N2.373) to O(M2.373) and is independent of the data size. Specifically, it is possible to construct 28-dimensional feature vectors (e.g., 4 time-domain features and 3 frequency-domain features for each of two sensors, for each device). Thus, the time complexity in the embodiment where 9/10 of the data was used for training, this reduced from O((800×9/10)2.373)=O(7202.373) to only O(282.373). In this embodiment, the average training time is 0.065 seconds and the average testing time is 18 milliseconds, indicating the effectiveness of the system.
In other embodiments, Support Vector Machines (SVMs) is a preferred method. SVMs are state-of-the-art large margin classifiers, which represent a class of supervised machine learning algorithms. After obtaining the features from sensors, SVM can be used as the classification algorithm in the system. The training data is represented as ={(xi, y1)εX×y:i=1, 2, . . . , n} for n data-label pairs. For binary classification, the data space is X=
d and the label set is y={−1, +1}. The predictor is w=x→y. The objective function is J(w,
). The SVM finds a hyperplane in the training inputs to separate two different data sets such that the margin is maximized.
w*=
dλ∥wλ
2+Σi=1nl(w,xi,yi) (Eq. 9)
where
l(w,xi,yi)=max(1−ywTx,0) (Eq. 10)
The margin is in SVM. So, Eq. 9 minimizes the reciprocal of the margin (first part) and the misclassification loss (second part). The loss function in SVM is the Hinge loss (Eq. 10).
Sometimes, the original data points need to be mapped to a higher dimensional space by using a kernel function so as to make training inputs easier to separate. In one embodiment of the classification, the smartphone owner's data is labeled as positive and all the other users' data as negative. Then, such a model is exploited to do authentication. Ideally, only the user who is the owner of the smartphone is authenticated, and any other user is not authenticated. In one embodiment, LIBSVM was selected to implement the SVM. The input of the embodiment was n positive points from the legitimate user and n negative data points from randomly selected n other users, although other embodiments utilize other configurations, including but not limited to using n positive points from the legitimate user and m negative data points from z other users. The n positive points could be gathered in a variety of ways, such as gathering sensor data within a period of time before, during, and after when the legitimate user is using an explicit form of authentication (such as signing in with a password, pin, or using some biometric sensor). The output is the user's profile for the legitimate user.
In one embodiment, the testing module of the authentication app in a smartphone runs as threads inside the smartphone system process. An application was developed to monitor the average CPU and memory utilization of the smartphone and smartwatch while running the authentication app which continuously requested sensor data at a rate of 50 Hz on a Nexus 5 smartphone and a Moto 360 smartwatch. The CPU utilization for this embodiment was 5% on average and never exceeded 6%. The CPU utilization (and hence energy consumption) would scale with the sampling rate. The memory utilization in this embodiment was 3 MB on average. This is small enough to have negligible effect on overall smartphone performance.
To measure the battery consumption of the authentication app, the following four testing scenarios were considered: (1) Phone is locked (i.e., not being used) and app is off; (2) Phone is locked and the app is running; (3) Phone is under use and the app is off; and (4) Phone is under use and the app is running. For scenarios (1) and (2), the test time was 12 hours each. The smartphone battery was charged to 100% and the battery level was checked after 12 hours. The average difference of the battery charged level from 100% was reported. For scenarios (3) and (4), the phone under use means that the user keeps using the phone periodically. During the using time, the user keeps typing notes. The period of using and non-using is five minutes each, and the test time in total is 60 minutes. Again, the battery charge at the end is reported. For scenario 1, 2.8% battery usage; for scenario 2, 4.9% battery usage; for scenario 3, 5.2% battery usage, and for scenario 4, 7.6% battery usage. Thus, for this embodiment, the app consumed 2.1% more battery power when the app was on for scenarios 1 and 2, and 2.4% more battery power when the app was on for scenarios 3 and 4 where the phone is under use, which is an acceptable cost for daily usage.
The overall authentication performance of one embodiment is seen in Table VII, when the system had a window size of 6 seconds and the data size of 800. As seen in Table VII, the authentication methodology works well with just the smartphone, even without contexts; by using only the smartphone without considering any context, the system was shown to achieve authentication accuracy up to 83.6%. Further, auxiliary devices are helpful: by combining sensor data from the smartwatch with the smartphone sensor data, the authentication performance increases significantly over that of the smartphone alone, reaching 91.7% accuracy, with better FRR and FAR. Additionally, context detection is beneficial for authentication: the authentication accuracy is further improved, when the finer-grained context differences are taken into consideration, reaching 93.3% accuracy with the smartphone alone, and 98.1% accuracy with the combination of smartphone and smartwatch data. Lastly, the overall time for implementing context detection followed by user authentication is less than 21 milliseconds. This is a fast user authentication testing time, with excellent authentication accuracy of 98%, making this method and system efficient and applicable in real world scenarios.
Simpler embodiments which do not use contexts, auxiliary devices (like a smartwatch) or frequency domain features are also used. In these simpler embodiments, using more than one sensor can still improve authentication accuracy.
The pseudo code for implicit data collection in Android smartphones is given in Listing 1. The application contains two parts. The first part is an Activity, which is a user interface on the screen. The second part is a Service, which is running in the background to collect data. Each sensor measurement consists of three values, so a vector is constructed from these nine values from three sensors.
Different sampling rates were considered in the experiment, to construct data points.
Listing 1. Pseudo code for PU dataset collection using Android smartphones.
First, as seen in
A three-sensor-based system was also compared with one and two sensor-based authentication experiments. From
In addition, this method can work in conjunction with other authentication methods. In some embodiments, both implicit authentication methods and explicit authentication methods, including biometrics such as fingerprint, or a signature or a graphical “password or pass-picture” entry, are used to authenticate a user. In one embodiment, the user is authenticated implicitly by the sensor data that is, in part, gathered while the user is explicitly authenticating, via a passcode, PIN, graphical password, signature or fingerprint. In other embodiments, the user is implicitly authenticated continuously, and an interface to allow explicit authentication is presented only if the user has already been implicitly authenticated. In a preferred embodiment, explicit authentication occurs first, and if that is successful then continuous implicit authentication is performed, with an interface presented for explicit authentication again only when implicit authentication fails.
With reference to
In addition, the sensor data may also help to further authenticate an explicit authentication mechanism, including utilizing one or more fingerprints, or a signature or a graphical “password or pass-picture” entry. In some embodiments, the processor may be configured to store acquired sensor data while a user is explicitly authenticating themselves, such as via a passcode, PIN, signature or fingerprint. The features from the stored acquired sensor data can be used to add security to the explicit authentication, which could be a faked signature or stolen biometric or PIN. In other embodiments, the processor is configured to implicitly authenticate users continuously, and explicitly authenticate at certain times (such as when the user turns the phone on, or wakes the phone after a period of inactivity). In still other embodiments, the processor is configured to implicitly authenticate users continuously, and is further configured to display an interface to allow explicit authentication only if the user is currently implicitly authenticated. In a preferred embodiment, explicit authentication occurs first, and if that is successful then continuous implicit authentication is performed, with an interface presented for explicit authentication again only when implicit authentication fails.
The first device (420) may also have communication links (476) with tertiary devices without sensors (494) that communicate with the device (420). These tertiary devices (494) may interact with the device (420) in order to authenticate a user before allowing the user access to the tertiary device (494). For example, a thin client may utilize a connected smartphone's authentication process in order to provide access to the thin client.
In addition, a communication link can be created between the first device (420) and a remote server (492) through a transceiver. In
The smartphone is in communication with an authentication server (640) and app server (650) in the cloud. In some embodiments, the app server can be in communication with a cloud customer or end user that is running cloud management client software (662) on a device capable of running the software, which may include the smartphone or other devices which may include, but are not limited to, a local machine (660). These communications will typically utilize secure communications via a known method, such as utilizing a secure sockets layer (SSL)/transport layer security (TLS) protocol.
The smartphone also runs the authentication testing module as a background service. Upon enrollment, in this embodiment, a context model/feature context dataset (642) is downloaded from the Authentication server (640) and provided to the testing module (632), for use with context detection (634). In the testing module, sensor data from the smartphone and smartwatch are sent to the feature extraction components (633) in both the time domain and the frequency domain, where fine-grained time-frequency features are extracted to form the authentication feature vector.
The extracted features are sent for context detection (634), both the features and the detected context are sent to the authentication server (640) to train (644) authentication models (646) based, for example, on different contexts. The authentication server provides efficient computation and enables the training data set to use sensor feature vectors of other enrolled smartphone users. When a legitimate user first enrolls in the system, the system keeps collecting the legitimate user's authentication feature vectors for training the authentication model. The system deploys a trusted Authentication cloud server to collect sensors' data from all the participating legitimate users. To protect a legitimate user's privacy, all users' data are anonymized. In this way, a user's training module can use other users' sensor data but has no way to know the other users' identities or behavioral characteristics. The training module uses the legitimate user's authentication feature vectors and other people's authentication feature vectors in the training algorithm to obtain the authentication model. After training, the authentication model is downloaded to the smartphone. The training module does not participate in the authentication testing process and is only needed for retraining when the device recognizes a user's behavioral drift, which is done online and automatically. Therefore, the system does not require continuous communication between the smartphone and the Authentication Server.
In some embodiments, behavioral drift can also be determined within the testing module (632) to update the authentication models (635) in the device (630) itself, without needing to go to the Authentication Server (640).
The authentication models (646) are then sent back to the testing module (632) on the smartphone (630), where they are stored locally (635) for use with the authentication module (637). To authenticate a user, the extracted features can be sent to a classifier (636) in the authentication module (637), where, in conjunction with locally-stored authentication models (635), a positive or negative authentication result is determined. This result is communicated to a response module (638). If the authentication results indicate the user is legitimate, then the Response Module will allow the user to use the cloud apps (639, 652) to access the data stored in the cloud (654) or cloud services (656) in the app server. Otherwise, the Response Module can take appropriate action, including but not limited to locking the smartphone, refusing access to the security-critical data, or performing further checking. If the legitimate user is misclassified, in order to unlock the smartphone, several possible responses can be implemented, depending on the situation and security requirements. For example, the legitimate user must explicitly re-authenticate by using a biometric that may have been required for initial log-in, e.g., a fingerprint. The legitimate user is motivated to unlock his device, whereas the attacker does not want to use his fingerprint because it will leave a trace to his identity. This architecture allows such explicit unlocking mechanisms, but is not restricted to one such mechanism.
Various modifications and variations of the invention in addition to those shown and described herein will be apparent to those skilled in the art without departing from the scope and spirit of the invention, and fall within the scope of the claims Although the invention has been described in connection with specific preferred embodiments, it should be understood that the invention as claimed should not be unduly limited to such specific embodiments.
This application claims benefit of U.S. Provisional Application No. 62/293,152, filed Feb. 9, 2016, which is hereby incorporated in its entirety by reference.
This invention was made with government support under Grant CNS-1218817 awarded by the National Science Foundation. The government has certain rights in the invention.
Number | Date | Country | |
---|---|---|---|
62293152 | Feb 2016 | US |