The disclosure relates to methods and systems for using a bone conduction signal and an audio signal for two-way or multifactor authentication of user. More specifically, the methods and systems use a bone conduction signal from a wearable device and an audio signal from a smart device to authenticate a user of the smart device.
Smart voice assistants or smart devices typically include a vocal or audio based authentication process, if any authentication at all. For example, a user may speak a specific phrase or statement and the smart device may recognize the user based on the unique audio characteristics of the user's vocals when speaking the specific phrase. Such audio characteristics are usually determined during an enrollment or initialization period. However, a user's submission during enrollment or initialization may not match a current attempt to authenticate due to differences in tone and volume, thus resulting in a mismatch or denial of access to the smart device. Further, a third party may spoof or copy a user's audio signal and utilize such a spoof or copy of the audio signal to imitate the user, thus gaining access to, potentially, sensitive and/or private user data or information.
Accordingly, Applicants have recognized a need for systems and methods to utilize two-factor or multifactor authentication to enable user access to a smart device, the two-factor or multifactor authentication including verifying a user via a bone conduction signal from a wearable device and an audio signal obtained by the smart device. The present disclosure is directed to embodiments of such systems and methods.
The present disclosure is generally directed to systems and methods for utilizing two-factor or multifactor authentication to enable user access to a smart device, the two-factor or multifactor authentication including verifying a user via a bone conduction signal from a wearable device and an audio signal obtained by the smart device. In such embodiments, a smart device may request an input from a user prior to access to a portion of or all of the functionality and/or data of the smart device. Prior to a request for such an input, the smart device may prompt the user to initialize or enroll with the smart device. Such an enrollment or initialization may include prompting a user to speak a particular phrase or phrases using each of one or more wearable devices associated with the user. In other words, the user may speak or submit a phrase for each wearable device the user desires to utilize for subsequent verifications and/or authentications. In such embodiments, the smart device and/or an authentication circuitry may store the audio signals, obtained from a microphone or other sensor of the smart device, and bone conduction signals, obtained from each of the one or more wearable devices. Further, an identifier or tag may be stored alongside each submission or entry, such an identifier or tag associated with one of the one or more wearable devices. The identifier or tag may be a device name obtained via data in a connection (for example, Bluetooth, WiFi, or other signal communication standard) between the wearable device and the smart device and/or authentication circuitry. Further still, more than one user may access a smart device. In such embodiments, different users may be able to access different functionality and/or data of the smart device. In such examples, when initializing or enrolling a user, a user identifier may be generated and may be stored with associated wearable devices and level of access. In an embodiment, functionality may include the smart device, after determining consistency and verifying the bone conduction signal and audio signal, utilizing natural language processing to interpret voice commands from the user and, based on the interpretation, performing an action or operation. The smart device may be configured to provide the additional functionality of granting access to virtual or physical locations and/or data, enabling a user to purchase goods and/or utilize services (for example, play music, view video of cameras connected to and/or proximate the smart device, among other services and/or functions), allowing a user to access a computing device or user profile, among other functionality.
The smart device and/or authentication circuitry may include one or more trained models. The smart device and/or authentication circuitry may include a bone conduction verification model and an audio conduction verification model. The bone conduction verification model may be trained based on previous users' bone conduction signals, previous users' enrollment or initialization bone conduction signals, and the outcome of previous verifications. Further, when a user submits a bone conduction signal enrollment or initialization, the bone conduction signal enrollment or initialization may be utilized to refine, re-train, or further train the bone conduction verification model. The audio conduction verification model may be trained based on previous users' audio signals, previous users' enrollment or initialization audio signals, and the outcome of previous verifications. Further, when a user submits an audio signal enrollment or initialization, the audio signal enrollment or initialization may be utilized to refine, re-train, or further train the audio conduction verification model.
As noted, prior to providing access to some or all of the functionality of the smart device, the smart device and/or authentication circuitry may request authentication and/or verification. In response to such a prompt, the smart device and/or authentication circuitry may wait until an audio signal and/or bone conduction signal is received. Once one signal (for example, the audio signal and/or bone conduction signal) is received by the smart device and/or authentication circuitry, the smart device and/or authentication circuitry may wait for the other signal to be received (for example, the bone conduction signal and/or the audio signal, respectively). If the other signal is not received within a pre-selected or specified time interval, the smart device and/or authentication circuitry may deny the user access to some or all functionality (for example, a user may play a song, but not access private or sensitive data or information relating to the user). Further, the smart device and/or authentication circuitry may prompt the user to resubmit a phrase or phrases for authentication or verification for an amount of pre-determined or pre-selected times.
Once the smart device and/or authentication circuitry receives both signals (for example, the bone conduction signal and the audio signal), the smart device and/or authentication circuitry may determine whether the bone conduction signals and audio signals are consistent. In other words, whether the bone conduction signals and audio signals are based on the same phrase and/or from the same user. Such a determination may include pre-processing each signal (for example, delaying or aligning one of the signals to match the other and/or reducing noise of the signals by passing the signals through a Wiener filter or other audio or signal filter, among other pre-processing steps), determining a consistency score, and/or determining whether the consistency score is greater than or equal to a consistency threshold. Such a consistency score may be based on similarities between the audio signal and/or bone conduction signal. In such embodiments, since the audio signal and bone conduction signal are from the same user, some aspects of each signal should be consistent. After consistency is determined, the audio signal may be verified and the bone conduction signal may be verified. Such verifications may occur simultaneously, substantially simultaneously, and/or in sequence. Once verified, the smart device and/or authentication circuitry may grant access to some or all functionality of the smart device to the user.
In other embodiments, rather than a smart device obtaining an audio signal to be utilized for two-factor or multi-factor authentication of a user, a device may obtain an identification signal via a sensor of the device. Such an identification signal may include a signal generated by one or more of a badge scan, an identification card scan, a retinal scan, a fingerprint scan, or other scan configured to obtain a signal to be utilized for authentication. In such embodiments, authentication of a user may include authenticating a user based on the bone conduction signal and the identification signal.
Accordingly, an embodiment of the disclosure is directed to a method for two-way authentication of a user. The method may include receiving a bone conduction signal from a user via one or more wearable devices. The method may include receiving an audio signal from the user via a microphone separate from the one or more wearable devices, the audio signal corresponding to the bone conduction signal. The method may include determining a consistency score for the audio signal and corresponding bone conduction signal. The method may include, in response to the consistency score being greater than or equal to a consistency threshold, verifying, using an audio conduction model (AC model), that the audio signal is associated with a particular user. The method may include verifying, using a bone conduction model (BC model), that the bone conduction signal is associated with the particular user. The method may include, in response to verification of the audio signal and bone conduction signal, enabling, for the particular user, access to a smart device.
In an embodiment, the determination of the consistency score may include, (a) determining a marginal bone conduction power distribution with respect to time; (b) selecting a time range of interest based on an average of the margin marginal bone conduction power distribution with respect to time; (c) determining a marginal audio conduction and bone conduction power distribution with respect to frequency; (d) selecting the top frequency index M from the marginal bone conduction power distribution with respect to frequency and the top frequency index N from the marginal audio conduction power distribution with respect to frequency; (e) determining a correlation matrix between the top frequency index M and the top frequency index N; and (f) generating, based on the correlation matrix, the consistency score.
In an embodiment, the method may include, prior to determining the consistency score, pre-processing the bone conduction signal. The pre-processing of the bone conduction signal may include passing the bone conduction signal through (a) a low-pass filter to remove noise generated by human motion and (b) a Wiener filter to remove noise. The method may include, prior to determining the consistency score, passing the audio signal through a Wiener filter to remove noise. The method may include, prior to determination of a consistency score and verification, prompting a user to submit an initial or enrollment (a) bone conduction signal and (b) audio signal. The AC model may be trained using a plurality of responses prior to submission of an initial or enrollment audio signal. The AC model may further be trained using a submitted initial or enrollment audio signal. The BC model may be a convolutional neural network (CNN). The CNN may be trained using the initial or enrollment bone conduction signal.
In an embodiment, the method may include, if one or more of the audio signal or bone conduction signal are not verified, preventing the particular user from accessing the smart device. The method may include, prior to determining the consistency score, delaying the earliest received of the bone conduction signal and the audio signal to align the bone conduction signal and the audio signal.
Another embodiment of the disclosure is directed to a non-transitory machine-readable storage medium storing processor-executable instructions that, when executed by at least one processor. Such execution may cause the at least one processor to receive a bone conduction signal from a user via one or more wearable devices. The execution may cause the at least one processor to receive an audio signal from the user via a microphone separate from the one or more wearable devices, the audio signal corresponding to the bone conduction signal. The execution may cause the at least one processor to determine a consistency score for the audio signal and corresponding bone conduction signal. The execution may cause the at least one processor to, in response to the consistency score being greater than or equal to a consistency threshold: (a) verify, using an audio conduction model (AC model), that the audio signal is associated with a particular user; (b) verify, using a bone conduction model (BC model), that the bone conduction signal is associated with the particular user; and (c) in response to verification of the audio signal and bone conduction signal, enable, for the particular user, access to a smart device.
In an embodiment, the smart device may include the microphone. The bone conduction signal may be received via wireless communication. The consistency threshold may be based on similarities between bone conduction signals and audio signals that indicate the bone conduction signal and the audio signal are from the particular user.
Another embodiment of the disclosure is directed to a method for two-way authentication of a user. The method may include prompting a particular user to submit initial or enrollment (a) audio signals and (b) bone conduction signals. The method may include updating an audio conduction model (AC model) and a bone conduction model (BC model) based on the received initial or enrollment (a) audio signal and (b) bone conduction signals. The method may include, after reception of the initial or enrollment (a) audio signal and (b) bone conduction signals, receiving a bone conduction signal from a user via one or more wearable devices. The method may further include receiving an audio signal from the user via a microphone separate from the one or more wearable devices, the audio signal corresponding to the bone conduction signal. The method may include determining a consistency score for the audio signal and corresponding bone conduction signal. The method may include, in response to the consistency score being greater than or equal to a consistency threshold, verifying, using the AC model, that the audio signal is associated with a particular user. The method may include verifying, using the BC model, that the bone conduction signal is associated with the particular user. The method may include, in response to verification of the audio signal and bone conduction signal, enabling, for the particular user, access to a smart device.
In an embodiment the initial or enrollment audio signal may be received via the microphone. The microphone may be included in the smart device. The initial or enrollment bone conduction signals may be received from one of one or more wearable devices. The method may include prompting the particular user to submit initial or enrollment (a) audio signals and (b) bone conduction signals for each of the one or more wearable devices. The initial or enrollment (a) audio signals and (b) bone conduction signals may include one or more specific phrases. The BC model may be a convolutional neural network (CNN). The CNN may be trained using the initial or enrollment bone conduction signal to generate corresponding initial embedded bone conduction vectors. The method may include, prior to verification via the BC model, generating embedded bone conduction vectors using the bone conduction signals and the initial embedded bone conduction vectors.
Still other aspects and advantages of these embodiments and other embodiments, are discussed in detail herein. Moreover, it is to be understood that both the foregoing information and the following detailed description provide merely illustrative examples of various aspects and embodiments, and are intended to provide an overview or framework for understanding the nature and character of the claimed aspects and embodiments. Accordingly, these and other objects, along with advantages and features herein disclosed, will become apparent through reference to the following description and the accompanying drawings. Furthermore, it is to be understood that the features of the various embodiments described herein are not mutually exclusive and may exist in various combinations and permutations.
These and other features, aspects, and advantages of the disclosure will become better understood with regard to the following descriptions, claims, and accompanying drawings. It is to be noted, however, that the drawings illustrate only several embodiments of the disclosure and, therefore, are not to be considered limiting of the scope of the disclosure.
So that the manner in which the features and advantages of the embodiments of the systems and methods disclosed herein, as well as others that will become apparent, may be understood in more detail, a more particular description of embodiments of systems and methods briefly summarized above may be had by reference to the following detailed description of embodiments thereof, in which one or more are further illustrated in the appended drawings, which form a part of this specification. It is to be noted, however, that the drawings illustrate only various embodiments of the systems and methods disclosed herein and are therefore not to be considered limiting of the scope of the systems and methods disclosed herein as it may include other effective embodiments as well.
The present disclosure is directed to systems and methods for two-way authentication or multifactor authentication for a user. Such authentication may occur be in response to a user's attempt to access a device or smart device. Typically, a smart device or other device may utilize audio or actual passwords or pins to authenticate a user. However, a user's speech may be spoofed or copied or a user's password determined or stolen, thus allowing a third party to take control of a user's account. To prevent such issues, the systems and methods described herein may utilize a two-factor or multifactor authentication. For example, a user attempting to access functionality or data associated with a particular device or smart device may be prompted to submit a phrase or one or more phrases. The data may include personal data, data associated with the user, data stored on the smart device, data associated with the smart device, and/or data stored in a specified location accessible via the smart device. Functionality of the smart device may include (for example, the smart device may be configured to) the smart device accepting voice commands, the smart device performing an action based on the received voice commands, granting access to virtual or physical locations and/or data, and/or enabling a user to purchase goods and/or utilize services (for example, play music, view video of cameras connected to and/or proximate the smart device, among other services and/or functions), among other functionality. In an example, the smart device may, after determining consistency and verifying the bone conduction signal and audio signal, utilize natural language processing to interpret voice commands from the user and, based on the interpretation, perform an action or operation. Further, to ensure proper authentication, the user may submit such a phrase or one or more phrases while wearing a wearable device. While the smart device may include a microphone or other sensor to obtain an audio signal or other identification signal from the user, the wearable device may be configured to obtain a bone conduction signal from the user and transmit the bone conduction signal to the smart device and/or an authentication circuitry. Thus, at least two different factors may be gathered to authenticate a user.
Further, such systems and methods may authenticate the user when the bone conduction signal and audio signal or other signal are received. In such examples, the smart device and/or authentication circuitry may first determine whether the bone conduction signal and audio signal or other signal are consistent. If the signals (for example, the bone conduction signal and audio signal or other signal) are consistent, then each signal (for example, the bone conduction signal and audio signal or other signal) may be verified. If each signal (for example, the bone conduction signal and audio signal or other signal) is verified, then the smart device and/or authentication circuitry may authenticate the user.
The systems and methods may utilize various machine learning and/or signal processing techniques to analyze the bone conduction and the audio signal or other signal. Additionally, machine learning models or classifiers used may be trained based on a user's enrollment or initial submission of one or more phrases. In other words, the smart device and/or authentication circuitry may prompt the user to submit the enrollment or initialization. Such an enrollment or initialization may be prompted for each wearable device the user utilizes (for example, a headphone, headset, earbud, smart glasses, smart phone, virtual reality (VR) headset, and/or augmented reality (AR) headset, among other devices), as the interaction between such wearable devices and the user's skull may produce different bone conduction signals. Once enrolled or initialized, the smart device and/or authentication circuitry may update, refine, re-train, or further train any machine learning models, probabilistic models, and/or statistical models utilized in such systems and methods. Thus, when the smart device and/or authentication device verifies a bone conduction signal and/or audio signal or other signal, the smart device and/or authentication device may grant the user access to all or some portion of the functionality and/or data for the smart device or other devices.
A solution to such issues, as noted, include the use of a wearable device in conjunction with a microphone or other sensor of a smart device or other device. The user can provide an audio signal and bone conduction signal and, based on such signals, be authenticated. Other and/or different signals may be utilized in the systems and methods. Further, a user may submit an enrollment or initialization for each of one or more wearable devices the user owns or that the user wishes to utilize for authentication.
A user may include a person attempting to utilize the smart device 112. A user may also include a person, third person, and/or designee who may be directed, permitted, and/or appointed by a person who is verified as a user or accessor able to authorize other users.
The system 100 in
In an embodiment, and as illustrated in
The term “server” or “server device” is used to refer to any computing device capable of functioning as a server, such as a master exchange server, web server, mail server, document server, or any other type of server. A server may be a dedicated computing device or a server module (for example, an application) hosted by a computing device that causes the computing device to operate as a server. A server module (for example, server application) may be a full function server module, or a light or secondary server module (for example, light or secondary server application) that is configured to provide synchronization services among the dynamic databases on computing devices. A light server or secondary server may be a slimmed-down version of server type functionality that can be implemented on a computing device, such as a smart phone, thereby enabling it to function as an Internet server (for example, an enterprise e-mail server) only to the extent necessary to provide the functionality described herein.
As used herein, a “non-transitory machine-readable storage medium” may be any electronic, magnetic, optical, or other physical storage apparatus to contain or store information such as executable instructions, data, and the like. For example, any machine-readable storage medium described herein may be any of random access memory (RAM), volatile memory, non-volatile memory, flash memory, a storage drive (for example, a hard drive), a solid state drive, any type of storage disc, and the like, or a combination thereof. The memory may store or include instructions executable by the processor.
As used herein, a “processor” or “processing circuitry” may include, for example one processor or multiple processors included in a single device or distributed across multiple computing devices. The processor (for example, processor 122 shown in
The smart device 112, as noted, may include a memory 124 (for example, non-transitory machine readable storage medium). The memory 124 may include instructions or one or more sets of instructions. The instructions may be executed by the processor 122 of the smart device 112. Such instructions may include instructions to authenticate the user 108. The instructions to authenticate a user 108 may comprise one or more sets of instructions, sub-instructions, modules, and/or other routines or sub-routines. For example, such instructions may include a first set of instructions. The first set of instructions may include instructions to, when executed, determine whether a received bone conduction signal 110 is consistent with an audio signal 120. Prior to or in conjunction with execution of the first set of instructions to determine consistency, instructions (for example, a second set of instructions) to pre-process the bone conduction signal 110 and audio signal 120 may be executed. Such pre-processing may include aligning the bone conduction signal 110 with the audio signal 120 (for example, by delaying an earliest received signal to align similar features or aspects of the signals) and passing each of the bone conduction signal 110 and the audio signal 120 through a noise reduction program, algorithm, and/or filter (for example, a Wiener filter). In another embodiment, pre-processing may include converting the bone conduction signal 110 and audio signal 120 to a binary waveform, for determining consistency.
After pre-processing, consistency of the signals may be determined via execution of the instructions noted above. For example, filtered and/or noise reduced signal waveforms (for example, waveforms for the bone conduction signal 110 and audio signal 120) may be compared to determine whether the waveforms or features of the waveforms are similar, substantially similar, or consistent. The instructions, when executed, may determine a probability or score based on an amount of similar or consistent features or an amount of differences between the waveforms. The probability or score may indicate whether or not the signals (for example, the bone conduction signal 110 and audio signal 120) are consistent. In such examples, the smart device 112 or instructions may utilize a threshold or consistency threshold to determine whether the generated probability or score indicates consistency or not. For example, if the generated probability or score is greater than or equal to the consistency threshold, then the instructions, when executed, may determine that the signals (for example, the bone conduction signal 110 and audio signal 120) are consistent, while if the generated probability or score is less than the consistency threshold, then the instructions, when executed, may determine that the signals (for example, the bone conduction signal 110 and audio signal 120) are not consistent. If the signals (for example, the bone conduction signal 110 and audio signal 120) are determined to be not consistent, instructions, when executed, may notify the user 108 of inconsistent signals (for example, such a notification may be generated and sent to the user 108). Further, if the bone conduction signal 110 and audio signal 120 are inconsistent, then instructions to prompt the user 108 to resubmit a phrase or phrases for authentication may be executed. In an embodiment, a prompt for the user 108 to resubmit a phrase or phrases may be submitted a predetermined or preselected number of times. After the prompt to resubmit has been given the predetermined or preselected number of times and the user 108 has submitted corresponding responses resulting in additional inconsistent bone conduction signals 110 and audio signals 120, the user 108 may be locked out of the smart device 112 until the user's 108 identity can be verified (for example, via an email or phone call) or until after a predetermined amount of time has passed. In such examples, locking a user may prevent the user from accessing data on or associated with the smart device 112 and/or utilizing functionality of the smart device 112. Further, after one or more attempts or a pre-determined or pre-selected number of attempts, the smart device 112 may generate a notification indicating a number of unsuccessful attempts at access. The smart device 112 may then transmit such a notification to the user 108. Further, the notification may be sent via an alternative form of communication, such as email, text message, phone call, or other form of communication.
If the bone conduction signal 110 and audio signal 120 are determined to be consistent, instructions to verify each signal may be executed. The steps or operation for verification may include passing the bone conduction signal 110 and audio signal 120 through a corresponding trained model or classifier. Each corresponding trained model or classifier may be trained using a number of previous or pre-collected and stored signals and outcomes (for example, previously submitted bone conduction signals 110 or audio signals 120, previously submitted enrollment bone conduction signals 110 or audio signals 120, and an indicator to indicate whether the bone conduction signals 110 or audio signals 120 were verified). The trained model or classifier may also be re-trained or further trained using an initial or enrollment submission from the user 108 (for example, an initial or enrollment bone conduction signal or audio signal). In an embodiment, the trained model or classifier may be a supervised or unsupervised learning model. In an embodiment, the trained model or classifier may be based on one or more of decision trees, random forest models, random forests utilizing bagging or boosting (as in, gradient boosting), neural network methods, support vector machines (SVM), other supervised learning models, other semi-supervised learning models, other unsupervised learning models, or some combination thereof, as will be readily understood by one having ordinary skill in the art. Other types of models may be utilized to verify a bone conduction signal 110 or audio signal 120, such as meta-analytical model or another statistical or probabilistic model. Upon verification or not of the bone conduction signal 110 and audio signals 120, instructions may be executed to grant or deny access of the user 108 to the smart device 112.
Turning to
The wearable device 102, as shown in
In an embodiment and referring to
Turning to
Turning to
The system 100 may include a first device 138 and a second device 146. The first device 138 may include a computing device, a security device, or a device configured to receive a signal, communicate with the second device 146, offer functionality or services, offer access to data, and/or offer physical or virtual access to a location, among other functionality. The first device 138 may include a sensor 142 and communications circuitry 140. The sensor 142 may receive or sense a first signal 152 from the user 108 (for example, an identification signal). The first signal 152 may be an audio signal, a bone conduction signal, and/or signals associated with a badge scan, identification scan, retinal scan, fingerprint scan, facial scan (for example, facial recognition scan), gesture scan (for example, limb or body gesture recognition scan), and/or a scan of some other aspect of a user 108. The first device 138 may include a communications circuitry 140. The communications circuitry 140 may provide or transmit the first signal 152 to communications circuitry or a communications interface of the authentication device 144.
As noted, the system 100 may include a second device 146. The second device 146 may include a sensor 148 and a communications circuitry 150. The sensor 148 may sense a second signal 154 from the user 108. In an embodiment, the second signal 154 may be a different type of signal than the first signal 152 (for example, the first signal 152 may be an audio signal, while the second signal 154 may be a bone conduction signal). The second signal 154 may be an audio signal, a bone conduction signal, and/or signals associated with a badge scan, identification scan, retinal scan, fingerprint scan, facial scan (for example, facial recognition scan), gesture scan (for example, limb or body gesture recognition scan), and/or a scan of some other aspect of a user 108. The second device 146 may include a communications circuitry 150. The communications circuitry 150 may provide or transmit the second signal 154 to the communications circuitry 140 of the first device 138 (as illustrated in
For example, the user 108 may generate a first signal 152 and a second signal 154. The first signal 152 may be received or sensed by the sensor 142 of the first device 138, while the second signal 154 may be received or sensed by the sensor 148 of the second device 146. In another embodiment, additional signals may be utilized and/or sensed by the devices and/or other additional devices. The second signal 154 may be transmitted to the first device 138 or the authentication device 144 via communications circuitry 150. The first device 138 may then transmit the first signal 152, in addition to, in some embodiments, the second signal 154, to the authentication device 144 via communications circuitry 140. The authentication device 144 may determine whether the signals are consistent (for example, from the same user 108). The authentication device 144 may determine a consistency score based on the two signals. For example, a user 108 may swipe or move a badge against a badge scanner (for example, the first device 138) and speak into an audio device (for example, the second device 146), thus generating a first signal (for example, based on the badge scan) and a second signal (for example, based on the audio signal). The authentication device 144 may determine the consistency score based on whether the badge associated with the user 108 matches an audio signal from the same user 108. As noted, different and/or additional signals may be utilized in such operations, such as a bone conduction signal and/or signals associated with an identification scan, retinal scan, fingerprint scan, facial scan (for example, facial recognition scan), gesture scan (for example, limb or body gesture recognition scan), and/or a scan of some other aspect of a user 108. After the consistency is determined, the authentication device 144 may verify each signal from the user 108 (for example, whether each of the signals are authentic or, in other words, actually from the user 108). Verification, as described, may utilize models trained using submitted or previously submitted enrollment or submissions from the user 108. After verification, the authentication device 144 may grant the user 108 access to utilize functionality of one or more devices (for example, the first device 138, the second device 146, and/or additional devices), access to data, and/or physical or virtual access to a location.
Before granting the user 202 access, the smart device and/or authentication circuitry may determine whether the bone conduction signal 214 is consistent with the audio signal 212 (see voice activity consistent 216). In other words, the smart device and/or authentication circuitry may determine whether the bone conduction signal 214 is from or originates from the same user 202 that provided the audio signal 212. During the consistency check, the smart device and/or authentication circuitry may compare and/or analyze the bone conduction signal 214 and audio signal 212 to determine to determine consistency. If the bone conduction signal 214 and audio signal 212 are determined to be inconsistent, then the smart device may remain locked (see 220).
If the bone conduction signal 214 and audio signal 212 are determined to be consistent, then the smart device and/or authentication circuitry may perform bone conduction signal 214 and audio signal 212 verification. In other words, the bone conduction signal 214 may be passed through a trained model or classifier and/or compared to previous submissions or enrollments to determine whether the bone conduction signal 214 is from or originates from a particular user 202. Similarly, the audio signal 212 may be passed through a trained model or classifier and/or compared to previous submissions or enrollments to determine whether the audio signal 214 is from or originates from a particular user 202. Such operations may occur simultaneously or in sequence. If at any point the smart device and/or authentication circuitry determines that either the bone conduction signal 214 or audio signal 212 are not from the user 202, then the smart device may lock the user out or disable further access to the smart device by the user. If both the bone conduction signal 214 and audio signal 212 are verified, then the smart device may unlock or allow the user to access features of, functionality of, or data stored in the smart device.
After denoising, the bone conduction signal and audio signal may be passed or transmitted to a synchronization module 306. The synchronization module 306 may utilize cross-correlation to align the bone conduction signal with the audio signal. The synchronization module 306 may delay the earlier of the bone conduction signal and audio signal so that the bone conduction signal and audio signal reach a maximum cross-correlation, such that the signals may be further processed and analyzed to determine if they are from the same user.
After synchronization, a consistency score may be determined. The score may be generated using a voice activity detection module 308 and a similarity check module 310. The voice activity module 308 may convert the bone conduction signal and the audio signal to binary waveforms. The binary waveforms may be compared to, at the least, ensure that voice and bone conduction activity is occurring at the same time. Such analysis ensures that an in depth consistency check is not performed if the bone conduction signal and audio signal do not at least include matching activity time frames.
The bone conduction signal and audio signal may then be analyzed and compared in the similarity check module 310. The similarity check module 310 may normalize, chunk the signals into frames, and transform the signals into a time-frequency grid by a short-time Fourier Transform. The processed signals may then be compared and a score generated. Depending on the score, it may be determined whether the bone conduction signal and the audio signal are form the same user (for example, a high score indicating the bone conduction signal and the audio signal is consistent and a low score indicating bone conduction signal and the audio signal are inconsistent). Additionally, the consistency score may be compared to a threshold (for example, a consistency threshold) and if the score exceeds the threshold, then the bone conduction signal and the audio signal may be determined to be consistent.
In another embodiment, the bone conduction signal and the audio signal may be converted to a waveform and the waveform may be compared to determine whether certain features are similar. For example, the waveforms may simply be compared to determine each time of activity (for example, as illustrated in the air-bone waveform 311). In another example, specific frequencies may be compared. In another example, or in addition to the waveform, the signals may be converted to a binary waveform (as noted above) which simply denotes activity using a square waveform (for example, as illustrated in voice activity 313, where a 1 would indicate activity and 0 would indicate non-activity).
In an embodiment, if the bone conduction signal and the audio signal are determined to be consistent, then features may be extracted from each of the bone conduction signal and the audio signal. The bone conduction features may be extracted using a convolutional neural network (CNN) or other neural network or machine learning model. The CNN may leverage an image-classification method to extract image-like feature maps from using a time-frequency analysis. The audio features extracted may include various spectral characteristics of the audio signal. In an embodiment, extracting features may include converting the audio signal to an audio signal feature vector and converting the bone conduction signal to a bone conduction signal vector.
Turning to
As noted, the audio features may be extracted after the consistency check. After the features are extracted, the audio features or vectors may be applied to a model or classifier to further reduce the amount of vectors. The reduced audio vectors, as noted, may be added to the air-bone speaker embedding 338.
In another embodiment, prior to authentication of the user 318, the smart device and/or authentication circuitry may prompt the user to submit an audio signal and bone conduction signal. Such a signal may be submitted for one or more wearable devices of the user. The results may be stored in a database 326 for future use. Further the model or classifier used in relation to the bone conducting signal and the audio signal may be trained using the submission.
At block 402, the smart device may determine whether an audio signal has been received. The smart device may wait until such a signal is received. In an embodiment, the smart device may prompt the user to submit the response (for example, the audio signal and the bone conduction signal).
At block 404, the wearable device and/or smart device may determine whether the bone conduction signal has been received. In an embodiment, the smart device and/or authentication circuitry may wait for a specified period of time for the bone conduction signal. For example, the smart device and/or authentication circuitry may wait about 30 seconds, about 1 minute, or up to about 5 minutes. Longer periods of time between reception of signals may indicate a potential attack or spoof via a third party. After the period of time has lapsed, the smart device and/or authentication circuitry may deny access for the user to utilize the smart device. In an embodiment, a portion of functionality of the smart device may be available for use by the user.
At block 406, the smart device and/or authentication circuitry may analyze the audio and bone conduction signals. Such analysis may include, at a high level, a consistency check and verification. The consistency check may include preprocessing the bone conduction signal and the audio signal. If the signals are consistent, then smart device and/or authentication circuitry may verify the audio signal and the bone conduction signal. The verification may include applying the audio signal or audio features or vectors and the bone conduction signal or bone conduction features or vectors to a trained model or classifier corresponding to the signal (for example, a model or classifier specific to either the bone conduction signal or the audio signal). The output of such a model may be transmitted to a PLDA or other statistical or probabilistic model to determine a score.
At block 408, the smart device and/or authentication circuitry may authenticate the user. The smart device and/or authentication circuitry may determine, based on scores generated from during analysis of the bone conduction signal and audio signal, that the user is authentic or is verified.
At block 410, if the user is authenticated, the smart device and/or authentication circuitry may grant access, at block 412, to the smart device. If the user is not authenticated, the smart device and/or authentication circuitry, at block 414, may deny access to the smart device.
At block 502, the smart device and/or authentication circuitry may determine whether a user is enrolled or has been initialized. In other words, the smart device and/or authentication circuitry may determine whether the user has submitted an initial phrase or one or more phrases.
At block 504, if the user has not been enrolled or initialized, then the smart device and/or the authentication circuitry may prompt a user to submit one or more phrases for initialization or enrollment. The one or more phrases may be random phrases and/or specified phrases (for example, specified by the smart device). In such examples, the smart device may issue or transmit a prompt as a text based message or as an audio prompt. In an embodiment, the smart device and/or authentication circuitry may issue or transmit the prompt for each of the user's wearable devices.
At block 506, the smart device and/or authentication circuitry may issue or transmit a prompt as to whether an additional wearable device is to be enrolled or initialized. If an additional wearable device is to be initialized or enrolled, then the smart device and/or authentication circuitry may prompt the user to wear the wearable device and submit one or more phrases.
At block 508, the smart device and/or authentication circuitry may update a trained bone conduction model with the submitted one or more phrases. The trained bone conduction model may be utilized to verify a user's bone conduction signal. Utilizing the user's submitted one or more phrases may ensure that an accurate verification may occur. At block 510, the smart device and/or authentication circuitry may update a trained audio model with the submitted one or more phrases. The trained bone conduction model may be utilized to verify a user's bone conduction signal. Utilizing the user's submitted one or more phrases ensure that an accurate verification may occur.
At block 512, the smart device and/or authentication circuitry may determine whether an audio signal has been received. The smart device and/or authentication circuitry may wait for such a signal prior to proceeding to the next operation. In another embodiment, the smart device and/or authentication circuitry may receive the bone conduction signal prior to the audio signal.
At block 514, the smart device and/or the authentication circuitry may determine whether the bone conduction signal has been received. In an embodiment, the smart device and/or authentication circuitry may wait for a specified or preselected period of time prior to denying access to the user. The smart device and/or the authentication circuitry may wait for about 30 seconds, about 1 minute, or up to about 5 minutes. Longer periods of time between reception of signals may indicate a potential attack or spoof via a third party. In an embodiment, the bone conduction signal is received by the wearable device at substantially the same time as the audio signal is received by the microphone or other sensor.
If the bone conduction signal is received, at block 516, the smart device and/or authentication circuitry may preprocess the audio signal. Preprocessing the audio signal may include passing the audio signal through a low-band pass filter, through a Wiener filter, and/or utilizing other noise reduction techniques, as will be understood by those skilled in the art.
At block 518, the smart device and/or the authentication circuitry may preprocess the bone conduction signal. Preprocessing the bone conduction signal may include passing the bone conduction signal through a band pass filter, through a Wiener filter, and/or utilizing other noise reduction techniques, as will be understood by those skilled in the art. The band pass filter may remove frequencies below about 20 Hz and frequencies above about 2 kHz.
At block 520, the smart device and/or the authentication circuitry may delay the earliest received signal to align the preprocessed bone conduction signal and the preprocessed audio signal. The time that either signal is received may not match the other. As such, to ensure an accurate consistency check, the smart device and/or the authentication circuitry may shift the earliest received signal in relation to time. Thus, at least from the perspective of alignment according to or in relation to time, the preprocessed bone conduction signal and the preprocessed audio signal may align.
At block 522, the smart device and/or the authentication circuitry may determine a bone conduction power distribution with respect to time. Higher spectral power for each signal may generate higher correlations considering the amplitude. Therefore, selecting the frequencies with the highest or largest power ensures accurate consistency checks. Such a determination may be represented by yb(t)←Σf yb(f,t).
At block 524, the smart device and/or the authentication circuitry may select the time range of interest for the bone conduction power distribution. As indicated, the higher the power for a particular frequency the likelier correlation between a bone conduction signal and audio signal is, if the signals are from the same user. As such, the smart device and/or the authentication circuitry may select the time range with the highest power or spectral power. Such an operation may be represented by t′←argt [yb(t)≥θ].
At block 526, the smart device and/or the authentication circuitry may determine an audio power distribution and/or bone conduction power distribution with respect to frequency. Such operations may be represented by, for audio, ya(f)←Σt′ ya(f, t′) and, for bone conduction yb(f)←Σt′ yb(f,t′).
At block 528, the smart device and/or the authentication circuitry may select the top frequencies of the bone conduction signal and audio signal power distribution. Such an operation may be represented by m←argl Sort (ya(f))1:M, and n←argl Sort (yb(f))1:N. At block 530, the smart device and/or authentication circuitry may determine a correlation matrix, as represented by C←CorrM×N [ya(f, t′), yb(f, t′)]. Using the correlation matrix, at block 532, the smart device and/or the authentication circuitry may generate a consistency score, as represented by
At block 534, the smart device and/or the authentication circuitry may determine whether the consistency score indicates that the bone conduction signal and audio signal are consistent or from the same user. The smart device and/or the authentication circuitry may utilize a consistency threshold to determine whether the consistency score indicates that the bone conduction signal and audio signal are consistent or from the same user (for example, the consistency score is greater than or equal to the consistency threshold.
At block 536, the smart device and/or the authentication circuitry may generate an audio feature vector. Using a neural network, a max pooling layer, or a fully connected layer, at block 538, the smart device and/or the authentication circuitry may reduce the audio feature vectors. At block 540, the smart device and/or the authentication circuitry may verify the reduced audio feature vector with the enrolled user template and/or a classifier.
At block 542, the smart device and/or the authentication circuitry may generate a bone conduction feature vector. Using a neural network, a max pooling layer, or a fully connected layer, at block 544, the smart device and/or the authentication circuitry may reduce the bone conduction feature vector. At block 546, the smart device and/or the authentication circuitry may verify the reduced bone conduction feature vector with the enrolled user template and/or a classifier.
At block 544, the smart device and/or the authentication circuitry may utilize the results of block 546 and block 540 to determine whether the bone conduction signal and the audio signals have been verified. If the signals have been verified, at block 550, the smart device and/or the authentication circuitry may enable access for the user.
If the signals have not been verified, at block 552, the smart device and/or the authentication circuitry may prompt the user to resubmit an audio and bone conduction signal for resubmission (for example, submit an additional attempt). The smart device and/or the authentication circuitry may allow for a specified number of resubmissions or at least one resubmission. Once that number has been met, the smart device and/or the authentication circuitry may deny access to the user at block 554.
The flowchart blocks support combinations of means for performing the specified functions and combinations of operations for performing the specified functions. It will be understood that one or more blocks of the flowcharts, and combinations of blocks in the flowcharts, can be implemented by special purpose hardware-based computing devices which perform the specified functions, or combinations of special purpose hardware and software instructions.
In some embodiments, some of the operations above may be modified or further amplified. Furthermore, in some embodiments, additional optional operations may be included. Modifications, amplifications, or additions to the operations above may be performed in any order and in any combination.
This application is related to U.S. Provisional Application No. 63/268,999, filed Mar. 8, 2022, titled “SYSTEMS AND APPARATUS FOR MULTIFACTOR AUTHENTICATION USING BONE CONDUCTION AND AUDIO SIGNALS,” U.S. Provisional Application No. 63/269,001, filed Mar. 8, 2022, titled “METHOD FOR MULTIFACTOR AUTHENTICATION USING BONE CONDUCTION AND AUDIO SIGNALS,” and U.S. Provisional Application No. 63/380,229, filed Oct. 19, 2022, titled “SYSTEMS AND METHODS FOR CONTINUOUS, ACTIVE, AND NON-INTRUSIVE USER AUTHENTICATION,” the disclosures of which are incorporated herein by reference in their entirety.
In the drawings and specification, several embodiments of systems and methods to provide two-way authentication for a user via a smart device or device and a wearable device have been disclosed, and although specific terms are employed, the terms are used in a descriptive sense only and not for purposes of limitation. Embodiments of systems and methods have been described in considerable detail with specific reference to the illustrated embodiments. However, it will be apparent that various modifications and changes can be made within the spirit and scope of the embodiments of systems and methods as described in the foregoing specification, and such modifications and changes are to be considered equivalents and part of this disclosure.
This application is related to U.S. Provisional Application No. 63/268,999, filed Mar. 8, 2022, titled “SYSTEMS AND APPARATUS FOR MULTIFACTOR AUTHENTICATION USING BONE CONDUCTION AND AUDIO SIGNALS,” U.S. Provisional Application No. 63/269,001, filed Mar. 8, 2022, titled “METHOD FOR MULTIFACTOR AUTHENTICATION USING BONE CONDUCTION AND AUDIO SIGNALS,” and U.S. Provisional Application No. 63/380,229, filed Oct. 19, 2022, titled “SYSTEMS AND METHODS FOR CONTINUOUS, ACTIVE, AND NON-INTRUSIVE USER AUTHENTICATION,” the disclosures of which are incorporated herein by reference in their entirety.
Number | Date | Country | |
---|---|---|---|
63268999 | Mar 2022 | US | |
63269001 | Mar 2022 | US | |
63380229 | Oct 2022 | US |