This disclosure concerns verifying a user, such as but not limited to, verifying a user who answered questions that comprise a test that was unproctered. Aspects of the invention include computerised method, software and hardware.
Unproctored (unsupervised) testing of people is suitable in some situations, for example assessing candidates for employment or students for grading purposes. This has occurred because there are some significant cost saving in unproctored testing when compared to proctored (supervised and typically onsite) testing. However the cost saving comes with some disadvantages.
In particular, the unproctored testing process is more open to abuse than proctored testing. For example, cheating is made easier by having someone else perform the test on the person's behalf. This is unfair to the remaining candidates or students and at the same time lessens the accuracy and therefore value of the tests for the organisation relying on the testing results.
In a first aspect there is provided a computer implemented method to determine verifying data of a user, the system comprising:
It is an advantage of the method that transition patterns can be determined that can then be suitably used to later verify the same user. This makes collecting the verification data convenient.
Impersonation is a common cheating strategy. Existing modes of verification separate the testing and verification and therefore can be more easily gamed. It is an advantage that by combining testing and verification data collection so that an integrated input is generated by the user makes it more difficult to cheat.
It is a further advantage that the transition patterns are free from bias based on the user's native tongue.
The sequence of questions also a test of the user suitability or qualification to a role that the user is being tested for. The test may be unproctored. Security measures exist for unproctored testing but they impose an additional burden on the user. Many of these security measures cause users to abandon the process. It is an advantage that verification of the user can be implemented passively, that is generating the verification data from the actual suitability testing (that may be unproctored) making it less intimidating than existing security measures. This in turn leas to a reduced abandonment rate.
The method of determining the transition pattern comprises determining from each subset one or more of the following features:
The comparison of subsets may be based on a comparison of the features of the subsets, and the method may comprise determining a probability distribution function of one or more features of each subset.
The comparison of subsets may be based on a distance measure, such as Euclidean distance, of the respective probability distribution function.
Where the questions are in substantially increasing or decreasing order of difficulty, the distance measures are combined to form a time series vector. The question in increasing or decreasing order may form all or part of a test.
The data representative of the transition pattern may include a model of the transition pattern, such as a probability model.
The method may further comprise receiving further data from the user while responding to the questions being one or more of:
The method may further comprise extracting features from the further data and determining probability and distance measures of these features, wherein the storing step also includes storing data representative of these probability and distance measures.
The method further comprises presenting the sequence of questions to the user.
The sequence of questions may be questions in a cognitive ability test.
In a second aspect the invention is software, that is computer readable instructions recorded on computer readable media that when executed by a computer causes it to operate in accordance with the method described above.
In a third aspect, there is provided a computer system for generating verifying data of a user, the system comprising:
The system may further comprise a testing module to present the sequence of questions tot he user.
In a fourth aspect there is provided a computer implemented method for verifying a user as having previously responded to a first sequence of questions, the method comprising:
Existing verification methods are typically based on statistical interference in a re-test scenario. That is, the user is asked to repeat the unproctored test in a proctored setting and the answers provided by the user are statistically compared to the original answers. It is an advantage of this method that the first and second cognitive ability test can be different enabling the verification phase to also provide more information on the cognitive ability of the user. Further, the second test can be significantly shorter that the first cognitive ability test. Therefore it also an advantage that this method is less time-intensive than the prior art.
The step of verifying the user may comprise determining a likelihood score that that the user previously responded to the first sequence of questions.
The method may further comprise determining the verifying data, being the representation of the transition pattern, as described above.
The method of determining a transition pattern in the voice data may be the same as described above.
The method may further comprise receiving further data from the user while responding to the second sequence of questions being one or more of:
The comparison of the transition pattern to the previously stored representation of a transition pattern may be based on an assessment of whether the transition patterns share a similar statistical distribution.
In a fifth aspect the invention is software, that is computer readable instructions recorded on computer readable media that when executed by a computer causes it to operate in accordance with the method described directly above.
In a sixth aspect there is provided a computer system to verify a user as having previously performed a first cognitive ability test, the system comprising:
Of course, where suitable, optional features of the first aspect described above are also optional features of the remaining aspects also described here.
Non-limiting examples will now be described with reference to the following drawings, in which:
An example of testing users, in this case candidates for employment, will now be described. In this example there are multiple candidates that will be assessed for suitability for the employment on offer. The assessment includes a sequence of tests which at least includes:
Typically, after each assessment only some of the candidates will proceed to the next assessment. The assessments are specifically provided in this order as the order reflects increasing costs. As a result the more costly tests are applied to fewer candidates.
Referring also to
A computer system diagram is provided in
The first stage 50 is typically initiated by a candidate using their computer 12. The candidate wishes to apply 70 for employment and the application process requires a CAT be taken by the candidate.
In this example, using the computer 12 the candidate completes the CAT which is administered 72 by the server 10 over the internet 14. The computer 12 includes software such as an internet browser, that enables the computer to display the CAT as received from the server 10 on a display device of the computer 12. The computer 12 includes a sound capture device, such as a microphone 16, and other input devices such as a mouse and/or alphanumeric keypad (e.g. keyboard or touch screen displaying a keypad) that allow the candidate to answer questions that form the CAT.
The server 10 includes a testing module 20, which can be considered as a combination of software and hardware, including a processor, that is able to provide the content of the CAT to the computer 12.
In this example the CAT is presented to the candidate on a display device of computer 12, such as a monitor.
The candidate performs the CAT by answering a sequence of questions that are displayed, where each question in this example is designed with a certain difficulty level which varies and in turn requires a corresponding level of effort to answer. The answers typically include a combination of oral answers and mouse clicks. In some examples there is also keyed input. The candidate's answers which are received by the microphone and mouse input is sent by the computer 12 typically via a web browser session. Other input can include keyboard input. Each input is in time series order so that the input that forms the answer to the same question can be aligned. In some examples, the input is also time indexed or question reference indexed.
In some examples the answers received will be used as input to the testing module 20 to effect the selection of questions to provide to the computer 12.
The voice data and where appropriate mouse input data and keyboard input data received by the server 10 is provided to an identity verification module 23, which again can be considered a combination of hardware and software. In particular the voice data is provided to the voice assessment module 22, the mouse input data to the mouse assessment module 25 and the keyboard input data provided to the keyboard assessment module 27, where each module 22, 25 and 27 forms part of the assessment module 23 and are used to generate an identity signature of the user.
Referring now to
In particular, regarding the generation of voice transition pattern in feature extraction stage 300, the voice data 84 is automatically analysed to estimate 482 raw features that are indicative of voice transitions between different cognitive loads being:
These extracted raw features are also in the time sequence order.
The sequential patterns of each of the time series features demonstrate how an individual's voice transits when the amount of mental efforts put into the test changes and are referred to as voice transition patterns 200.
An example of extracting the voice transition pattern from voice signals is shown in
To determine a pattern in the transition features from low cognitive load to high cognitive load, pairs of PDFs across a certain time frame (e.g. seven times of the sliding window size) are compared. An Euclidean L2 distance measure is calculated from each of the PDF pairs 503, reflecting trend of voice feature distribution change while cognitive load changes.
The PDF can be compared using any suitable distance measure other than the Euclidean L2 distance measure. That means a measure is determined for each comparison of PDFs 501 of each feature. All the distance measures combined creates time-series distance vectors that form the sequential feature vector representing voice transition patterns 400 of the candidate under different cognitive loads.
In the same stage, other voice biometrics 404 is also applied to the standard spectrum features 402, such as Linear Predictive Coding (LPC) coefficients and Mel-Frequency Cepstral Coefficients (MFCC), which are also derived from the input voice 84. The parameters of standard voice biometrics features 404 are concatenated to form a time-series feature vector.
While keyboard and mouse inputs present in test, the patterns of these extra inputs can be used to enhance the identity verification. In case of keystrokes 490, typing dynamics 406 such as timing and hit-force are recorded and the distances among the combination of possible key sequences 403 are calculated from recorded data. Similarly, mouse movement input 491 is also analysed for trajectory and speed features 210 and in turn the distance between certain moves are derived 412. The value of distances measured are concatenated to form another time-series feature vector.
Then in feature combination stage 302, the feature vectors derived from the distance measures 403 and 412 from key stroke features 406 and from the collection of mouse movement features 410 respectively, the feature vector 400 derived from voice transition pattern, and the feature vector 404 of the standard voice biometrics features 402 are concatenated to create a high-dimensional super vector 494 as a time-series.
As shown in
The parameters of the resulting GMMs and HMM 86, including but not limited to:
That signature 86 is then stored in a datastore 24, that is computer storage 24 of the computer 12.
The testing module 20 also determines 74 a result of the CAT. That is, determines in real time whether the candidate's performance on the CAT is sufficient to move onto stage two 52. If so, the successful result is communicated 76 to the candidate, such as by display on the monitor once the CAT is complete or an email.
The second stage 52 uses the stored voice identity signature 86 to verify that the person appearing for the interview is the same person that conducted at the earlier time the CAT 72. The second stage 52 makes use of the discovery that the change of voice data received from a candidate during a specific cognitive task from a certain difficulty level to another is suitably unique and consistent. In particular, the change in the voice data of the candidate under different cognitive loads is consistent for a particular individual.
As a result the candidate can be asked to perform a shorter version of the CAT 78 as a verification test in a proctored setting to generate a new voice transition pattern, along with other identity features, that can be compared to the voice identity signature 86.
Again the candidate would use a computer 40 to perform the test 78 at the interview site 62. The features of the computer 40 are the same as computer 20, that is the computer is able to communicate with the server 10 via the internet 14 to deliver the test 78, displays the test on a monitor and receives input from the user as answers to the test 78, the input typically being voice data recorded from microphone 41 and as mouse movement data received by a mouse.
Again the test 78 is delivered by the testing module 20 of the server, and in response the voice input 84, together with keyboard and mouse data, is delivered to the identity verification module 23 which works in verification mode.
The verification result indicates that the persons taking the full-length test and short verification test are sufficiently or not the same. Then the candidate is taken to having passed 79 the verification test and proceed to the interview 75. Alternatively, they fail 77 the test which can lead to being disqualified 73 from progressing further in the selection process. The result of the verification test is also stored in the datastore 24.
An example of the voice data taken from two candidates under different conditions is shown in
The waveform 702′ of the voice data of the candidate 700 is also shown in
A similar example of a different candidate 800 is also shown in
What can be seen from this representation is that the how these features associated with the same word change from low cognitive load to high is significantly different across two speakers. However, the exact change pattern of each speaker needs to be captured using a statistical model described above to identify.
It should be understood that various arrangement of computers 10 and 12 can be provided that will be able to perform stage one 50 or two 52. Software on computer 12 could be application software in which case modules 20 and 23 and the function of the datastore 24 of the server 10 could be incorporated into the computer 12
The user may be being tested for suitability to a role. For example the second stage 52 could be performed at enrolment for say an academic course rather than at the interview.
Other inputs that can be used include eye movements, biological sensors, pen gestures and pressures, such as input via a stylus or finger on a touch sensitive screen.
Computer systems have the necessary input out port to communicate over the internet.
In another example, the questions may be presented to the user over a phone.
The cognitive load associated with the question of the CAT may not be increasing sequential order. For example, the question may be presented in sets where in each set is presented sequentially, and the question in each set are in increasing cognitive load. However, the last question of a set may have a higher cognitive load than the first question of the next set. In this case, when comparing PDFs of windows to determine the voice transition pattern 200 the associated cognitive load of the windows will need to be determined to ensure that they have different cognitive load and that the windows are consistently compared as low to high or high to low to determine a pattern. For example, the time series voice data may have associated tags that indicate the question difficulty that is responded to at various time points in the voice data. The tags are also stored with the voice data and the tags are referenced when making the comparisons.
It will be appreciated by persons skilled in the art that numerous variations and/or modifications may be made to the invention as shown in the specific embodiments without departing from the scope of the invention as broadly described. The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive.
It will be appreciated by persons skilled in the art that numerous variations and/or modifications may be made to the above-described embodiments, without departing from the broad general scope of the present disclosure. The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive.
Number | Date | Country | Kind |
---|---|---|---|
2011905420 | Dec 2011 | AU | national |
Number | Name | Date | Kind |
---|---|---|---|
5815197 | Kakii | Sep 1998 | A |
5915001 | Uppaluru | Jun 1999 | A |
6119084 | Roberts | Sep 2000 | A |
6411933 | Maes | Jun 2002 | B1 |
6632174 | Breznitz | Oct 2003 | B1 |
6879968 | Hayakawa | Apr 2005 | B1 |
RE38884 | Kakii | Nov 2005 | E |
6973575 | Arnold | Dec 2005 | B2 |
7054811 | Barzilay | May 2006 | B2 |
7321855 | Humble | Jan 2008 | B2 |
7400879 | Lehaff | Jul 2008 | B2 |
7490043 | Tavares | Feb 2009 | B2 |
7536304 | Di Mambro et al. | May 2009 | B2 |
7583197 | Wesby Van Swaay | Sep 2009 | B2 |
7636855 | Applebaum et al. | Dec 2009 | B2 |
7716055 | McIntosh et al. | May 2010 | B1 |
7933771 | Chang | Apr 2011 | B2 |
8185646 | Headley | May 2012 | B2 |
8239677 | Colson | Aug 2012 | B2 |
8332223 | Farrell et al. | Dec 2012 | B2 |
8407762 | Bidare | Mar 2013 | B2 |
8412530 | Pereg et al. | Apr 2013 | B2 |
8590018 | Thavasi | Nov 2013 | B2 |
8818810 | Weng et al. | Aug 2014 | B2 |
9002706 | Lopez | Apr 2015 | B2 |
9378366 | Tegreene | Jun 2016 | B2 |
20040162726 | Chang | Aug 2001 | A1 |
20020190124 | Piotrowski | Dec 2002 | A1 |
20030046083 | Devinney et al. | Mar 2003 | A1 |
20050089172 | Fujimoto | Apr 2005 | A1 |
20050969906 | Barzilay | May 2005 | |
20050171774 | Applebaum | Aug 2005 | A1 |
20050182618 | Azara | Aug 2005 | A1 |
20050288820 | Wu | Dec 2005 | A1 |
20050288930 | Shaw | Dec 2005 | A1 |
20060021003 | Fisher et al. | Jan 2006 | A1 |
20060102717 | Wood | May 2006 | A1 |
20060104348 | Chen et al. | May 2006 | A1 |
20070067159 | Basu | Mar 2007 | A1 |
20070094021 | Bossemeyer, Jr. et al. | Apr 2007 | A1 |
20070147592 | Ikumi et al. | Jun 2007 | A1 |
20070177017 | Kyle | Aug 2007 | A1 |
20070255564 | Yee et al. | Nov 2007 | A1 |
20080222722 | Navratil | Sep 2008 | A1 |
20080306811 | Goldman | Dec 2008 | A1 |
20100091953 | Kim et al. | Apr 2010 | A1 |
20100161327 | Chandra | Jun 2010 | A1 |
20100217097 | Chen et al. | Aug 2010 | A1 |
20100228656 | Wasserblat et al. | Sep 2010 | A1 |
20110136085 | Leroy | Jun 2011 | A1 |
20110162067 | Shuart et al. | Jun 2011 | A1 |
20110207099 | Chen et al. | Aug 2011 | A1 |
20110223576 | Foster et al. | Sep 2011 | A1 |
20120130714 | Zeljkovic | May 2012 | A1 |
20120323796 | Udani | Dec 2012 | A1 |
20130006626 | Aiyer | Jan 2013 | A1 |
20130097682 | Zeljkovic | Apr 2013 | A1 |
20140347265 | Aimone | Nov 2014 | A1 |
20150096002 | Shuart | Apr 2015 | A1 |
Number | Date | Country |
---|---|---|
0058947 | Oct 2000 | WO |
2010066310 | Jun 2010 | WO |
2011035271 | Mar 2011 | WO |
2011088415 | Jul 2011 | WO |
Entry |
---|
“Investigation and Evaluation of Voice Stress Analysis Technology,” Darren Haddad, Sharon Walter, Roy Ratley and MEgan Smith, Air Force Research Laboratory, Information Directorate, Rome Research Site, Rome, New York, In-House Technical Memorandum, Nov. 2001. |
John Sweller et al., “Cognitive Architecture and Instructional Design”, Educational Psychology Review, vol. 10, No. 3, 1998, pp. 251-296. |
R. Huang et al, “Dialect Classification on Printed Text Using Perplexity Measure and Conditional Random Fields”, IEEE, 2007, pp. IV-993-IV-996. |
F. Bimbot et al., “An Alternative Scheme for Perplexity Estimation”, IEEE, 1997, pp. 1483-1486. |
P.E. Kenne et al., “Topic change and local perplexity in spoken legal dialogue”, Nov. 1996, IEEE Xplore Conference: Spoken Language, pp. 721-723. |
Number | Date | Country | |
---|---|---|---|
20130185071 A1 | Jul 2013 | US |