The invention relates to a system and method of automatically distinguishing between computers and human based on responses to enhanced Completely Automated Public Turing test to tell Computers and Humans Apart (“e-captcha”) challenges that do not merely challenge the user to recognize skewed or stylized text.
Automatically (e.g., by a computer) distinguishing between humans from computers in a networked computing environment, such as the Internet, is a difficult task. Many automated computational tasks are performed by computer processes (e.g., “bots”) that automatically crawl websites to obtain content. Such bots are growing in sophistication and are able to provide inputs to websites and other online resources. Bots may increase traffic in online networks, reducing available bandwidth for actual human users. Furthermore, bots may increasingly be used for malicious activities, such as denial of service attacks.
Another way in which bots have been used to exploit computer networks is through crowd-sourced tasks. In the context of natural language understanding (“NLU”), crowds can generate creative input for open ended questions, which then can be used as bootstrapping data for NLU models. It is difficult, however, to prevent the use of bots that act as human responders in order to “game” the system.
Conventional Completely Automated Public Turing test to tell Computers and Humans Apart (“CAPTCHA”)-style challenges have been developed in an attempt to distinguish humans from computers. Conventional CAPTCHA challenges provide an image of characters, typically skewed to make it difficult for bots to perform image recognition, but usually easy for the human brain to decipher. However, such challenges can be prone to image recognition techniques and also cannot be used to further screen respondents based on whether they know the answer to a challenge (i.e., to select only those respondents with a basic knowledge of particular subject matter). These and other drawbacks exist with conventional CAPTCHA challenges.
The invention addressing these and other drawbacks relates to a system and method of automatically distinguishing between computers and human based on responses to e-captcha challenges that do not merely challenge the user to recognize skewed or stylized text. For example, an e-captcha challenge may require a skill-based action to be completed. A skill-based action does not include merely recognizing text (whether or not skewed) from an image. Rather, the skill-based action may require a user to have some knowledge in a domain (e.g., answer a trivia question) or solve a problem (e.g., solve a math problem).
A database of e-captcha challenges may be maintained. A given e-captcha challenge may be specific to a particular knowledge domain. As such, e-captchas may be used not only to distinguish between computers and humans, but also to validate whether a human should be validated based on whether he can demonstrate knowledge in the particular knowledge domain. For instance, participants in crowd-sourced tasks, in which unmanaged crowds are asked to perform tasks, may be screened using an e-captcha challenge. This not only validates that a participant is a human (and not a bot, for example, attempting to game the crowd-source task), but also screens the participant based on whether they can successfully respond to the e-captcha challenge.
In another crowd-sourced example, a crowd-sourced task administrator may wish for only those participants who can utter a particular utterance to be able to participate in an unmanaged crowd task involving the collection of utterances from a crowd. This may be advantageous when seeking participants who can utter certain sounds to be able to participate. Alternatively, a crowd-sourced task administrator may wish for only those participants who cannot utter a particular utterance to be able to participate (in a sense, only those who fail the validation may participate). This may be advantageous when seeking participants who cannot utter certain sounds to be able to participate to be able to train natural language processors towards those individuals (e.g., to account for accents, twangs, etc.).
This may allow a crowd-sourced task administrator, who wishes to have only participants having certain knowledge relating to a given knowledge domain to perform the task, screen such participants. An e-captcha may be used to ensure that the participants have such knowledge (or else they will not be validated and will not be permitted to perform the task).
E-captchas may be used in other contexts as well. For example, a classic rock music site may wish to prevent bots from entering the music site as well as ensure only those who have a basic knowledge of classic rock be permitted to enter the site. In such a case, an e-captcha having an e-captcha challenge relating to classic rock (e.g., classic rock trivia, recognize all or a portion of a classic rock song, etc.) may be presented to users, who must satisfy the challenge in order to gain entry into the site. A gaming site, for enhanced enjoyment/challenge, may present an e-captcha having an embedded game having an objective that must be completed in order to gain entry into the gaming site. Other contexts may be used as well, as would be apparent based on the disclosure herein.
In some instances, an e-captcha may be required not only at the beginning of a session, but throughout the session. For instance, a user may be required to pass an e-captcha challenge to access a resource (e.g., a website), but also may be required to periodically pass additional e-captchas to continue with the session (or get kicked off).
These and other objects, features, and characteristics of the system and/or method disclosed herein, as well as the methods of operation and functions of the related elements of structure and the combination of parts and economies of manufacture, will become more apparent upon consideration of the following description and the appended claims with reference to the accompanying drawings, all of which form a part of this specification, wherein like reference numerals designate corresponding parts in the various figures. It is to be expressly understood, however, that the drawings are for the purpose of illustration and description only and are not intended as a definition of the limits of the invention. As used in the specification and in the claims, the singular form of “a”, “an”, and “the” include plural referents unless the context clearly dictates otherwise.
The invention addressing these and other drawbacks relates to a system and method of automatically distinguishing between computers and human based on responses to e-captcha challenges that do not merely challenge the user to recognize skewed or stylized text. As used herein, the term “e-captcha” may refer to information related to a challenge that must be satisfied (e.g., a question to be answered, a game objective to be complete, a problem to be solved, etc.), rather than merely an image of text to be reproduced, in order to pass validation. An e-captcha may include the challenge itself, client executable scripts or code that is configured to, when executed at the client, generate an interface for presenting and receiving a response to the e-captcha challenge, and/or other information related to an e-captcha challenge.
Exemplary System Architecture
E-captcha subscriber 105 may include a system (e.g., operated by an entity) that wishes to automatically distinguish between computers and humans. For instance, e-captcha subscriber 105 may include a crowd-sourcing service that wishes to prevent bots from accomplishing tasks intended for unmanaged human crowds, a website operator that wishes to prevent bots from automatically accessing its website, and/or other systems that wish to automatically distinguish between computers and humans. E-captcha subscriber 105 may request e-captchas from computer system 110. In some implementations, e-captcha subscriber 105 may request certain types of e-captchas. For instance, e-captcha subscriber 105 may request an e-captcha relating a particular (knowledge/subject matter) domain, e-captchas having challenges in a particular format, e-captchas having response inputs in a particular format, and/or other types of e-captchas. Responsive to such requests, computer system 110 may provide and validate e-captchas, which may be obtained from e-captcha database 112.
E-captcha database 112 may store information related to e-captchas. For instance, e-captcha database 112 may store e-captcha challenges, correct responses/solutions to the e-captcha challenges, e-captcha specifications, and/or other information described herein. An example of the types of e-captchas are illustrated in
Having described a high level overview of the system, attention will now be turned to a more detailed description of computer system 110.
Computer system 110 may include one or more processors 212 (also interchangeably referred to herein as processors 212, processor(s) 212, or processor 212 for convenience), one or more storage devices 214 (which may store various instructions described herein), and/or other components. Processors 212 may be programmed by one or more computer program instructions. For example, processors 212 may be programmed by an e-captcha specification module 220, an e-captcha generator and validator 222, an Automated Speech Recognition (“ASR”) engine 224, and/or other instructions 230 that program computer system 110 to perform various operations.
As used herein, for convenience, the various instructions will be described as performing an operation, when, in fact, the various instructions program the processors 212 (and therefore computer system 110) to perform the operation.
Defining and Storing E-Captchas
In an implementation, e-captcha specification module 220 may obtain e-captcha challenges and their correct responses to be stored (e.g., in e-captcha database 112) and later provided to an e-captcha subscriber 105. A correct response may include a correct answer to a question, a completion of an objective (e.g., a completion of a level, a score achieved in a game, etc.), a solution to a problem (e.g., a solution to a math problem, a completion of a puzzle in a particular arrangement of puzzle pieces, etc.), and/or other response deemed suitable.
In some instances, a given response may be completely correct, partially correct, or incorrect. The response may be assigned a score based on its level of correctness according to one or more scoring parameters that specify how the response should be score. Such scoring parameters will be different depending on the type of e-captcha challenge to which they pertain. A given response may be deemed satisfactory if the score exceeds a predetermined threshold score. E-captcha specification module 220 may store the scoring parameters and the predetermined threshold score in e-captcha database 112.
In an implementation, e-captcha specification module 220 may obtain the e-captcha challenges and their correct responses from e-captcha subscriber 105. For instance, e-captcha subscriber 105 may provide its own set of challenges to be used to validate users. E-captcha specification module 220 may store the received set of challenges and store the challenges in e-captcha database 112 on behalf of e-captcha subscriber 105 and then later provide an appropriate e-captcha challenge to the subscriber for validating a user.
In an implementation, e-captcha specification module 220 may obtain the e-captcha challenges and their correct responses from an administrator of computer system 110, who may generate the challenges.
In an implementation, e-captcha specification module 220 may allow an e-captcha subscriber 105 to specify, in advance (before requesting an e-captcha for validating a user), which set of e-captchas should be provided when requested. For instance, e-captcha specification module 220 may present a selectable listing of knowledge domains, types, levels of difficulty, and/or other e-captcha parameters to specify e-captchas. In this manner, a given e-captcha subscriber 105 may specify certain types of e-captcha challenges to validate users.
Types of E-Captchas
In an implementation, an e-captcha challenge may be configured in various formats, such as text, image or video, audio, a game, a problem solving, and/or other format. It should be noted that text formats may be converted into and presented in an image format. For instance, a question in text format: “What state is known as the Evergreen State?” may be converted into an image format prior to presentation to a user. Although not illustrated, other characteristics of the e-captcha challenge may be stored as well, such as a response mechanism (e.g., open text, multiple choice, utterance, gesture, etc.) used to respond to the challenge.
In the illustrated implementation, a given e-captcha challenge may be selected for validating a user based on a desired e-captcha domain 310, a level of difficulty, a format of the question, a response mechanism, and/or other characteristic known about the e-captcha challenge. In this manner, different types of e-captcha challenges may be selected and presented to validate a user.
Providing and Validating E-Captchas
Referring back to
In an implementation, in an operation 402, process 400 may include receiving a request for an e-captcha. The request may be received from an e-captcha subscriber 105, who wishes to validate one of its users. Alternatively, the request may be received from an agent operating on end user device 120. The agent may include code that programs end user device 120 to request an e-captcha and communicates with e-captcha subscriber 105 or computer system 110 to provide a response.
In an implementation, in an operation 404, process 400 may include identifying a type of e-captcha to provide based on the request. For example, the request may include an e-captcha specification that includes one or more e-captcha parameters that specify the type of e-captcha requested. For example, the e-captcha parameters may specify an e-captcha domain 310, a level of difficulty, a format (e.g., text, image or video, of the challenge, audio, game, problem solving, etc.), a response mechanism, and/or other characteristic of an e-captcha challenge.
In some instances, the request may include identifying information that identifies an e-captcha subscriber 105 associated with the request. In these instances, process 400 may include obtaining a pre-stored profile that includes e-captcha parameters predefined for the e-captcha subscriber. In instances where the request does not specify any e-captcha parameter or e-captcha subscriber identifying information, process 400 may include randomly identifying an e-captcha to provide.
In an implementation, in an operation 406, process 400 may include causing the e-captcha to be provided. For instance, an e-captcha may include the identified e-captcha challenge, code necessary to present and receive a response to the e-captcha challenge, and/or other information. In instances where code is provided in an e-captcha, the e-captcha may be configured as a standalone agent able to be executed at the client. For instance, the e-captcha may be configured as a JavaScript, Microsoft® Silverlight®, Adobe Flash™ technology scripts, HTML5, and/or other client executable instructions that include an e-captcha challenge and instructions configured to cause the e-captcha challenge to be displayed. In other instances, the e-captcha may include only the e-captcha challenge, in which case the recipient (e.g., e-captcha subscriber 105 or end user device 120) will configure the e-captcha challenge to be displayed and provide response.
In an implementation, in an operation 408, process 400 may include receiving and scoring a response to the e-captcha (i.e., response to the e-captcha challenge). In implementations where the response mechanism is a text (free-form) input, the score may be based on a comparison of the text to text from the correct response (which may be retrieved from e-captcha database 112). An edit distance, a measure of dissimilarity between the response and the correct response may be used to determine the score. In implementations where the response mechanism is a gesture and text input is expected, the gesture may be converted to text and the aforementioned edit distance analysis may be conducted. Such gesture to text may be performed by the end user device 120 and/or process 400. In implementations where the response mechanism is a gesture and a shape is expected, conventional shape analysis may performed to determine a level of matching between the input gesture shape and the correct shape. In implementations where the response mechanism is an utterance, process 400 may use ASR engine 224 to convert the utterance to a string, and the aforementioned edit distance analysis may be conducted.
In an implementation, in an operation 410, process 400 may include determining whether the score is adequate. For instance, the score may be compared to a predetermined score threshold.
In an implementation, in an operation 412, responsive to a determination that the score is adequate, process 400 may include providing information indicating validation. In these implementations, the user providing the response is presumptively a human having sufficient knowledge of the subject matter of the e-captcha challenge and may permitted to continue on to a next step, whether that includes being permitted to participate in a crowd-sourced task, continue to a website, etc.
In an implementation, in an operation 414, responsive to a determination that the score is inadequate, process 400 may include providing information indicating failure. In these implementations, the user providing the response is presumptively a computer/bot or a human having insufficient knowledge of the subject matter of the e-captcha challenge. In either case, the user may not be permitted to continue on to a next step, whether that includes being permitted to participate in a crowd-sourced task, continue to a website, etc.
In an implementation, in an operation 502, process 400 may include receiving a request for an e-captcha. The request may be received from an e-captcha subscriber 105 or from an agent operating on end user device 120, in a manner similar to operation 402. In operation 502, the request may include identifying information that identifies a user or an end user device 120.
In an implementation, in an operation 504, process 500 may include identifying a type of e-captcha to provide based on the request, in a manner similar to operation 404.
In an implementation, in an operation 506, process 500 may include causing the e-captcha to be provided, in a manner similar to operation 406.
In an implementation, in an operation 508, process 500 may include receiving and scoring a response to the e-captcha, in a manner similar to operation 408.
In an implementation, in an operation 510, process 500 may include storing the score in association with the identifying information. For instance, over time, scores associated with the identifying information may be used to build a profile of the user or the end user device 120 identified by the identifying information. In this manner, process 500 may understand the types of e-captchas that have been successfully or unsuccessfully completed by a user (or end user device 120). Alternatively or additionally, the profile of the user may be used to disqualify the user if the user does not maintain a certain number or percentage of validations per attempts.
In an implementation, in an operation 512, process 500 may include determining whether the score is adequate. For instance, the score may be compared to a predetermined score threshold.
In an implementation, in an operation 514, responsive to a determination that the score is inadequate, process 500 may include providing information indicating failure.
In an implementation, in an operation 516, responsive to a determination that the score is adequate, process 500 may include providing information indicating validation.
Each response mechanism illustrated in
The one or more processors 212 illustrated in
Furthermore, it should be appreciated that although the various instructions are illustrated in
The description of the functionality provided by the different instructions described herein is for illustrative purposes, and is not intended to be limiting, as any of instructions may provide more or less functionality than is described. For example, one or more of the instructions may be eliminated, and some or all of its functionality may be provided by other ones of the instructions. As another example, processor(s) 212 may be programmed by one or more additional instructions that may perform some or all of the functionality attributed herein to one of the instructions.
The various instructions described herein may be stored in a storage device 214, which may comprise random access memory (RAM), read only memory (ROM), and/or other memory. The storage device may store the computer program instructions (e.g., the aforementioned instructions) to be executed by processor 212 as well as data that may be manipulated by processor 212. The storage device may comprise floppy disks, hard disks, optical disks, tapes, or other storage media for storing computer-executable instructions and/or data.
The various databases described herein may be, include, or interface to, for example, an Oracle™ relational database sold commercially by Oracle Corporation. Other databases, such as Informix™, DB2 (Database 2) or other data storage, including file-based, or query formats, platforms, or resources such as OLAP (On Line Analytical Processing), SQL (Structured Query Language), a SAN (storage area network), Microsoft Access™ or others may also be used, incorporated, or accessed. The database may comprise one or more such databases that reside in one or more physical devices and in one or more physical locations. The database may store a plurality of types of data and/or files and associated data or file descriptions, administrative information, or any other data.
The various components illustrated in
The various processing operations and/or data flows depicted in
Other implementations, uses and advantages of the invention will be apparent to those skilled in the art from consideration of the specification and practice of the invention disclosed herein. The specification should be considered exemplary only, and the scope of the invention is accordingly intended to be limited only by the following claims.
This application is a continuation of U.S. patent application Ser. No. 14/846,923, filed Sep. 7, 2015, entitled “SYSTEM AND METHOD OF PROVIDING AND VALIDATING ENHANCED CAPTCHAS”, which is hereby incorporated by reference herein in its entirety.
Number | Name | Date | Kind |
---|---|---|---|
7197459 | Harinarayan | Mar 2007 | B1 |
7912726 | Alshawi | Mar 2011 | B2 |
7966180 | Bajaj | Jun 2011 | B2 |
8731925 | Da Palma | May 2014 | B2 |
8805110 | Rhoads | Aug 2014 | B2 |
8847514 | Reynoso | Sep 2014 | B1 |
8849259 | Rhoads | Sep 2014 | B2 |
8855712 | Lord | Oct 2014 | B2 |
8886206 | Lord | Nov 2014 | B2 |
8925057 | Ansari | Dec 2014 | B1 |
8929877 | Rhoads | Jan 2015 | B2 |
9008724 | Lord | Apr 2015 | B2 |
9043196 | Leydon | May 2015 | B1 |
9047614 | Kalikivayi | Jun 2015 | B2 |
9190055 | Kiss | Nov 2015 | B1 |
9361887 | Braga | Jun 2016 | B1 |
9401142 | Rothwell | Jul 2016 | B1 |
9436738 | Ehsani | Sep 2016 | B2 |
9448993 | Braga | Sep 2016 | B1 |
9452355 | Lin | Sep 2016 | B1 |
9519766 | Bhosale | Dec 2016 | B1 |
9734138 | Rothwell | Aug 2017 | B2 |
20020065848 | Walker | May 2002 | A1 |
20030126114 | Tedesco | Jul 2003 | A1 |
20040093220 | Kirby | May 2004 | A1 |
20040138869 | Heinecke | Jul 2004 | A1 |
20050108001 | Aarskog | May 2005 | A1 |
20070044017 | Zhu | Feb 2007 | A1 |
20070050191 | Weider | Mar 2007 | A1 |
20070100861 | Novy | May 2007 | A1 |
20070192849 | Golle | Aug 2007 | A1 |
20070198952 | Pittenger | Aug 2007 | A1 |
20070265971 | Smalley | Nov 2007 | A1 |
20080046250 | Agapi | Feb 2008 | A1 |
20090013244 | Cudich | Jan 2009 | A1 |
20090122970 | Kearney | May 2009 | A1 |
20090150983 | Saxena | Jun 2009 | A1 |
20110054900 | Phillips | Mar 2011 | A1 |
20110225629 | Pai | Sep 2011 | A1 |
20110252339 | Lemonik | Oct 2011 | A1 |
20120066773 | Weisberger | Mar 2012 | A1 |
20120197770 | Raheja | Aug 2012 | A1 |
20120232907 | Ivey | Sep 2012 | A1 |
20120254971 | Hu | Oct 2012 | A1 |
20120265528 | Gruber | Oct 2012 | A1 |
20120265578 | Olding | Oct 2012 | A1 |
20120284090 | Marins | Nov 2012 | A1 |
20130054228 | Baldwin | Feb 2013 | A1 |
20130132091 | Skerpac | May 2013 | A1 |
20130231917 | Naik | Sep 2013 | A1 |
20130253910 | Turner | Sep 2013 | A1 |
20130262114 | Brockett | Oct 2013 | A1 |
20130289994 | Newman | Oct 2013 | A1 |
20130304454 | Kimberly | Nov 2013 | A1 |
20130325484 | Chakladar | Dec 2013 | A1 |
20140067451 | Balamurugan | Mar 2014 | A1 |
20140156259 | Dolan | Jun 2014 | A1 |
20140167931 | Lee | Jun 2014 | A1 |
20140193087 | Conwell | Jul 2014 | A1 |
20140196133 | Shuster | Jul 2014 | A1 |
20140244254 | Ju | Aug 2014 | A1 |
20140249821 | Kennewick | Sep 2014 | A1 |
20140279780 | Dasgupta | Sep 2014 | A1 |
20140304833 | Gujar | Oct 2014 | A1 |
20140358605 | Balamurugan | Dec 2014 | A1 |
20150006178 | Peng | Jan 2015 | A1 |
20150095031 | Conkie | Apr 2015 | A1 |
20150120723 | Deshmukh | Apr 2015 | A1 |
20150128240 | Richards | May 2015 | A1 |
20150154284 | Pfeifer | Jun 2015 | A1 |
20150169538 | Reynolds | Jun 2015 | A1 |
20150213393 | O'Neill | Jul 2015 | A1 |
20150269499 | B | Sep 2015 | A1 |
20150278749 | Bhagat | Oct 2015 | A1 |
20150339940 | Aggarwal | Nov 2015 | A1 |
20150341401 | Lee | Nov 2015 | A1 |
20160012020 | George | Jan 2016 | A1 |
20160048486 | Lopategui | Feb 2016 | A1 |
20160048934 | Gross | Feb 2016 | A1 |
20160285702 | Beausoleil | Sep 2016 | A1 |
20160329046 | Gross | Nov 2016 | A1 |
20160342898 | Ehsani | Nov 2016 | A1 |
20170017779 | Huang | Jan 2017 | A1 |
20170039505 | Bose | Feb 2017 | A1 |
20170068651 | Rothwell | Mar 2017 | A1 |
20170068656 | Braga | Mar 2017 | A1 |
20170068659 | Rothwell | Mar 2017 | A1 |
20170069039 | Kennewick | Mar 2017 | A1 |
20170069325 | Braga | Mar 2017 | A1 |
20170069326 | Rothwell | Mar 2017 | A1 |
Number | Date | Country |
---|---|---|
2017044368 | Mar 2017 | WO |
2017044369 | Mar 2017 | WO |
2017044370 | Mar 2017 | WO |
2017044371 | Mar 2017 | WO |
2017044408 | Mar 2017 | WO |
2017044409 | Mar 2017 | WO |
2017044415 | Mar 2017 | WO |
Entry |
---|
Amazon, “Amazon Mechanical Turk API Reference”, API Version Mar. 25, 2012, available at: http://awsdocs.s3.amazonaws.com/MechTurk/20120325/amt-API-20120325.pdf, 234 pages. |
Amazon, “Amazon Mechanical Turk Developer Guide”, API Version Mar. 25, 2012, available at: http://awsdocs.s3.amazonaws.com/MechTurk/20120325/amt-dgi-20120325.pdf, 43 pages. |
Amazon, “Amazon Mechanical Turk Getting Started Guide”, API Version Mar. 25, 2012, available at http://awsdocs.s3.amazonaws.com/MechTurk/20120325/amt-gsg-20120325.pdf, 36 pages. |
Amazon, “Amazon Mechanical Turk Requester UI Guide”, API Version Mar. 25, 2012, available at http://awsdocs.s3.amazonaws.com/MechTurk/20120325/amt-ui-20120325.pdf, 59 pages. |
Badenhorst, Jaco, et al., “Quality Measurements for Mobile Data Collection in the Developing World”, SLTU, 2012, 7 pages. |
Bontcheva, Kalina, et al. “Crowdsourcing Named Entity Recognition and Entity Linking Corpora”, Handbook of Linguistic Annotation, Springer, 2014, 18 pages. |
Braunschweig, Katrin, et al., “Enhancing Named Entity Extraction by Effectively Incorporating the Crowd”, BTW Workshops, 2013, pp. 181-195. |
Buchholz, Sabine, et al., “Crowdsourcing Preference Tests, and How to Detect Cheating”, in INTERSPEECH 2011, 8 pages. |
Callison-Burch, Chris, et al., “Creating speech and language data with Amazon's Mechanical Turk”, Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data with Amazon's Mechanical Turk, Association for Computational Linguistics, 2010, 12 pages. |
Carmel, David, et al., “ERD'14: Entity Recognition and Disambiguation Challenge”, ACM SIGIR Forum, vol. 48, No. 2, 2014, pp. 63-77. |
De Vries, Nic J., et al., “A Smartphone-Based ASR Data Collection Tool for Under-Resourced Languages”, Speech Communication, vol. 56, 2014, pp. 119-131. |
Derczynski, Leon, et al., “Analysis of Named Entity Recognition and Linking for Tweets”, Information Processing & Management, vol. 51, No. 2, 2015, pp. 32-49. |
Draxler, Christoph, “Interfaces for Crowdsourcing Platforms”, from “Crowdsourcing for Speech Processing: Applications to Data Collection, Transcription, and Assessment”, Chapter 9, pp. 241-278, John Wiley & Sons, 2013, 48 pages. |
Eickhoff, C. et al., “Increasing Cheat Robustness of Crowdsourcing Tasks”, Information Retrieval, vol. 16, No. 2, 2013, 18 pages. |
Eickhoff, Carsten, “How Crowdsourcable is Your Task?”, Proceedings of the Workshop on Crowdsourcing for Searchand Data Mining, Feb. 9, 2011, pp. 11-14. |
Finin, Tim, et al., “Annotating Named Entities in Twitter Data With Crowdsourcing”, Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data With Amazon's Mechanical Turk, Association for Computational Linguistics, Jun. 2010, pp. 80-88. |
Freitas, Joao, et al., “Crowd-sourcing Platform for Large-Scale Speech Data Collection”, Proc. FALA, 2010, 4 pages. |
Gadiraju, Ujwal, et al., “Understanding Malicious Behavior in Crowdsourcing Platforms: The Case of Online Surveys”, CHI 2015—Conference on Human Factors in Computing Systems, Seoul, South Korea, Apr. 18, 2015, 10 pages. |
Gennaro, Rosario, et al., “Non-Interactive Verifiable Computing: Outsourcing Computation to Untrusted Workers”, Advances in Cryptology—CRYPTO 2010, Springer Berlin Heidelberg, 2010, 19 pages. |
Hsueh, Pei-Yun, et al., “Data Quality from Crowdsourcing: A Study of Annotation Selection Criteria”, Proceedings of the NAACL HLT Workshop on Active Learning for Natural Language Processing, Boulder, Colorado, Jun. 2009, pp. 27-35. |
Hughes, Thad, et al., “Building Transcribed Speech Corpora Quickly and Cheaply for Many Languages”, INTERSPEECH, 2010, 4 pages. |
Ipeirotis, Panagiotis G., “Quality Management on Amazon Mechanical Turk”, Proceedings of the ACM SIGKDD Workshop on Human Computation, ACM, Jul. 2010, pp. 64-67. |
Kaufmann, Nicolas, et al., “More Than Fun and Money. Worker Motivation in Crowdsourcing—A Study on Mechanical Turk”, Proceedings of the Seventeenth Americas Conference on Information Systems, AMCIS, vol. 11, Aug. 4, 2011, pp. 1-11. |
Lawson, Nolan, et al., “Annotation Large Email Datasets for Named Entity Recognition with Mechanical Turk”, Proceedings of the NAACL HLT 2010 Workshop on Creating Speech and Language Data With Amazon's Mechanical Turk, Association for Computational Linguistics, Jun. 2010, pp. 71-79. |
Levenshtein, V., I., Binary Codes Capable of Correcting Deletions, Insertions, and Reversals, Soviet Physics—Doklady, vol. 10, No. 8, Feb. 1966, pp. 707-710. |
Liu, Sean, et al., “A Collective Data Generation Method for Speech Language Models”, Spoken Language Technology Workshop (SLT), 2010 IEEE, IEEE, 2010, 6 pages. |
McGraw, “Collecting Speech from Crowds”, from “Crowdsourcing for Speech Processing: Applications to Data Collection, Transcription, and Assessment”, Chapter 3, pp. 37-71, John Wiley & Sons, 2013, 44 pages. |
McGraw, Ian Carmichael, “Crowd-Supervised Training of Spoken Language Systems”, Dissertation, Massachusetts Institute of Technology, 2012, 166 pages. |
McGraw, Ian, et al., “Collecting Voices from the Cloud”, LREC, 2010, 8 pages. |
McGraw, Ian, et al., “How to Control and Utilize Crowd-Collected Speech”, from “Crowdsourcing for Speech Processing: Applications to Data Collection, Transcription, and Assessment”, Chapter 5, pp. 106-136, John Wiley & Sons, 2013, 40 pages. |
Oleson, David, et al., “Programmatic Gold: Targeted and Scalable Quality Assurance in Crowdsourcing”, Human Computation, Papers from the 2011 AAAI Workshop (WS-11-11), vol. 11, 2011, 6 pages. |
Rutherford, Attapol T., et al., “Pronunciation Learning for Named-Entities Through Crowd-Sourcing”, Proceedings of the 15th Annual Conference on the International Speech Communication Association, 2015, 5 pages. |
Sabou, M. et al., “Crowdsourcing Research Opportunities: Lessons from Natural Language Processing”, iKnow 2012—Proceedings of the 12th International Conference on Knowledge Management and Knowledge Technologies, Graz, Austria, Article 17, Sep. 5, 2012, 8 pages. |
Sabou, Marta, et al., “Corpus Annotation through Crowdsourcing: Towards Best Practice Guidelines”, Proceedings of the 9th International Conference on Language Resources and Evaluation, Reykjavik, Iceland, 2014, 8 pages. |
Soleymani, Mohammad, et al., “Crowdsourcing for Affective Annotation of Video: Development of a Viewer-Reported Boredom Corpus”, Proceedings of the ACM SIGIR 2010 Workshop on Crowdsourcing for Search Evaluation, Jul. 19, 2010, pp. 4-8. |
Suzic, Sinisa, et al., “On the Realization of AnSpeechCollector, System for Creating Transcribed Speech Database”, 2014, 4 pages. |
Voyer, Robert, et al., “A Hybrid Model for Annotating Named Entity Training Corpora”, Proceedings of the Fourth Linguistic Annotation Workshop, Association for Computational Linguistics, Jul. 15, 2010, pp. 243-246. |
Wang, Gang, et al., Serf and Turf: Crowdturfing for Fun and Profit, Proceedings of the WWW, New York, Apr. 16, 2012, pp. 679-688. |
Number | Date | Country | |
---|---|---|---|
20170068809 A1 | Mar 2017 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 14846923 | Sep 2015 | US |
Child | 15275720 | US |