A data service may provide services for free on the internet. A malicious entity may take advantage of these services using software applications that pretend to be human users. The software applications may overtax the server for the data service, hijack the data service for nefarious use, or interrupt normal use of the data service. For example, the software applications may set up fake free e-mail accounts to send out spam, hoard sale products for nefarious purposes, or may strip mine a public database.
This Summary is provided to introduce a selection of concepts in a simplified form that is further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
Embodiments discussed below relate to controlling access to an online data service. A communication interface establishing a human interactive proof session with a user device accessing an online data service. The communication interface may iteratively send an audio proof challenge set having multiple audio proof challenges each asking a semantic query to the user device for presentation to a user. A processor may provide access to the online data service based in part on at least one proof response having a semantic reply indicating a human user.
In order to describe the manner in which the above-recited and other advantages and features can be obtained, a more particular description is set forth and will be rendered by reference to specific embodiments thereof which are illustrated in the appended drawings. Understanding that these drawings depict only typical embodiments and are not therefore to be considered to be limiting of its scope, implementations will be described and explained with additional specificity and detail through the use of the accompanying drawings.
Embodiments are discussed in detail below. While specific implementations are discussed, it should be understood that this is done for illustration purposes only. A person skilled in the relevant art will recognize that other components and configurations may be used without parting from the spirit and scope of the subject matter of this disclosure. The implementations may be a machine-implemented method, a tangible machine-readable storage medium having a set of instructions detailing a method stored thereon for at least one processor, or a human interactive proof portal.
As opposed to a standard human interactive proof, in which a user reads or listens to a word then types that word for transmission to a human interactive proof portal, a user may listen to an audio proof challenge with a semantic query and infer or interpret the semantic queries to come up with the answer. A semantic query is a question designed to provoke a human cognitive process to determine the answer. The human interactive proof portal may engage the user in an interactive session with multiple audio challenge and responses with a varying pattern.
As the user interacts with each audio proof challenge, the user may interpret the question and content by providing a proof response. The human interactive proof portal may calculate the statistics of the correctness of the proof response with each interaction. The human interactive proof portal may leverage the statistics of the response time on each interaction. When the user response score is at or above the response threshold, the human interactive proof portal may determine the human interactive proof session successful and the user is allowed to proceed with the intended task. When the user response score is below a response threshold and the number of attempts reaches the challenge set size, the human interactive proof portal may determine the human interactive proof session unsuccessful.
The human interactive proof portal may provide a semantic query as an audio data file in an interactive pattern. The user may understand and interpret the query to come up with a response on each interaction. For each interactive session, the human interactive proof portal may provide a collection of semantic queries. Each semantic query may have a template with placeholders. The placeholders may be randomly filled from a vocabulary set. When the semantic query is constructed, the corresponding correct response may be generated at runtime. The semantic query may be targeted to receive a response from a limited response pool. With this pattern, the human interactive proof portal may create a corpus of random semantic queries on each user interaction session.
The human interactive proof portal may frame each semantic query to allow a genuine user to answer the semantic query quickly. In a typical interactive session, the user may experience a series of semantic questions, answering one by one till the human interactive proof portal reaches the verdict, either reaching the response threshold or the challenge set size.
The human interactive proof portal may track the minimum number of audio proof challenges in a human interactive proof session, the maximum number of audio proof challenges in a human interactive proof session, a lower response threshold below which the human interactive proof session fails, and an upper response threshold below which the human interactive proof session succeeds.
The human interactive proof portal may serve one semantic query as audio content in each interaction. When the user solves the challenge and enters the answer, the user may receive the next semantic query as an audio proof challenge. On receiving each answer, the human interactive proof portal may compute the user response score leveraging the statistics of time taken to enter the answer. Additionally, the human interactive proof portal may factor in the geo-location of the user, the reputation of the internet protocol address, user success rate on previous responses, and other user data to determine if the user is a benign human or a malicious actor. If the number of interactions is greater than or equal to the minimum number of interactions and the user response score is at or above the upper response threshold, the human interactive proof portal may judge the human interactive proof session successful and allow the user to access to the online data service. If the number of interactions is greater than or equal to the minimum number of interactions and the user response score is below the lower response threshold, the human interactive proof portal may judge the human interactive proof session unsuccessful and deny the user access to the online data service. If the number of interactions is greater than or equal to the minimum number of interactions and the user response score is between the upper and the lower response threshold, the human interactive proof portal may provide further audio proof challenges. If the number of interactions is equal to the maximum number of interactions and the user response score is below the upper response threshold, the human interactive proof portal may judge the human interactive proof session unsuccessful and deny the user access to the online data service.
Thus, in one embodiment, a human interactive proof portal may control access to an online data service. A communication interface establishing a human interactive proof session with a user device accessing an online data service. The communication interface may iteratively send an audio proof challenge set having multiple audio proof challenges each asking a semantic query to the user device for presentation to a user. A processor may provide access to the online data service based in part on at least one proof response having a semantic reply indicating a human user.
The human interactive proof portal 140 may consider other factors in determining a response threshold, such as the reputation of the internet protocol address, the geo-location of the user, statistics about the interaction time during a human interactive proof session, response success rate, or other factors. The human interactive proof portal 140 may use a geo-location database 160 to identify a geo-location for the user device 110 by using the internet protocol address originating the access request to identify the actual geo-location.
The processor 220 may include at least one conventional processor or microprocessor that interprets and executes a set of instructions. The memory 230 may be a random access memory (RAM) or another type of dynamic data storage that stores information and instructions for execution by the processor 220. The memory 230 may also store temporary variables or other intermediate information used during execution of instructions by the processor 220. The data storage 240 may include a conventional ROM device or another type of static data storage that stores static information and instructions for the processor 220. The data storage 240 may include any type of tangible machine-readable medium, such as, for example, magnetic or optical recording media, a digital video disk, or a corresponding drive. A tangible machine-readable medium is a physical medium storing machine-readable code or instructions, as opposed to a signal. Having instructions stored on computer-readable media as described herein is distinguishable from having instructions propagated or transmitted, as the propagation transfers the instructions, versus stores the instructions such as can occur with a computer-readable medium having instructions stored thereon. Therefore, unless otherwise noted, references to computer-readable media/medium having instructions stored thereon, in this or an analogous form, references tangible media on which data may be stored or retained. The data storage 240 may store a set of instructions detailing a method that when executed by one or more processors cause the one or more processors to perform the method. The data storage 240 may also be a database or a database interface with the audio proof challenge database 150 or the geo-location traffic database 160.
The input/output device interface 250 may include one or more conventional mechanisms that permit a user to input information to the computing device 200, such as a keyboard, a mouse, a voice recognition device, a microphone, a headset, a gesture capture device, a touch screen, etc. The input/output device interface 250 may include one or more conventional mechanisms that output information to the user, including a display, a printer, one or more speakers, a headset, or a medium, such as a memory, or a magnetic or optical disk and a corresponding disk drive. The communication interface 260 may include any transceiver-like mechanism that enables computing device 200 to communicate with other devices or networks. The communication interface 260 may include a network interface or a transceiver interface. The communication interface 260 may be a wireless, wired, or optical interface. The clock 270 may provide timing information for various functions performed by a user device 110 or a human interactive portal 140. For example, the clock 270 may record a challenge response time for each audio proof challenge or an overall response time for a human interactive proof session.
The computing device 200 may perform such functions in response to a processor 220 executing sequences of instructions contained in a computer-readable medium, such as, for example, the memory 230, a magnetic disk, or an optical disk. Such instructions may be read into the memory 230 from another computer-readable medium, such as the data storage 240, or from a separate device via the communication interface 260.
The human interactive proof portal 140 may establish a human interactive proof session with the user device 110 to determine whether to grant access to the online data service 122. The human interactive proof portal 140 may send an audio proof challenge set having multiple audio proof challenges for the user device 110 to solve. The audio proof challenge database 150 may store a pre-defined set of audio proof challenges or may store a set of semantic query templates and a vocabulary set to facilitate the human interactive proof portal 140 with the automatic generation of the audio proof challenges.
Each semantic query template 322 may have one or more associate vocabulary sets 332 in the vocabulary set database 330. A vocabulary set 332 is a set of one or more words that may be input into the semantic query template. For example, the vocabulary set may be “trees books tables cats dogs”. The human interactive proof portal 140 may input the vocabulary set 332 into the semantic query template 322 to create an audio proof challenge 312. Based on the previous examples, the audio proof challenge 312 may be “Write down how many books? 3 books, 2 tables, 1 books.” The proof response to this audio proof challenge 312 may be four.
The geo-location database 160 may store a location record to indicate optimum use parameters at each geo-location.
The human interactive proof portal 140 may maintain a user record of the user device 110.
The human interactive proof portal 140 may vary a template complexity between a predecessor semantic query template 322 and a successor semantic query template 322 (Block 812). The human interactive proof portal 140 may calculate a benchmark response time based on a template complexity (Block 814). The human interactive proof portal 140 may generate the successor audio proof challenge automatically from a successor semantic query template 322 (Block 816). The human interactive proof portal 140 may send a successor audio proof challenge 604 asking a successor semantic query to the user device 110 for presentation to the user (Block 818). A successor proof challenge 604 is a proof challenge that follows a predecessor proof challenge. The human interactive proof portal 140 may receive from the user device 110 a successor proof response 606 having a successor semantic reply indicating a human user (Block 820). If the human interactive proof portal has not sent the complete set of audio proof challenges (Block 822), the human interactive proof portal 140 may generate the next successor audio proof challenge (Block 816).
The human interactive proof portal 140 may base the number of audio proof challenges sent in a human interactive proof session on a user's performance during the human interactive proof session.
Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms for implementing the claims.
Embodiments within the scope of the present invention may also include computer-readable storage media for carrying or having computer-executable instructions or data structures stored thereon. Such computer-readable storage media may be any available media that can be accessed by a general purpose or special purpose computer. By way of example, and not limitation, such computer-readable storage media can comprise RAM, ROM, EEPROM, CD-ROM or other optical disk storage, magnetic disk storage or other magnetic data storages, or any other medium which can be used to carry or store desired program code means in the form of computer-executable instructions or data structures. Combinations of the above should also be included within the scope of the computer-readable storage media.
Embodiments may also be practiced in distributed computing environments where tasks are performed by local and remote processing devices that are linked (either by hardwired links, wireless links, or by a combination thereof) through a communications network.
Computer-executable instructions include, for example, instructions and data which cause a general purpose computer, special purpose computer, or special purpose processing device to perform a certain function or group of functions. Computer-executable instructions also include program modules that are executed by computers in stand-alone or network environments. Generally, program modules include routines, programs, objects, components, and data structures, etc. that perform particular tasks or implement particular abstract data types. Computer-executable instructions, associated data structures, and program modules represent examples of the program code means for executing steps of the methods disclosed herein. The particular sequence of such executable instructions or associated data structures represents examples of corresponding acts for implementing the functions described in such steps.
Although the above description may contain specific details, they should not be construed as limiting the claims in any way. Other configurations of the described embodiments are part of the scope of the disclosure. For example, the principles of the disclosure may be applied to each individual user where each user may individually deploy such a system. This enables each user to utilize the benefits of the disclosure even if any one of a large number of possible applications do not use the functionality described herein. Multiple instances of electronic devices each may process the content in various possible ways. Implementations are not necessarily in one system used by all end users. Accordingly, the appended claims and their legal equivalents should only define the invention, rather than any specific examples given.