The present invention relates to systems and methods for detecting and adjusting for response bias, and in particular to systems and methods for detecting and adjusting for response bias in aggregated data collected in online surveys.
Surveys—a research instrument that asks a sample population one or more questions—are among the most common methodologies for collecting human response data in both academic and industry settings. Given the ubiquitous nature of the Internet as well as personal computers and mobile devices, most academic and industry-focused surveys are delivered to respondents using an online survey delivery platform. For instance, using the quoted search phrase “online survey” in Clarivate Analytics' Web of Science search engine, resulted in more than 26,000 results (Aug. 22, 2019). In Google Scholar, the same search phrase (on the same date) resulted in more than 414,000 results. Online surveys are pervasive and are utilized in a broad range of contexts.
Data from surveys is aggregated and used to make inferences about a population. Aggregating data refers to mathematically combining self-reported data from multiple respondents in an online survey into a sum, average, or other summary statistic. For example, answers to questions about product satisfaction may be aggregated to infer how satisfied a population is with a product or service. In another example, one may aggregate answers about whether people intend to adhere to a certain policy or behave in a certain way. This aggregate data may be used to make decisions, including resource allocation, product enhancements, or policy changes relevant to the population.
The marketplace for such online survey platforms was valued at US$4 billion in 2017 and is expected to have a compound annual growth rate of 11.25%, reaching a market size of nearly US$7 billion by 2022. Academic researchers as well as countless industries including retail, market research, healthcare, financial services, and manufacturing to name a few use online surveys to gain a better understanding of respondent opinions, perceptions, intentions and behaviors. Numerous survey platform providers support the online delivery of surveys including Zoho Corporation Pvt. Ltd., Medallia Inc., Confirmit, Inqwise, SurveyMonkey, Campaign Monitor, QuestionPro, and Qualtrics. Online surveys are an established and growing data collection method in nearly all public and private sectors including education, non-profits, various for-profit industries, and all governmental sectors. Most of these survey delivery platforms are cloud-based, allowing respondents to utilize a broad range of computing devices including personal computers, notebooks, tablets and mobile devices to make responses using a standard web browser or specialized app, as shown in
It is with these observations in mind, among others, that various aspects of the present disclosure were conceived and developed.
It is therefore an object of the present invention to remedy the deficiencies in the art by disclosing systems and methods that determine and adjust for response bias. In certain embodiments, the system receives data associated with a user's input device in the course of a survey and calculates one or more metrics from the data. Metrics are a measure of the interaction with the survey with an input device including navigation, item selections, and data entry. The system then calculates the user's response bias from the metrics and outputs results of the survey. In that output, the results are adjusted for the user's response bias.
It is another object of the invention to calculate the response bias as one or more response bias scores.
It is yet another object of the invention to apply signal isolation on human-computer interaction data in order to calculate the response bias.
It is yet another object of the invention to determine that response bias exists when there is a moderating relationship between items on the survey and the one or more metrics it has previously calculated.
In describing a preferred embodiment of the invention illustrated in the drawings, specific terminology will be resorted to for the sake of clarity. However, the invention is not intended to be limited to the specific terms so selected, and it is to be understood that each specific term includes all technical equivalents that operate in a similar manner to accomplish a similar purpose. Several preferred embodiments of the invention are described for illustrative purposes; it being understood that the invention may be embodied in other forms not specifically shown in the drawings.
Surveys—a research instrument that asks a sample population one or more questions—are among the most common methodologies for collecting human response data in both academic and industry settings. Data is often aggregated across multiple questions and individuals to make an inference about the sample population. A critical threat to the validity of survey results are a category of factors referred to as response biases; i.e., a tendency of responding to questions on some basis other than the question content. Response biases can have a detrimental effect on the quality of the results of a survey study, resulting in summary statistics that do not accurately represent the sample population. The present system and method relates to how changes in hand movement trajectories and fine motor control, captured by tracking human-computer interaction (HCl) dynamics—i.e., changes in typing, mouse-cursor movements, touch pad movements, touch screen interaction, device orientation on smartphones and tablet computers, etc.—can help estimate response biases in aggregated survey data. The raw fine-grained HCl interaction data is collected at millisecond precision and converted into various statistical metrics. For a computer mouse, for example, the raw data consists of X and Y-coordinates, timestamps, and clicks. Regardless of the type of HCl device, its raw data is converted into a variety of statistical metrics related to a collection of continuous measures (e.g., movement speed, movement accuracy, completion time) and binary measures (e.g., page exits, answer switching, text entry and editing characteristics, etc.). These measures are then aggregated into a set of Response Bias Scores (RBS) that are used to moderate the relationship between a survey construct and a predicted variable to detect and adjust for response biases.
At a high-level, the execution of a successful online survey follows several steps. First, planning, where the objectives and rationale for the survey are established. Here, many planning activities can occur depending on the context including timelines, objective, research question and hypothesis development, literature review, and so on. Second, once the planning is completed, the survey design process begins, which includes the creation of the online survey, determining the target population, determining sample sizes as well as pilot testing to refine and optimize the survey language and delivery process. Third, survey deployment occurs where the online survey is sent to the target population where response rates are monitored and possible reminders are sent to any individuals who have yet to respond. Fourth, data preparation and analysis are conducted in order to produce aggregated summary statistics and a report of those findings (
Data Quality Problems with Online Surveys
While there is widespread and global use of online surveys, there is a large and growing body of literature related to various data quality concerns. Many factors can cause poor data quality. First, a vast literature, generally referred to as psychometrics, relates to the theory and technique of psychological measurement. Specifically, psychometrics focuses on the development, evaluation and use of survey-based tests. Psychometrics also establishes standards related to testing operations including test design and development, scores, scales, norms, score linking, cut scores, test administration, scoring, reporting, score interpretation, test documentation, and rights and responsibilities of test takers and test users. In essence, poorly designed or poorly executed survey-based data collection leads to data quality concerns. Clearly, psychometrics plays a large and established role in determining the data quality of a survey.
In addition to survey design concerns, there are a variety of potential problems that can negatively influence the quality of survey responses such as non-response biases where people in a particular demographic fail to respond at the same rate as other populations or coverage biases where a sample is (or is not) representative of the target population. To overcome some of these threats to the validity of the results, researchers and pollsters employ a variety of sampling approaches to account for, or attempt to nullify, these possible validity threats. These avoidance and correction techniques are not only time-consuming and expensive, but their efficacy for improving the validity and clarity of results is also questionable. The focus in this innovation is not on psychometrics and basic survey design, delivery, or sampling. The focus of the present disclosure is on evaluating how human response biases influence the quality of a respondent's answers and how to apply a proxy measure for various types of response biases in predictive statistical models.
Online surveys are used to test some of the most utilized theories in the behavioral sciences as well as assess values, beliefs, competency, product preference, and political opinions. Surveys are a valuable research methodology for collecting information on respondents' characteristics, actions, or opinions and also help to answer questions regarding “how” and “why”.
A threat to the validity of survey results; however, is a category of factors referred to as response biases. A response bias (also known as a survey bias) is the tendency of people to respond to questions on some basis other than the question content. For example, a person might misrepresent an answer in such a manner that others view it more favorably (i.e., a type of response bias called a social desirability bias). In general, people have the tendency to portray themselves in the best light particularly when asked about personal traits, attitudes, and behaviors, which often causes respondents to falsify or exaggerate answers. In other situations, a person might not be sure how to answer a question because of a lack of knowledge of the area or a lack of understanding of the question. Thus, there are several types of factors that can bias survey responses.
Acquiescence bias refers to the tendency of respondents to agree with all the questions in a survey. Relatedly, nay-saying is the opposite form of the acquiescence bias, where respondents excessively choose to deny or not endorse statements in a survey or measure. Demand bias refers to the tendency of respondents to alter their response or behavior simply because they are part of a study (i.e., hypothesis guessing with a desire to help or hurt the quality of the results). Extreme responding bias refers to the tendency of respondents to choose the most (or least) extreme options or answers available. Prestige bias refers to the tendency of respondents to overestimate their personal qualities. Social desirability bias, introduced above, refers to the tendency of respondents to misrepresent an answer in such a manner that others will view it more favorably. Unfamiliar content bias refers to the tendency of respondents to choose answers randomly because they do not understand the question or do not have the knowledge to answer the question. Finally, satisficing bias refers to the tendency of respondents to give less thoughtful answers due to being tired of answering questions or unengaged with the survey completion process. Table 1 provides a summary of these common types of response biases.
Response biases can have a detrimental effect on the quality of the results on inferences made from data aggregated from a survey study. For example, the significant results of a study might be due to a systematic response bias rather than the hypothesized effect. On the other hand, a hypothesized effect might not be significant because of a response bias. For example, the intention-behavior gap—a phenomenon that describes why intentions do not always lead to behaviors—may be attributed to response biases in some situations. For example, a person may give a socially desirable, yet inaccurate answer about their intentions to perform a given behavior (e.g., a New Year's resolution to increase exercising when the person knows they are not likely to change their current behavior). As a result, their behavior is not consistent with their reported intentions. Thus, in order to increase the validity of many types of survey studies, it is critical to understand and control for response biases. Response biases can lead to both Type 1 errors (i.e., detecting an effect that isn't present) and Type 2 errors (i.e., failing to detect an effect that is present).
Various strategies, primarily related to the design of research protocols and wording of the questions, help reduce response bias. Most of these strategies involve deceiving the subject, or are related to the way questions in surveys and questionnaires are presented to those in a study. None of these approaches is able to detect and measure the effects of various response biases, but simply apply empirically derived methods for reducing or identifying possible bias (see Table 2 for previously developed bias identification and reductions methods).
Satisficing is a decision-making strategy or cognitive heuristic that entails searching through the available alternatives until an acceptability threshold is met. Survey respondents who do not put forth maximum effort, for a variety of reasons including fatigue, feelings of compliance to a request, being required to answer in order to gain compensation, or to fulfill a requirement of a job or academic course, are following a satisficing response strategy. As such, respondents following a satisficing response strategy expend only the amount of effort needed to make an acceptable or satisfactory response. Also, respondents may begin a survey and provide ample effort for some period, but then may lose interest and become increasingly fatigued, impatient or distracted. When respondents engage in satisficing, there are many different strategies used to minimize effort. A speeding strategy refers to responding quickly without carefully considering the response in order to minimize effort. Straight-liners are responding with a strategy such that all answers are the same (
Researchers have developed two basic techniques for detecting when respondents may be engaging in various types of satisficing behavior. In general, when satisficing, a respondent lacks full engagement with the task. In addition to finding an abnormal pattern of responses through an abnormal amount of statistical variability or by visual inspection, there are two well-established approaches for assessing such lack of engagement: completion times and attention check questions.
Regarding completion times, most modern online survey platforms can report page-level start and finish times so that the duration of time spent completing the overall survey is compared to total population averages. Those significantly faster than population averages may suggest that the respondent engaged in satisficing. Greater precision is obtained by examining not just the total completion time for the entire survey, but by comparing page- or question-level completion times to population averages. This approach works best for selection type responses (e.g., choose A, B, C, or D) versus open ended questions where there would likely be greater variation in response times from a population of respondents. A weakness of this approach is that it cannot account for those who are multitasking or leaving-and-returning to the page. So, while assessing completion times of the overall survey, pages, or even questions can aid in finding some types of satisficing behavior, it cannot evaluate the extent to which a respondent is cognitively engaged when completing the question. For instance, a completion time that is “too fast” may suggest a response bias on a multi-item choice response (e.g., speeding). However, when a respondent is too slow, this delay may be caused by thoughtful and extensive deliberation and answer switching or be due to a lack of engagement (e.g., delayed due to responding to a friend's text message on their smartphone while completing a survey on a desktop computer). Without also understanding the extent to which a person is engaged in responding to questions on the survey, completion time alone is an incomplete measure of engagement.
A second approach, which is much more widely utilized, is the use of attention check questions (also called a trap or red-herring question) and consistency check questions. Such “check” questions are embedded in one or more locations in the survey where the respondent is asked to respond in a particular way. For example, a common attention check question is as follows: “Select option B as your answer choice.” Similarly, a consistency check question is designed to focus on the same information of a prior question, but asked in a different way (e.g., one question worded positively and the other negatively). The responses from these two consistency check questions can be later compared to infer a level of engagement of a respondent based on whether the two questions are answered in a consistent or inconsistent manner.
There are many limitations to using attention- and consistency-check questions for understanding which respondents are engaging in satisficing. First, professional test takers from online sites like Amazon's Mechanical Turk are skilled at identifying such checks, much more so than traditional subject pool participants. Additionally, there is an increasing body of work that suggests that by eliminating participants that fail such checks is likely to introduce a demographic bias to the study. Further, because such checks are usually spread out throughout a survey, it cannot aid in identifying when or where a respondent is following a satisficing strategy.
Data Quality Problems with Online Professional Crowdsourced Respondents
Crowdsourcing is the distribution of tasks to large groups of individuals via a flexible, open call, where respondents are paid a relatively small fee for their participation. Increasingly, researchers from a broad range of domains are using various crowdsourcing platforms for quickly finding convenience samples to complete online surveys. Examples of such online crowdsourcing sites include Amazon's Mechanical Turk (MTurk), Qualtrics, Prolific, SurveyMonkey, and numerous others. MTurk is arguably one of the largest suppliers of crowdsourced respondents in academic papers; a Google Scholar search found that more than 34,000 papers contained the phrase “Mechanical Turk” between the date range of 2006 and 2019 (Aug. 22, 2019). It is clear from this work that online professional crowdsourced respondents are here to stay and will continue to accelerate acceptance and growth.
While crowdsourced platforms work hard to establish and maintain the quality of their workers, there are concerns that some workers will be motivated to establish multiple accounts, and using IP proxies and multiple machines to repeatedly complete the same task. Another concern is the use of bots—computer robots—who can repeatedly complete the same survey. Further, there are concerns professional survey takers are fundamentally different than the general population. For instance, an American undergraduate university student is about 4,000 times more likely than an average American to be a subject in a research study. However, the typical worker from a crowdsourced platform like MTurk completes more surveys in a week than the typical undergraduate completes in a lifetime. Frequent MTurk workers are “fluent” in participating in online studies and completing surveys, and are more likely to lose natural human reactions to various types of questions. Thus, as researchers increasingly utilize these easy to collect samples there is a greater need for more sophisticated approaches for detecting data quality problems.
Responding to a survey question is in essence making a decision. Nobel Prize winning economist Herbert Simon proposed a simple yet elegant, three-step decision-making process—referred to as intelligence, design and choice—that is applied here to explain how a person completes survey questions (
Table 1 (above) outlines common response biases that reduce the data quality of a survey. In some of these biases, the respondent has pre-decided their response when engaging in biases such as acquiescence, and extreme responding as well as when engaging in various types of satisficing. In these contexts, respondents will be more likely to be less engaged in the intelligence process (i.e., skipping or quickly browsing the question), less engaged in response deliberation (i.e., less searching for a response that best matches an objective), and more quickly select this response (or type in an answer). Such respondents will have overall faster normalized response times and show lower levels of deliberation and reconsiderations.
Alternatively, other forms of response biases act to slow the intelligence process, response deliberation and final response selection or generation. For instance, for a person wanting to provide a more socially desirable answer, they will more likely iterate between reading the question, evaluating possible responses, and selecting a final response.
Similarly, a person who is influenced by a demand or prestige bias as well as those not understanding what the question is asking or what the correct answer should be, will also more likely engage in additional deliberation (i.e., cycling in the intelligence, design, choice processes) to identify the response that best aligns with their response objectives (see Table 4). Such respondents will therefore have longer normalized response times and show greater deliberation and answer switching. Thus, different response biases generate meaningful and predictable differences in how questions are processed, how responses are identified, and ultimately how selections are made (or how text-based answers are generated [i.e., typing speed changes, excessive editing, etc.]) (
In sum, some response biases result in faster, less deliberative responses and behaviors; some response biases result in slower, more deliberative responses and behaviors (see
When the response bias is due to slower and greater deliberation (
1. Lack of question understanding. One possible cause for a slow and greater deliberative response is due to a lack of understanding regarding the meaning of the question (i.e., the respondent doesn't understand the question and is therefore unsure about its answer). This is generally referred to as an “unfamiliar content” bias (see Table 1).
2. Lack of knowledge of correct answer. A second possible cause for a low confidence response is due to a lack of knowledge about the correct answer to the question as well as considering multiple competing answers (i.e., the respondent understands the question, but is unsure of the answer or is considering multiple possible answers). Again, this is referred to as an “unfamiliar content” bias.
3. Not wanting to provide information. A third possible cause for a slower and more deliberative response is due to a reluctance to share an answer due to embarrassment (e.g., social desirability bias). Other types of response bias that slow the deliberative processes include demand and prestige biases. In general, for these types of response biases, the respondent understands the question, knows the answer to the question, but is hesitant to share this truthful answer.
Regardless of the reason why a person answers a question with a high response bias response, it is valuable for the researchers or organizations requesting information via the online survey to know when and where (i.e., which question(s)) a respondent is more likely to have entered a biased response. Depending on the question content and question response format (e.g., radio buttons, text, sliders, check boxes, etc.), Response Bias Scores (RBS) can be calculated from a set of metrics reflecting how a person entered information and interacted with the online form or questionnaire.
When a response bias is due to faster and less deliberation (
In sum, various factors act to increase the likelihood of highly biased responses. Some biased responses will have longer response times and show greater deliberation and answer switching. Likewise, respondents responding with inadequate deliberation will have faster response times and show less deliberation and answer switching (
When people complete an online survey, they utilize a range of modern computing technologies like workstations, laptops, tablets and smartphones. Each type of device is equipped with an array of capabilities and sensors to provide an enhanced user experience and capture typing, navigation, selections, and screen orientation. In addition to providing an enhanced user experience, these sensors can be used to measure changes in users' fine motor control with high granularity and precision. For example, various human-computer interaction (HCl) devices such computer mice, touch pads, touch screens, keyboards, accelerometers, and so on, provide an array of data that is collected at millisecond intervals. Thus, all human-computer interaction devices (e.g., keyboard, mouse, touch screen, etc.) as well as screen and device orientation sensors (e.g., gyrometers and accelerometers) steam data with very fine detail and precision. This data can be used to not only interact with the survey system, but also be used to capture and measure the fine motor movements of users. For example, a computer mouse streams finely grained data (e.g., X-Y coordinates, clicks, timestamps) at millisecond precision that can be translated into a large number of statistical metrics that can be used to calculate changes in speed, movement efficiency, targeting accuracy, click latency, and so on as a user interacts with the survey over time. Likewise, other related devices and sensors (e.g., keyboards, touch pads, track balls, touch screens, etc.) provide similar raw data that can be used to capture a user's fine motor control and related changes over time.
In past research by the inventors, this data can be collected, analyzed and interpreted in near real-time, has been shown to provide insights for a broad range of applications including emotional changes, cognitive load, system usability and deception. The present system has developed deep expertise in automatically collecting and analyzing users' navigation as well as data entry behaviors such as typing fidelity and selection making. This approach works on all types of computing devices by embedding a small JavaScript library (or other equivalent technology), referred to as the Raw Data Collector module (RDC), into a variety of online systems (i.e., the survey hosted on the platform and delivered by a web browser). The RDC acts as a “listener” to capture all movements, typing dynamics, and events (e.g., answer switches, page exits, etc.). Once embedded, the script collects and sends the raw HCl device data—movements, events and orientation (if relevant)—to a secure web service in order to be stored and analyzed. Also, the implementation of the RDC could be implemented in a variety of ways in addition to JavaScript including hidden rootkit or tracking application running in the background that captures and stores data of similar content and granularity. The RDC may be coded in any programming language known in the art.
Recent neuroscience research has unequivocally demonstrated that strong linkages exist between changes in cognitive processing (e.g., cognitive conflict, emotion, arousal, etc.) and changes in hand movements (i.e., fine motor control). When a person is operating with some types of response bias, such as social desirability, demand, prestige or unfamiliar content bias, they are more likely to experience cognitive or moral conflict as well as emotional changes. Such respondents are therefore more likely to experience hesitations and engage in answer switching as they consider and reconsider their response. Such heightened cognitive activity will also more likely result in less movement or typing precision, and increased movement and selection delays as compared to when the individual is responding in a non-biased manner. Alternatively, respondents who are engaging in acquiescence, demand, extreme responding and satisficing biases, will be less cognitively engaged as they more superficially process questions, more quickly search for acceptable responses and more quickly make selections. Such lower cognitive activity will more likely result in higher movement precision (e.g., straight lining), fewer delays, and less answer switching as compared to when the individual is responding in an engaged, contemplative non-biased manner.
In addition to an increase in predictable movement anomalies (
Mouse cursor tracking was originally explored as a cost-effective and scalable alternative to eye tracking to denote where people devote their attention in an HCl context. For example, research has shown that eye gaze and mouse-cursor movement patterns are highly correlated with each other. When scanning search results, the mouse often follows the eye and marks promising search results (i.e., the mouse pointer stops or lingers near preferred results). Likewise, people often move their mouse while viewing web pages, suggesting that the mouse can indicate where people focus their attention. In selecting menu items, the mouse often tags potential targets (i.e., hovers over a link) before selecting an item. Monitoring where someone clicks can also be used to assess the relevance of search results. Finally, by continuously recording mouse position, the awareness, attraction, and avoidance of content can be assessed (e.g., people avoiding ads, not looking at text because of frustration, or struggling to read the text). Consequently, mouse tracking is often applied as a usability assessment tool for visualizing mouse-cursor movements on webpages, and to develop heat maps to indicate where people are devoting their attention.
However, as the ability for more fine-grained measurement and analysis of mouse-cursor movements has improved, mouse cursor tracking has also become a scientific methodology that can be used to provide objective data about a person's decision making and other psychological processes. A concise review of mouse tracking literature suggests that the “movements of the hand . . . offer continuous streams of output that can reveal ongoing dynamics of processing, potentially capturing the mind in motion with fine-grained temporal sensitivity.” Accordingly, hundreds of recent studies have chosen mouse tracking as a methodology for studying various cognitive and emotional processes. For example, mouse-cursor tracking has been shown to predict decision conflict, attitude formation, concealment of racial prejudices, response difficulty, response certainty, dynamic cognitive competition, perception formation, and emotional reactions to name a few.
In the present system and method, it is explained how hand movement trajectories, typing fluency and editing, answer switching and so on captured by tracking human-computer interaction dynamics—i.e., fluency and changes in typing, mouse-cursor, track pad, touch screen, touch pad, device orientation, etc.—can help estimate response biases and various forms of satisficing strategies when a respondent completes an online survey. Recall, a response bias occurs when respondents answer questions on some basis other than the question content. This ‘other’ basis (e.g., desires to answer in a socially desirable way, fatigue from a long survey, tendency to answer positively, satisficing, etc.) changes how people view questions and generate answers. For example, if a question is influenced by social desirability bias, a person may experience conflict between what they know is the truthful answer and what they know is a more socially desirable answer. Whereas a person not influenced by social desirability bias would not have this conflict. Similarly, a person suffering from survey fatigue would give less attention to the question and answers, and answer in a more efficient way. The present system and method draws on the Response Activation Model to explain how these different allocations of attention influence a person's fine motor control as measured through mouse-cursor movements.
The Response Activation Model (RAM) explains how hand movement trajectories are programmed in the brain and executed (e.g., how the brain programs and executes moving the mouse cursor to a destination). When a person wants to move the hand to or in response to a stimulus (whether using a mouse, touch pad or other HCl device), the brain starts to prime a movement response toward or in response to the stimulus. To prime a movement response refers to programming an action (transmitting nerve impulses to the hand and arm muscles) toward the stimulus. However, the resulting movement is not only influenced by this intended movement, rather it is influenced by all stimuli with action-based potential. A stimulus with action-based potential refers to any, potentially multiple, stimuli that could capture a person's attention. For example, in a multiple-choice survey question, stimuli with actionable potential may include all answers that capture a person's attention.
When two or more stimuli with actionable potential even briefly capture a person's attention, “responses to both stimuli are programmed in parallel”. This is an automatic, subconscious process that allows the body to react more quickly to stimuli that a person may eventually decide to move towards. This priming causes the hand to deviate from its intended movement as the observed hand movement is a product of all primed responses, both intended and non-intended. For example, if one is intending to move the mouse cursor to an answer on the survey, and another answer catches a person's attention because of its social desirability, the hand will prime movements toward this new answer in addition to the intended answer. Together, this priming will cause the trajectory of movement to deviate from the path leading directly to the intended destination. Throughout the movement, the brain will compensate for these departures by inhibiting the unintended movement if the person decides not to move to it, and automatically programming corrections to the trajectory based on continuous visual feedback, ultimately reaching the intended destination.
Based on the RAM, the present disclosure now discusses how response biases influence the way a person generates or selects answers when completing questions on an online survey. Specifically, the RAM informs how changing cognitions influence hand movements that can be captured with various HCl devices (e.g., touch, mouse cursor, keyboard, etc.); of course, different devices may use different measures to capture meaningful movements and behaviors (e.g., when using a keyboard, changing cognitions will influence the fluency of the typing). As an example,
It is proposed that RBS will moderate the relationship between a survey construct (i.e., measurement items) and a predicted variable in the presence of response biases. As discussed above, some response biases will result in slower question processing, with greater response deliberation and increased answer switching, while other types of response biases will result in faster question processing, with less response deliberation and decreased answer switching over non-biased responses.
Some response biases cause respondents to consider answers that are not true. For example, social desirability bias occurs when someone misrepresents an answer by selecting the answer that is more socially desirable, even if it is not accurate. In such cases, several answers may catch a respondents' attention—the truthful answer and also the more socially desirable answers. People will move the mouse-cursor toward these different answers that capture their attention. As explained by the RAM, this reaction is often automatic and subconscious. The brain is particularly likely to prime movements toward answers that capture attention because the answers have “actionable potential”—they represent potential targets on the page that a person might move the mouse-cursor to select an answer. As a result of moving the computer mouse away from the most direct path toward competing answers, response deviations will increase. An example of this is shown in
Some other response biases can cause respondents to deliberate less on the questions and answers, and answer questions more directly. For example, various types of satisficing bias (e.g., survey fatigue or speeding) cause respondents to pay less attention to the question content (i.e., intelligence), less attention to possible answers (i.e., design), and less attention to response selection (i.e., choice) resulting in less response deviation and response time. As respondents cognitively engage less with the survey question and possible response, the RAM suggests that the decision making process will therefore not stimulate movement deviations at a normal rates, resulting in more direct answers and less movement deviations than normal. Likewise, less deliberation will result in a decrease in in response time and behavioral events like answer switching.
Because response biases can influence the allocation of attention (both lower and higher) and thereby influence various aspects of navigation, behaviors and time, the Response Bias Scores (RBS) derived from this data moderate the influence of a survey construct (i.e., measurement item) on a predicted outcome variable. As a surrogate for response biases, RBS can account for unexplained variation in models influenced by response biases and provide valuable insight into the true relationship between the survey construct and the predicted variable. For example, in response biases that cause greater deliberation (e.g., social desirability bias), a negative moderating effect suggests that the greater the response bias, the smaller the effect of the survey construct on the predicted outcome. This bias may even lead to a type 2 error if the bias is prevalent enough. Further, a significant moderation effect suggests that response biases are likely influencing the results, and a non-significant moderation effect suggests that the results are likely not being influenced by response biases that impact RBS (although they still might be influenced by other biases). In sum, RBS will moderate the relationship between a survey construct and a predicted variable in scenarios influenced by response biases.
Similar to our logic regarding metrics related to navigation and behavioral events, response time also plays a role generating RBS and therefore also moderates the relationship between a survey construct (i.e., measurement item) and a predicted variable when response biases are present. Again, it is important to note that the scope of this logic is also only applicable for response biases that influence a person's attention allocation to different answers.
Response time is the duration of answering a given question and is another indicator of potential response biases. Biases that cause users to deliberate more between choosing the accurate answer or choosing another answer naturally take more time to answer in addition to causing more navigation and behavioral anomalies.
Biases that cause less deliberation among answers will have the opposite effect. For example, satisficing due to survey fatigue causes respondents to pay less attention to questions and quickly select answers without as much deliberation. In doing so, respondents give less attention to competing answers on the survey and answer more quickly, resulting in lower response time (
Because response biases influence response time—either decreasing or increasing depending upon the type of bias—response time moderates the influence of a survey construct on a predicted variable similarly to navigation and behavioral anomalies. Response time is therefore an important component when calculating RBS.
In sum, if the RBS has a significant moderating effect, it suggests that a response bias is present and can be controlled for by including the moderating effect in statistical analyses. The moderating effect can then provide valuable insight into the true relationship between a survey construct and a predicted variable, helping avoid Type 1 and Type 2 errors.
There are a wide range of contexts where this innovation can be applied and used to significantly improve the understanding of the true relationship between survey constructs and predicted variables for a given population of respondents. For example, there are countless examples where researchers have explored various aspects of respondents' traits, attitudes and behaviors and some predicted outcome where response biases were likely prevalent. Examples include:
Gaining clearer insight and greater confidence in the results of countless types of survey studies, in virtually all aspects of society, has tremendous potential for improving the targeting of education, use of scarce resources, and interventions.
To explore the potential analytic value and validity of the present approach, a preliminary study was conducted that examined the relationship between a survey construct (intentions to attend class) and actual behavior (class attendance over five weeks). Intention is a central construct in several prevalent behavioral theories such as the Technology Acceptance Model and the Theory of Planned Behavior, to name just two. Additionally, understanding intentions is also a critical concern in various aspects of business and society where response biases may reduce data quality (e.g., “Who do you intend to vote for?” “Do you intend to purchase product A or B?” “Do you intend to increase your exercise this year?”). Because stated intentions often fail to robustly predict actual behavior—termed the intention-behavior gap in the extent literature—and because various types of response biases are a central cause of lower data quality in such studies, it was believed this research context would provide a proper setting for testing the central idea of the disclosed inventive concepts.
The preliminary study only focused on a single type of response bias (i.e., social desirability bias). Additionally, the preliminary study used a single measure of navigation efficiency and a single measure of time from mouse movement data. Another weakness is that both navigation efficiency and time were measured as simple magnitudes of deviation and time. No behavioral events were captured, reported, or included in this analysis. Importantly, data quality can be much more accurately estimated when measured by using a broad set of measures (i.e., dozens of metrics rather than the two simple metrics). For example, by using multiple measures for various navigation, behaviors and time (see Table 5, Table 6 and Table 7), it is more likely to capture a broader range of variance caused by a response bias. Also, by using more sophisticated analytic approaches for understanding response bias, it substantially increases both the accuracy and power of the measurement method (i.e., improving the r-squared of the predictive model).
Next, the present disclosure details these more diagnostic measures and more sophisticated analytic approach. In our description, we denote what is unique compared to our previously reported survey.
The present system and method analyzes and scores how a respondent selects or generates an answer when completing questions in an online survey. For each respondent, and for each question on the survey, an algorithm generates Response Bias Scores (RBS) related to a) navigation efficiency, b) response behaviors and c) time for each construct being measured. These three scores are used to moderate the relationship between a construct and an outcome to adjust for response bias. To create these scores, the follow process is followed (
1. A raw data collector (RDC)—or equivalent technology to capture fine grained data related to human-computer interaction—is embedded into an existing survey (step 1302).
2. The RDC covertly collects fine-grained data about how each of the survey responses is selected or generated (step 1304). This fine-grained data reflects the navigation speed and accuracy, typing speed and editing characteristics, behavioral events such as answer switches or hovers, as well as non-answer-generation related behaviors such as leaving and returning to the survey and the duration of such events, to name a few.
3. The RDC sends the fine-grained data to a Storage and Processing System (SPS) at pre-established intervals for storage and processing (step 1306).
4. The SPS analyses the response data to generate Response Bias Scores (RBS) for each question, and for each user, storing these results for later retrieval (step 1308).
Each of these steps is further described below:
Step 1302: Embedding Raw Data Collector (RDC) into the Survey
At step 1302, the survey system delivers the survey with the embedded RDC (or equivalent technology) to a user on a computer or other electronic device. In many instances, the RDC will use JavaScript, a programming language for the web and is supported by most web browsers including Chrome, Firefox, Safari, Internet Explorer, Edge, Opera, and most others. Additionally, most mobile browsers for smartphones support JavaScript. Other methods for capturing similar data are also contemplated by the disclosed inventive concepts. In other instances, the RDC will use a programming language that is inherent to the mobile app or desktop application being monitored.
Most of the commercial online survey systems support the embedding of JavaScript directly into a survey. Alternatively, a survey can be embedded into a website that has JavaScript enabled. A JavaScript library (or equivalent hardware or software that achieves the same purpose) is embedded into an online survey, covertly recording and collecting fine-grained movements, events and data entry (i.e., behaviors). For example, when a respondent utilizes a mouse to interact with a survey, the RDC (implemented using JavaScript or other methods) records all movements within the page (i.e., x-y coordinates, time stamps) as well as events like mouse clicks or data entry into an html form-field element. Likewise, if a respondent is entering text with a keyboard or touchscreen, various aspects of the interaction are captured depending upon the capabilities of the device and the RDC.
At step 1304, the RDC that is embedded into the online survey system collects a range of movement, navigation, orientation, data entry (e.g., clicks, choices, text, etc.) and events (e.g., answer switches, answer hovers, leaving/returning to the survey, etc.) depending on the capabilities of the device and the RDC. Thus, depending on whether the respondent utilizes a tablet computer, smartphone, traditional computer, or laptop, the fine-grained interaction data is captured by the RDC while the respondent is interacting with the survey (
In essence, the RDC collects raw data related to how a person interacts with the survey, whereas the survey system collects the respondent's final response selection or entry. For example,
At step 1306, at predetermined intervals (e.g., time-[e.g., every 1 second] or event-based [at the completion of a single question or a page of questions]) raw data is sent to the Storage and Processing System (SPS) for storage (
At step 1308, the algorithm generates Response Bias Scores (RBS) for each question, and for each user, that completes an online survey. The algorithm is comprised of two primary processes, each with several sub processes:
Both of these processes are described below.
Step 1308 (Subprocess 1): Segmenting and Isolating Data
In many of the controlled studies reported in the extent mouse-cursor-tracking literature, the task environment is highly artificial in order to more easily segment and isolate data to a particular stimulus (e.g., a question, image, etc.). For instance,
Human-computer interaction (HCl) behaviors (e.g., movements, clicks, accelerometer data, gyroscope data, touch screen data, etc.) can be attributed to a particular question in two ways: a) direct interactions with the html question elements, and b) inferred associations based on the user's navigation of questions on a form.
The Signal Isolation Algorithm (SIA) consists of two primary steps (
Second, HCl behaviors can be attributed to a question by inferring an association based on interactions with the questions on the form. This is done by analyzing the behaviors before and after answering the question, although these behaviors may not be specifically on the question elements.
Step 1038 (Subprocess 2): Calculating Response Bias Scores
Online surveys can contain a broad range of data entry fields (Table 5). When a respondent completes an online survey, they may respond to some of the questions with relatively efficient, confident and likely unbiased responses. Alternatively, they may respond to other questions more slowly and with a lack of confidence due to some type of response bias or may answer quickly without adequately deliberating on the question (i.e., satisficing). Depending on the type of question or field type, the way a person completes the field with low- or high-response bias may differ. As described above (
To detect a biased response, a systematic process is used to capture binary and continuous anomalies from the raw HCl data. A binary anomaly refers to a behavioral event on the survey that either occurs or does not occur (e.g., an answer switch). A continuous anomaly refers to a numeric value related to being an anomaly, and is applicable to metrics that are continuous in nature (e.g., attraction, hesitation, etc.). Depending on the type of data field that is being used (e.g., typing text or numbers, selecting a radio button or checkbox with a mouse, making a choice from a dropdown list, etc.), different metrics are used to capture and store the presence or absence of a binary and continuous anomalies (see Table 6, Table 7 and Table 8).
The algorithm for calculating Response Bias Scores (RBS) contains the following steps, as outlined in
1. Convert raw data into metrics, as shown at step (step 1902)
2. Normalize metrics (step 1904)
3. Aggregate raw metrics into three meta variables: a) navigation efficiency, b) response behaviors and c) time metrics (step 1906); and
4. Use meta variables to detect and adjust for response bias in relationships (step 1908). Each of those steps are discussed in greater detail below.
Convert Raw Data into Metrics (Step 1902)
The raw data is calculated into several metrics. These metrics fall into three categories: a) navigation efficiency, b) response behaviors and c) time metrics.
Navigation efficiency metrics refer to metrics that define how far a person deviated from answering the questions directly (or in a straight line connecting the beginning point and the answer). Examples of navigation efficiency are shown in Table 6 below.
Response behaviors metrics refer to behaviors performed on the question directly (e.g., changing answers, hovering over question, etc.). Examples of response behaviors are shown in Table 7 below.
Time based metrics refer to timing between events. Examples of time metrics are shown in Table 8 below.
So that data in each category can be compared and combined, the time-based metrics and the navigation efficiency metrics must be normalized (note, response behavior metrics are already normalized as binary (T/F)).
To normalize the time-based metrics, a centering and scaling process must take place. This process entails calculating the mean and standard deviation for each metric. Then, the mean is subtracted from each value, and the difference is divided by the standard deviation. This converts all continuous metrics to a similar scale.
Aggregate Raw Metrics into Three Meta Variables: A) Navigation Efficiency, b) Response Behaviors and c) time metrics (Step 1906)
Next, the numerous metrics must be combined into three indicators of response bias: navigation efficiency, response behaviors, and time metrics.
Multiple techniques can be used to combine the metrics in each category into a meta score (e.g., weighted averages, regressions, factor analysis, etc.). Here we describe one example.
For the continuous metrics (navigation efficiency and time metrics), the metrics for each category can be grouped for each participant, and then the median value can be taken for each participant for navigation efficiency and time metrics.
For the response behaviors metrics, the number of TRUE cases can be summed to create a combined metric.
The three combined metrics are referred to as Response Bias Scores (RBS).
The three combined metrics can be used to detect and control for response bias in analyses.
To check for response bias, the metrics are used to statistically moderate the relationship between collected self-reported variables (survey items) and an outcome. If the moderating relationship is significant, this indicates that a response bias is present.
The moderating relationship can be adjusted for by multiplying the self-reported variables (survey items) by the significant moderating variables (navigation efficiency, response behaviors, and time metrics). Other adjusting techniques can be used as well, such as squaring, cubing, or adjusting the moderating variable before combining with the self-report variables.
Alternatively, significant moderating variables can be used to filter out biased data, and therefore have cleaned data for analysis.
Deploying an online survey is a four-step process that includes planning, survey design, survey deployment, and data preparation and analysis. This innovation aids in detecting and measuring various types of response biases by embedding a Raw Data Collector (RDC) into an online survey. The RDC captures various types of data including navigation, question answering behaviors, time, and other types of events at millisecond precision. This fine-grained data sent from the online survey to a Storage and Processing System (SPS), where Response Bias Scores (RBS) are calculated and stored. Once the execution of the online survey is completed, RBS are provided for inclusion in data analysis (
At the Storage and Processing System 1406, the system embeds the JavaScript Listener 1407 into the online survey of
This innovation provides three important benefits over existing approaches for dealing with response biases. First, Response Bias Scores (RBS) can be used to identify if response biases are present in a study; i.e., when the RBS do not significantly moderate the relationship between a survey construct and predictor variable, a response bias in not present. Second, RBS provide novel insight into understanding how response biases influence relationships that are often difficult or impossible to obtain through other measures and approaches. Third, and most importantly, the statistical metrics used to capture RBS helps to account for various types of response biases in predictive statistical models, thus improving the explanatory power of the relationship between a survey construct and predictor variable.
Certain embodiments are described herein as including one or more modules 112. Such modules 112 are hardware-implemented, and thus include at least one tangible unit capable of performing certain operations and may be configured or arranged in a certain manner. For example, a hardware-implemented module 112 may comprise dedicated circuitry that is permanently configured (e.g., as a special-purpose processor, such as a field-programmable gate array (FPGA) or an application-specific integrated circuit (ASIC) to perform certain operations. A hardware-implemented module 112 may also comprise programmable circuitry (e.g., as encompassed within a general-purpose processor or other programmable processor) that is temporarily configured by software or firmware to perform certain operations. In some example embodiments, one or more computer systems (e.g., a standalone system, a client and/or server computer system, or a peer-to-peer computer system) or one or more processors may be configured by software (e.g., an application or application portion) as a hardware-implemented module 112 that operates to perform certain operations as described herein.
Accordingly, the term “hardware-implemented module” encompasses a tangible entity, be that an entity that is physically constructed, permanently configured (e.g., hardwired), or temporarily configured (e.g., programmed) to operate in a certain manner and/or to perform certain operations described herein. Considering embodiments in which hardware-implemented modules 112 are temporarily configured (e.g., programmed), each of the hardware-implemented modules 112 need not be configured or instantiated at any one instance in time. For example, where the hardware-implemented modules 112 comprise a general-purpose processor configured using software, the general-purpose processor may be configured as respective different hardware-implemented modules 112 at different times. Software may accordingly configure a processor 102, for example, to constitute a particular hardware-implemented module at one instance of time and to constitute a different hardware-implemented module 112 at a different instance of time.
Hardware-implemented modules 112 may provide information to, and/or receive information from, other hardware-implemented modules 112. Accordingly, the described hardware-implemented modules 112 may be regarded as being communicatively coupled. Where multiple of such hardware-implemented modules 112 exist contemporaneously, communications may be achieved through signal transmission (e.g., over appropriate circuits and buses) that connect the hardware-implemented modules. In embodiments in which multiple hardware-implemented modules 112 are configured or instantiated at different times, communications between such hardware-implemented modules may be achieved, for example, through the storage and retrieval of information in memory structures to which the multiple hardware-implemented modules 112 have access. For example, one hardware-implemented module 112 may perform an operation, and may store the output of that operation in a memory device to which it is communicatively coupled. A further hardware-implemented module 112 may then, at a later time, access the memory device to retrieve and process the stored output. Hardware-implemented modules 112 may also initiate communications with input or output devices.
As illustrated, the computing system 100 may be a general purpose computing device, although it is contemplated that the computing system 100 may include other computing systems, such as personal computers, server computers, hand-held or laptop devices, tablet devices, multiprocessor systems, microprocessor-based systems, set top boxes, programmable consumer electronic devices, network PCs, minicomputers, mainframe computers, digital signal processors, state machines, logic circuitries, distributed computing environments that include any of the above computing systems or devices, and the like.
Components of the general purpose computing device may include various hardware components, such as a processor 102, a main memory 104 (e.g., a system memory), and a system bus 101 that couples various system components of the general purpose computing device to the processor 102. The system bus 101 may be any of several types of bus structures including a memory bus or memory controller, a peripheral bus, and a local bus using any of a variety of bus architectures. For example, such architectures may include Industry Standard Architecture (ISA) bus, Micro Channel Architecture (MCA) bus, Enhanced ISA (EISA) bus, Video Electronics Standards Association (VESA) local bus, and Peripheral Component Interconnect (PCI) bus also known as Mezzanine bus.
The computing system 100 may further include a variety of computer-readable media 107 that includes removable/non-removable media and volatile/nonvolatile media, but excludes transitory propagated signals. Computer-readable media 107 may also include computer storage media and communication media. Computer storage media includes removable/non-removable media and volatile/nonvolatile media implemented in any method or technology for storage of information, such as computer-readable instructions, data structures, program modules or other data, such as RAM, ROM, EEPROM, flash memory or other memory technology, CD-ROM, digital versatile disks (DVD) or other optical disk storage, magnetic cassettes, magnetic tape, magnetic disk storage or other magnetic storage devices, or any other medium that may be used to store the desired information/data and which may be accessed by the general purpose computing device. Communication media includes computer-readable instructions, data structures, program modules or other data in a modulated data signal such as a carrier wave or other transport mechanism and includes any information delivery media. The term “modulated data signal” means a signal that has one or more of its characteristics set or changed in such a manner as to encode information in the signal. For example, communication media may include wired media such as a wired network or direct-wired connection and wireless media such as acoustic, RF, infrared, and/or other wireless media, or some combination thereof. Computer-readable media may be embodied as a computer program product, such as software stored on computer storage media.
The main memory 104 includes computer storage media in the form of volatile/nonvolatile memory such as read only memory (ROM) and random access memory (RAM). A basic input/output system (BIOS), containing the basic routines that help to transfer information between elements within the general purpose computing device (e.g., during start-up) is typically stored in ROM. RAM typically contains data and/or program modules that are immediately accessible to and/or presently being operated on by processor 102. For example, in one embodiment, data storage 106 holds an operating system, application programs, and other program modules and program data.
Data storage 106 may also include other removable/non-removable, volatile/nonvolatile computer storage media. For example, data storage 106 may be: a hard disk drive that reads from or writes to non-removable, nonvolatile magnetic media; a magnetic disk drive that reads from or writes to a removable, nonvolatile magnetic disk; and/or an optical disk drive that reads from or writes to a removable, nonvolatile optical disk such as a CD-ROM or other optical media. Other removable/non-removable, volatile/nonvolatile computer storage media may include magnetic tape cassettes, flash memory cards, digital versatile disks, digital video tape, solid state RAM, solid state ROM, and the like. The drives and their associated computer storage media provide storage of computer-readable instructions, data structures, program modules and other data for the general purpose computing device 100.
A user may enter commands and information through a user interface 140 or other input devices 145 such as a tablet, electronic digitizer, a microphone, keyboard, and/or pointing device, commonly referred to as mouse, trackball or touch pad. Other input devices 145 may include a joystick, game pad, satellite dish, scanner, or the like. Additionally, voice inputs, gesture inputs (e.g., via hands or fingers), or other natural user interfaces may also be used with the appropriate input devices, such as a microphone, camera, tablet, touch pad, glove, or other sensor. These and other input devices 145 are often connected to the processor 102 through a user interface 140 that is coupled to the system bus 101, but may be connected by other interface and bus structures, such as a parallel port, game port or a universal serial bus (USB). A monitor 160 or other type of display device is also connected to the system bus 101 via user interface 140, such as a video interface. The monitor 160 may also be integrated with a touch-screen panel or the like.
The general purpose computing device may operate in a networked or cloud-computing environment using logical connections of a network interface 103 to one or more remote devices, such as a remote computer. The remote computer may be a personal computer, a server, a router, a network PC, a peer device or other common network node, and typically includes many or all of the elements described above relative to the general purpose computing device. The logical connection may include one or more local area networks (LAN) and one or more wide area networks (WAN), but may also include other networks. Such networking environments are commonplace in offices, enterprise-wide computer networks, intranets and the Internet.
When used in a networked or cloud-computing environment, the general purpose computing device may be connected to a public and/or private network through the network interface 103. In such embodiments, a modem or other means for establishing communications over the network is connected to the system bus 101 via the network interface 103 or other appropriate mechanism. A wireless networking component including an interface and antenna may be coupled through a suitable device such as an access point or peer computer to a network. In a networked environment, program modules depicted relative to the general purpose computing device, or portions thereof, may be stored in the remote memory storage device.
The system and method of the present invention may be implemented by computer software that permits the accessing of data from an electronic information source. The software and the information in accordance with the invention may be within a single, free-standing computer or it may be in a central computer networked to a group of other computers or other electronic devices. The information may be stored on a computer hard drive, on a CD-ROM disk or on any other appropriate data storage device.
The foregoing description and drawings should be considered as illustrative only of the principles of the invention. The invention is not intended to be limited by the preferred embodiment and may be implemented in a variety of ways that will be clear to one of ordinary skill in the art. Numerous applications of the invention will readily occur to those skilled in the art. Therefore, it is not desired to limit the invention to the specific examples disclosed or the exact construction and operation shown and described. Rather, all suitable modifications and equivalents may be resorted to, falling within the scope of the invention.
This application claims the benefit of U.S. Provisional App. No. 62/769,342, filed Nov. 19, 2018, the entire contents of which are hereby incorporated by reference.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2019/061817 | 11/15/2019 | WO | 00 |
Number | Date | Country | |
---|---|---|---|
62769342 | Nov 2018 | US |