The present invention relates to systems and methods for the AI assisted analysis of user experience studies that allow for insight generation for the usability of a website. Generally, this type of testing is referred to as “User Experience” or merely “UX” testing.
The Internet provides new opportunities for business entities to reach customers via web sites that promote and describe their products or services. Often, the appeal of a web site and its ease of use may affect a potential buyer's decision to purchase the product/service.
Especially as user experiences continue to improve and competition online becomes increasingly aggressive, the ease of use by a particular retailer's website may have a material impact upon sales performance. Unlike a physical shopping experience, there is minimal hurdles to a user going to a competitor for a similar service or good. Thus, in addition to traditional motivators (e.g., competitive pricing, return policies, brand reputation, etc.) the ease of a website to navigate is of paramount importance to a successful online presence.
As such, assessing the appeal, user friendliness, and effectiveness of a web site is of substantial value to marketing managers, web site designers and user experience specialists; however, this information is typically difficult to obtain. Focus groups are sometimes used to achieve this goal but the process is long, expensive and not reliable, in part, due to the size and demographics of the focus group that may not be representative of the target customer base.
In more recent years advances have been made in the automation and implementation of mass online surveys for collecting user feedback information. Typically these systems include survey questions, or potentially a task on a website followed by feedback requests. While such systems are useful in collecting some information regarding user experiences, the studies often suffer from biases in responses, and limited types of feedback collected.
In order to overcome these limitations, systems and methods have been developed to provide more immersive user experience testing which utilize AI analytics, audio and video recording, and improved interfaces. These systems and methods have revolutionized user experience testing, but still fundamentally rely upon the ability to recruit sufficient numbers of qualified and interested participants.
Sourcing capable participants is always a challenge, and becomes particularly difficult when very large studies are performed, or many studies are operating in parallel. Traditionally, companies would solicit individuals to join focus groups. Such methods were generally effective in collecting small groups of willing participants, but are extremely resource intensive, and fail to scale in any appreciable manner. With the invention of the internet, more individuals could be solicited in a much more cost effective manner. These populations are aggregated by survey provider groups, and can serve as a source for willing participants. However, even these large participant pooling companies are generally unable to fulfill the needs of truly scaled UX studies. Additionally, these pooled participant sources often are unable to properly deliver the quality of participants desired.
One critical component to getting quality participants, is understanding the characteristics of the participants. These characteristics can be leveraged to match the participants against the study needs. This increases the chance that the participants meet the study criteria, increasing participation rates, and reduces time to testing for the study. Collection of participant characteristics, in a manner that allows for deployment across different studies, is not trivial. Advances in machine learning, however, have made characterization of participants more viable.
It is therefore apparent that an urgent need exists for advancements in the sourcing of participants, and especially in the characterization of participant attributes for user experience studies. Such systems and methods allow for modified participant sourcing based upon participant attributes, reducing time to field the study, and reducing study costs.
To achieve the foregoing and in accordance with the present invention, systems and methods for characterization of participant attributes for user experience studies are provided. An intelligent sourcing engine is capable of delivering qualified and scalable numbers of participants for large, complex and multiple parallel user experience studies in a manner not available previously.
The system includes the ability to collect screener questions and response pairs and determine the type of question. The question and response pairs may be processed for topic and entity extractions using machine learning (ML) models. From the collected topics and entities, a dictionary of attributes may be generated and eventually expanded/added to as new information regarding the participant becomes available. This attribute dictionary may take the form of a vector dictionary, in some particular embodiments.
In some cases, the type of question being posed may dictate how the response is processed. There are four possible question types in some embodiments. These include a Boolean style question, a quantitative question, a single response question and a multi-response type question. In some embodiments, the collection of the response data for a Boolean style question is merely a collection of the binary (or more) response selection. In contrast, a quantitative type of response may be subjected to an entity extraction. A single response will undergo a topic and entity extraction, while a multi-response may undergo a multi-topic extraction with corresponding entity extractions. In some cases, general topic and entity models may be employed. In other embodiments, the response entity extraction model may be tuned or selected based upon topic type.
After attribute collection the system may target particular individuals for the study based upon known, or imputed attribute information. Questions may be recommended to the screener based upon templates of “correct” question types. Fulfillment predictions are made using the collected attributes. These include predicting conversion rates, time to field, study duration and ultimately the time to completion for the study. Lastly study participants are onboarded and the study is performed.
Note that the various features of the present invention described above may be practiced alone or in combination. These and other features of the present invention will be described in more detail below in the detailed description of the invention and in conjunction with the following figures.
In order that the present invention may be more clearly ascertained, some embodiments will now be described, by way of example, with reference to the accompanying drawings, in which:
The present invention will now be described in detail with reference to several embodiments thereof as illustrated in the accompanying drawings. In the following description, numerous specific details are set forth in order to provide a thorough understanding of embodiments of the present invention. It will be apparent, however, to one skilled in the art, that embodiments may be practiced without some or all of these specific details. In other instances, well known process steps and/or structures have not been described in detail in order to not unnecessarily obscure the present invention. The features and advantages of embodiments may be better understood with reference to the drawings and discussions that follow.
Aspects, features and advantages of exemplary embodiments of the present invention will become better understood with regard to the following description in connection with the accompanying drawing(s). It should be apparent to those skilled in the art that the described embodiments of the present invention provided herein are illustrative only and not limiting, having been presented by way of example only. All features disclosed in this description may be replaced by alternative features serving the same or similar purpose, unless expressly stated otherwise. Therefore, numerous other embodiments of the modifications thereof are contemplated as falling within the scope of the present invention as defined herein and equivalents thereto. Hence, use of absolute and/or sequential terms, such as, for example, “will,” “will not,” “shall,” “shall not,” “must,” “must not,” “first,” “initially,” “next,” “subsequently,” “before,” “after,” “lastly,” and “finally,” are not meant to limit the scope of the present invention as the embodiments disclosed herein are merely exemplary.
The present invention relates to the sourcing of participants for user experience testing and subsequent insight generation. While such systems and methods may be utilized with any user experience environment, embodiments described in greater detail herein are directed to providing participants for user experience studies in an online/webpage environment. Some descriptions of the present systems and methods will also focus nearly exclusively upon the user experience within a retailer's website. This is intentional in order to provide a clear use case and brevity to the disclosure, however it should be noted that the present systems and methods apply equally well to any situation where a user experience in an online platform is being studied. As such, the focus herein on a retail setting is in no way intended to artificially limit the scope of this disclosure.
In the following it is understood that the term ‘usability’ refers to a metric scoring value for judging the ease of use of a target web site. A ‘client’ refers to a sponsor who initiates and/or finances the usability study. The client may be, for example, a marketing manager who seeks to test the usability of a commercial web site for marketing (selling or advertising) certain products or services. ‘Participants’ may be a selected group of people who participate in the usability study and may be screened based on a predetermined set of questions. ‘Remote usability testing’ or ‘remote usability study’ refers to testing or study in accordance with which participants (referred to use their computers, mobile devices or otherwise) access a target web site in order to provide feedback about the web site's ease of use, connection speed, and the level of satisfaction the participant experiences in using the web site. ‘Unmoderated usability testing’ refers to communication with test participants without a moderator, e.g., a software, hardware, or a combined software/hardware system can automatically gather the participants' feedback and records their responses. The system can test a target web site by asking participants to view the web site, perform test tasks, and answer questions associated with the tasks.
To facilitate the discussion,
Data processing unit 120 includes a browser 122 that enables a user (e.g., usability test participant) using the data processing unit 120 to access target web site 110. Data processing unit 120 includes, in part, an input device such as a keyboard 125 or a mouse 126, and a participant browser 122. In one embodiment, data processing unit 120 may insert a virtual tracking code to target web site 110 in real-time while the target web site is being downloaded to the data processing unit 120. The virtual tracking code may be a proprietary JavaScript code, whereby the run-time data processing unit interprets the code for execution. The tracking code collects participants' activities on the downloaded web page such as the number of clicks, key strokes, keywords, scrolls, time on tasks, and the like over a period of time. Data processing unit 120 simulates the operations performed by the tracking code and is in communication with usability testing system 150 via a communication link 135. Communication link 135 may include a local area network, a metropolitan area network, and a wide area network. Such a communication link may be established through a physical wire or wirelessly. For example, the communication link may be established using an Internet protocol such as the TCP/IP protocol.
Activities of the participants associated with target web site 110 are collected and sent to usability testing system 150 via communication link 135. In one embodiment, data processing unit 120 may instruct a participant to perform predefined tasks on the downloaded web site during a usability test session, in which the participant evaluates the web site based on a series of usability tests. The virtual tracking code (e.g., a proprietary JavaScript) may record the participant's responses (such as the number of mouse clicks) and the time spent in performing the predefined tasks. The usability testing may also include gathering performance data of the target web site such as the ease of use, the connection speed, the satisfaction of the user experience. Because the web page is not modified on the original web site, but on the downloaded version in the participant data processing unit, the usability can be tested on any web sites including competitions' web sites.
Data collected by data processing unit 120 may be sent to the usability testing system 150 via communication link 135. In an embodiment, usability testing system 150 is further accessible by a client via a client browser 170 running on data processing unit 190. Usability testing system 150 is further accessible by user experience researcher browser 180 running on data processing unit 195. Client browser 170 is shown as being in communications with usability testing system 150 via communication link 175. User experience research browser 180 is shown as being in communications with usability testing system 150 via communications link 185. A client and/or user experience researcher may design one or more sets of questionnaires for screening participants and for testing the usability of a web site. Usability testing system 150 is described in detail below.
In one exemplary embodiment, the testing of the target website (page) may provide data such as case of access through the Internet, its attractiveness, case of navigation, the speed with which it enables a user to complete a transaction, and the like. In another exemplary embodiment, the testing of the target web site provides data such as duration of usage, the number of keystrokes, the user's profile, and the like. It is understood that testing of a website in accordance with embodiments of the present invention can provide other data and usability metrics. Information collected by the participant's data processing unit is uploaded to usability testing system 150 via communication link 135 for storage and analysis.
Data processing unit 125 may send the collected data to usability testing system 150 via communication link 135′ which may be a local area network, a metropolitan area network, a wide area network, and the like and enable usability testing system 150 to establish communication with data processing unit 125 through a physical wire or wirelessly using a packet data protocol such as the TCP/IP protocol or a proprietary communication protocol.
Usability testing system 150 includes a virtual moderator software module running on a virtual moderator server 230 that conducts interactive usability testing with a usability test participant via data processing unit 125 and a research module running on a research server 210 that may be connected to a user research experience data processing unit 195. User experience researcher 181 may create tasks relevant to the usability study of a target web site and provide the created tasks to the research server 210 via a communication link 185. One of the tasks may be a set of questions designed to classify participants into different categories or to prescreen participants. Another task may be, for example, a set of questions to rate the usability of a target web site based on certain metrics such as ease of navigating the web site, connection speed, layout of the web page, ease of finding the products (e.g., the organization of product indexes). Yet another task may be a survey asking participants to press a “yes” or “no” button or write short comments about participants' experiences or familiarity with certain products and their satisfaction with the products. All these tasks can be stored in a study content database 220, which can be retrieved by the virtual moderator module running on virtual moderator server 230 to forward to participants 120. Research module running on research server 210 can also be accessed by a client (e.g., a sponsor of the usability test) 171 who, like user experience researchers 181, can design her own questionnaires since the client has a personal interest in the target website under study. Client 171 can work together with user experience researchers 181 to create tasks for usability testing. In an embodiment, client 171 can modify tasks or lists of questions stored in the study content database 220. In another embodiment, client 171 can add or delete tasks or questionnaires in the study content database 220. In yet another embodiment, client 171 may be user experience researcher 181.
In some embodiment, one of the tasks may be open or closed card sorting studies for optimizing the architecture and layout of the target website. Card sorting is a technique that shows how online users organize content in their own mind. In an open card sort, participants create their own names for the categories. In a closed card sort, participants are provided with a predetermined set of category names. Client 171 and/or user experience researcher 181 can create a proprietary online card sorting tool that executes card sorting exercises over large groups of participants in a rapid and cost-effective manner. In an embodiment, the card sorting exercises may include up to 100 items to sort and up to 12 categories to group. One of the tasks may include categorization criteria such as asking participants questions “why do you group these items like this?” Research module on research server 210 may combine card sorting exercises and online questionnaire tools for detailed taxonomy analysis. In an embodiment, the card sorting studies are compatible with SPSS applications.
In an embodiment, the card sorting studies can be assigned randomly to participant 120. User experience (UX) researcher 181 and/or client 171 may decide how many of those card sorting studies each participant is required to complete. For example, user experience researcher 181 may create a card sorting study within 12 tasks, group them in 4 groups of 3 tasks and manage that each participant just has to complete one task of each group.
After presenting the thus created tasks to participants 120 through virtual moderator module (running on virtual moderator server 230) and communication link 135, the actions/responses of participants will be collected in a data collecting module running on a data collecting server 260 via a communication link 135′. In an embodiment, communication link 135′ may be a distributed computer network and share the same physical connection as communication link 135. This is, for example, the case where data collecting module 260 locates physically close to virtual moderator module 230, or if they share the usability testing system's processing hardware. In the following description, software modules running on associated hardware platforms will have the same reference numerals as their associated hardware platform. For example, virtual moderator module will be assigned the same reference numeral as the virtual moderator server 230, and likewise the data collecting module will have the same reference numeral as the data collecting server 260.
Data collecting module 260 may include a sample quality control module that screens and validates the received responses, and eliminates participants who provide incorrect responses, or do not belong to a predetermined profile, or do not qualify for the study. Data collecting module 260 may include a “binning” module that is configured to classify the validated responses and stores them into corresponding categories in a behavioral database 270.
Merely as an example, responses may include gathered web site interaction events such as clicks, keywords, URLs, scrolls, time on task, navigation to other web pages, and the like. In one embodiment, virtual moderator server 230 has access to behavioral database 270 and uses the content of the behavioral database to interactively interface with participants 120. Based on data stored in the behavioral database, virtual moderator server 230 may direct participants to other pages of the target web site and further collect their interaction inputs in order to improve the quantity and quality of the collected data and also encourage participants' engagement. In one embodiment, virtual moderator server may eliminate one or more participants based on data collected in the behavioral database. This is the case if the one or more participants provide inputs that fail to meet a predetermined profile.
Usability testing system 150 further includes an analytics module 280 that is configured to provide analytics and reporting to queries coming from client 171 or user experience (UX) researcher 181. In an embodiment, analytics module 280 is running on a dedicated analytics server that offloads data processing tasks from traditional servers. Analytics server 280 is purpose-built for analytics and reporting and can run queries from client 171 and/or user experience researcher 181 much faster (e.g., 100 times faster) than conventional server systems, regardless of the number of clients making queries or the complexity of queries. The purpose- built analytics server 280 is designed for rapid query processing and ad hoc analytics and can deliver higher performance at lower cost, and, thus provides a competitive advantage in the field of usability testing and reporting and allows a company such as UserZoom (or Xperience Consulting, SL) to get a jump start on its competitors.
In an embodiment, research module 210, virtual moderator module 230, data collecting module 260, and analytics server 280 are operated in respective dedicated servers to provide higher performance. Client (sponsor) 171 and/or user experience research 181 may receive usability test reports by accessing analytics server 280 via respective links 175′ and/or 185′. Analytics server 280 may communicate with a behavioral database via a two-way communication link 272.
In an embodiment, study content database 220 may include a hard disk storage or a disk array that is accessed via iSCSI or Fiber Channel over a storage area network. In an embodiment, the study content is provided to analytics server 280 via a link 222 so that analytics server 280 can retrieve the study content such as task descriptions, question texts, related answer texts, products by category, and the like, and generate together with the content of the behavioral database 270 comprehensive reports to client 171 and/or user experience researcher 181.
Shown in
Referring still to
The following process flow is best understood together with
User interface input devices 412 may include a keyboard, pointing devices such as a mouse, trackball, touchpad, or graphics tablet, a scanner, a barcode scanner, a touch screen incorporated into the display, audio input devices such as voice recognition systems, microphones, and other types of input devices. In general, use of the term input device is intended to include all possible types of devices and ways to input information to processing device. User interface output devices 414 may include a display subsystem, a printer, a fax machine, or non-visual displays such as audio output devices. The display subsystem may be a cathode ray tube (CRT), a flat-panel device such as a liquid crystal display (LCD), or a projection device. In general, use of the term output device is intended to include all possible types of devices and ways to output information from the processing device.
Storage subsystem 406 may be configured to store the basic programming and data constructs that provide the functionality in accordance with embodiments of the present invention. For example, according to one embodiment of the present invention, software modules implementing the functionality of the present invention may be stored in storage subsystem 406. These software modules may be executed by processor(s) 402. Such software modules can include codes configured to access a target web site, codes configured to modify a downloaded copy of the target web site by inserting a tracking code, codes configured to display a list of predefined tasks to a participant, codes configured to gather participant's responses, and codes configured to cause participant to participate in card sorting exercises. Storage subsystem 406 may also include codes configured to transmit participant's responses to a usability testing system.
Memory subsystem 408 may include a number of memories including a main random access memory (RAM) 418 for storage of instructions and data during program execution and a read only memory (ROM) 420 in which fixed instructions are stored. File storage subsystem 410 provides persistent (non-volatile) storage for program and data files, and may include a hard disk drive, a floppy disk drive along with associated removable media, a Compact Disk Read Only Memory (CD-ROM) drive, an optical drive, removable media cartridges, and other like storage media.
Now that systems and methods of usability testing have been described at a high level, attention will be directed to the improved methods and systems employed for the sourcing of participants for these usability studies. As noted, the outcome of these studies is entirely dependent upon having suitable participants. The most advanced UX testing platform is worthless without sufficient numbers of qualified participants to engage in the testing.
The intelligent sourcing engine 520 may communicate with the panel sources 510a-n via the internet or other suitable information transfer network. The intelligent sourcing engine 520 likewise interfaces with a usability testing system 150 or with multiple independent UX experience systems, to receive studies 530a-m. Examples of study 530a-m requesters may include unified testing platforms such as UserZoom, or even simple studies such as Surveymonkey, Qualtrics or even Google form questionnaires.
The studies 530a-m include information regarding the study scope, participant requirements, and in some embodiments the price the study is willing to expend upon the participants. Alternatively, the study may be assigned a pricing tier, indicating the level of service contract the study originator has entered into with the usability testing platform.
The participant panel sources 510a-n likewise include information to the intelligent sourcing engine 520, such as total available participants on their platform, names or other identifiers for their participants, and collected known attributes for their participants. There are a few attributes that are almost universally collected by panel sources. These include participant gender and age, for example. However, other panel sources may collect additional panelist information beyond these most basic attributes. These other collected data points may include marital status, political affiliation, race, household income, interests, location, home ownership status, dietary restrictions/preferences, education levels, number of people in the household, and the like.
The intelligent sourcing engine 520 consumes the panelist information provided by the panel sources 510a-n and combines it with collected analytics for the potential participants. These potential participants are then initially filtered to exclude historically ineligible participants. The intelligent sourcing engine 520 then performs complex matching of the panel sources to the studies 530a-m based upon participant cost/price, quality, time to field/speed, and availability concerns. This matching step includes considerations for study requirements, be they targetable attributes (known to the system) or non-targetable attributes (attributes which must be estimated for in the participant population). The process by which this matching occurs shall be discussed in significant detail further below.
Turning to
Additionally, the intelligent sourcing engine 520 may include a repository of preconfigured business rules 523. These rules may be supplied directly from the study provider, or may be generated automatically based upon the contractual obligations existing between the study provider and the intelligent sourcing engine 520 entity. For example, one study provider may enter into a contract whereby they pay a flat fee for unlimited studies to be designed under 100 concurrent participants with a guaranteed participant field time of less than 30 days. The system may extrapolate out the rules as being no more than 100 fielded participants at any time, minimum cost per participant, minimum quality threshold, and fill rate/speed of participant sourcing less than 30 days. The system will therefore source participants that are above the needed quality threshold at the lowest price possible to meet the 30 day commitment. If more than 100 participants are needed, to the degree allowed by the 30 day commitment, the system will throttle participant sourcing to maintain a level less than 100 participants fielded at any given time. If it is not possible to meet the 30 day requirements and the less than 100 participant cap, then the system will reject the most recent study, and suggest a contract upgrade to a larger participant number.
As can be seen, the preconfigured business rules 523 have a significant impact upon how the system sources the participants, speed or participant sourcing, and which criteria may exclude possible participant sub-populations. This rule data 523 along with the study data 522 defining the study parameters are supplied to a study query and estimation server 521. This server 521 uses the constraints to determine which populations of participants are likely available given the source and panelist database 524 information regarding the numbers and types of participants available. The initial raw data in the source and panelist database 524 is collected from the panel sources 510a-n. This includes the number and unique identifier information for their potential participants, and well as any collected attribute information for them. The system over time is capable of augmenting this dataset with recorded quality metrics for participants, the likelihood of them engaging with specific studies, discovered attributes, and imputed attributes. Discovered attributes include attributes for which the participant provides direct feedback regarding, whereas imputed attributes are predictions of attributes based upon correlation models. These correlation models may be rule driven, or may be generated using known machine learning techniques. An example of an imputed attribute such models may generate is that individuals who are known to have an income above $175,000 (known attribute) are likely to be consumers of luxury goods (imputed attribute).
In addition to determining the sample availability, the study query and estimation server 521 is likewise tasked with determining the pricing and estimated time in field. As noted before, sometimes these criteria are predetermined by a service level contract. In such flat-fee structures the system defaults to the lowest price possible to deliver the other required criteria. However, when one or more of these criteria are not dictated by the business rules, the study query and estimation server 521 can generate the expected cost and or speed of the participant sourcing based upon the known source data.
In some situations the study query and estimation server 521 will determine that a study, as proposed, is not feasible commercially. In such situations the study query and estimation server 521 may flag the study request with an error and propose alternate study requirements. For example, the cost, speed, quality and number/availability of individuals is interrelated. For a given quality threshold, the speed, cost and number can be modeled as a topographical surface chart. If a study client wants to increase the speed of participant sourcing, either the number needs to reduce, cost increase, or some combination of the two. Very fast and large study groups will be very expensive to field.
An example of such a surface graph is provided at 1600 of
As noted, for some study criteria, it may simply be impossible (commercially or physically) to meet the required participant sourcing. In the case of a physical impossibility, the system will respond with a simple error and request for the criteria to be adjusted. Going back to the above example, if a study author wants to survey 10,000 participants with the aforementioned computer programming experience, in two weeks, it is likely not physically possible to source that study, regardless of the price the study author is willing to pay. In the case of a commercial impossibility, the system will still throw an error, but will also propose an adjustment that enables the study to move forward. For example, assume the study author wants 100 computer programmers to engage in a two week study, but is on a basic flat-fee service contract. To fulfill the participant request, the query and estimation server 521 determines the cost of such a study is well outside of a threshold cost assumed for this basic service contract. The study author may then be proposed to either extend the study length by three additional weeks, or to upgrade their service contract to a premium level (thereby allowing for higher priced participants to be sourced).
Returning to
After filtering, a supply estimator 573 uses the study criteria to determine the likelihood of any one supplier to provide the needed number of participants.
For unknown targetable attributes, the targetable attribute predictor 581 may use statistical techniques to determine the number of participants in the supply, to a certain confidence level, have the attribute. The targetable attribute predictor 581 will map the supply population to the most granular population for which data is available, and extrapolate the attribute prevalence within the supply population.
For example, assume the targetable attribute in interest is for participants who are parents. Demographic information about birthrates and family status by age are known for state level geographic areas. A panel supply 510 based in the western United States consisting of participants predominantly between 20-30 years old, can have the prevalence for being a parent estimated by using this state and age demographic data. In this example, parental rates for this age bracket are below the general population level. Furthermore, for the states at issue, the trends are even lower. This mapping of the supply population to the most granular populations for which the attribute is known allows the targetable attribute predictor 581 to more accurately determine the number of individuals in the supply populations that meet the targetable criteria.
In a similar vein, the non-targetable attribute estimator 582 generates estimates for non-targetable attributes that are desired for the study in the supply populations. Non-targetable attributes are more ephemeral than targetable attributes. These are attributes that change (such as the participant having an ailment like the flu) or are attributes that are obscure and would not be commonly collected (such as how many 18th century French novels the individual owns, for example). Non-targetable attributes must be entirely estimated based upon incidents of the attribute in a given population (in much the same manner as targetable attribute estimations), but this is often not possible as even in the aggregate there is little information available regarding prevalence of these attributes. As such, the system generally begins a small scale sampling of the various populations, and subjecting these sampled individuals to questions to determine the frequency of the non-targetable attribute. Once statistically sufficient (e.g., seventy-fifth, eighty-fifty, ninetieth or ninety-fifth percentile confidence) data has been collected, then the estimate for the prevalence of the non-targetable attribute may be determined for the given supply. The statistical methodologies for sampling, and determining frequency within a larger population to a given confidence level are known in the field of statistical analysis, and as such will not be discussed in any exhaustive detail for the sake of brevity.
After the supply populations have thus been winnowed down to the total numbers of participants that likely exist that meet the study criteria, an invite number calculator 583 is capable of determining how many individuals from each panel supplier 510a-n could conceivably be extended an invitation to join the study. This determination is based upon past sign-up frequency for the given panel supplier, compared against time in filed/speed requirements, and adjusted for macro-factors that may impact study participation.
For example, assume it is found that supplier A is determined to have 250 members that meet the study criteria, and supplier B has 150 members that are expected to meet the criteria. In the past, of the eligible individuals in supplier A, generally 30% join an offered study after a two week period. For supplier B, it is found that 50% of the members join after two weeks. Thus, if a study wanted to be completed within that two week period, both supplier A and supplier B could be extended 75 invitations. However, assume that this study is occurring over the Christmas and New Year holidays. Historically, participation rates drop dramatically during this time period, for the sake of this example by two thirds. Thus, for the given study, it is likely that both these suppliers are only able to provide 25 participants.
In the above manner the invite number calculator 583 determines the capacities the panel sources 510a-n are realistically able to provide a given study. This process has been simplified as additional metrics, such as numbers of participants involved in alternate studies, closeness of attributes between these concurrent studies, and participant fatigue factors may likewise be included in the supply estimations. In particular, multiple overlapping studies may drain the availability of participants. This is especially true for studies for which the participant attributes overlap. Clustering algorithms, or least means squares functions may be utilized to define the degree of attribute overlap. This value can be used to weight (via a multiplication function) against study size to determine a factor of interference. This factor may be scaled based upon prior experience of the reduction in participant rates when multiple overlapping studies occur, and is used to reduce the estimates participant number (either by subtracting an absolute number of “tied up” participants, or via a weighing/multiplication of the estimated participant numbers by the scaled factor). Likewise, the raw number of participants (or numbers modified by closeness of attributes as previously discussed) that occurred in the two, four or six weeks prior to the present study may be used to determine a “fatigue” reduction in participants. A few individuals will enjoy and endeavor to engage in one study after another. However, many individuals tire of responding to studies, and will throttle engagement in a cyclical manner. This fatigue factor may likewise be used to adjust the expected number of participants available, in some select embodiments.
Returning to
Regardless of the metrics relied upon to collect quality measures, even when insufficient data is collected for any one participant, when the supplier as a whole is shown to have a quality issue not meeting the quality cutoff threshold, this supplier may be entirely discounted from the offer extension process.
Generally, after the threshold quality issue is determined, the offer extended 574 ranks the suppliers by price, and allocates the participant invitations to the suppliers in ascending order of their respective price/cost. For example, suppose Supplier A in our earlier example has 25 available participants, as was determined, each costing $5 to engage. Supplier B also was determined to have 25 available participants, however supplier B costs $7 per test participant. For a study requiring 40 participants, supplier A would be extended 25 invitations, and supplier B only 15 invitations.
However, when two suppliers are substantially similar in cost, then the system may alternatively determine the invite allocation by looking at the relative capacity of the various sources, and leveling the load imposed upon any given supplier. The load leveler 572 performs this tracking of participant demands being placed on any given panel supplier 510a-n and makes load leveling determinations by comparing these demands against total participants available in each supplier. For the purposes of this activity, “substantially similar in cost” may mean less than either five, ten, or fifteen percent deviation in cost, based upon embodiment.
After invitations to join the study are sent to one or more of the panel suppliers 510a-n, the rate of acceptance can be monitored, and the number of invitations sent modified by a supply throttle 575. For example, if a lower cost supplier ends up filling participants much faster than anticipated, then it is likely the estimates for the available participants were incorrect, and the total number of invitations to this supplier can be increased while the number for a higher cost supplier is ratcheted back. Additionally, it may be beneficial to batch release invitations to the suppliers in order to spread out study engagement. This allows the study systems to reduce spikes in computational demand, and further by extending study time to the limits of the service agreement with a client, the costs to the study provider can be more readily managed. Further, initial study results often times lead to changes in the study questions or objectives in order to explore specific insights more fully. By extending study invitation release, the throttle 575 allows time for such study updates to occur.
Returning to
Now that the systems for intelligent participant sourcing have been described in detail, attention will be turned to example processes and methods executed by these systems. For example,
After initialization in this manner, returning to
Subsequently, the selection of the participants is performed (at 930).
If a single source has the capacity to meet a study's demands, and the source is substantially the lowest price provider, then all participants can be invited from that single source (at 1150). Often however, no single source can meet the participant demands, or the sources that can are more expensive than other available sources. In this case, the sources are ranked by price (at 1130). The participants are then sourced from this price ranked listing of suppliers responsive to the speed requirements, and where the pricing and speed are substantially comparable, based upon load leveling between suppliers (at 1140) as previously discussed.
Regardless of whether the participants are sourced from a single provider, or multiple providers, the system subsequently monitors the participant join rates (at 1160), as well as collected information regarding the participants. This collected information may be leveraged to update the participant and source database, and the join rates are utilized to throttle or speed up invitation rates if they differ from expected participant join rates (at 1170).
Returning to
Returning to
Next attention will be directed to an example process for participant sourcing pricing determination. This pricing determination may operate in parallel with the above described participant sourcing. As noted before, in some cases the study authors have entered into a service agreement whereby a subscription style fee is charged to the client by the intelligent sourcing engine entity for a particular level of service. Having more participants, higher quality participants, or faster in-the-field time may require the client to upgrade to higher tier service agreements, as has been already discussed in some detail. However, in alternative embodiments, it may be desirable to have a “pay as you go” style participant sourcing. In such situations the client/study author provides desired quality, speed, and participant numbers desired, and the system performs a pricing calculation for delivering the required participant pool.
The next step in the process is to estimate the pool size available for the given study (at 1440).
After reducing the pool of possible participants by targetable attributes, a similar process may be performed based upon an estimate of the prevalence of non-targetable attributes (at 1530). As noted before, non-targetable attributes are typically extremely obscure or ephemeral, and thus cannot generally be estimated based upon demographic or correlations to other attributes. Instead, prevalence data must be acquired by sampling the participant pool, as is known in the art of statistical analysis. After the pool has thus been further narrowed, an error adjustment may be applied to the pool size based upon the confidence levels of the estimations (at 1540). For example, if the panel source is able to provide data on the number of participants, and attribute data such that there are no estimations required, the total number of available participants is fairly assured, and little or no error adjustment is required. However, if the population is determined based upon estimations of targetable attributes where the correlations are weak, and demographic frequency data is granular, then the estimate of the population size may be subject to more error. In such a case, based upon the desired business risk desired, an error adjustment may be applied to artificially reduce the population size. A smaller population will cause the price per participant to rise. As such, the error adjustment causes the overall price to increase, reducing the competitiveness of the final pricing, but conversely building in more pricing “cushion” that may result from incorrect estimates of the populations.
Returning to
By applying the required time-to-field criteria, and the number of participants desired, the system can generate the requisite price (at 1460) to fulfill the participant sourcing needs of the usability study. As noted before, in some situations, the study requirements may simply not be able to be met. This is especially true if the attributes required by the participants are rare or specialized, and during high demand time periods. In such circumstances, a price may be generated for an altered set of study conditions (e.g., lower participant number, or longer length of time to field), and this alternative study may be presented to the study author for approval, with an explanation on why their prior study design was not possible.
Moving on to the attribute characterization of study participants,
Panelist attributes can be unlimited in variety. However, in some particular embodiment, the attributes are restricted to the following categories: finances, pets, health, family, education, vehicles, real estate (living situation), technology, career, and shopping channels. When limited to a known set of categories, the attributes for any given individual may be stored as a dictionary of vectors. In some cases, the fields may be considered “sensitive” (such as a health category) due to the need to comply with privacy laws. Coding of the vector may be deidentified, encrypted, or otherwise protected, in order to ensure compliance with such privacy regulations.
Turning to
In example
In some embodiments, each attribute analysis module 1815 includes a topic and entity analyzer 1816 for the question and an entity and topic analyzer 1817 for the response. The output of each attribute analysis module 1815 is then proceeded to a decoder 1860 for the determination of the attribute.
In
The quantitative module 1830 includes a second topic model and entity model (collectively 1832) that analyzes the question. An example of a quantitative question would include the following: “Your salary is how many dollars per year?” In some limited embodiments, the second topic model and entity model may be the same as the first topic model and entity model. In alternative embodiments, these models may be tailored based upon the question type determined before. Further, the quantitative module 1830 includes a response entity model 1834. This entity model may be the same as the question entity model or may be selected from a plurality of entity models. In some cases, the response entity model is selected based upon the output of the question topic model. Outputs from the various models are then provided to the attribute decoder 1860 for conversion to a vector value which is saved (or updated) for the given panelist.
The single response module 1840 includes a third topic model and entity model (collectively 1842) that analyzes the question. An example of a single response question would include the following: “What is your favorite shopping brand?” In some limited embodiments, the third topic model and entity model may be the same as the first and/or second topic models and entity models. In alternate embodiments, these models may be tailored based upon the question type determined before. Further, the single response module 1840 includes a response topic and entity model (collectively 1844). These entity and topic models may be the same as the question entity and topic model or may be selected from a plurality of entity and/or topic models. In some cases, the response entity model is selected based upon the output of the question topic model. Outputs from the various models are then provided to the attribute decoder 1860 for conversion to a vector value which is saved (or updated) for the given panelist.
The multi-response module 1850 includes a fourth topic model and entity model (collectively 1852) that analyzes the question. An example of a multi-response question would include the following: “List the financial institutions you bank with?” In some limited embodiments, the fourth topic model and entity model may be the same as the first, second and/or third topic models and entity models. In alternative embodiments, these models may be tailored based upon the question type determined before. Further, the multi-response module 1850 includes a response topics and entity model (collectively 1854). These entity and topic models may be selected from a plurality of entity and/or topics models. In some cases, the response entity model is selected based upon the output of the question topic model. Outputs from the various models are then provided to the attribute decoder 1860 for conversion to a vector value which is saved (or updated) for the given panelist.
The questions for screening participants are generally created by the study authors based upon the desires of the given study. For example, if a brand caters to wealthy men between 25-45 years old, it may be desirable to screen for these attributes for the given study. After the data is collected, however, the data can further be stored for the given participants. This generates a data rich set of information regarding participant attributes that may be pulled upon in later studies to increase study efficiency and speed.
After the screener questions are deployed to the participants, the system collects the question and response sets for each screened participant (at 1920). The questions are then analyzed to determine type, and they are routed to the appropriate analytical module (either physical or logical), for processing (at 1930). Each question and response set are then processed, as seen in
In this example process, a series of determinations are made on what type of question is present in the question/response pairings. Initially, a decision is made if the question is a Boolean type (at 2005). If so, the process continues with processing the question/response pairings as Boolean pairs (at 2010). If the question is not a Boolean type, the system may determine if the question is quantitative in nature (at 2015). If so, the question/response pairs are processed as quantitative types (at 2020). If the question is not quantitative, the system makes a determination if the question is a single response type question (at 2025). If so the question response pairs are processed as single responses (at 2030). However, if the question is not a single response, the system processes the response as a multi-response type question/response pairings (at 2040). Again, the questions may be annotated to help determine what type of question is being asked, or may be classified by a ML classification.
In contrast, the quantitative type processing, as seen in greater detail in relation to
Single response type processing, as seen in relation to
As with the Boolean processing, the selection of models may be dependent upon the type of processing, and may be impacted by the topic modeling. For example, a model may be extremely proficient at identifying an entity extractions when it relates to technology, but less accurate when the topic is finance. As such, based upon the topic of the question, the entity model may be selected (or weights changed), to ensure optimal classification accuracy.
In some specific embodiment, the topics may include age, gender, accounts, memberships, activities, hobbies, apps, website, automotive, college, company, country, residence, estate, security, education, employment, engineering, finance, gaming, health, home, income, insurance, household, language, mobile, occupation, food, payment, pets, photography, politics, reading, relationships, shopping, smoking, software, taxes, technology, transportation, and travel. In other embodiments, the topic model(s) may include additional or different topic types.
Named entity recognition, in contrast, identifies quantity responses, time window references, organizations and the like. For example, in some particular embodiments, the named entity may include any of people, nationalities, religions, political groups, buildings, airports, destinations, companies, agencies, institutions, countries, cities, states, bodies of water, mountains, other locations, objects, vehicles, foods, named hurricanes, battles, wars, sports events, weather events, other events, products, works of art, books, songs, legal documents, languages, dates (absolutes and relative), time, percentages, quantities, weights, distances, money, time, order of things, and other numerical values. In some cases, name entity recognition modeling is the most complex and varied of the modeling being performed. As such, a wider variety of usable models may be employed in some embodiments. Different named entity recognition models may perform better than others based upon topic or entity type. As such, using feedback from the topic models may be useful in ensuring higher accuracy of the downstream models leveraged.
Returning to
Lastly, the process may field participants for a study in an intelligent manner (at 1970). Here “intelligent” means optimized to select and field participants that are most likely to meet the study requirements, most likely to complete the study, and overall optimized for study efficiency and timing.
Here there is initially a screener fraud detection step using the attributes (at 2510). In some cases, partial vector space coordinates or distance may be indicative, statistically, of a fraudulent participant. In other situations, inconsistencies in answers may flag the user as fraudulent. For example, a user who has answered that their occupation is a doctor, teacher, engineer and cashier is suggestive that the user is providing randomized answers and/or answers he or she thinks will be accepted for the study rather than accurate responses. Additionally, the system may be able to scrape public sources, such as LinkedIn, to independently validate the authenticity of a user. Moreover, as the advancement of AI “deep fake” systems are more frequently employed, the system may further be able to visit historical/archived versions of these public sources from before such ‘deep fake’ profiles were feasible. For example, a 40 year old participant should be expected to have an online presence that spans backwards nearly twenty years. By analyzing historical data sources, it is possible to more accurately uncover such fraudulent profiles. These individuals may be isolated or otherwise excluded from the study. Subsequently, the participants may be filtered by attributes (at 2520). The studies include mandatory and optional but preferred qualities for the participants. The participants attributes may be compared against these requirements to filter the participant pool to only suitable participants. A conversion prediction is then performed (at 2530). Different attribute vector profiles may yield very different conversion rates for participants. For example, middle age affluent participants may have a lower conversion rate than a younger and less financially established individual. By collecting a large variety of attributes for the individuals, very accurate conversion predictions may be established. Further, if the participant is a “regular” participant, the actual conversion rate for said individual may be calculated and used to override the predicted conversion rate.
The participant providers may then be prioritized (at 2540) based upon their ability to supply sufficient participants in the required time period at a given budget. These calculations rely upon the contracted time to field the study and the expected conversion rates of the participants belonging to the provider. The calculations may also be based on the prioritization of quality versus time of fulfillment of the customer. Individual participants may be targeted in the provider for the study based upon their known attributes (at 2550). This may be by the identified attributes from before, or based upon imputed likelihood of an attribute. For example, the distance between a known attribute and the unknown desired attribute may be calculated using an ontology or other distance function. For example, the user having an iPhone may be a known attribute. Having an Apple watch may be the desired attribute. The iPhone and Apple Watch may be closely related to one another in a distance model. Conversely, the distance between an Android phone and an Apple Watch may be greater. As such, if the system knows the types of phones the users' have, the system may target iPhone users over Android users if the study is looking for participants who have Apple Watches.
The system may then also provide recommendations to the study for questions to further identify needed attribute information (at 2560). The system can provide recommendations of extra screener questions depending on the which type of audience this client has used before on similar types of studies and/or suggest other screener questions per a description of the needed audience.
The system may then generate fulfillment predictions based upon the attributes (at 2570). These predictions include determining the likelihood of conversion of a given individual, expected time to completion for the study, and estimate of time to field. The rarity of the desired attributes, compared against the number of participants known to possess such attributes may all be consumed by ML models that are tuned to predict the conversion rate of the individuals. Knowing conversion rates, plus knowledge of the number of invitations to the study that are extended, all instruct the time to field period. Knowledge of the study scope, compared against similar scoped historical studies, helps inform the prediction of time to completion for the study. Lastly, the participants are screened and onboarded (at 2580) as discussed in greater detail previously. The participants may be screened when onboarding but often times they are screened at the beginning of the test, where the audience criteria is selected. Another option is to keep screening regularly, following a specific strategy to keep the participants' info up-to-date.
Some portions of the above detailed description may be presented in terms of algorithms and symbolic representations of operations on data bits within a computer memory. These algorithmic descriptions and representations are the means used by those skilled in the data processing arts to most effectively convey the substance of their work to others skilled in the art. An algorithm is, here and generally, conceived to be a self-consistent sequence of operations leading to a desired result. The operations are those requiring physical manipulations of physical quantities. Usually, though not necessarily, these quantities take the form of electrical or magnetic signals capable of being stored, transferred, combined, compared, and otherwise manipulated. It has proven convenient at times, principally for reasons of common usage, to refer to these signals as bits, values, elements, symbols, characters, terms, numbers, or the like.
The algorithms and displays presented herein are not inherently related to any particular computer or other apparatus. Various general-purpose systems may be used with programs in accordance with the teachings herein, or it may prove convenient to construct more specialized apparatus to perform the methods of some embodiments. The required structure for a variety of these systems will appear from the description below. In addition, the techniques are not described with reference to any particular programming language, and various embodiments may, thus, be implemented using a variety of programming languages.
In alternative embodiments, the machine operates as a standalone device or may be connected (e.g., networked) to other machines. In a networked deployment, the machine may operate in the capacity of a server or a client machine in a client-server network environment or as a peer machine in a peer-to-peer (or distributed) network environment.
The machine may be a server computer, a client computer, a virtual machine, a personal computer (PC), a tablet PC, a laptop computer, a set-top box (STB), a personal digital assistant (PDA), a cellular telephone, an iPhone, a Blackberry, a processor, a telephone, a web appliance, a network router, switch or bridge, or any machine capable of executing a set of instructions (sequential or otherwise) that specify actions to be taken by that machine.
While the machine-readable medium or machine-readable storage medium is shown in an exemplary embodiment to be a single medium, the term “machine-readable medium” and “machine-readable storage medium” should be taken to include a single medium or multiple media (e.g., a centralized or distributed database, and/or associated caches and servers) that store the one or more sets of instructions. The term “machine-readable medium” and “machine-readable storage medium” shall also be taken to include any medium that is capable of storing, encoding or carrying a set of instructions for execution by the machine and that cause the machine to perform any one or more of the methodologies of the presently disclosed technique and innovation.
In general, the routines executed to implement the embodiments of the disclosure may be implemented as part of an operating system or a specific application, component, program, object, module or sequence of instructions referred to as “computer programs.” The computer programs typically comprise one or more instructions set at various times in various memory and storage devices in a computer, and when read and executed by one or more processing units or processors in a computer, cause the computer to perform operations to execute elements involving the various aspects of the disclosure.
Moreover, while embodiments have been described in the context of fully functioning computers and computer systems, those skilled in the art will appreciate that the various embodiments are capable of being distributed as a program product in a variety of forms, and that the disclosure applies equally regardless of the particular type of machine or computer-readable media used to actually effect the distribution
While this invention has been described in terms of several embodiments, there are alterations, modifications, permutations, and substitute equivalents, which fall within the scope of this invention. Although sub-section titles have been provided to aid in the description of the invention, these titles are merely illustrative and are not intended to limit the scope of the present invention. It should also be noted that there are many alternative ways of implementing the methods and apparatuses of the present invention. It is therefore intended that the following appended claims be interpreted as including all such alterations, modifications, permutations, and substitute equivalents as fall within the true spirit and scope of the present invention.
This application, Attorney Docket No. UZM-2301, entitled “Systems And Methods For Attribute Characterization of Usability Testing Participants,” is a Continuation-In-Part Application claims the benefit of U.S. application Ser. No. 18/344,538, Attorney Docket No. UZM-1905-C2, entitled “System And Method For An Intelligent Sourcing Engine For Study Participants,” filed on Jun. 29, 2023, by inventor Mestres et al. Application Ser. No. 18/344,538, Attorney Docket No. UZM-1905-C2, is a Continuation Application and claims priority to U.S. application Ser. No. 17/750,283, Attorney Docket No. UZM-1905-C1, filed May 20, 2022, entitled “System And Method For An Intelligent Sourcing Engine For Study Participants,” now U.S. Pat. No. 11,704,705, Issued Jul. 18, 2023. Application Ser. No. 17/750,283; , Attorney Docket No. UZM-1905-C1, is a Continuation Application and claims priority to U.S. application Ser. No. 17/063,368, Attorney Docket No. UZM-1905-US, entitled “Systems And Methods For An Intelligent Sourcing Engine For Study Participants”, now U.S. Pat. No. 11,348, 148, Issued May 31, 2022, which application claims priority to U.S. Provisional Application No. 62/913,142, Attorney Docket No. UZM-1905-P, filed Oct. 9, 2019, of the same title, expired. All of the above-listed applications/patents are incorporated herein in their entirety by this reference.
Number | Date | Country | |
---|---|---|---|
62913142 | Oct 2019 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 17750283 | May 2022 | US |
Child | 18344538 | US | |
Parent | 17063368 | Oct 2020 | US |
Child | 17750283 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 18344538 | Jun 2023 | US |
Child | 18533043 | US |