This disclosure relates to vestibular testing systems and methods.
The vestibular system of the inner ear enables one to perceive body position and movement. In an effort to assess the integrity of the vestibular system, it is often useful to test its performance. Such tests are often carried out at a vestibular clinic.
Vestibular clinics typically measure reflexive responses like balance or the vestibulo-ocular reflex (VOR) to diagnose a subject's vestibular system. The VOR is one in which the eyes rotate in an attempt to stabilize an image on the retina. Because the magnitude and direction of the eye rotation depend on the signal provided by the vestibular system, observations of eye rotation provide a basis for inferring the state of the vestibular system. Measurements of eye movement are useful for diagnosing some failures of the vestibular system.
Some patients tested in vestibular clinics can report perceptual vestibular problems and test normal on diagnostic tests that assess the VOR. For example, these diagnostic tests may use reflexive vestibular responses and vestibular perception associated with different neural pathways than those tested in the clinics. The tests may measure average VOR metrics such as gain and phase that may fail to diagnose some vestibular problems. Some disorders may include subtle physiological responses that VOR diagnostic tests are unable to measure. For example, VOR tests typically assess responses to motions with relatively large amplitudes, but some diagnoses may require conducting tests having motions with small amplitudes.
The present disclosure is related to apparatus and methods for estimating a vestibular function of a subject. In one aspect, the document describes apparatus that include a motion platform for supporting a subject and an input device configured to receive confidence ratings from the subject. The motion platform is configured to execute one or more motions. The confidence ratings are related to the subject's perception of the one or more motions. The apparatus further include a processer configured to fit a cumulative distribution function to the confidence ratings, determine a relationship configured to link the cumulative distribution function to an underlying noise distribution, and output parameters associated with the vestibular function based at least in part on the relationship. The parameters also provide an estimation of the vestibular function of the subject.
In another aspect, this document features methods for estimating a vestibular function of a subject. The methods include providing one or more motion stimuli to a subject. The methods further include receiving confidence ratings from the subject and fitting a cumulative distribution function to the confidence ratings. The confidence ratings indicate the subject's perceptions of the motion stimuli. The methods further include determining a relationship configured to link the cumulative distributive function to an underlying noise distribution and generating parameters associated with the vestibular function based at least in part on the relationship. The parameters also provide an estimation of the vestibular function of the subject.
In a further aspect, one or more machine-readable storage devices have encoded thereon computer readable instructions for causing one or more processors to perform operations as described herein. The operations include providing one or more motion stimuli to a subject and receiving confidence ratings from the subject. The confidence ratings indicate the subject's perceptions of the motion stimuli. The operations further include fitting a cumulative distribution function to the confidence ratings, determining a scaling parameter configured to link the cumulative distributive function to an underlying noise distribution, and generating parameters associated with the vestibular function. The parameters include the scaling parameter. The parameters also provide an estimation of the vestibular function of the subject.
Implementations of the above aspects can include one or more of the following features. The relationship can be represented by a scaling parameter configured to link the cumulative distribution function associated with the confidence ratings to a cumulative distribution function associated with the underlying noise distribution. The plurality of parameters can include the scaling parameter. The cumulative distribution function associated with the confidence ratings and the cumulative distribution function associated with the underlying noise distribution can both be Gaussian. The cumulative distribution function can be fitted to the confidence ratings using a maximum likelihood criterion.
The cumulative distribution associated with the confidence ratings can be different from a cumulative distribution function associated with the underlying noise distribution.
In some examples, the parameters can include a width and bias of the vestibular function.
In some examples, the confidence ratings include any of a quasi-continuous rating, a binary rating, an N-level discrete rating, or a wagering rating.
The technologies described herein can provide several advantages. For example, the time required to test a subject can be reduced. In particular, the use of the confidence ratings can be used to account for the underlying noise distribution, which can in turn be used to reduce the number of overall trials needed to determine the parameters associated with the vestibular function for a given subject. The additional consideration of the confidence ratings can also decrease the variability of the parameters to provide more precise estimations of the parameters associated with the vestibular function of the subject.
Unless otherwise defined, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which the subject matter of this disclosure belongs. Although methods and materials similar or equivalent to those described herein can be used in the practice or testing of the implementations described herein, suitable methods and materials are described below. All publications, patent applications, patents, and other references mentioned herein are incorporated by reference in their entirety. In case of conflict, the present specification, including definitions, will control. In addition, the materials, methods, and examples are illustrative only and not intended to be limiting.
Other features and advantages will be apparent from the following detailed description, and from the claims.
Like reference symbols in the various drawings indicate like elements.
Perceptual thresholds are commonly assayed in the lab and clinic. When precision and accuracy are required, thresholds are quantified by fitting a psychometric function to forced-choice data. However, this approach can require a hundred trials or more to yield accurate (i.e., small bias) and precise (i.e., small variance) psychometric parameter estimates. The present disclosure demonstrates that confidence probability judgments combined with a model of confidence can yield psychometric parameter estimates that are markedly more precise and/or markedly more efficient than methods using a signal detection model without consideration of confidence (e.g., confidence-agnostic methods). Specifically, both human data and simulations show that including confidence probability judgments for as few as twenty trials can yield psychometric parameter estimates that match the precision of those obtained from the hundred trials using confidence-agnostic analyses. Such an efficiency advantages are especially beneficial for tasks (e.g., taste, smell, and vestibular assays) that require more than a few seconds for each trial, but the benefits would also accrue for many other tasks.
Measuring thresholds is a psychophysical procedure; applications range from experimental psychology to neuroscience to economics to engineering. Fitting psychometric functions using categorical data analyses that describe the relationship between a stimulus characteristic (e.g., amplitude) and a subject's forced-choice categorical responses provides a standard approach used to estimate thresholds. A comprehensive analysis concluded that maximum likelihood methods can be used when accuracy and precision of psychometric function fit parameters is important and, further, showed that more than a hundred forced-choice trials can be required to yield acceptable fit parameter estimates. Because many trials can be needed to yield accurate and precise psychometric fits, studies spanning fifty years have reported efforts to improve threshold test efficiency (i.e., to reduce the number of trials), but only modest efficiency improvements have accumulated. This may be due to binary/binomial distributions, which can have high variability at near-threshold stimulus levels—where the maximal information can be attained on each trial.
While forced choice procedures can be simple and robust, subjects can know how confident they are for each response. “Confidence” as used herein is a belief in the validity of what a subject believes and is widely considered a form of metacognition, because it involves self-monitoring of perceptual performance. In other words, confidence reflects self-assessment of the conviction in a decision of a subject being tested.
Confidence has been studied in humans using a variety of techniques including probability judgments. In fact, confidence probability judgments (i.e., confidence ratings provided using a nearly continuous scale between 0 and 100% or 50% and 100%) can provide the most common assessment of confidence.
One use of confidence recordings is in “confidence calibration” studies where confidence is compared to actual performance, where a data set may be classified as “well-calibrated” or classified as indicative of “overconfidence” or “underconfidence.” Specifically, assuming that a subject reported 90% confidence that a given motion was rightward for 10 separate trials at a given stimulus level, on average, perfect calibration of these confidence reports is assumed when 9 out of 10 of these trials are in the rightward direction, while overconfidence would be indicated by 5 out of 10 being rightward.
Probability judgments are not typically directly used to help estimate psychometric function parameters. Typically, confidence is not recorded. A confidence rating (e.g., “uncertain”) can be recorded and used as part of a psychometric fit procedure, but these approaches do not model how confidence quantitatively changes as the stimulus is varied. Instead these approaches can include one additional decision boundary for each added category (e.g., “uncertain”)—and can add one free parameter to the fit algorithm for each additional decision boundary.
We describe herein a confidence signal detection (CSD) model, which combines a confidence function (
When this process is repeated many times for many different stimulus levels controlled by an operator, a psychometric dataset is generated, which can be quantified by fitting a psychometric function to the dataset. Such a psychometric function can reflect an expected average performance at each stimulus level. For a Gaussian noise distribution, the psychometric function is a Gaussian cumulative density function, as shown in
A relationship between a confidence function and the psychometric function can also be determined. For a well-calibrated subject performing a symmetric task (i.e., a task having uniform priors), a perfect confidence calibration can be defined to occur when subjective confidence matches objectively assessed accuracy. For example, if a subject reported with 90% confidence that a given motion was rightward for ten separate trials at a given rightward stimulus level, perfect calibration would be reflected, on average, by 9 out of 10 of these trials actually being in the rightward direction. Because average decision-making performance is represented by the psychometric function, a perfectly calibrated confidence is reflected by a confidence function that is substantially identical to the psychometric function, i.e., χ(x)=Ψ(x)=φ(x, μ, σ). For example, for a given trial, consider that a well-calibrated subject has sampled a decision variable having an amplitude of 2.0 (i.e., one trial from the distribution shown in
Assuming that the subject has an accurate estimate of the noise distribution (σ=1 and μ=0) and ignoring the effect of any underlying neural processes, a well calibrated subject calculates the confidence by directly mapping the decision variable for each trial onto a respective “confidence function” to determine a confidence probability judgment, which for this trial yields 97.7%, i.e., φ(x=2, μ=0, σ=1)=0.977. When confidence is similarly calculated for each sampled decision variable (each of the trials represented in
In
In
We use an example to illustrate the relationship between psychometric functions and confidence for a direction-recognition forced-choice task. A typical perceptual direction-recognition paradigm begins with well-controlled stimuli that are either positive or negative; the subject's task is to determine whether the motion is positive (“rightward”) or negative (“leftward”). The stimuli provided to a subject (
We now add a confidence model to the above-described signal detection approach. If we sample a very positive decision variable for one trial (e.g., if the stimuli were very large), we can assume high confidence that the motion was positive. If we sample a positive decision variable near the decision boundary on another trial, we can decide that the motion was positive but have much less confidence in that decision. Like the empiric relationship captured by a psychometric function (({circumflex over (ψ)}(x)), a quantitative empiric relationship between confidence and the stimulus can be represented by a confidence function ({circumflex over (χ)}(x)). As for the psychometric function, with enough data, this empiric confidence function can be assumed to be representative of neural processes that can be captured by a confidence function, {circumflex over (χ)}(x)≈χ(x).
As described in detail in the Methods section entitled “Confidence Maximum Likelihood Fit Technique,” we utilize this CSD model to help improve psychometric parameter estimates. More specifically, we present a confidence analysis technique that utilizes this CSD model. We describe, develop, and investigate this model using previously published analytic, simulation, and experimental approaches. To help evaluate the contributions that confidence can make to psychometric function estimation, we report: (1) human studies for a direction-recognition task in which the subjects were required to report whether they rotated toward their left or right, and (2) simulation results for psychometric functions that range from 0 to 1, which are used for direction-recognition data analysis. We report that psychometric functions estimated using confidence probability judgments can require about 5 times fewer trials to yield the same performance as forced-choice psychometric methods without confidence analysis.
Such improved test efficiency should be realized for any forced-choice task where confidence can be reported, but may be especially important for perceptual tasks involving olfaction, gustation, equilibrium, or any other task where individual trials take, for example, tens of seconds as well as for clinical applications where more efficient and/or more precise perceptual measures could lead to improved patient diagnoses.
In some implementations, the CSD model can be used to analyze confidence probability judgments. This is illustrated using
The asterisks in the entries of Table 1 denote that the smallest stimulus magnitude was set in accordance with the minimum motion that could be reliable provided by the MOOG platform (described below with reference to
Twelve subjects completed the SVV study. Stimuli (repeated sixty times at each amplitude) provided at 0.75, 1, 1.25, and 1.5 times (as shown via the figure legend) the baseline threshold were randomly intermixed. Histograms for correct response time (RT) (
Generally, each motion of the motion platform 110 can be described by a motion profile that includes information about the direction of motion and other features related to the motion. For example, a motion can be a translational motion along any of the three perpendicular axes x, y, and z of a coordinate system centered on head of the subject 150. Referring to
The motion profile can include amplitude and frequency of the velocity and acceleration of the motion. The amplitude of the acceleration and velocity vary with time, whereas the frequency remains constant. For example, a translational motion starts with a zero velocity, accelerates to a maximum velocity, and decelerates to zero again. For example, the acceleration is sinusoidal and can be expressed as a(t)=A sin(2πft), where a(t) is the acceleration at time t, A is the acceleration amplitude, and/is the frequency. With such acceleration, starting from zero, the translational velocity v(t) at time t is v(t)=A/2πft[1−cos(2πft)]
Similarly, a rotational motion can include a sinusoidal angular acceleration and an angular velocity, both of which are expressed in a manner similar to the translational acceleration and velocity of the above-noted equations for a(t) and v(t).
The motion platform 110 moves the subject along a trajectory in a spatial coordinate system while following a velocity profile. The velocity profile relates the magnitude of velocity to time. At the beginning and end of the motion, the magnitude of the velocity is zero. At some point in between, the velocity reaches a maximum magnitude, referred to herein as “peak velocity” or “peak stimulus velocity.” In many applications, the velocity profile is one cycle of such a velocity oscillation. The reciprocal of the period of this sine wave is referred to herein as “frequency” or “motion frequency.” As noted above, the shape of the velocity profile can be sinusoidal. However, other shapes are possible, such as those defined by superpositions of weighted and/or timeshifted components.
The motion platform 110 can have a translational motion in either x, y, or z direction. Accordingly, the translation motion in either direction is referred as “x-translation”, “y-translation”, or “z-translation”, respectively. In addition, the motion platform can have various rotational motions. Rotation about the x axis is referred as “roll” rotation, rotation about the y axis is referred as “pitch” rotation, and rotation about the z axis is referred as “yaw” rotation. The movements can be caused by the stimulus signal provided by the controller 120.
In some implementations, the controller 120 can change the orientation of the motion platform 110. Alternatively, a person can manually change the orientation. For example, the motion platform can be rotated 90 degrees to the side such that the subject 150 is lying on his or her side. Considering the variety of orientations of the motion platform 110, it is useful to refer a motion of the motion platform 110 (or the subject 150) using X, Y, and Z coordinates with respect to the fixed earth 160 (or ground.) Such coordinates are referred as “earth coordinates” in this specification. The Z direction is referred as “earth-vertical” and either the X or Y direction is referred as “earth-horizontal”.
In the example illustrated in
In some implementations, the motion platform 110 can be moved to be oriented such that the body orientation of the subject 150 is different from the upright position.
Accordingly, the motion platform 110 may move the subject 150 in a variety of configurations depending on the body orientation, type, or direction of motion in head coordinates. In some implementations, the motion platform 110 can be configured to provide only one or several types of motions and body orientations. In this specification, a motion along, or aligned with, a specific direction may refer to motion in positive and negative directions of the specific direction. Similarly, a motion parallel to a specific direction may refer to motion which is parallel or antiparallel to the specific direction.
During operation, the subject 150 provides an input to the input device 130 to communicate his or her perception of motion to the processor 140.
In the example shown in
In some implementations, the input device 130 can receive a binary response from the subject 150 through locations 136 and 137. After receiving the binary response, the input device 130 can further receive a confidence rating through the confidence rating menu 138. For example, the subject 150 can augment his or her binary response by providing a confidence rating including: (1) a quasi-continuous rating (e.g., 50% confidence to 100% confidence); (2) a binary rating (e.g., guessing versus certain); (3) a quinary rating (e.g., 1 to 5 where 1 is “guessing” and 5 is “certain,” or vice versa) or an N-level discrete rating (e.g., 1 to N where 1 is “guessing” and N is “certain” or vice versa); or (4) a wagering rating (e.g., the user wagers 1-10 points with each response and loses the wagered number of points if the response is incorrect or gains the wagered number of points if the response is correct). The confidence rating can also be a combination of the forms (1)-(4). As described elsewhere herein, the received confidence rating can be used to: (1) improve the quality of estimating the psychometric function; (2) improve the efficiency of targeting stimulus levels in real-time via a closed-loop system during psychometric test; (3) reduce the negative impacts of indecision; (4) help evaluate subject's with psychometric (e.g., vestibular) dysfunctions; or (5) help evaluate malingerers. It is also understood that the confidence rating can be received before or simultaneous with the binary response.
As described above, the input device 130 can receive both the binary response and the confidence rating for a given motion, in other words, for each trial. The received data (e.g., binary response, confidence rating) can be communicated to the processor 140. The processor 140 can estimate a psychometric function and its threshold based on the communicated data. The communication can be done in a wired or wireless (e.g., WiFi, Bluetooth, or Near Field Communication) manner.
The controller 120 can instruct (e.g., by providing stimuli signals) a predefined set of motions to the motion platform. Alternatively, the controller 120 can instruct the motion platform based on the input received by the input device 130. For example, the processor 140 is configured to instruct the controller 120 to cause execution of those motions for which expected information about a subject's perception of those motions would most contribute to improving an estimate of a subject's vestibular threshold. Such an estimate can be used to construct a vestibulogram, which shows the subject's vestibular threshold at different frequencies.
Referring back to
This section presents a maximum likelihood analysis developed to help estimate psychometric function fit parameters. This technique simultaneously fits both a psychometric function ({circumflex over (ψ)}(x)) and a confidence function ({circumflex over (χ)}(x)) to confidence probability judgments.
For the process depicted in the generalized flow chart of
x
j
upper={circumflex over (χ)}−1(cjupper)
x
j
lower={circumflex over (χ)}−1(cjlower)
At step G, the operator, with this range for the decision variables for the given stimulus (sj), calculates the probability of this specific confidence probability judgment given the fitted psychometric function:
p
j={circumflex over (Ψ)}(xjupper)−{circumflex over (Ψ)}(xjlower)
At step H, the operator repeats steps F and G n times, for example, once for data from each of n trials and computes an appropriate cost function, C({circumflex over ({right arrow over (θ)})};{right arrow over (c)},{right arrow over (s)})=g(pj). At step I, the operator repeats steps F through H while varying the fit parameters ({circumflex over ({right arrow over (θ)})}) to optimize the cost function.
For the process depicted in the flow chart of the specific model of
x
j
upper=ϕ−1(cjupper,0,{circumflex over (k)}{circumflex over (σ)})
x
j
lower=ϕ−1(cjlower,0,{circumflex over (k)}{circumflex over (σ)})
At step G, the operator, with this range for the decision variables for the given stimulus (sj), calculates the probability of this specific confidence probability judgment given the fitted psychometric function, pj=ϕ(xjupper,sj+{circumflex over (μ)},{circumflex over (σ)})−ϕ(xjlower,sj+{circumflex over (μ)},{circumflex over (σ)}). At step H, the operator repeats steps F and G n times, for example, once for data from each of n trials, and calculates a log likelihood function by summing the logarithm of each of the n probability values:
At step I, the operator repeats steps F through H while varying {circumflex over (μ)},{circumflex over (k)}, and. {circumflex over (σ)} to maximize the log likelihood function.
To describe the confidence-based technique we assume that the fitted psychometric function can be represented by a Gaussian cumulative distribution function (ϕ) having two fit parameters ({circumflex over (μ)}, {circumflex over (σ)}):
{circumflex over (ψ)}(x)=ϕ(x;{circumflex over (μ)},{circumflex over (σ)}) (1)
where {circumflex over (μ)} represents shifts in the psychometric function (i.e., mean value of the noise distribution) and represents the width of the psychometric function (i.e., standard deviation of the noise distribution), which is often referred to as the threshold for direction-recognition tasks. Assuming that subjects based their confidence assessment on the signal used to make their decision, we modeled the fitted confidence function as a Gaussian cumulative distribution function having one additional free parameter, a confidence-scaling factor ({circumflex over (k)}) that scales this average confidence function to account for under-confidence or over-confidence, as previously demonstrated in
{circumflex over (χ)}(x)=ϕ(x,{circumflex over (μ)},{circumflex over (k)}{circumflex over (σ)}) (2)
We assume a Gaussian confidence function for simplicity, but other shapes of the confidence function were investigated via simulations to evaluate the impact of this assumption. Noise was not explicitly included in this relationship; noise may be present in the mapping from a decision variable to the confidence response, and we evaluate the impact of additive noise via simulations.
{circumflex over (x)}
j={circumflex over (χ)}−1(cj)=ϕ−1(cj;{circumflex over (μ)},{circumflex over (k)}{circumflex over (σ)}) (3)
where cj represents a confidence probability judgment, {circumflex over (χ)}−1(cj) represents the inverse fitted confidence function, and ϕ−1 represents the inverse cumulative Gaussian. The precise probabilistic interpretation of a confidence probability judgment depends on the resolution of the subjective scale provided the subject. Our subjects provided a confidence probability judgment using a scale that had a resolution of 1%. Therefore, when a subject provided a confidence probability judgment of 70%, we set the lower (cjlower) and upper (cjupper) bin limits to 69.5 and 70.5%, respectively. Using equation 3, lower (xjlower) and upper (xjupper) decision variable limits can be calculated for a given confidence probability judgment, cj.
As illustrated schematically (
p
j={circumflex over (Ψ)}(xjupper)−{circumflex over (Ψ)}(xjlower)=ϕ(xjupper;sj+{circumflex over (μ)},{circumflex over (σ)})−ϕ(xjlower;sj+{circumflex over (μ)},{circumflex over (σ)}) (4)
where sj is the stimulus provided on that trial.
Repeating this process for each of the N trials, we can then calculate the log likelihood by simply summing the log of each of the individual trial likelihoods, which can be written as:
We find the maximum likelihood fit by numerically finding the three fit parameters ({circumflex over (μ)},{circumflex over (σ)},{circumflex over (k)}) that maximize the value of this log likelihood function. This method assumes that the confidence judgment utilizes a similar decision variable as the binary decision-making process. The methods described herein also assume that all processes and mechanisms (e.g., decision boundary, confidence estimation, etc.) are stationary (i.e., constant) across time. This stationarity assumption is included in some psychometric function fits as well.
To provide a direct comparison of the confidence-based fitting method to standard binary forced-choice fitting methods, we fit psychometric curves to the binary data using a maximum likelihood approach. For our forced-choice direction-recognition task, the subject's directional responses are binary (e.g., left or right) and the psychometric function ranges from 0 to 1. A Gaussian distribution was fitted to the data using MATLAB® programming language to generate a generalized linear model using a probit link function. An example of a psychometric function fit to binary data is shown in
The general technique used to fit a psychometric function and a confidence function to the confidence data was described with respect to
Each subject was seated in a racing-style chair with a five-point harness; his/her head was fixed relative to the chair and platform via an adjustable helmet. Each subject wore a pair of noise cancelling earpieces that also provided the ability to communicate with the experimenter. All motions were performed in darkness. Subjects performed a binary forced-choice direction-recognition task in response to upright whole-body yaw rotation. Aural white noise began playing in the subject's earpiece 300 ms before motion commenced and ended when the motion ended. This aural cue was provided to mask any potential directional auditory cues and also informed the subject when a trial began and ended. When the motion and white noise ended, a tablet computing device illuminated and subjects were required to report the motion direction perceived and a confidence probability judgment. Single cycles of sinusoidal acceleration at 1 Hz were used as the motion stimuli. Motion stimuli were generated using MOOG® 6 degrees of freedom motion platform. There was a pause of at least 3 s between motions. An adaptive sampling procedure—a standard 3-Down/1-Up (3D/1U) staircase using PEST rules was utilized. The initial stimulus amplitude was 4°/s.
Subject responses, both the direction responses (i.e., left or right) and quantitative confidence probability judgments having a resolution of 1%, were recorded using a tablet computing device (e.g., an iPad® tablet computing device). Before each trial, the tablet computing device backlighting was turned off. When the trial ended, the tablet computing device was automatically illuminated to display sliders (one on the left and one on the right) that ranged from 50% to 100%. The subject tapped on the left side of the tablet computing device to report perceived motion to the left and tapped on the right side to report perceived motion to the right. Subjects could then move the selected slider up/down to indicate their confidence. To avoid biasing the subject's confidence responses, a slider position was not displayed until the subject touched the screen to indicate their confidence (i.e., no initial slider position was provided to the subject). The subject's responses including both directions (left/right) and confidence (50% to 100%) were displayed on the screen. The subject could adjust their response until satisfied. The subject then invoked a button on the tablet computing device labeled “Confirm”. At the beginning of the testing, the subjects practiced for a few trials to improve their understanding of the task.
These human studies utilized a half-range task in which subjects used a scale between 50% and 100%, inclusive. Confidence probability judgment tasks can be full-range tasks or half-range tasks. Full-range scales range between 0% and 100%, while half-range scales range between 50% and 100%. For our half-range task, a subject could report that they perceived negative motion and report 84% confidence. For a full range task with the subject asked to report their confidence that the motion was positive, the equivalent response would be a 16% confidence that the motion was positive. To plot, model, and fit the data, we used this mathematical equivalence to convert each half-range confidence rating to a full-range rating.
Subject instructions indicated that the motion direction would be selected randomly and that the directions of previous motions would not impact the next motion direction. Instructions also indicated that expectations regarding the distribution of confidence assessments and that they report the confidence that they experienced for each specific trial. Subjects were informed that “ . . . if you are guessing much of the time, this is OK, and if you are very certain much of the time this is OK, too.” Subjects were not provided information regarding their confidence indications. During the initial training that did not exceed 10 practice trials, subjects were informed whether their left/right responses were correct or incorrect. During test sessions, subjects were not informed whether their responses were correct or incorrect.
Four healthy human subjects (2 male, 2 female, 26-34 years old) were each tested on six different days. Informed consent was obtained from all subjects prior to participation in the study. The study was approved by the local ethics committee and was performed in accordance with the ethical standards laid down in the 1964 Declaration of Helsinki.
For one subject, since the computer randomized the motion direction for each trial just before the trial and since the adaptive staircase targets stimuli where an average subject may get about 20% of the trials incorrect, this subject did not have information to guide his binary reports or confidence judgments on each individual trial. As noted in the results, this subject's responses did not differ from the other subjects in any noticeable manner.
All simulations were performed using MATLAB® computing language, Release R2015a (The Mathworks®, Inc.) using parallel IBM® BladeCenter® HS21 XMs with 3.16 GHz Xeon® processors and 8 GB of RAM. These simulations used the same standard adaptive sampling procedure used for the human studies. Specifically, we used a 3-Down/1-Up (3D/1U) staircase having a hundred trials. The simulated 3D/1U staircases began at a stimulus level of four. The size of the change in stimulus magnitude was determined using PEST (parameter estimation by sequential testing) rules.
For all four simulated data sets included and described herein, the psychometric function, Ψ(x)=ϕ(x;μ=0.5,σ=1), and the fitted psychometric function, {circumflex over (Ψ)}(x)=ϕ(x;{circumflex over (μ)},{circumflex over (σ)}), were modeled as cumulative Gaussians.
For the first simulated data set (as represented, for example, in the first column of
For the first three simulated data sets (
Two simulation data sets included additive noise. For the simulations shown in
Fitted psychometric function ({circumflex over (μ)},{circumflex over (σ)}) and confidence scaling ({circumflex over (k)}) parameters for each of our four subjects for yaw rotation about an earth-vertical rotation axis are shown in FIGS. 4A to 4L depicting the mean and 5 depicting the standard deviation. Parameter fits are plotted versus the number of trials in increments of 5 trials starting at the 15th trial. To demonstrate raw performance for individual test sessions,
Consistent with studies utilizing adaptive procedures, the confidence-agnostic estimates of the width of the psychometric function ({circumflex over (σ)}) took between fifty and a hundred trials to stabilize (
In contrast, estimates of the width parameter ({circumflex over (σ)}) using the confidence fit technique could require fewer than twenty trials to reach stable levels (
Furthermore, the parameter estimates obtained using confidence-agnostic psychometric fits (
The estimates of the shift of the psychometric functions ({circumflex over (μ)}) showed a qualitatively similar pattern; the estimates that utilized confidence reached stable levels a sooner and were more precise than the estimates provided by the analysis that does not account for confidence. We also note that, for three of our subjects (
We also simulated tens of thousands of test sessions to test the confidence fit procedures described herein. The simulations simulated the human studies with a difference being that we defined the simulated psychometric (Ψ(x)) and confidence (χ(x)) functions. Since we knew these simulated functions, this allowed us to quantify parameter fit accuracy. For all simulated data sets, we fit the binary forced-choice data without confidence analysis and compared and contrasted these fits with the CSD fits Histograms show fitted parameters after twenty (
Mimicking the format previously used for the human data (
Tables 2-5 summarize results from 10,000 simulated test sessions for direct quantitative comparisons. These fit parameters are the same as shown graphically in
Brier Score (BS) and three decomposed components; Reliability (REL), Resolution (RES), and Uncertainty (UNC) are shown. For definitions and descriptions, see Yates J F. Judgment and decision making. Prentice-Hall, Inc, 1990. We use a formulation that results by dividing Brier's original formulation by two. Deviance for each of two fits are also shown. Mean (and standard deviation) across 6 trials for each subject and across 10,000 simulations for each simulated data set are provided.
The simulated data (see, e.g.,
To demonstrate robustness, we utilized the same Gaussian confidence fit model (Equation 2) while simulating a confidence model that differed from the Gaussian confidence fit model. For example, we modeled the confidence function as a linear function (slope of 0.1445; i.e., σ=2) instead of a cumulative Gaussian. We also added zero-mean uniform noise, U(−0.1, 0.1), to the simulated confidence response. The confidence fit of these simulated data are similar to the earlier confidence fits well (see, e.g.,
Finally, to demonstrate the flexibility of the confidence fit technique, we model the same linear confidence function from the previous paragraph, but we now add less extreme zero-mean uniform noise levels (U(−0.05, 0.05)) and fit a linear confidence function that mimics the linearity of the true confidence function used for these simulations. The fit accuracy and precision were very good (
This disclosure describes a new confidence signal detection (CSD) model (
We assumed that the binary decision variable is used to determine confidence probability judgments. Empirical data presented herein suggests that, at least for yaw rotation thresholds, confidence can correlate with the sampled decision variable used to make a decision.
Similar benefits can also result from replacing the confidence function and confidence recordings with a magnitude estimation function accompanied by magnitude estimation recordings (e.g., I rotated +3°). Other analogous perceptual functions and associated recordings could be used. The confidence probability judgments could alternatively or additionally be replaced by another confidence assay accompanied by an appropriate model of the confidence assay.
In this disclosure, we describe improving the efficiency of psychometric parameter fits. The fitting technique can also be used to calculate a confidence-scaling factor, which may be used in experimental studies examining confidence calibration. The confidence modeling approach described herein with respect to, for example,
Human experimental data showed that stable psychometric function parameters could be estimated in as few as twenty trials, as shown in and described with respect to
The four subjects had not previously performed experiments utilizing a confidence task prior to our studies of confidence. None of the subjects was provided feedback, except during the initial practice trials that consisted of 10 trials when subjects were informed whether their responses were correct or incorrect. No specific feedback regarding confidence was provided during the course of the study. Despite intentionally limiting the training and feedback, each subject yielded a coherent data set across the six test sessions. All four subjects were experienced observers.
For our task, we minimized the potential impact of after-effects from the previous trial by waiting at least 3 seconds between the end of a stimulus and the start of the next stimulus. Therefore, providing confidence did not require substantial additional time (e.g., greater than 3 seconds) to complete each trial than a confidence-agnostic binary forced-choice task for our direction-recognition task
For knowledge tasks, investigations showed that simple and short training sessions that can provide direct confidence calibration information can lead to improved calibration and that such improvement could generalize to tasks beyond the one calibrated. In some implementations, generalized calibration can be used to help calibrate confidence prior to testing as part of the process by which subjects are taught the requisite tasks. The generalized calibration techniques can be used during, for example, clinical testing. Alongside existing measures, such as, e.g., the Brier score and its calibration and resolution components, the confidence-scaling factor described herein could be used to determine if confidence calibration is an individual trait and to train confidence calibration. The confidence modeling approach can determine psychometric fit parameters in less than hundreds of trials. For example, the approach can determine the parameters in about twenty to fifty trials as shown and described with respect to
Simulations confirmed that the confidence-based fitting technique described herein can yield accurate psychometric parameter estimates and a marked efficiency improvement. Simulations assuming (a) a confidence model that was not matched by the fitting model and (b) large additive confidence noise—likely greater than actual confidence noise—yielded psychometric function parameter estimates that were similar to the parameter estimates of the simulated psychometric function much more efficiently than confidence-agnostic psychometric fits (e.g., 20 trials as compared to 50-100 trials). Simulations thus indicate that the confidence signal detection methods can be robust to noise in the confidence function.
The confidence-based fitting method can be used for psychometric functions that range from, for example, 0 to 1, as confirmed by simulations. The fitting method can further be used for a specific vestibular direction-recognition task. Simulation results can be applicable to all tasks yielding psychometric functions that range from, for example, 0 to 1. Furthermore, the confidence-based technique can be applied to other tasks, such as, for example, detection tasks (e.g., yes/no tasks) or to two-alternative forced choice detection tasks where the subject identifies the interval (or location) when (or where) the signal occurred. These tasks can have different psychometric ranges (e.g., 0.5 to 1).
The CSD model depicted in, for example,
The computing system 1700 can include, for example, a processor 1710, a memory 1720, a storage device 1730, and an input/output device 1740. Each of the components 1710, 1720, 1730, and 1740 are interconnected using a system bus 1750. The processor 1710 is capable of processing instructions for execution within the system 1700. In one implementation, the processor 1710 is a single-threaded processor. In another implementation, the processor 1710 is a multi-threaded processor. The processor 1710 is capable of processing instructions stored in the memory 1720 or on the storage device 1730 to display graphical information for a user interface on the input/output device 1740. The processor 1710 can be operable with electrical and electromechanical components of the vestibular testing system.
The memory 1720 stores information within the system 1700. In some implementations, the memory 1720 is a non-transitory computer-readable medium. The memory 1720 can include volatile memory and/or non-volatile memory.
The storage device 1730 is capable of providing mass storage for the system 1700. In one implementation, the storage device 1730 is a non-transitory computer-readable medium. In various different implementations, the storage device 1730 may be a hard disk device, an optical disk device, or a solid state memory device. The memory 1720 and/or the storage device 1730 can store treatment parameters and parameters of the electromechanical systems of the vestibular testing system described herein. These components can also store data collected by various sensors of the vestibular testing system. The memory 1720 and/or the storage device 1730 can also store data regarding the inputs (e.g., power input) into electromechanical components of the vestibular testing system. In some cases, the memory 1720 and/or the storage device 1730 can also store data pertaining to the progress of the treatment, such as the amount of fluid delivered or the duration of treatment that has elapsed.
The input/output device 1740 provides input/output operations for the system 1700. In some implementations, the input/output device 1740 includes a keyboard and/or a pointing device. In some implementations, the input/output device 1740 includes a display unit (e.g., a touchscreen display) for displaying graphical user interfaces. In some implementations the input/output device can be configured to accept verbal (e.g., spoken) inputs. The touchscreen display device may be, for example, a capacitive display device operable by touch, or a display that is configured to accept inputs via a stylus.
The features computing systems described herein can be implemented in digital electronic circuitry, or in computer hardware, firmware, or in combinations of these. The features can be implemented in a computer program product tangibly embodied in an information carrier, e.g., in a machine-readable storage device, for execution by a programmable processor; and features can be performed by a programmable processor executing a program of instructions to perform functions of the described implementations by operating on input data and generating output. The described features can be implemented in one or more computer programs that are executable on a programmable system including at least one programmable processor coupled to receive data and instructions from, and to transmit data and instructions to, a data storage system, at least one input device, and at least one output device. A computer program includes a set of instructions that can be used, directly or indirectly, in a computer to perform a certain activity or bring about a certain result. A computer program can be written in any form of programming language, including compiled or interpreted languages, and it can be deployed in any form, including as a stand-alone program or as a module, component, subroutine, or other unit suitable for use in a computing environment.
Suitable processors for the execution of a program of instructions include, by way of example, both general and special purpose microprocessors, and the sole processor or one of multiple processors of any kind of computer. Generally, a processor will receive instructions and data from a read-only memory or a random access memory or both. Computers include a processor for executing instructions and one or more memories for storing instructions and data. Generally, a computer will also include, or be operatively coupled to communicate with, one or more mass storage devices for storing data files; such devices include magnetic disks, such as internal hard disks and removable disks; magneto-optical disks; and optical disks. Storage devices suitable for tangibly embodying computer program instructions and data include all forms of non-volatile memory, including by way of example semiconductor memory devices, such as EPROM, EEPROM, and flash memory devices; magnetic disks such as internal hard disks and removable disks; magneto-optical disks; and CD-ROM and DVD-ROM disks. The processor and the memory can be supplemented by, or incorporated in, ASICs (application-specific integrated circuits).
The features can be implemented in a computer system that includes a back-end component, such as a data server, or that includes a middleware component, such as an application server or an Internet server, or that includes a front-end component, such as a client computer having a graphical user interface or an Internet browser, or any combination of them. The components of the system can be connected by any form or medium of digital data communication such as a communication network. Examples of communication networks include, e.g., a LAN, a WAN, and the computers and networks forming the Internet.
The computer system can include clients and servers. A client and server are generally remote from each other and typically interact through a network, such as the described one. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
The processor 1710 carries out instructions related to a computer program. The processor 1710 may include hardware such as logic gates, adders, multipliers and counters. The processor 1710 may further include a separate arithmetic logic unit (ALU) that performs arithmetic and logical operations.
It is to be understood that while the technology has been described in conjunction with the detailed description, the foregoing description and Examples are intended to illustrate and not limit the scope defined by the appended claims. Other aspects, advantages, and modifications are within the scope of the following claims.
This application claims priority to U.S. Provisional Application 62/275,601, filed on Jan. 6, 2016, the entire content of which is incorporated herein by reference.
This work was supported in part by NIH/NIDCD grant DC04158 and NIH grant R56DC012038. The United States government may have certain rights in the invention.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2017/012456 | 1/6/2017 | WO | 00 |
Number | Date | Country | |
---|---|---|---|
62275601 | Jan 2016 | US |