This disclosure relates generally to methodology for applying mathematical modeling techniques in the area of medical evaluation, and more specifically to methods and systems for performing a clinical assessment and for improving the reliability of a clinical assessment.
Mathematical modeling techniques are known and include disparate technologies, like Kalman filters, which can work to an end of performing an estimation of a signal by combining data from more than one source.
The present disclosure provides methods and systems which allow a user, such as a physician or other clinical care provider, to perform a clinical assessment or to improve the reliability of a clinical assessment through the combination of the assessment with other signals that are recorded from a patient including, but not limited to, voice or motion patterns. In various aspects, the invention allows the physician or clinical care provider to perform a more reliable clinical rating scale.
In an embodiment, the invention provides a method for performing clinical assessment of a patient that includes determining of a base clinical assessment for the patient by generating information on a clinical rating scale. At least one objective signal is recorded, and each objective signal involves an indicator corresponding to the state of the patient or the state of the patient's environment. Each objective signal is analyzed for generating a corresponding rating on the clinical rating scale. The clinical assessment of the patient is provided by combining the information from the base clinical assessment with the information generated from analysis of each objective signal. Alternatively, the clinical assessment may be based exclusively on information generated by analysis of each objective signal.
Each objective signal may be analyzed by relating the signal to the base clinical assessment. Analyzing the objective signals includes application of a mathematical model. The mathematical model may be improved by determining at least one base clinical assessment and recording a corresponding at least one objective signal for a plurality of patients. Each base clinical assessment is obtained at the same time or at nearly the same time as the corresponding objective signal. Each objective signal is then related to a clinical state on the basis of the corresponding base clinical assessment. Alternatively, the mathematical model may be improved by determining a plurality of base clinical assessments and recording a plurality of corresponding objective signals for a specific patient. Each base clinical assessment is determined at the same time or at nearly the same time as the corresponding objective signal. Each objective signal is then related to a clinical state for the specific patient on the basis of its corresponding base clinical assessment. The mathematical model may include a regression approach. Alternatively, the mathematical model may include application of neural networks.
The clinical rating scale may be classified within one of, scales for social health, scales for psychological well being, scales for anxiety, scales for depression, scales for mental status testing, scales for pain measurements, scales for general health status, and scales for quality of life. More specific embodiments of the clinical rating scale may include PHQ-9, visual analog scale for pain, APGAR score for neonatal health, Quality of Life scale, or HAM-D. Without limitation, the invention is used to assess psychiatric diseases (depression, bipolar disease, schizophrenia, anxiety, etc.), endocrine diseases (diabetes, cushings syndrome, thyroid disorders, etc.), cardiac conditions (congestive heart disease, hypertension, peripheral vascular disease, etc.), pain disorders (chronic pain, back pain, etc.), inflammatory diseases (arthritis, inflammatory bowel disease, psoriasis, etc.), neurological conditions (epilepsy, headaches, traumatic brain injury, etc.), and rehabilitation (post cardiac bypass surgery rehabilitation, etc.).
The base clinical assessment may include assessment of the patient by a healthcare provider. The base clinical assessment may alternatively include a self-report performed by the patient.
Objective signals may be recorded periodically, to provide updates to the base clinical assessment. Objective signals may be recorded by a sensor. The objective signal may include galvanic skin conductance or a recorded speech sample from the patient. Where the objective signal is a recorded speech sample, based on the clinical rating generated for the objective signal, the patient may be subjected to an additional clinical assessment on the clinical rating scale. Where the objective signal is a speech sample, the signal may be recorded over a communication device, including a phone, and may be recorded by an Interactive Voice Response (IVR) Server.
The base clinical assessment may also be obtained from a patient over a communication device, including a phone and may be recorded by an IVR Server.
Combining the information generated by the base clinical assessment with information generated by analysis of the objective signal may include application of a mathematical model. The applied mathematical model may include a Kalman filter.
Where the objective signal is a speech sample, it may be analyzed by applying speech analysis techniques to extract voice features. Extraction of voice features may include identification of voiced segments of a speech sample. Voice features are then extracted from voiced segments of the speech sample. Identification of voiced segments in a speech sample includes applying a two-level Hidden Markov Model. The two-level Hidden Markov Model includes use of at least one of autocorrelation, entropy, and residual amplitude structure of the speech samples and may be applied to 30 millisecond speech samples. The identification of voiced segments may be iteratively improved using the Baum-Welch Expectation Maximization technique.
Voice features extracted from a speech sample include Class I voice features and Class II voice features. Class I features include one or more of formant frequency, confidence in formant frequency, spectral entropy, value of largest autocorrelation peak, location of largest autocorrelation peak, number of autocorrelation peaks, energy in frame and time derivative of energy in frame. Class II features include one or more of average length of voiced segment, average length of speaking segment, fraction of time speaking, voicing rate, fraction speaking over, average number of short speaking segments per minute, entropy of speaking lengths and entropy of pause lengths.
The objective signal may be analyzed and correlated to the clinical rating scale by providing inputs from a plurality of models (m) and uniquely corresponding meta models (m′) to a neural network. Information for correlating the objective signal to the clinical rating scale is generated by the neural network on the basis of said inputs. Inputs are provided by the models (m) and meta models (m′) on the basis of voice features extracted from the objective signal. A score on the clinical rating scale is predicted by each model (m). A corresponding confidence rating is provided by each meta model (m′). The confidence rating provided by each meta model (m′) may include a higher rating when the respective model (m) is probabilistically correct, and a lower rating when the respective model (m) is probabilistically incorrect.
In various embodiments of the present invention, the method for performing clinical assessment of a patient may be provided as a computer program product having computer readable instructions embodied therein.
These and other features and advantages of the present disclosure will be apparent to those skilled in the art of statistics driven clinical assessments from a review of the following detailed descriptions along with the accompanying figures.
In management of a patient with a particular disease or condition, a physician or other care provider often uses a standard clinical assessment rating scale such as the Hamilton Depression Rating Scale (HDRS/HAM-D) for assessing levels of depression, the APGAR score for assessing neonatal health or the Quality of Life scale for assessing a patient's functional status. A patient may also rate his or her own disease or condition through a scale such as the Patient Health Questionnaire (PHQ-9) for assessing depression or the visual analog scale for pain (which may be used by patients to self-report levels of pain). Such standard clinical assessments are often used in clinical decision making, such as in deciding to change a medication dosage or refer a patient to a different level of medical care. Without providing an exhaustive recitation, clinical rating scales may be classified inter alia as falling within one of, scales for social health, scales for psychological well being, scales for anxiety, scales for depression, scales for mental status testing, scales for pain measurements, scales for general health status, and scales for quality of life.
By way of example, in the case of major depression, the HDRS and PHQ-9 are known to be correlated with the disease or symptom severity. Often after a clinical interview or patient self-assessment, a physician or other care provider will use the numbers from these scales to increase the dose of an anti-depressant, change the class of medications, request the patient to visit a specialist for evaluation, and so on. The numbers generated through these scales form an important part of the medical evaluation. However, there are two limitations to the standard use of clinical rating scales that motivate this disclosure.
Clinical rating scales that have strong subjective components suffer from poor inter-rater reliability. In other words, two physicians or other care providers may rate a patient's mood differently using a clinical rating scale, based on their subjective clinical impressions of the patient. For example, the first field in the standard HAM-D asks the interviewer to score a ‘1’ if he or she thinks the patient indicated sadness, hopelessness, helplessness, or worthlessness only on questioning, or a ‘3’ if the patient communicated these feeling states through non-verbal cues such as facial expression, posture, voice, and tendency to weep. This scoring is subject to the impression of the interviewer, and may differ between interviewers. The higher the tally of such fields, the greater is the severity of depression.
In addition, clinical rating scales are performed at discrete, and often lengthy, time intervals through the course of clinical management of a patient. For example, a patient may be diagnosed with major depression and have an HDRS before initiation of anti-depressant medications. The physician or other care giver may conduct another HDRS at the patient's next visit. This second visit may occur more than four weeks after the initial visit. During the four weeks, the only mood rating that the physician or other care giver may have for the patient would be the initial HDRS performed. This rating scale becomes a poor estimate of the patient's mood rating as time progresses, and the physician or other care giver does not have an effective method or system to improve the reliability of that initial estimate.
The present disclosure addresses these shortcomings by providing both a method to make a clinical assessment more objective by combining it with an objective measurement (e.g. voice analysis), and by providing a method through which more frequent objective measurements may be factored in to update an older clinical assessment. The disclosure also provides a method to improve the overall reliability of a clinical assessment and a method for arriving at a clinical assessment based exclusively on an objective measurement.
If the recorded objective signal 301 is a speech sample, such sample could provide for extraction of features including the formant frequency, confidence in formant frequency, energy in frame, spectral entropy, value and location of largest autocorrelation peak, number of autocorrelation peaks, time derivative of energy in frame, and average length of voiced segment.
A formant is a resonant frequency and formant frequencies can be found by looking for peaks in the speech signal in the frequency domain. An autocorrelation can be performed to find periodicities within a signal x(t) with mean mx for all lags k=0, 1, 2 . . . N−1.
Spectral entropy is a measure of the disorder of a signal in the frequency domain. To arrive at the spectral entropy of a given speech sample, first a probability function of a power spectral density is created based on a magnitude square of the Fourier coefficients. Normalization of the function when done with respect to the total power of Fourier coefficients then yields a probability function used to compute entropy. The mathematical model M0 may use these and other techniques that would be apparent to a person of skill in the art, for extracting relevant features from the recorded speech sample, or from other recorded objective signals.
Various techniques and mechanisms for achieving the mathematical model M1 would present themselves to a person of skill in the art. In an aspect of the invention, model M1 may be implemented by application of regression. To determine which objective signals are related to a clinical rating scale, a stepwise linear regression can be performed. The goal of said linear regression is to discover the linear combinations of signals which, taken together, would predict the maximum amount of variance in the rating scales and outcomes. This procedure would produce a linear function of the signals that predicts the rating scale. To avoid over fitting and other statistical estimation problems, a cross-validation can be performed including by way of a 5-fold, ‘leave-twenty percent-out’ method, with decision boundaries such that the difference between classification accuracy for the training and test data is minimized.
Implementation of model M1 may also be achieved by way of a neural network. The objective signals may be provided to a Multilayer Perceptron (MLP) or a “blackbox” that creates a network with a single hidden layer and corresponding weights and bias. For neural networks it is proven that there is always a single hidden layer that can approximate a multiple hidden layer. This combination of weighted vectors provides one output that can be correlated with a rating scale. The error of the index and the rating scale indicates how much more training is required for the neural network. A threshold can be set to a 5% change wherein, if an improvement in results is greater then 5% from the previous model, said improved neural network may be used as the modified network. It is understood that the above are only some of the techniques that would present themselves to a person of skill in the art with a view to implement model M1.
The result of the data analysis using M1 is a mathematical equation that relates an objective signal or set of signals at a given time point to the clinical rating scale or disease state at the same time point. By way of example, it may be computed that a model that combines the pitch and energy within a patient's voice at a given time is highly correlated with a patient's PHQ-9 or mood at the same time. As new measurements 402 are performed, the model M1 (401) is improved. Thus the model M1 provides a means to estimate the patient's clinical rating or clinical state at a given time if surrogate measures such as the pitch or galvanic skin response are available in the form of recorded objective signals. Where the objective signal is a recorded speech sample, based on the clinical rating generated for the objective signal, the patient may be subjected to an additional clinical assessment on the clinical rating scale.
The method then provides an assessment of the patient's clinical state on the clinical rating scale, on the basis of the rating generated by analysis of the objective signal or set of signals. The estimate may be based entirely on the rating generated by analysis of the objective signal or set of signals, or may combine such rating with the initial or base clinical assessments 201 performed on the patient. Another mathematical model M2 (403) may be used to combine the data of 201 and 301 to provide a more reliable estimate of the clinical assessment. Mathematical model M2 achieves this by combining the data 201 and the estimate that M1 makes of the patient's clinical state in terms of the rating scale used to create 201 (e.g. the PHQ-9) based on the data 301 using the relationship M1 derives between 301 and 201. The techniques used for implementing model M2 may include a Kalman filter.
A scenario in which this type of training could apply is as follows. A patient may use his or her cellular phone to call and perform a PHQ-9 at various time points. With each PHQ-9, the patient may also explicitly leave a voice sample on a computer, or the patient's voice from phone calls completed around the time of the PHQ-9 may be analyzed. The result will be frequent samples of voice and PHQ-9 scores performed around the same times. The data so obtained may be used to train M1. The clinical assessment by way of a rating on the clinical rating scale, and the patient's speech sample can be recorded over phone by an IVR Server.
The mathematical model discussed in the preceding paragraph is further described by the following relationship:
μ=[σz
where, μ and σ are the mean and standard deviation of the Gaussian distribution 901 respectively, σz1 and σz2 are the standard deviation of 701 and 801 respectively and z1 and z2 are observations conducted close in time. Equation 1 demonstrates that σ is lower than either σz1 or σz2.
The final form of the Kalman filter that may be used to implement the mathematical model M2 is:
where, for example, {circumflex over (x)}(t2) is an estimate of a patient's PHQ-9 score at time t2, z1 is a PHQ-9 result and z2 is a voice feature (that has been converted through mathematical model M1 and is expressed in terms of a PHQ-9 score).
The relationship provided in Equation 3 hereinbelow, relates the estimate of PHQ-9 at time t2 to the estimate of PHQ-9 at time t1 or {circumflex over (x)}(t1).
{circumflex over (x)}(t2)={circumflex over (x)}(t1)+K(t2)[z2 . . . {circumflex over (x)}(t1)]
K(t2)=σt
The outputs of m and m′ are then feed in a Neural Network 1205 that is again trained using as its inputs the outputs from all models and meta models. The neural network uses the m′ 0 to 1 confidence interval as well as the predicted output ms to determine a final output score. Further refinements can be made such that only subsets of the training data are sent to particular models.
The present disclosure uses the surrogate measures both in addition, and also instead of the clinical ratings that are traditionally performed on or by the patient to increase the reliability of the overall clinical assessment. For example, in one embodiment of the present disclosure a patient may perform a PHQ-9 self-report at a first clinic visit and then be asked to call into a phone system and leave a voice sample every other day that an algorithm computes the pitch based upon. As described hereinabove, regular pitch measurements can be combined using a Kalman filter with the original PHQ-9 to provide an ‘updated’ PHQ-9 that gives a more reliable assessment of the patient's depression severity.
References and/or the use of the articles “a” or “an”, unless otherwise specified herein, can be understood to include references to one or more of the noun to which the articles refer. Accordingly, throughout the entirety of the present disclosure, use of the articles “a” or “an”, unless otherwise provided, is for convenience only and is not intended to limit the noun in the singular. Use of the article “the” is also for convenience, and is not intended to limit the modified noun in the singular, and/or otherwise indicate that the disclosed methods and systems are limited to the description/depiction of the modified noun.
Although the methods and systems have been described relative to a specific embodiment thereof, they are not so limited. Obviously many modifications and variations may become apparent in light of the above teachings.
In addition, the method for performing clinical assessment of a patient may be provided as a computer program product having computer readable instructions embodied therein.
Many additional changes in the details, materials, and arrangement of parts, herein described and illustrated, can be made by those skilled in the art. Accordingly, it will be understood that the methods and systems provided herein are not to be limited to the embodiments disclosed herein, can include practices otherwise than specifically described, and are to be interpreted as broadly as allowed under the law.
This application claims the benefit of U.S. Provisional Application No. 60/895,868 filed on Mar. 20, 2007. The contents of which is hereby incorporated by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
60895868 | Mar 2007 | US |