This invention relates to measurement of physiological parameters and more particularly to a simple, low-cost method for measuring multiple physiological parameters using digital color video.
The option of monitoring a patient's physiological signals via a remote, non-contact means has promise for improving access to and enhancing the delivery of primary healthcare. Currently, proposed solutions for non-contact measurement of vital signs such as heart rate (HR) and respiratory rate (RR) include laser Doppler [1], microwave Doppler radar [2] and thermal imaging [3, 4]. The numbers in brackets refer to the references included herewith, the contents of all of which are incorporated herein by reference. Non-contact assessment of heart rate variability (HRV), an index of cardiac autonomic activity [5], presents a greater challenge and few attempts have been made [6-8]. Despite these impressive advancements, a common drawback of the above methods is that the systems are expensive and require specialist hardware.
Photoplethysmography (PPG) is a low-cost and noninvasive means of sensing a cardiovascular blood volume pulse (BVP) through variations in transmitted or reflected light [9]. Although PPG is typically implemented using dedicated light sources (e.g., red and/or infra-red wavelengths), Verkruysse et al. showed that pulse measurements from the human face are attainable with normal ambient light as the illumination source [10]. However this study lacked rigorous physiological and mathematical models amendable to computation; it relied instead on manual segmentation and heuristic interpretation of raw images with minimal validation of performance characteristics.
According to a first aspect, the invention is a method for measuring physiological parameters. The method includes capturing a sequence of images of a human face and identifying the location of the face in a frame of the captured images and establishing a region of interest including the face or a subset thereof. Pixels in the region of interest are separated into at least two channel values forming raw traces over time. The raw traces are decomposed into at least two independent source signals. At least one of the source signals is processed to obtain a physiological parameter.
In an embodiment of this aspect of the invention, the pixels are spatially averaged in the region of interest to yield a measurement point for each of the at least two channel values for each frame. This embodiment may include detrending and normalizing the raw traces.
In another preferred embodiment of this aspect of the invention the identifying location step utilizes a boosted cascade classifier. In this embodiment, the region of interest is a box drawn around the face or a subset thereof. The traces may be approximately five seconds to fifteen minutes long. In a preferred embodiment, the detrending step is applied to the raw traces. The raw traces are normalized and in a preferred embodiment the decomposing step uses independent component analysis.
In another preferred embodiment of this aspect of the invention the processing step includes smoothing and filtering of the separated source signals. In preferred embodiments, the physiological parameters include the blood volume pulse, cardiac interbeat interval, heart rate, respiration rate or heart rate variability. It is preferred that the video be color video. The capturing step utilizes a digital camera, web cam or mobile phone camera. In a preferred embodiment, the spatially averaging step computes a spatial mean, median or mode. The heart rate variability may be determined by power spectral density estimation. Simultaneous physiological measurements may be made of multiple users.
In yet another aspect, the invention is a method for automatic measurement of physiological parameters of at least one subject from video of a body part of the subject. The method includes localization of a region of interest from frames of the video and extraction of input signals from the region of interest. The input signals are blind source separated to recover separated source signals. One or more of the separated source signals is selected and the one or more selected source signals is processed to provide a measurement of the physiological parameters. In a preferred embodiment of this aspect of the invention, the body part is a face or a subset thereof. The localization step may be based on a trained classifier.
In a preferred embodiment, the extraction of input signals from the region of interest include separating red, green and blue channels and computing a spatial mean, median or mode of these channels for each video frame. The blind source separation may include detrending and normalizing the input signals extracted from the region of interest. It is preferred that the blind source separation incorporate independent component analysis for the separation of source signals from the detrended and normalized input signals. The separated source signals may be processed in a time window on the order of five seconds to fifteen minutes. It is also preferred that the processing of the one or more selected source signals includes moving average filtering to obtain a blood volume pulse. In this aspect of the invention, the physiological parameters include heart rate, respiratory rate and heart rate variability.
In still another aspect, the invention is a system for determining physiological parameters, including a camera for capturing video of a human face to generate at least two signals and a computer running a program operating on the signals to determine the blood volume pulse from which other physiological parameters may be determined.
The present invention thus provides a simple, low-cost method for measuring multiple physiological parameters using a basic web cam or other color digital video camera. High degrees of agreement were achieved between the measurements across all physiological parameters. The present invention has significant potential for advancing personal healthcare and telemedicine.
a is a photograph of a human face within a video frame.
b are decompositions of the face in
c are red, green and blue raw signals.
d is a schematic representation showing independent component analysis applied to the separate three independent source signals.
e are graphs of the separated source signals.
a are plots of a blood volume pulse waveform using the present invention in comparison with a waveform detected by a finger BVP sensor. The selected source signal was smoothed using a five-point moving average filter and bandpass filtered, 0.7 to 4 Hz.
b are plots of interbeat intervals formed by extracting the peaks from the BVP waveforms according to an embodiment of the invention and with a finger BVP sensor.
c illustrates a normalized Lomb periodogram of the detrended interbeat intervals exhibiting a dominant HF component.
d-2f are an example recording exhibiting a dominant LF component.
a is a plot of an interbeat interval series from a webcam.
b is a plot showing a normalized Lomb periodogram showing HF power (0.15-0.4 Hz) centered at 0.23 Hz.
c is a plot of respiration signal versus time showing a respiration waveform measured by a chest belt sensor.
d is a plot of normalized power versus frequency showing a normalized Lomb periodogram showing the fundamental respiration frequency of 0.23 Hz.
a is a scatter plot comparing measurements of heart rate.
b is a scatter plot comparing measurements of high frequency power.
c is a scatter plot comparing measurements of low frequency power.
d is a scatter plot comparing measurements of the ratio of low frequency power to high frequency power.
e is a scatter plot comparing measurements of respiration rate between a web cam an reference sensors (finger BVP for HR and HRV measurements, chest belt respiration sensor for respiration rate).
Recently, the inventors herein developed a robust method for automated computation of heart rate from digital color video recordings of the human face [11]. In this patent application we extend the methodology to quantify multiple physiological parameters. Specifically, the invention disclosed herein extracts the blood volume pulse for computation of heart rate, respiration rate as well as heart rate variability.
First of all, some of the theory on which the present invention is based will now be provided. Independent component analysis (ICA) is a relatively new technique for uncovering independent signals from a set of observations that are composed of linear mixtures of the underlying sources [12]. The underlying source signal of interest in this patent application is the blood volume, pulse that propagates throughout the body. During the cardiac cycle, volumetric changes in the facial blood vessels modify the path length of the incident ambient light such that the subsequent changes in amount of reflected light indicate the timing of cardiovascular events. By capturing a sequence of images of the facial region with a webcam, the red, green and blue (RGB) color sensors pick up a mixture of reflected plethysmographic signal along with other sources of fluctuations in light due to artifacts. Given that hemoglobin absorptivity differs across the visible and near-infrared spectral range [13], each color sensor records a mixture of the original source signals with slightly different weights. These observed signals from the RGB color sensors are denoted by y1(t), y2(t) and y3(t) respectively, which are amplitudes of the recorded signals at time point t. We assume three underlying source signals, represented by x1(t), x2(t) and x3(t). The ICA model assumes that the observed signals are linear mixtures of the sources, that is, y(t)=Ax(t) where the column vectors y(t)=|y1(t), y2(t), y3(t)|T, x(t)=[x1(t), x2(t), x3(t)]T and the square 3×3 matrix A contains the mixture coefficients aij. The aim of ICA is to find a demixing matrix W that is an approximation of the inverse of the original mixing matrix A whose output {circumflex over (x)}(t)=Wy(t) is an estimate of the vector x(t) containing the underlying source signals. To uncover the independent sources, W must maximize the non-Gaussianity of each source. In practice, iterative methods are used to maximize or minimize a given cost function that measures non-Gaussianity.
The technology disclosed herein has been evaluated at the Massachusetts Institute of Technology. Experiments included 12 participants of both genders (four females), different ages (18-31 years) and skin color. The experiments were conducted indoors and with a varying amount of ambient sunlight entering through windows as the only source of illumination. Participants were seated at a table in front of a laptop computer at a distance of approximately 0.5 m from a built in webcam (iSight camera). During the experiments, participants were asked to keep still, breathe spontaneously and face the webcam while their video was recorded for one minute. All videos were captured in color (24-bit RGB with three channels with 8 bits/channel) at 15 frames per second with pixel resolution of 640×480, and saved in AVI format on the laptop computer. We also recorded the blood volume pulse of the participants along with spontaneous breathing using an FDA-approved finger BVP sensor and chest belt respiration sensor (Flexcomp Infiniti by Thought Technologies Limited) respectively at a sampling rate of 256 Hz.
All of the video and physiological recordings were analyzed offline using software written in MATLAB. With reference now to
The region of interest was then separated into three RGB channels, as shown in
for each i=1, 2, 3, and where μi and σi are the mean and standard deviation of yi(t) respectively. The normalized raw traces are then decomposed into three independent source signals using ICA (
The separated source signal was smoothed using a five-point moving average filter and bandpass filtered (128-point Hamming window, 0.7 to 4 Hz). To refine the BVP peak fiducial point, the signal was interpolated with a cubic spline function at a sampling frequency of 256 Hz. We developed an algorithm to detect the BVP peaks in the interpolated signal and applied it to obtain the interbeat intervals (IBIs). To avoid inclusion of artifacts such as ectopic beats or motion, the IBIs were filtered using the NC-VT (non-causal of variable threshold) algorithm [18] with a tolerance of 30%. Heart rate was calculated from the mean of the IBI time series as 60/
Analysis of the heart rate variability was performed by power spectral density (PSD) estimation using the Lomb periodogram. The low frequency power (LF) and high frequency power (HF) were measured as the area under the PSD curve corresponding to 0.04-0.15 Hz and 0.15-0.4 Hz respectively and quantified in normalized units to minimize the effect on the values of the changes in total power. The LF component is modulated by baroreflex activity and includes both sympathetic and parasympathetic influences [19]. The HF component reflects parasympathetic influence on the heart through efferent vagal activity and is connected to 230 respiratory sinus arrhythmia, a cardiorespiratory phenomenon characterized by interbeat interval fluctuations that are in phase with inhalation and exhalation. We also calculated the LF/HF ratio considered to mirror sympatho/vagal balance or to reflect sympathetic modulations.
Since the HF component is connected with breathing, the respiration rate can be estimated from the HRV power spectrum. When the frequency of respiration changes, the center frequency of the HF peak shifts in accordance with the respiration rate [20]. Thus, we calculated respiration rate from the center frequency of the HF peak fHf Peak in the heart rate variability power spectral density plot derived from the webcam recordings as 60/fHf Peak. The respiratory rate measured using the chest belt sensor was determined by the frequency corresponding to the dominant peak fresp peak in the power spectral density plot of the recorded respiratory wave form using 60/fresp peak.
Using the techniques set forth above, we extracted the blood volume pulse waveforms from the webcam recordings via ICA. A typical example of the recovered BVP recordings is shown in
We were able to determine RR from the HRV power spectrum by locating the center frequency of the HF peak.
The level of agreement between the physiological measurements made by the invention disclosed herein and by reference sensors was accessed using Pearson's correlation coefficients (n=12). Correlation scatter plots for each measured parameter are shown in
The steps of the method of an embodiment of the invention disclosed herein are shown in
On the basis of the results in Table 1, we demonstrated the feasibility of using a simple webcam to measure multiple physiological parameters. These parameters include vital signs such as heart rate and respiration rate, as well as correlates of cardiac autonomic function through heart rate variability. Our data demonstrate that there is a strong elation between these parameters derived from the webcam recordings and standard reference sensors. Regarding the choice of measurement epoch, a recording of 1-2 minutes is needed to assess the spectral components of HRV [5] and an averaging period of 60 beats improves the confidence in the single timing measurement from the BVP waveform [9]. The face detection algorithm is subject to head rotation limits. About three axes of pitch, rotation and yaw, the limits were 32.6±4.84, 33.4±2.34 and 18.6±3.75 degrees from the frontal position.
The results set forth above should be considered in light of limitations of the present study. First of all, the webcam video sampling rate fluctuated around 15 fps due to the use of a standard PC for image acquisition, causing misalignment of the BVP peaks compared to the reference signal. The performance of the present invention can be boosted if each video frame were time stamped and the signals were resampled. Performance can also be boosted by (1) using a camera with a higher frame rate or one dedicated to this computation, or by (2) using multiple slow (e.g. 30 fps) cameras, slightly uttered in their time sampling synchronization offsets so that their measures may be combined to get higher temporal resolution. Second, the video sampling rate is much lower than recommended rates (greater than or equal to 250 Hz) for HRV analysis. By interpolating at 256 Hz to reline the peaks in the BVP and improve timing estimations we achieved the high correlation shown in Table 1 above. The PPG beat-to-beat variability can be affected by changes in the pulse transit time, which is related to arterial compliance and blood pressure, but it has been shown to be a good surrogate of HRV at rest [21]. A limitation of the system disclosed herein is that only three source signals can be recovered. However, our results suggest that this is sufficient to obtain accurate measurements of the BVP.
It is recognized that modifications and variations of the invention disclosed herein will be apparent to those of ordinary skill in the art, and it is intended that all such modifications and variations be included within the scope of the appended claims.
[5] M. Malik, J. Rigger, A. Camm, R, Kleiger, A. Malliani, A. Moss, and P. Schwartz, “Heart rate variability: Standards of measurement, physiological interpretation, and clinical use,” Eur Heart J, vol. 17, p. 354, 1996.
This application claims priority to provisional application Ser. No. 61/316,047 filed Mar. 22, 2010, the contents of which are incorporated herein by reference in their entirety.
Number | Date | Country | |
---|---|---|---|
61316047 | Mar 2010 | US |