This application claims the benefit of Korean Patent Application Nos. 10-2017-0021522, filed on Feb. 17, 2017, and 10-2017-0147610, filed on Nov. 7, 2017, in the Korean Intellectual Property Office, the disclosures of which are incorporated herein in their entirety by reference.
One or more embodiments relate to a method of detecting physiological information by using a pupillary response, and a system using the method, and more particularly, to a method of detecting frequency-domain cardiac information from a pupil size variation, and a system using the method.
In vital signal monitoring (VSM), physiological information can be acquired by a sensor attached to a human body. Such physiological information includes electrocardiogram (ECG), photo-plethysmograph (PPG), blood pressure (BP), galvanic skin response (GSR), skin temperature (SKT), respiration (RSP) and electroencephalogram (EEG).
The heart and brain are two main organs of the human body and analysis thereof provide the ability to evaluate human behavior and obtain information that may be used in response to events and in medical diagnosis. VSM may be applicable in various fields such as ubiquitous healthcare (U-healthcare), emotional information and communication technology (e-ICT), human factor and ergonomics (HF&E), human computer interfaces (HCIs), and security systems.
Regarding ECG and EEG, sensors attached to the body are used to measure physiological signals and thus, may cause inconvenience to patients. That is, the human body experiences considerable stress and inconvenience when using sensors to measure such signals. In addition, there are burdens and restrictions with respect to the cost of using the attached sensors and to the movement of the subject, due to attached sensor hardware.
Therefore, VSM technology is required in the measurement of physiological signals by using non-contact, non-invasive, and non-obtrusive methods while providing unfettered movement at low cost.
Recently, VSM technology has been incorporated into wireless wearable devices allowing for the development of portable measuring equipment. These portable devices can measure heart rate (HR) and RSP by using VSM embedded into accessories such as watches, bracelets, or glasses.
Wearable device technology is predicted to develop from portable devices to “attachable” devices shortly. It is also predicted that attachable devices will transition to “edible” devices.
VSM technology has been developed to measure physiological signals by using non-contact, non-invasive, and non-obtrusive methods that provide unfettered movement at low cost. While VSM will continue to advance technologically, innovative vision-based VSM technology is required to be developed also.
One or more embodiments include a system and method for inferring and detecting human vital signs by using a non-invasive and non-obstructive method at low cost.
In detail, one or more embodiments include a system and method for detecting frequency-domain cardiac information by using a pupillary response or pupil size variation.
Additional aspects will be set forth in part in the description which follows and, in part, will be apparent from the description, or may be learned by practice of the presented embodiments.
According to one or more exemplary embodiments, the method of detecting frequency-domain cardiac information comprises acquiring moving images of a pupil from a subject; extracting a pupil size variation (PSV) from the moving images; extracting a heart rate variability (HRV) spectrum by performing a processing procedure including a frequency-analysis of the PSV; and calculating at least one power of at least one of a plurality frequency bands from the HRV spectrum.
According to one or more exemplary embodiments, a system performing the method comprises a video capturing unit configured to capture the moving images of the subject; and a computer architecture based analyzing system, including analysis tools, configured to process, analyze the moving images, and calculate at least one of the powers.
According to one or more exemplary embodiments, the processing procedure includes filtering using a band pass filter (BPF) and frequency analysis using fast Fourier transformation (FFT).
According to one or more exemplary embodiments, the plurality of frequency bands are the same as a plurality of bands extracted from electrocardiogram (ECG) signals.
According to one or more exemplary embodiments, the plurality of frequency bands include at least one of a very low frequency (VLF) in a range of 0.0033 Hz to 0.04 Hz, a low frequency (LF) in a range of 0.04 Hz to 0.15 Hz, and a high frequency (HF) in a range of 0.15 Hz to 0.4 Hz.
These and/or other aspects will become apparent and more readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings in which:
Reference will now be made in detail to embodiments, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to like elements throughout. In this regard, the present embodiments may have different forms and should not be construed as being limited to the descriptions set forth herein. Accordingly, the embodiments are merely described below, by referring to the figures, to explain aspects of the present description.
Hereinafter, a method and system for inferring and detecting physiological signals according to the present inventive concept is described with reference to the accompanying drawings.
The invention may, however, be embodied in many different forms and should not be construed as being limited to the embodiments set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of the invention to those of ordinary skill in the art. Like reference numerals in the drawings denote like elements. In the drawings, elements and regions are schematically illustrated. Accordingly, the concept of the invention is not limited by the relative sizes or distances shown in the attached drawings.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used herein, the singular forms “a”, “an” and “the” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises” and/or “comprising,” or “includes” and/or “including” when used in this specification, specify the presence of stated features, numbers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, numbers, steps, operations, elements, components, and/or groups thereof.
Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. It will be further understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and/or the present application, and will not be interpreted in an overly formal sense unless expressly so defined herein.
The embodiments described below involve processing frequency-domain cardiac parameters from a pupillary response which is obtained from video information.
The present invention, which may be sufficiently understood through the embodiments described below, involves extraction of frequency-domain cardiac parameters from the pupillary response or pupil size variation by using a vision system equipped with a video camera such as a webcam without any physical restriction or psychological pressure on the subject. In particular, the pupillary response is detected from the image information and frequency-domain cardiac parameters are extracted from the detected pupillary response.
In an experiment of the present invention, the reliability of the frequency-domain cardiac parameters extracted from the pupil size variation (PSV) acquired through moving images was compared with the ground truth signal by ECG sensors.
Experiments in relation to the present invention were performed by using video equipment, and a computer architecture based analyzing system for processing and analyzing the moving images which included analysis tools provided by software. The system according to exemplary embodiments was developed using Visual C++ 2010 and OpenCV 2.4.3. The signal processing function for fast Fourier transformation (FFT), band-pass filter (BPF), etc. was provided by Lab VIEW 2010.
In order to cause variation in a physiological state, this experiment used sound stimuli based on the Russell's cir-complex model (Russell, 1980). The sound stimuli included a plurality of factors, including arousal, relaxation, positive, negative, and neutral sounds. The neutral sound was defined by an absence of acoustic stimulus. The steps for selecting sound stimuli are shown in
(S11) Nine hundred sound sources were collected from the broadcast media such as advertisements, dramas, and movies.
(S12) The sound sources were then categorized into four groups (i.e., arousal, relaxation, positive, and negative). Each group was comprised of 10 commonly selected items based on a focus group discussion for a total of forty sound stimuli.
(S13) These stimuli were used to conduct surveys for suitability for each emotion (i.e., A: arousal, R: relaxation, P: positive, and N: negative) based on data gathered from 150 subjects that were evenly split into 75 males and 75 females. The mean age was 27.36 years±1.66 years. A subjective evaluation was required to select each item for the four factors, which could result in duplicates of one or more of the items.
(S14) A chi-square test for goodness-of-fit was performed to determine whether each emotion sound was equally preferred. Preference for each emotion sound was equally distributed in the population (arousal: 6 items, relaxation: 6 items, positive: 8 items, and negative: 4 items) as shown in Table 1.
Table 1 shows the chi-square test results for goodness-of-fit in which the items selected for each emotion are based on comparisons of observation and expectation values.
Resurveys of the sound stimuli were conducted for relation to each emotion from the 150 subjects by using a seven-point scale based on 1 indicating strong disagreement to 7 indicating strong agreement.
Valid sounds relating to each emotion were analyzed using principal component analysis (PCA) based on Varimax (orthogonal) rotation. The analysis yielded four factors explaining of the variance for the entire set of variables. After obtaining the analysis result, representative sound stimuli for each emotion were derived, as shown in Table 2.
In Table 2, the bold type is the same factor, the italic character is the communalities <0.5, and the thick, underlined lettering represents the representative acoustic stimulus for each emotion.
positive 9
.812
.065
.021
−.033
.751
.717
.531
−.528
.520
relaxation 2
.192
.684
.109
.004
.649
.629
.628
.569
.529
positive 10
−.145
.342
−.020
negative 1
−.257
−.009
.672
.123
.608
.580
.566
.528
positive 5
.377
.014
−.019
positive 7
.002
.193
.128
arousal 1
−.158
.209
−.042
.774
.765
.672
.617
Seventy undergraduate volunteers of both genders, evenly split between males and females, ranging in age from 20 to 30 years old with a mean of 24.52 years±0.64 years participated in this experiment. All subjects had normal or corrected-to-normal vision (i.e., over 0.8), and no family or medical history of disease involving visual function, cardiovascular system, or the central nervous system. Informed written consent was obtained from each subject prior to the study. This experimental study was approved by the Institutional Review Board of Sangmyung University, Seoul, South Korea (2015-8-1).
The experiment was composed of two trials where each trail was conducted for a duration of 5 minutes. The first trail was based on the motionlessness condition (MNC), which involves not moving or speaking. The second trial was based on a natural movement condition (NMC) involving simple conversations and slight movements. Participants repeatedly conducted the two trials and the order was randomized across the subjects. In order to verify the difference of movement between the two conditions, this experiment quantitatively measured the amount of movement during the experiment by using webcam images of each subject. In the present invention, the moving image may include at least one pupil, that is, one pupil or both pupils image.
The images were recorded at 30 frames per second (fps) with a resolution of 1920×1080 by using a HD Pro C920 camera from Logitech Inc. The movement measured the upper body and face based on MPEG-4 (Tekalp and Ostermann, 2000; JPandzic and Forchheimer, 2002). The movement in the upper body was extracted from the whole image based on frame differences. The upper body line was not tracking because the background was stationary.
The movement in the face was extracted from 84 MPEG-4 animation points based on frame differences by using visage SDK 7.4 software from Visage Technologies Inc. All movement data used the mean value from each subject during the experiment and was compared to the difference of movement between the two trails, as shown in
In
In order to cause the variation of physiological states, sound stimuli were presented to the participants during the trails. Each sound stimulus was randomly presented for 1 min for a total of five stimuli over the 5 min trial. A reference stimulus was presented for 3 min prior to the initiation of the task. The detailed experimental procedure is shown in
The experimental procedure includes the sensor attachment S31, the measurement task S32 and the sensor removal S33 as shown in
The experiment was conducted indoors with varying illumination caused by sunlight entering through the windows. The participants gazed at a black wall at a distance of 1.5 m while sitting in a comfortable chair. Sound stimuli were equally presented in both the trials by using earphones. The subjects were asked to constrict their movements and speaking during the MNC trial. However, the NMC trial involved a simple conversation and slight movement by the subjects. The subjects were asked to introduce themselves to another person as part of the conversation for sound stimuli thereby involving feelings and thinking of the sound stimuli. During the experiment, ECG signals and pupil image data were obtained.
ECG signals were sampled and recorded at a 500 Hz sampling rate through one channel with the lead-I method by an amplifier system including ECG 100C amplifiers and a MP100 power supply from BIOPAC System Inc. The ECG signals were digitized by a NI-DAQ-Pad 9205 of National Instrument Inc.
Pupil images were recorded at 125 fps with a resolution of 960×400 by GS3-U3-23S6M-C infrared camera from Point Grey Research Inc.
Hereinafter, a method for extracting or constructing (recovering) vital signs from a pupillary response will be described.
The pupil detection procedure acquires moving images using the infrared video camera system as shown in
The pupil detection procedure may require following certain image processing steps since the images were captured using an infrared video camera, as shown in
Threshold=(−0.418×Bmean+1.051×Bmax)+7.973
B=Brightness value <Equation 1>
The next step to determine the pupil position involved processing the binary image by using a circular edge detection algorithm, as shown in Equation 2 (Daugman, 2004; Lee et al., 2009).
In case that multiple pupil positions are selected, the reflected light caused by the infrared lamp may be used. Then an accurate pupil position was obtained, including centroid coordinates (x, y) and a diameter.
Pupil diameter data (signal) as a PSV was resampled at a frequency range of 1 Hz-30 Hz, as shown in Equation 3. The resampling procedure for the pupil diameter data involved a sampling rate of 30 data points, which then calculated the mean value during 1-s intervals by using a common sliding moving average technique (i.e., a window size of 1 second and a resolution of 1 second). However, non-tracked pupil diameter data caused by the eye closing was not involved in the resampling procedure.
The detections of the cardiac frequency-domain indexes (parameters) are now described along with
The HRV indexes, such as very low frequency (VLF), low frequency (LF), high frequency (HF), VLF/HF ratio, and LF/HF ratio, are the indicators of autonomic balance. The VLF band is an index of sympathetic activity while the HF band is an index of parasympathetic activity. The LF band is a complex mixture comprising of sympathetic and parasympathetic efferent and afferent activities as well as vascular system resonance (Malik, 1996; Shen et al., 2003; Reyes del Paso et al., 2013; Park of al., 2014).
Referring
The VLF, LF, and HF power were calculated from the ratio between total band power and the HRV index band power (VLF, LF, and HF), as shown in Equation (5). This procedure was processed by sliding window technique (window size: 180 sec and resolution: 1 sec).
The ECG signals detected the R-peak and then calculated the RRI. The successive RRI values were resampled at 2 Hz and analyzed to the HRV spectrum by using FFT, as shown in Equation (7). The HRV spectrum was categorized into frequency bands for a VLF band ranging from 0.0033 Hz to 0.04 Hz; a LF band ranging from 0.04 Hz to 0.15 Hz; and a HF band ranging from 0.15 Hz to 0.4 Hz. Then, power for each frequency band was extracted. The VLF, LF, and HF were converted into activity ratio such as VLF/HF or LF/HF ratio. The detailed procedure for processing ECG signals is shown in
The pupillary response was processed to extract the vital signs from the cardiac time domain index, cardiac frequency domain index, EEG spectral index, and the HEP index of the test subjects. These components were compared with each index from the sensor signals (i.e., ground truth) based on correlation coefficient (r) and mean error value (ME). The data was analyzed with respect both MNC and NMC for the test subjects.
To verify the difference of the amount movement between the two conditions of MNC and NMC, the movement data was quantitatively analyzed. The movement data was a normal distribution based on a normality test of probability-value (p)>0.05, and from an independent t-test. A Bonferroni correction was performed for the derived statistical significances (Dunnett, 1955). The statistical significance level was controlled based on the number of each individual hypothesis (i.e., α=0.05/n). The statistical significant level of the movement data sat up 0.0167 (upper body, X and Y axis in face, α=0.05/3). The effect size based on Cohen's d was also calculated to confirm practical significance. In Cohen's d, standard values of 0.10, 0.25, and 0.40 for effect size are generally regarded as small, medium, and large, respectively (Cohen, 2013).
Referring
The cardiac frequency domain indexes, such as VLF power, LF power, HF power, VLF/HF ratio, and LF/HF ratio, for the cardiac output were extracted from the pupillary response. These components were compared with the frequency index from the ECG signals (ground truth).
This experiment was able to determine the cardiac frequency index (i.e., VLF power, LF power, HF power, VLF/HF power ratio, and LF/HF power ratio) from the pupillary response by the entrainment of the harmonic frequency.
The cardiac HRV index range of 0.0033 Hz-0.4 Hz was closely connected with the circadian pupillary rhythm of the same frequency range. The size variation of the pupil diameter was divided into three bands: VLF (0.0033 Hz-0.04 Hz), LF (0.04 Hz-0.15 Hz), and HF (0.15 Hz-0.4 Hz).
In addition, the size variation was extracted for each power band from the ratio between the total power bands (0.0033 Hz-0.4 Hz). These were synchronized with the HRV index within the frequency band of 1 Hz. The VLF/HF and LF/HF power ratio were then calculated from the individual VLF, LF, and HF components.
In
Table 4 shows average of correlation coefficient and mean error of cardiac frequency index in MNC (N=270, p<0.01). These results was processed by using the sliding window technique based on a window size of 180 sec and a resolution of 1 sec by using the recorded data for 300 sec. The correlation and mean error were the mean value for the 70 subjects (in one subject, N=120).
In the comparison result with ground truth in MNC, the cardiac frequency index from pupillary response were strong correlation of all parameters with r=0.888±0.044 for VLF power; r=0.898±0.058 for LF power; r=0.896±0.054 for HF power; r=0.797±0.080 for VLF/HF ratio; and r=0.801±0.086 for LF/HF ratio, as shown Table 4.
Also, the difference between the mean error of all parameters was low with ME=0.353±0.258 for VLF power; ME=0.329±0.243 for LF power; ME=0.301±0.250 for HF power; ME=0.497±0.386 for VLF/HF ratio, and ME=0.492±0.372 for LF/HF ratio.
Table 5 shows an average of correlation coefficients and a mean error of the cardiac frequency index of the subjects in NMC (N=120, p<0.01). The procedure for Table 5 was performed by using the sliding window technique where the window size was 180 sec and the resolution size was 1 sec by using the recorded data for 300 s.
Comparing the results of the ground truth in the NW, the cardiac frequency index from the pupillary response indicates a strong correlation for all parameters where r=0.850±0.057 for VLF power; r=0.864±0.062 for LF power; r=0.855±0.066 for HF power; r=0.784±0.073 for the VLF/HF power ratio; and r=0.791±0.077 for the LF/HF power ratio. The difference between the mean error of all parameters was low with ME=0.457±0.313 for the VLF power; ME=0.506±0.292 for the LF power; ME=0.546±0.435 for the HF power; ME=0.692±0.436 for the VLF/HF power ratio, and ME=0.692±0.467 for the LF/HF power ratio.
The real-time system for detecting information of the cardiac frequency domain was developed based on capturing and processing of pupil images. This system may include an infrared webcam, near IR (Infra-Red light) illuminator (IR lamp) and personal computer for analysis.
The infrared webcam was divided into two types, the fixed type, which is a common USB webcam, and the portable type, which are represented by wearable devices. The webcam was a HD Pro C920 from Logitech Inc. converted into an infrared webcam to detect the pupil area.
The IR filter inside the webcam was removed and an IR passing filter used for cutting visible light from Kodac Inc., was inserted into the webcam to allow passage of IR wavelength longer than 750 nm, as shown in
The conventional 12 mm lens of the USB webcam shown in
As described in the above, the present invention develops and provides an advanced method for measurements of human vital signs from moving images of the pupil. Thereby, the measurement of parameters in cardiac frequency domain can be performed by using a low-cost infrared webcam system that monitored pupillary response.
This result was verified for both the conditions of noise (MNC and NMC) and various physiological states (variation of arousal and valence level by emotional stimuli of sound) for seventy subjects.
The research for this invention examined the variation in human physiological conditions caused by the stimuli of arousal, relaxation, positive, negative, and neutral moods during verification experiments. The method based on pupillary response according to the present invention is an advanced technique for vital sign monitoring that can measure vital signs in either static or dynamic situations.
The present invention may be applied to various industries such as U-health care, emotional information and communication technology (ICT), human factors, human computer interfaces (HCIs), and security that require vital signal monitoring (VSM) technology.
It should be understood that embodiments described herein should be considered in a descriptive sense only and not for purposes of limitation. Descriptions of features or aspects within each embodiment should typically be considered as available for other similar features or aspects in other embodiments.
While one or more embodiments have been described with reference to the figures, it will be understood by those of ordinary skill in the art that various changes in form and details may be made therein without departing from the spirit and scope of the disclosure as defined by the following claims.
Number | Date | Country | Kind |
---|---|---|---|
10-2017-0021522 | Feb 2017 | KR | national |
10-2017-0147610 | Nov 2017 | KR | national |