The present invention relates to breathing abnormalities and detection thereof. In particular, a method and apparatus are described for acquiring sounds related to breathing and for identifying breathing abnormalities based on the acquired sounds.
Acoustic signals generated by internal body organs are transmitted to the skin, causing skin vibration. The stethoscope captures body sounds by detecting skin vibration. The stethoscope is currently employed by medical professionals to aid in the diagnosis of diseases by listening to body sounds and recognizing the patterns associated with specific diseases. However, such use of the stethoscope is limited by the episodic nature of data acquisition, as well as the limits of human acoustic sensitivity and pattern recognition. The electronic stethoscope was developed to digitally amplify the acoustic signal and aid in pattern recognition, but data acquisition is still limited by its episodic nature. Due to the weight of the stethoscope, and the lack of adequate, wearable design, the electronic stethoscope is not suitable for continuous monitoring for an active user.
The advance of computer processing led to research on computerized analysis of body sounds to identify disease states. These research studies are conducted in a controlled setting, where sensors are used to capture body sounds for computerized analysis.
Yet, to date, there are no systems available to monitor body sounds in an ambulatory, uncontrolled setting because of a multitude of design obstacles.
An apparatus and method are for evaluating respiration. A microphone is placed in contact with a patient's skin and audio is acquired through the microphone. The acquired audio is sampled, processed and stored. At least one sound associated with respiration is identified. Abnormal respiration is identified based on frequency or duration of at least the identified sound.
The present invention is designed for the continuous acquisition of body sounds for computerized analysis. In contrast, existing devices for body sound acquisition are designed for episodic acquisition of body sounds for human hearing. The difference in intended use between the present invention and existing devices leads to design differences in construction materials, weight, and mechanisms of body sound acquisition. Specifically, existing designs typically require an operator to manually press the stethoscope against the skin for adequate acoustic signal acquisition. Such data acquisition is episodic, as it is limited by the duration an operator can manually press the stethoscope against the skin. In the present invention, the device is pressed against the skin using a mechanism such as adhesives or a clip to a piece of clothing worn by the patient. As such, data acquisition can occur continuously and independent of operator effort.
Existing mechanisms of body sound acquisitions include contact microphones, electromagnetic diaphragms, and air-coupler chestpieces made of metals.
Using electronic contact microphones and electromagnetic diaphragms for body sound acquisition is desirably accomplished via require tight contact between the device and the skin. Minimal movements between the device and the skin can distort the signal significantly. Thus the use of adhesive and a clip as attachment mechanisms may be precluded, as these attachment mechanisms do not offer sufficient skin contact for these types of body sound acquisition mechanisms.
The use of electromagnetic diaphragms requires more battery power in the case of continuous monitoring, which renders the design less desirable in wearable devices.
Body sound acquisition using air-coupler chestpiece is more forgiving with looser skin-device contact and unwanted movements. High density materials such as metals are used in its construction for better sound quality for human hearing. However, metallic chestpieces are too heavy for wearable applications. For example, the Littmann 3200 Electronic Stethoscope chestpiece weighs 98 grams, while an exemplary embodiment of the present invention weighs 25 grams because lightweight, lower density polymeric materials, such as acrylonitrile butadiene styrene (ABS), are used. Metals that are commonly used in chestpieces include aluminum alloy in low-cost stethoscopes and steel in premium stethoscopes. Aluminum alloys have a density of approximately 2.7 gram/cm̂3, while steels have a density of approximately 7.8 gram/cm̂3. In contrast, ABS have a density of approximately 1 gram/cm̂3. The use of lightweight, lower density air-coupler chestpiece render sound quality relatively poor for human hearing, but more than sufficient for computerized analysis.
Additionally, an exemplary embodiment of the present invention incorporates motion sensors that acquire additional physiological data used to optimize computerized body sound analysis. The physiological data include but are not limited to the phases of respiration, i.e., inhalation and exhalation, heart rate, and the degree of chestwall expansion.
A method and apparatus enable respiration of a patient to be evaluated. In accordance with an exemplary embodiment of the present invention, evaluation of patient may lead, for example, to detection of medical issues associated with respiration of a patient. The evaluation may also lead to detection of worsening lung function in patients. Exemplary patients include asthmatics and patients with chronic obstructive pulmonary disease (COPD).
According to one aspect of the invention, a wearable device is placed in contact with a patient's body in order to receive and process sound emanating from inside the patient's body. An exploded view of an exemplary wearable device 100 is illustrated in
Optional physical filter(s) 306 may also be included. Exemplary filters include linear continuous-time filters, among others. Exemplary filter types include low-pass, high-pass, among others. Exemplary technologies include electronic, digital, mechanical, among others. Optional filter(s) 306 may receive sound prior to digitization, after digitization, or both.
The output of electrical bus interface 350 is transmitted to data processing unit 170, which is more clearly shown in
In one exemplary embodiment of the present invention, data is transferred from memory 173 to external computer 360. This is further described below.
Operation of an exemplary embodiment of the present invention is illustrated by
At step 104, sound from chest facing microphone 305 is acquired. At optional step 106, sound from background microphone 310 is acquired. The sound optionally passes through filter 306 before being converted into electrical energy by microphone 305. After being converted to electrical energy, the sound passes through A-D converter 340 and electrical bus interface 350 before being received by digital signal processor 171. Processor 171 samples audio desirably at a minimum of 20 kHz. Sampling may occur, for example, for twenty seconds. Step 108 optionally includes the step of using the audio signals received at step 106 via microphone 310 in order to perform noise cancellation. Noise cancellation is performed using algorithms that are well known to one of ordinary skill in the art of noise cancellation.
Sampled audio data is processed at step 110. Audio data is processed in order to detect certain sounds associated with breathing (and/or associated with breathing difficulties). Processing at step 110 may include, for example, Fast Fourier Transform. Processing may also include, for example, digital low pass and/or high pass Butterworth and/or Chebyshev filters.
At optional step 112, data is stored in memory 172.
At step 114, the processed data is evaluated by processor 171 to determine if an “abnormal” respiratory sound has been captured by microphone 305. Examples of an “abnormal” respiratory sound include a wheeze, a cough, labored breathing, or some other type of respiratory sound that is indicative of a respiratory problem. Evaluation occurs as follows. In one exemplary embodiment of the present invention, the processed data (i.e. from a transform such as a Fourier transform or a wavelet transform) results in a spectrogram. The spectrogram may correspond, for example, to the 20 seconds worth of processed data that has been stored in memory 172. The spectrogram is then evaluated using a set of “predefined mathematical features”.
The “predefined mathematical features” are generated from multiple “predefined spectrograms”. Each “predefined spectrogram” is generated by processing data that is known to correspond to an irregular respiratory sound (such as a wheeze). A method of generating such a predefined spectrogram is illustrated by the flowchart diagram of
Once the raw data has been acquired from the patient (step 202), and is subject to audio processing (step 204), spectrogram feature extraction (step 206) may occur.
A set of mathematical features can be extracted from each predefined spectrogram. Mathematical feature extraction is known to one of ordinary skill in the art and is described in various publications, including 1) Bahoura, M., & Pelletier, C. (2004, September). Respiratory sounds classification using cepstral analysis and Gaussian mixture models. In Engineering in Medicine and Biology Society, 2004. IEMBS'04. 26th Annual International Conference of the IEEE (Vol. 1, pp. 9-12). IEEE; 2) Bahoura, M. (2009). Pattern recognition methods applied to respiratory sounds classification into normal and wheeze classes. Computers in biology and medicine, 39(9), 824-843; 3) Palaniappan, R., & Sundaraj, K. (2013, December). Respiratory sound classification using cepstral features and support vector machine. In Intelligent Computational Systems (RAICS), 2013 IEEE Recent Advances in (pp. 132-136). IEEE; 4) Mayorga, P., Druzgalski, C., Morelos, R. L., Gonzalez, O. H., & Vidales, J. (2010, August). Acoustics based assessment of respiratory diseases using GMM classification. In Engineering in Medicine and Biology Society (EMBC), 2010 Annual International Conference of the IEEE (pp. 6312-6316). IEEE; and 5) Chien, J. C., Wu, H. D., Chong, F. C., & Li, C. I. (2007, August). Wheeze detection using cepstral analysis in gaussian mixture models. In Engineering in Medicine and Biology Society. All of the above references are hereby incorporated by reference in their entireties.
The set of mathematical features are derived from the inherent power and/or frequency of the predefined spectrogram of data clusters using mathematical methods that include but are not limited to the following: data transforms (Fourier, wavelet, discrete cosine) and logarithmic analyses. The set of mathematical features extracted from each predefined spectrogram can vary by the method with which each feature in the set is extracted. These features may include, but are not limited to, frequency, power, pitch, tone, and shape of data waveform. See Lartillot, O., & Toiviainen, P. (2007, September). A Matlab toolbox for musical feature extraction from audio. In International Conference on Digital Audio Effects (pp. 237-244). This reference is hereby incorporated by reference in its entirety.
For example, a first set of two mathematical features are extracted from a predefined spectrogram using statistical mean and mode. A second set of two mathematical features are extracted from the same predefined spectrogram using statistical mean and entropy. The set of mathematical features can also vary by the number of features in each set of mathematical features. For example, a set of twenty mathematical features are extracted from a predefined spectrogram. In another example, a set of fifty mathematical features are extracted from the same predefined spectrogram. Additionally, the mathematical features may vary by the segment lengths of the predefined spectrogram with which the mathematical features are extracted. For example, a mathematical feature extracted from one-second segments of the predefined spectrogram using a statistical method is different from a mathematical feature extracted from five-second segments of the predefined spectrogram using the same statistical method.
The set of mathematical methods used to extract the “predefined mathematical features” is the “pre-specified feature extraction”. In one exemplary embodiment of the present invention, the “pre-specified feature extraction” is developed using mel-frequency cepstral coefficients and is optimized using machine learning methods that include but are not limited to the following: support vector machines, decision trees, gaussian mixed models, recurrent neural network, semi-supervised auto encoder, restricted Boltzmann machines, convolutional neural networks, and hidden Markov chain (see above references). Each machine learning method may be used alone or in combination with other machine learning methods.
The “predefined mathematical features” is derived from multiple predefined spectrograms in the following manner. An feature extraction method, as defined above, is used to extract a set of mathematical features from each predefined spectrogram corresponding to a type of respiratory sound. Multiple features are evaluated in this manner. The features are then plotted together (step 208) from multiple respiratory sound types in order to perform cluster analysis in the nth dimension (n being the number of features extracted). For example, if three features were extracted for analysis from each data file, each data file would correspond to one point in three dimensional space, each axis representing the value of a particular feature. Thereafter, one example of algorithm generation attempts to find a hyperplane in this three dimensional space that maximally separates clusters of points representing specific sound types. For example, if data points from wheeze files cluster in one corner of this three dimensional space while those from cough files cluster in another, a plane that separates these two clusters would correspond to an algorithm that distinguishes the two and is able to classify these sound types into two groups. This analysis can be extrapolated to as many features as needed, n, thereby moving the analysis into nth dimensional space. This allows differentiation of each sound type based on its unique feature set. The algorithm that generates outputs (sets of mathematical features) that are most similar to each other is selected as the “pre-specified algorithm” as described above. For example, ten sets of twenty statistical features is extracted from ten predefined spectrograms corresponding to wheezing using different algorithms. The algorithm that extracts ten sets of features that are the most similar to each other is selected as the “pre-specified algorithm” (step 210). In an exemplary graphical representation of classification, lines represent the “pre-defined algorithm” in classifying data in multiple dimensions in accordance with an exemplary embodiment of the present invention. Next, the “average” of the sets of mathematical features extracted with the “pre-specified algorithm” is selected as the “predefined mathematical features”. Here, “average” is defined by mathematical similarity between the “predefined mathematical features” and each set of mathematical features from which the “predefined mathematical features” derives from.
Evaluation of a spectrogram with a predefined spectrogram may be on several bases. A spectrogram is processed by the “pre-specified feature extraction” method to generate a set of mathematical features. The set of mathematical features is then compared to sets of “predefined mathematical features”, of which each set corresponds to a specific type of sound. If the similarity between the set of mathematical features extracted from a spectrogram and the predefined mathematical features of a type of respiratory sound goes past certain thresholds, then it is determined that the corresponding type of respiratory sound has been emitted. By saying ‘goes past” what may be meant is going above a value. What may alternatively be meant is going below a value. Thus, by portions of the spectrogram going above or below portions of the predefined spectrogram associated with possible abnormal respiratory sounds, it is determined that an abnormal respiratory sound may have occurred.
Once an irregular respiratory sound (such as a wheeze) has been identified using the “predefined mathematical features” the previous 20 (for example) minutes of accumulated raw data that has been stored in memory 172 receives “further processing.” In one exemplary embodiment of the present invention, the 20 minutes of raw data is transferred from memory 172 to external computer 360 for more robust processing. In another exemplary embodiment of the present invention, depending upon the processing power of processor 171, the 20 minutes of raw data is subjected to further processing in processor 171 without being transferred to an external computer.
The idea behind “further processing” is that a first algorithm is used to possibly identify an irregular respiratory sound and a second algorithm (more robust—i.e. that requires more significant processing than the first algorithm) is applied to the raw data to try to make a more accurate determination as to whether an irregular respiratory sound (such as a wheeze) has indeed occurred. In one exemplary embodiment of the present invention, a first algorithm generates twenty mathematical feature. A second algorithm generates fifty mathematical features and is more robust. In another exemplary embodiment of the present invention, the mathematical methods used to extract each mathematical feature in the second algorithm require more processing power than the mathematical methods used in a first algorithm. The second algorithm is more robust. In addition to using a spectrogram with the second algorithm, other factors may also be used in the analysis. Exemplary factors include: 1) user inputs, including subjective feelings, rescue inhaler use, type and frequency of medication use, current asthma status; 2) input from sensors, which include but are not limited to accelerometers, magnetometers, and gyroscopes, about a patient's current physiological status; 3) environmental inputs available from sensors, which include but are not limited to temperature sensors and barometers; and 4) environmental inputs available from an information source such as the internet. In other words, other variables are integrated into the analysis, in place of or in addition to the variables that form the basis of the analysis of the initial processed data (the 20 seconds of data, for example, discussed above).
Further processing may be performed in processor 171, external computer 360, or both, depending upon respective processing power, ability to communicate wirelessly, etc.
Thus, the further processing may include determining whether processed data has passed (i.e. above or below) boundary conditions. The boundary conditions may include one or more of any of the inputs and/or characteristics identified above. This is accomplished by pre-specified algorithms previously developed using a machine-learning approach using a deep-learning framework. This involves a multi-layer classification scheme. The variables used in the pre-specified algorithms in the external computer include, but are not limited to, the exemplary variables described above.
The “raw” data that may be stored, for example, in memory 172 provides multiple functions. For example, it provides an extended period of time for respiratory sound classification. The data may be processed into a spectrogram, and then a second algorithm may be used to analyze the spectrogram, in conjunction with other variables mentioned above. As a further example, the raw data may be used to improve the algorithm. For example, should an abnormal lung sound be recognized, it can serve as a control, and the raw data is used as a dataset to further refine (or “train”) the pre-specified algorithm.
An exemplary spectrogram based on audio data captured in accordance with an exemplary embodiment of the present invention is illustrated in
The inventors continue to refine algorithms in accordance with exemplary embodiments of the present invention. For example, multiple sound samples are obtained and classified into different lung sounds. Next, the samples (spectrograms) are input into a pre-specified classification algorithm to generate a set of mathematical features. The difference between the output of this classification algorithm and the pre-defined mathematical features is used to refine the algorithms. The goal is ensure the classification algorithm have the variables needed to filter out unwanted noises during feature extraction. Note, the above description is based on well-described machine learning approach.
Next, the classification algorithm can be applied to additional samples containing both an audio spectrogram and additional user data defined as “boundary conditions” above. The machine learning approach in this case need not focus on feature extraction. Rather, this machine learning approach employs predictive statistical analysis. The basic concept remains the same: Difference between the classification algorithm and the pre-defined answer is used to create and adjust the weight of variables. The goal is to make a classification algorithm generalizable across different boundary conditions.
An algorithm in accordance with an exemplary embodiment of the present invention may be based on specific approaches used to train the algorithm, and the algorithm itself.
To further clarify, in one exemplary embodiment of the present invention, a respiratory condition is detected by identifying how many times a certain type of respiratory sound occurs during a time period (“frequency”). If the number of times the sound is identified in a time period goes past a threshold, then a signal is generated to indicate that an adverse respiratory condition has been detected (or that an adverse respiratory condition has gotten better or worse). By saying “goes past a threshold” what is included is meeting the threshold, going above the threshold, or going below the threshold, depending upon what adverse respiratory conditions are desired to be detected. In a further exemplary embodiment of the present invention, the number of times a certain type of respiratory sound occurs in a first time period is compared with the number of times the certain type of respiratory occurs in a second type period (the first and second time periods may or may not be overlapping, the first and second time periods may or may not be equal). For example, the number of respiratory sounds in a first time period may be compared with the number of respiratory sounds in a second time period greater than the first time period. Comparisons may be with regard to frequency, power, location in the time frame being evaluated, and/or other criteria. In one exemplary embodiment of the present invention, the first time period may be three hours and the second time period may be 18 hours. These time periods are merely exemplary.
In another exemplary embodiment of the present invention, respiratory issues are identified based on frequency of audio signal (wheeze frequency ˜300-400 Hz) and the number of times an event occurs (frequency of the event itself). When referring to threshold, we are referring to the number of times an event is detected (decompensation).
In a further exemplary embodiment of the present invention, the external computer (i.e. smartphone) modulates the frequency with which sensor 160 capture data.
The results of step 118 can be displayed and/or arranged in numerous manners. For example, it is possible to perform classification of audio data with boundaries set by user input. The classification can also be performed based on sensor data (i.e. gyroscope) included in a smartphone.
In one exemplary embodiment of the present invention, a patient is able to provide feedback—i.e. a self-assessment of the diagnosis, in order to improve accuracy of diagnosis. Regardless, historical data can be accumulated over periods of time (days, months, years) to further refine boundary conditions and models used to identify respiratory problems.
In one exemplary embodiment of the present invention, a computing device other than a smartphone may be used. Exemplary computing devices include computers, tablets, etc.
In one exemplary embodiment of the present invention, results of identification of respiratory illness, and/or changes in respiratory conditions, are provided to a patient provider. The identification and/or changes may be displayed using a variety of different user interfaces.
In one exemplary embodiment of the present invention, wearable device 100 provides an indication of remaining battery life.
In one exemplary embodiment of the present invention, near-field communication (NFC) enabled tags are used to track medication and inhaler use. A NFC enabled tag is attached to an inhaler or a medication container. After each use of the inhaler or each dose of medication, a user taps a NFC enabled computing device to the NFC enabled tag. The NFC-enabled computing device then records the time at which the tap occurs, which corresponds to the timing of the use of an inhaler or administering of a medication. The NFC-enabled computing device may include but not limited to the following: mobile phone, tablet, or as part of the electronic components 130. The output of medication-use tracking is a “boundary condition” described above.
In one exemplary embodiment of the present invention, results of identification and/or changes are pushed to a patient or to a patient provider. In another exemplary embodiment, results of identification and/or changes are pulled to a patient or to a patient provider (i.e. provided on demand).
In one exemplary embodiment of the present invention, results of identification and/or changes are provided to a patient and/or patient provider in the form of emails and/or text messages and/or other forms of electronic communication.
The sampling frequency and sampling duration set forth above are merely exemplary. In one exemplary form of the present invention, sampling frequency and/or duration may be changed.
In one exemplary embodiment of the present invention, the invention is used in combination with location technology such as GPS in order to locate location of a patient.
This application claims priority to U.S. Provisional Application 62/439,254, which is hereby incorporated by reference in its entirety.
Number | Date | Country | |
---|---|---|---|
62439254 | Dec 2016 | US |