Examples described herein relate generally to systems for recognizing agonal breathing. Examples of detecting agonal breathing using a trained neural network are described.
Out-of-hospital cardiac arrest (OHCA) is a leading cause of death worldwide and in North America accounts for nearly 300,000 deaths annually. A relatively under-appreciated diagnostic element of cardiac arrest is the presence of a distinctive type of disordered breathing: agonal breathing. Agonal breathing, which arises from a brainstem reflex in the setting of severe hypoxia, appears to be evident in approximately half of cardiac arrest cases reported to 9-1-1. Agonal breathing may be characterized by a relatively short duration of collapse and has been associated with higher survival rates, though agonal breathing may also confuse the rescuer or 9-1-1 operator about the nature of the illness. Sometimes reported as “gasping” breaths, agonal respirations may hold potential as an audible diagnostic biomarker, particularly in unwitnessed cardiac arrests that occur in a private residence, the location of ⅔ of all OHCAs.
Early CPR is a core treatment, underscoring the vital importance of timely detection, followed by initiation of a series of time-dependent coordinated actions which comprise the chain of survival. Hundreds of thousands of people worldwide die annually from unwitnessed cardiac arrest, without any chance of survival because they are unable to activate this chain of survival and receive timely resuscitation. Timely identification and detection of cardiac arrest is important to the ability to provide prompt assistance.
Example systems are disclosed herein. In an embodiment of the disclosure, an example system includes a microphone configured to receive audio signals, processing circuitry, and at least one computer readable media encoded with instructions which when executed by the processing circuitry cause the system to classify an agonal breathing event in the audio signals using a trained neural network.
Additionally or alternatively, the trained neural network may be trained using audio signals indicative of agonal breathing and audio signals indicative of an ambient noise in an environment proximate the microphone.
Additionally or alternatively, the trained neural network may be trained further using audio signals indicative of non-agonal breathing.
Additionally or alternatively, the non-agonal breathing may include sleep apnea, snoring, wheezing, or combinations thereof.
Additionally or alternatively, the audio signals indicative of non-agonal breathing sounds in the environment proximate to the microphone may be identified from polysomnographic sleep studies.
Additionally or alternatively, the audio signals indicative of agonal breathing may be classified using confirmed cardiac arrest cases from actual agonal breathing events.
Additionally or alternatively, the trained neural network may be configured to distinguish between the agonal breathing event, ambient noise, and non-agonal breathing.
Additionally or alternatively, further included is a communication interface, wherein the instructions may further cause the system to request medical assistance by the communication interface or cause the system to request an AED device be brought to a user.
Additionally or alternatively, the instructions may further cause the system to request confirmation of medical emergency prior to requesting medical assistance by a user interface.
Additionally or alternatively, further included is a display to indicate the request for the confirmation of medical emergency.
Additionally or alternatively, the system may be configured to enter a wake state responsive to the agonal breathing event being classified.
Additionally or alternatively, the instructions may further cause the system to perform audio interference cancellation in the audio signals.
Additionally or alternatively, the instructions may further cause the system to reduce the audio interference transmitted by a smart device housing the microphone.
Example methods are disclosed herein. In an embodiment of the disclosure, an example method includes receiving audio signals, by a microphone, from a user, processing the audio signals by a processing circuitry, and classifying agonal breathing in the audio signals using a trained neural network.
Additionally or alternatively, further included may be training the trained neural network using audio signals indicative of agonal breathing and audio signals indicative of ambient noise in an environment proximate the microphone.
Additionally or alternatively, further included may be cancelling audio interference in the audio signals.
Additionally or alternatively, cancelling the audio interference may further include reducing interfering effects of audio transmissions produced by a smart device including the microphone.
Additionally or alternatively, further included may be requesting medical assistance when a medical emergency is indicated based at least on the audio signals indicative of agonal breathing.
Additionally or alternatively, further included may be requesting confirmation of the medical emergency prior to requesting medical assistance.
Additionally or alternatively, further included may be displaying the request for confirmation of the medical emergency.
To easily identify the discussion of any particular element or act, the most significant digit or digits in a reference number refer to the figure number in which that element is first introduced.
Certain details are set forth herein to provide an understanding of described embodiments of technology. However, other examples may be practiced without various of these particular details. In some instances, well-known circuits, control signals, timing protocols, and/or software operations have not been shown in detail in order to avoid unnecessarily obscuring the described embodiments. Other embodiments may be utilized, and other changes may be made, without departing from the spirit or scope of the subject matter presented here.
Widespread adoption of smart devices, including smartphones and smart speakers, may enable identifying agonal breathing and therefore OHCAs. In some examples, machine learning and algorithms may be used to identify agonal breathing and request medical assistance, such as connecting unwitnessed cardiac arrest victims to Emergency Medical Services (EMS) and cardiopulmonary resuscitation (CPR).
Non-contact, passive detection of agonal breathing allows identification of a portion of previously unreachable victims of cardiac arrest, particularly those who experience such events in a private residence. As the US population ages and more people become at risk for OHCA, leveraging omnipresent smart hardware for monitoring of these emergent conditions can provide public health benefits. Other domains where an efficient agonal breathing classifier could have utility include unmonitored health facilities (e.g., hospital wards and elder care environments), EMS dispatch, and when people have greater than average risk, such as people at risk for opioid overdose-induced cardiac arrest and for people who survive a heart attack.
An advantage of a contactless detection mechanism is that it does not require a victim to be wearing a device while asleep in the bedroom, which can be inconvenient or uncomfortable. Such a solution can be implemented on existing wired smart speakers and as a result would not face power constraints and could scale efficiently. While examples are provided in the context of the victim asleep in the bedroom, the victim may be monitored in other areas such as a bathroom, a kitchen, a living room, a dining room, a hospital room, etc.
Examples described herein may leverage a smart device to present an accessible detection tool for detection of agonal breathing. Examples of systems described herein may operate by (i) receiving audio signals from a user via a microphone of the smart device; (ii) processing the audio signals, and (iii) classifying agonal breathing in the audio signals using a machine learning technique, such as a trained neural network. In some examples, no additional hardware (beyond the smart device) is used. An implemented example system demonstrated high detection accuracy across all interfering sounds while testing across multiple smart device platforms.
For example, a user may produce audio signals indicative of the agonal breathing sounds which are captured by a smart device. The microphone of the smart device may passively detect the user's agonal breathing. While agonal breathing events are relatively uncommon and lack gold-standard measurements, real-world audio of confirmed cardiac arrest cases (e.g., 9-1-1 calls and actual audio from victims experiencing cardiac arrest in a controlled setting such as Intensive Care Unit (ICU), hospice, and planned end of life events) which may include agonal breathing instances captured were used to train a Deep Neural Network (DNN). The trained DNN was used to classify OHCA-associated agonal breathing instances on existing omnipresent smart devices.
Examples of trained neural networks or other systems described herein may be used without necessarily specifying a particular audio signature of agonal breathing. Rather, the trained neural networks may be trained to classify agonal breathing by training on a known set of agonal breathing episodes as well as a set of likely non-agonal breathing interference (e.g., sleep sounds, speech sounds, ambient sounds).
Examples of systems and methods described herein may be used to monitor users, such as user 102 of
The user 102 of
Generally, environments may contain sources of interfering sounds, such as non-agonal breathing sounds. For example, in
Smart devices may be used to classify agonal breathing sounds of a user in examples described herein. In the example of
Once agonal breathing sounds are detected by the smart device 112, a variety of actions may be taken. In some examples, the smart device 112 may prompt the user 102 to confirm an emergency is occurring. The smart device 112 may communicate with one or more other users and/or devices responsive to an actual and/or suspected agonal breathing event (e.g., the smart device 112 may make a phone call, send a text, sound or display an alarm, or take other action).
Examples of smart devices may include processing circuitry, such as processing circuitry 206 of
Examples of smart devices may include memory, such as memory 204 of
The memory 204 may store executable instructions for execution by the processing circuitry 206, such as executable instructions for classifying agonal breathing 208. In this manner, techniques for classifying agonal breathing of a user 102 may be implemented herein wholly or partially in software. Examples described herein may provide systems and techniques which may be utilized to classify agonal breathing notwithstanding interfering signals which may be present.
Examples of systems described herein may utilize trained neural networks. The trained neural network 210 is shown in
While a single trained neural network 210 is shown in
In some examples, the smart device 200 may be used to train the trained neural network 210. However, in some examples the trained neural network 210 may be trained by a different device. For example, the trained neural network 210 may be trained during a training process independent of the smart device 200, and the trained neural network 210 stored on the smart device 200 for use by the smart device 200 in classifying agonal breathing.
Trained neural networks described herein may generally be trained to classify agonal breathing sounds using audio recordings of known agonal breathing events and audio recordings of expected interfering sounds. For example, audio recordings of known agonal breathing events, such as 9-1-1 recordings containing agonal breathing events, may be used to train the trained neural network 210. Other examples of audio recordings of known agonal breathing events (e.g., actual agonal breathing events) may include agonal breathing events occurring in a controlled setting such as a victim in a hospital room, hospice, and experiencing planned end of life, etc. In order to generate a robustly trained neural network, the recordings of known agonal breathing events may be varied in accordance with their expected variations in practice. For example, known agonal breathing audio clips may be recorded at multiple distances from a microphone and/or captured using a variety of smart devices. This may provide a set of known agonal breathing clips from various environments and/or devices. Using such a robust and/or varied data set for training a neural network may promote the accurate classification of agonal breathing events in practice, when an individual may vary in their distance from the microphone and/or the microphone may be incorporated in a variety of devices which may perform differently. In some examples, known non-agonal breathing sounds may further be used to train the trained neural network 210. For examples, audio signals from polysomnographic sleep studies may be used to train trained neural network 210. The non-agonal breathing sounds may similarly be varied by recording them at various distances from a microphone, using different devices, and/or in different environments. The trained neural network 210 trained on recordings of actual agonal breathing events, such as 9-1-1 recordings of agonal breathing and expected interfering sounds such as polysomnographic sleep studies may be particularly useful, for example, for classifying agonal breathing events in a bedroom during sleep.
Examples of smart devices described herein may include a communication interface, such as communication interface 212. The communication interface 212 may include, for example, a cellular telephone connection, a Wi-Fi connection, an Internet or other network connection, and/or one or more speakers. The communication interface 212 may accordingly provide one or more outputs responsive to classification of agonal breathing. For example, the communication interface 212 may provide information to one or more other devices responsive to a classification of agonal breathing. In some examples, the communication interface 212 may be used to transmit some or all of the audio signals received by the smart device 200 so that the signals may be processed by a different computing device to classify agonal breathing in accordance with techniques described herein. However, in some examples to aid in speedy classification and preserve privacy, audio signals may be processed locally to classify agonal breathing, and actions may be taken responsive to the classification.
Examples of smart devices described herein may include one or more displays, such as display 214. The display 214 may be implemented using, for example, one or more LCD displays, one or more lights, or one or more touchscreens. The display 214 may be used, for example, to display an indication that agonal breathing has been classified in accordance with executable instructions for classifying agonal breathing 208. In some examples, a user may touch the display 214 to acknowledge, confirm, and/or deny the occurrence of agonal breathing responsive to a classification of agonal breathing.
Examples of smart devices described herein may include one or more microphones, such as microphone 202 of
In some examples, smart devices described herein may include executable instructions for waking the smart device. Executable instructions for waking the smart device may be stored, for example, on memory 204. The executable instructions for waking the smart device may cause certain components of the smart device 200 to turn on, power up, and/or process signals. For example, smart speakers may include executable instructions for waking responsive to a wake word, and may process incoming speech signals only after recognizing the wake word. This waking process may cut down on power consumption and delay during use of the smart device 200. In some examples described herein, agonal breathing may be used as a wake word for a smart device. Accordingly, the smart device 200 may wake responsive to detection of agonal breathing and/or suspected agonal breathing. Following classification of agonal breathing, one or more components of the device may power on and/or conduct further processing using the trained neural network 210 to confirm and further classify an agonal breathing event and take action responsive to the agonal breathing classification.
In the example of
The neural network may be trained to output probabilities (e.g., a stream of probabilities in real-time) indicative of a likelihood of agonal breathing at a particular time. The incoming audio signals may be segmented into segments which are of a duration relevant to agonal breathing. For example, audio signals occurring during a particular time period expected to be sufficient to capture an agonal breath may be used as segments and input to the trained neural network to classify or begin to classify agonal breathing. In some examples, a duration of 2.5 seconds may be sufficient for reliably capturing an agonal breath. In other examples, a duration of 1.5 seconds, 1.8 seconds, 2.0 seconds, 2.8 seconds, 3.0 seconds may be sufficient.
Each segment may be transformed from the time-domain into the frequency domain, such as into a spectrogram, such as a log-mel spectrogram 306. The transformation may occur, for example, using one or more transforms (e.g., Fourier transform) and may be implemented using, for example, the processing circuitry 206 of
In some examples, in addition to agonal breathing sounds, the user 302 may produce sleep sounds such as movement in bed, breathing, snoring, and/or apnea events. While apnea events may sound similar to agonal breathing, they are physiologically different from agonal breathing. Examples of trained neural networks described herein, including trained neural network 210 of
Neural networks described herein, such as the trained neural network 210 and/or Support vector machine 308 of
In the example of
On average, instances of agonal breathing may be separated by a period of negative sounds (e.g., interfering sounds). In some examples, the period of time separating instances of agonal breathing sounds may be 30 seconds, although other periods may be used in other examples. The threshold and timing detector 410 may be used to detect agonal breathing sounds and reduce false positives by only classifying agonal breathing as an output when agonal breathing sounds are classified over a threshold number of times and/or within a threshold amount of time. For example, in some examples agonal breathing may only be classified as an output if it is classified by a neural network more than one time within a time frame, more than two times within a time frame, or more than another threshold of times. Examples of time frames may be 15 seconds, 20 seconds, 25 seconds, 30 seconds, 35 seconds, 40 seconds, and 45 seconds.
When it is determined that the user 402 is producing agonal breathing, the smart device 404 may contact EMS 412, caregivers, or volunteer responders in the neighborhood to assist in performing CPR and/or any other necessary medical assistance. Additionally or alternatively, the smart device 404 may prompt the EMS 412, caregivers, or volunteer responders to bring an AED device be brought to a user. The AED device may provide visual and/or audio prompts for operating the AED device and performing CPR.
In an example, the smart device 404 may reduce and/or prevent false alarms of requesting medical help from EMS 412 when the user 402 does not in fact have agonal breathing by sending a warning to the user 402 (e.g., by displaying an indication that agonal breathing has been classified and/or prompting a user to confirm an emergency is occurring). The smart device 404 may send a warning and seek an input other than agonal breathing sounds from the user 402 via the user interface 216. The warning may additionally be displayed on display 214. Absent a confirmation from the user 402 that the agonal breathing sounds detected is not indicative of agonal breathing, the communication interface 212 of smart device 404 may seek medical assistance in some examples. In some examples, an action (e.g., seeking medical assistance) may only be taken responsive to confirmation an emergency is occurring.
Utilizing smart devices may improve the ubiquity with which individuals may be monitored for agonal breathing events. By prompt and passive detection of agonal breathing, individuals suffering cardiac arrest may be able to be treated more promptly, ultimately improving outcomes and saving lives.
An implemented example system was used to train and validate a model for detecting unwitnessed agonal breathing of real-world sleep data. In training the neural network, agonal breathing recordings sourced from 9-1-1 emergency calls from 2009 to 2017, provided by Public Health Seattle & King County, Division of Emergency Medical Services. The positive dataset included 162 calls (19 hours) that had clear recordings of agonal breathing. For each occurrence, 2.5 seconds of audio from the start of each agonal breathing was extracted. A total of 236 clips of agonal breathing instances were extracted. The agonal breathing dataset was augmented by playing the recordings over the air over distances of 1, 3, and 6 m, in the presence of interference from indoor and outdoor sounds with different volumes and when a noise cancellation filter is applied. The recordings were captured on different devices, namely an Amazon Alexa, an iPhone 5s and a Samsung Galaxy S4 to get 7316 positive samples.
The negative dataset included 83 hours of audio data captured during polysomnographic sleep studies, across 12 different patients. These audio streams include instances of hypopnea, central apnea, obstructive apnea, snoring, and breathing. The negative dataset also included interfering sounds that might be present in a bedroom while a person is asleep, specifically a podcast, sleep soundscape and white noise. In training the model, 1 hour of audio data from the sleep study in addition to other interfering sounds were used. These audio signals were played over the air at different distances and recorded on different devices to get 7305 samples. The remaining 82 hours of sleep data (117,985 audio segments) is then used for validating the performance of the model.
A k-fold (k=10) cross-validation was applied and an area under the curve (AUC) of 0.9993±0.0003 was obtained. An operating point with an overall sensitivity and specificity of 97.24% (95% CI: 96.86-97.61%) and 99.51% (95% CI: 99.35-99.67%), respectively, was obtained. The k-fold (k=10) cross-validation using other machine learning classifiers including k-nearest neighbors, logistic regression and random forests was executed. These classifiers achieved an AUC that was >0.98 but slightly lower than the AUC of the trained SVM. The detection algorithm can run in real-time on a smartphone natively and can classify each 2.5 s audio segment within 21 ms. With a smart speaker, the algorithm can run within 58 ms. The audio embeddings of the dataset were visualized by using t-SNE to project the features into a 2-D space.
To evaluate false positive rate, the classifier trained over the full audio stream collected in the sleep lab was run. The sleep audio used to train each model was excluded from evaluation. By relying only on the classifier's probability outputs, a false positive rate of 0.14409% was obtained (170 of 117,985 audio segments). To reduce false positives, the classifier's predictions are passed through a frequency filter that checks if the rate of positive predictions is within the typical frequency at which agonal breathing occurs (e.g., within a range of 3-6 agonal breaths per minute). This filter reduced the false positive rate to 0.00085%, when it considers two agonal breaths within a duration of 10-20 s. When it considers a third agonal breath within a subsequent period of 10-20 s, the false positive rate reduces to 0%.
Outside of the sleep lab, real-world recordings of sleep sounds that occur within the house (e.g., snoring, breathing, movement in bed) were used to evaluate the false positive rate of the classifier. 35 individuals were recruited to record themselves while sleeping using their smart devices for a total duration of 167 hours. The recordings were manually checked to ensure the audio corresponded to sleep sounds. The classifier was retrained with an additional 5 min of data from each subject, with a comparable operating point with a sensitivity and specificity of 97.17% (95% CI: 96.79-97.55%) and 99.38% (95% CI: 99.20-99.56%), respectively. The false positive rate of the classifier without a frequency filter is 0.21761%, corresponding to 515 of the 236,666 audio segments (164 hours) used as test data. After applying the frequency filter, the false positive rate reached 0.00127% when considering two agonal breaths within a duration of 10-20 seconds, and 0% after considering a third agonal breath within a subsequent period of 10-20 seconds.
Audio clips of agonal breathing over the air from an external speaker and captured the audio on an Amazon Echo and Apple iPhone 5s. The detection accuracy was evaluated using the k=10 validation folds in the dataset such that no audio file in the validation set appears in any of the different recording conditions in the training set. Both the Echo and iPhone 5s achieved >96.63% mean accuracy at distances up to 3 meters. When the smart device was placed in a pocket, with the user supine on the ground and the speaker next to the head, a mean detection accuracy of 93.22±4.92%. Across all interfering sound classes including indoor interfering sounds (e.g., cat, dog, air conditioner) and outdoor interfering sounds (e.g., traffic, construction and human speech), the smart device achieved a mean detection accuracy of 96.23%.
A smart device was set to play sounds one might play to fall asleep (e.g., a podcast, sleep soundscape, and white noise). These sounds were played at a soft (45 dbA) and loud (67 dBA) volume. Simultaneously, the agonal breathing audio clips were played. When the audio cancellation algorithm was applied, the detection accuracy achieved an average of 98.62 and 98.57% across distances and sounds for soft and loud interfering volumes, respectively.
To benchmark the classifier's performance against negative audio sounds, a stream of negative sounds was streamed over the air: snoring, a podcast, a sleep soundscape and white noise, and the negative audio sounds were recorded on a smart device. The smart device achieved a mean detection accuracy of 99.57% at a distance of 3 m; a 100% accuracy corresponds to the classifier correctly identifying that the sounds are from the negative dataset. Across all interfering sounds, the mean detection accuracy was 99.29%.
The particulars shown herein are by way of example and for purposes of illustrative discussion of the preferred embodiments of the present invention only and are presented in the cause of providing what is believed to be the most useful and readily understood description of the principles and conceptual aspects of various embodiments of the invention. In this regard, no attempt is made to show structural details of the invention in more detail than is necessary for the fundamental understanding of the invention, the description taken with the drawings and/or examples making apparent to those skilled in the art how the several forms of the invention may be embodied in practice.
As used herein and unless otherwise indicated, the terms “a” and “an” are taken to mean “one”, “at least one” or “one or more”. Unless otherwise required by context, singular terms used herein shall include pluralities and plural terms shall include the singular.
Unless the context clearly requires otherwise, throughout the description and the claims, the words ‘comprise’, ‘comprising’, and the like are to be construed in an inclusive sense as opposed to an exclusive or exhaustive sense; that is to say, in the sense of “including, but not limited to”. Words using the singular or plural number also include the plural and singular number, respectively. Additionally, the words “herein,” “above,” and “below” and words of similar import, when used in this application, shall refer to this application as a whole and not to any particular portions of the application.
The description of embodiments of the disclosure is not intended to be exhaustive or to limit the disclosure to the precise form disclosed. While the specific embodiments of, and examples for, the disclosure are described herein for illustrative purposes, various equivalent modifications are possible within the scope of the disclosure, as those skilled in the relevant art will recognize.
Specific elements of any foregoing embodiments can be combined or substituted for elements in other embodiments. Moreover, the inclusion of specific elements in at least some of these embodiments may be optional, wherein further embodiments may include one or more embodiments that specifically exclude one or more of these specific elements. Furthermore, while advantages associated with certain embodiments of the disclosure have been described in the context of these embodiments, other embodiments may also exhibit such advantages, and not all embodiments need necessarily exhibit such advantages to fall within the scope of the disclosure.
This application claims the benefit under 35 U.S.C. § 119 of the earlier filing date of U.S. Provisional Application Ser. No. 62/782,687 filed Dec. 20, 2018, the entire contents of which are hereby incorporated by reference in their entirety for any purpose.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US19/67988 | 12/20/2019 | WO | 00 |
Number | Date | Country | |
---|---|---|---|
62782687 | Dec 2018 | US |