Aspects and embodiments of the present disclosure relate to systems and methods for detecting snoring.
Known methods for alleviating snoring include making adjustments to the surface of the bed on which a user is sleeping. Such adjustments are designed to place the user in a position known to reduce snoring.
Some methods of detecting snoring are based on the use of a single microphone. The microphone is used to capture audio in an environment in which snoring may be present. Signal processing techniques are then used to determine if the captured audio signal is consistent with snoring.
According to an aspect of the present disclosure there is provided a method for detecting a sound generated by an entity during sleep. The method comprises using a first microphone to convert a first sound into a first electrical signal; using a second microphone to convert a second sound into a second electrical signal, the first and second microphones being spatially separated; generating a third electrical signal from the first electrical signal and the second electrical signal, the third electrical signal being representative of the first sound arriving at the first microphone and the second sound arriving at the second microphone from a plurality of directions; selecting a first portion of the third electrical signal, the first portion corresponding to the first sound arriving at the first microphone and the second sound arriving at the second microphone from a direction of interest at each of a plurality of sample points; selecting a second portion of the third electrical signal, the second portion containing only components of the first portion that have a frequency within a frequency range of interest; deriving a metric from the second portion of the third electrical signal, the metric indicating if the first portion of the third electrical signal includes a component consistent with a sound generated by an entity during sleep; and generating an output if the metric indicates that the first portion of the third electrical signal includes a component consistent with the sound generated by an entity during sleep.
In one example generating the third electrical signal may include measuring a similarity between the first electrical signal and the second electrical signal.
In one example generating the third electrical signal may include cross correlating the first electrical signal and the second electrical signal.
In one example cross correlating may include using a generalized cross correlation function.
In one example using a generalized cross correlation function may include using a Fast Fourier transform to generate a Fourier transform of the first electrical signal and the second electrical signal.
In one example a Fourier transform of the first electrical signal and the second electrical signal may be generated between every 2 milliseconds to 6 milliseconds.
In one example the Fast Fourier transform may be a 256 point Fast Fourier transform.
In one example cross correlating may generate an output including correlation for a plurality of time delays in arrival between the first sound at the first microphone and the second sound at the second microphone.
In one example each of the plurality of time delays in arrival may correspond to the first sound arriving at the first microphone and the second sound arriving at the second microphone from a physical direction at a sample point.
In one example selecting the first portion of the third electrical signal may include selecting a subset of the output, the subset having time delays in arrival that correspond to a physical area of interest.
In one example the method may further comprise smoothing the subset of the output from frame to frame to reduce noise in the subset.
In one example smoothing may include using exponential smoothing.
In one example selecting the first portion of the third electrical signal may include selecting a maximum signal from the subset of the output at each sample point.
In one example the first portion of the third electrical signal may be representative of how the physical direction from which the first sound arrives at the first microphone and the second sound arrives at the second microphone changes in time.
In one example the method may further comprise normalizing the first portion of the third electrical signal.
In one example selecting the second portion of the third electrical signal may include generating a Fourier transform of the first portion of the third electrical signal.
In one example generating the Fourier transform of the first portion of the third electrical signal may include using a Fast Fourier transform.
In one example generating the Fourier transform may include generating a buffer of a magnitude Fast Fourier transform of the first portion of the third electrical signal.
In one example the buffer may be between 10 seconds and 25 seconds long.
In one example selecting the second portion of the third electrical signal may include selecting a subset of the Fourier transform, the subset corresponding to the frequency range of interest.
In one example the frequency range of interest may correspond to a characteristic frequency range of the sound generated by the entity during sleep.
In one example the sound generated by the entity during sleep may be a sound generated by the entity snoring.
In one example the frequency range of interest may correspond to a breathing rate of 1.5 seconds per breath to 6 seconds per breath.
In one example deriving the metric may include calculating a difference between a maximum value and a minimum value in the subset of the Fourier transform.
In one example the metric may vary with time.
In one example the metric may indicate that the first portion of the third electrical signal comprises a component consistent with a sound produced by an entity during sleep if the metric rises above a first threshold.
In one example the method may further comprise indicating that the first portion of the third electrical signal no longer comprises a component consistent with the sound produced by the entity during sleep if the metric subsequently falls below a second threshold.
In one example the method may further comprise defining a first direction from a point between the first microphone and second microphone towards a position of the first microphone.
In one example an angle corresponding to the direction of interest may comprise a component in the first direction.
In one example the method may further comprise selecting a third portion of the third electrical signal, the third portion corresponding to the first sound arriving at the first microphone and the second sound arriving at the second microphone from a second direction of interest.
In one example the method may further comprise defining a second direction from the point between the first and second microphones towards a position of the second microphone.
In one example an angle corresponding to the second direction of interest may comprise a component in the second direction.
According to another aspect of the present disclosure there is provided a system for detecting a sound generated by an entity during sleep The system comprises a first microphone configured to convert a first sound into a first electrical signal; a second microphone configured to convert a second sound into a second electrical signal, the first and second microphones being spatially separated; and a processor configured to generate a third electrical signal from the first electrical signal and the second electrical signal, the third electrical signal being representative of the first sound arriving at the first microphone and the second sound arriving at the second microphone from a plurality of directions, to select a first portion of the third electrical signal, the first portion corresponding to the first sound arriving at the first microphone and the second sound arriving at the second microphone from a direction of interest at each of a plurality of sample points, to select a second portion of the third electrical signal, the second portion containing only components of the first portion that have a frequency within a frequency range of interest, to derive a metric from the second portion of the third electrical signal, the metric indicating if the first portion of the third electrical signal includes a component consistent with a sound generated by an entity during sleep, and to generate an output if the metric indicates that the first portion of the third electrical signal includes a component consistent with the sound generated by an entity during sleep.
In one example the third electrical signal may be based on a cross correlation between the first electrical signal and the second electrical signal.
In one example the cross correlation may use a generalized cross correlation function.
In one example the generalized cross correlation function may use a Fast Fourier transform to generate a Fourier transform of the first electrical signal and the second electrical signal.
In one example a Fourier transform of the first electrical signal and the second electrical signal may be generated between every 2 milliseconds to 6 milliseconds.
In one example the Fast Fourier transform may be a 256 point Fast Fourier transform.
In one example an output of the cross correlation may include correlation for a plurality of time delays in arrival between the first sound at the first microphone and the second sound at the second microphone.
In one example each of the plurality of time delays in arrival may correspond to the first sound arriving at the first microphone and the second sound arriving at the second microphone from a physical direction at a sample point.
In one example the first portion of the third electrical signal may include a subset of the output, the subset having time delays in arrival that correspond to a physical area of interest.
In one example the processor may be further configured to smooth the subset of the output from frame to frame to reduce noise in the subset.
In one example the processor may be further configured to smooth the subset of the output using exponential smoothing.
In one example the first portion of the third electrical signal may include a maximum signal from the subset of the output at each sample point.
In one example the first portion of the third electrical signal may be representative of how the physical direction from which the first sound arrives at the first microphone and the second sound arrives at the second microphone changes in time.
In one example the processor may be further configured to normalize the first portion of the third electrical signal.
In one example the second portion of the third electrical signal may be based on a Fourier transform of the first portion of the third electrical signal.
In one example the Fourier transform of the first portion of the third electrical signal may be generated using a Fast Fourier transform.
In one example the Fast Fourier transform may include generating a buffer of a magnitude Fast Fourier transform of the first portion of the third electrical signal.
In one example the buffer may be between 10 seconds and 25 seconds long.
In one example the second portion of the third electrical signal may include a subset of the Fourier transform, the subset corresponding to the frequency range of interest.
In one example the frequency range of interest may correspond to a characteristic frequency range of the sound generated by an entity during sleep.
In one example the sound generated by an entity during sleep may be a sound generated by the entity snoring.
In one example the frequency range of interest may correspond to a breathing rate of 1.5 seconds per breath to 6 seconds per breath.
In one example the metric may be based on a difference between a maximum value and a minimum value in the subset of the Fourier transform.
In one example the metric may vary with time.
In one example the metric may indicate that the first portion of the third electrical signal comprises a component consistent with a sound produced by an entity during sleep if the metric rises above a first threshold.
In one example the processor may be further configured to indicate that the first portion of the third electrical signal no longer comprises a component consistent with the sound generated by the entity during sleep if the metric subsequently falls below a second threshold.
In one example a first direction may be defined from a point between the first microphone and the second microphone towards a position of the first microphone.
In one example an angle corresponding to the direction of interest may comprise a component in the first direction.
In one example the processor may be further configured to select a third portion of the third electrical signal, the third portion corresponding to the first sound arriving at the first microphone and the second sound arriving at the second microphone from a second direction of interest.
In one example a second direction may be defined from a point between the first microphone and the second microphone towards a position of the second microphone.
In one example an angle corresponding to the second direction of interest may comprise a component in the second direction.
In one example the system may further comprise a third microphone and a fourth microphone.
According to another aspect of the present disclosure there is provided a system for detecting a sound generated by an entity during sleep. The system is configured to use a first microphone to convert a first sound into a first electrical signal; use a second microphone to convert a second sound into a second electrical signal, the first and second microphones being spatially separated; generate a third electrical signal from the first electrical signal and the second electrical signal, the third electrical signal being representative of the first sound arriving at the first microphone and the second sound arriving at the second microphone from a plurality of directions; select a first portion of the third electrical signal, the first portion corresponding to the first sound arriving at the first microphone and the second sound arriving at the second microphone from a direction of interest at each of a plurality of sample points; select a second portion of the third electrical signal, the second portion containing only components of the first portion that have a frequency within a frequency range of interest; derive a metric from the second portion of the third electrical signal, the metric indicating if the first portion of the third electrical signal includes a component consistent with a sound generated by an entity during sleep; and generate an output if the metric indicates that the first portion of the third electrical signal includes a component consistent with the sound generated by an entity during sleep.
According to another aspect of the present disclosure there is provided a system for detecting a sound generated by an entity during sleep, The system comprises a first microphone configured to convert a first sound into a first electrical signal; a second microphone configured to convert a second sound into a second electrical signal, the first and second microphones being spatially separated; a processor configured to generate a third electrical signal from the first electrical signal and the second electrical signal, the third electrical signal being representative of the first sound arriving at the first microphone and the second sound arriving at the second microphone from a plurality of directions, to select a first portion of the third electrical signal, the first portion corresponding to the first sound arriving at the first microphone and the second sound arriving at the second microphone from a direction of interest at each of a plurality of sample points, to select a second portion of the third electrical signal, the second portion containing only components of the first portion that have a frequency within a frequency range of interest, to derive a metric from the second portion of the third electrical signal, the metric indicating if the first portion of the third electrical signal includes a component consistent with a sound generated by an entity during sleep, and to generate an output if the metric indicates that the first portion of the third electrical signal includes a component consistent with the sound generated by an entity during sleep; and a bed including a moveable bed base and a moveable mattress, the moveable bed base and moveable mattress being configured to adjust their positioning in response to the output.
In one example the bed base and mattress may be configured to support the entity during sleep.
In one example the direction of interest may correspond to a location of the entity on the mattress.
In one example the bed base and mattress may be further configured to adjust a position of the entity in response to the output.
In one example the position of the entity may be adjusted by an amount that is proportional to an amount of time for which the metric has indicated that the first portion of the third electrical signal includes a component consistent with the sound generated by an entity during sleep.
In one example the position of the entity may be continually adjusted up to a maximum position or until the metric no longer indicates that the first portion of the third electrical signal includes a component consistent with the sound generated by an entity during sleep.
In one example the position of the entity may be adjusted by an amount that is proportional to a loudness of the sound generated by an entity during sleep.
In one example the bed may further comprise a headboard.
In one example the first microphone and the second microphone may be positioned on the headboard.
In one example the first microphone and the second microphone may project from the headboard.
In one example the first microphone and the second microphone may be positioned in a line of sight of the entity.
Still other aspects, embodiments, and advantages of these exemplary aspects and embodiments are discussed in detail below. Embodiments disclosed herein may be combined with other embodiments in any manner consistent with at least one of the principles disclosed herein, and references to “an embodiment,” “some embodiments,” “an alternate embodiment,” “various embodiments,” “one embodiment” or the like are not necessarily mutually exclusive and are intended to indicate that a particular feature, structure, or characteristic described may be included in at least one embodiment. The appearances of such terms herein are not necessarily all referring to the same embodiment.
Various aspects of at least one embodiment are discussed below with reference to the accompanying figures, which are not intended to be drawn to scale. The figures are included to provide illustration and a further understanding of the various aspects and embodiments, and are incorporated in and constitute a part of this specification, but are not intended as a definition of the limits of the invention. In the figures, each identical or nearly identical component that is illustrated in various figures is represented by a like numeral. For purposes of clarity, not every component may be labeled in every figure. In the figures:
Aspects and embodiments of the disclosure described herein are directed to a system and method for detecting snoring.
It is to be appreciated that embodiments of the methods and apparatuses discussed herein are not limited in application to the details of construction and the arrangement of components set forth in the following description or illustrated in the accompanying drawings. The methods and apparatuses are capable of implementation in other embodiments and of being practiced or of being carried out in various ways. Examples of specific implementations are provided herein for illustrative purposes only and are not intended to be limiting. Also, the phraseology and terminology used herein is for the purpose of description and should not be regarded as limiting. The use herein of “including,” “comprising,” “having,” “containing,” “involving,” and variations thereof is meant to encompass the items listed thereafter and equivalents thereof as well as additional items. References to “or” may be construed as inclusive so that any terms described using “or” may indicate any of a single, more than one, and all of the described terms.
Snoring is when breathing becomes noisy during sleep. Snoring is the result of obstructed air flow through the nose and/or mouth. Known methods for detecting snoring have relied on techniques such as Formant Analysis or performing Spectrograms on an audio stream as an input to a Convolutional Neural Network. Such methods rely on a single microphone. Although single microphone systems are able to indicate whether snoring is present in an audio signal, they do not address the location or the direction of the snoring. They are further unable to detect and distinguish multiple sources of snoring.
In summary, the inventors of the snoring detection system and method described herein have appreciated that it is advantageous to determine the direction from which snoring originates. This is achieved by using a two microphone system as opposed to a known single microphone system. The use of two microphones is significant. By capturing audio at two spatially separated microphones, information regarding the direction from which sound is arriving at the microphones can be obtained. From this directional information, it can be determined if the sound arriving at the microphones from an area of interest corresponds to a sound generated by snoring. This is achieved by assessing if a signal representing sound arriving at the microphones from the area of interest comprises a component within a frequency range that is characteristic of a sound produced by snoring.
The above-described method and system has particular applicability in the context of smart beds shared by more than one user. By positioning the two microphones such that they are able to capture audio in the surroundings of the bed, it can be determined if sound arriving at the microphones from an area of interest is consistent with a sound produced by a user snoring. Measures can then be taken to alleviate the snoring. Such measures include adjusting the position of a user, by way of moveable components of the bed, whose location on the bed corresponds to that area of interest. The snoring detection system and method is described in more detail below with reference to example embodiments.
According to some aspects of the present disclosure, a system for detecting snoring is provided that is able to determine the direction from which snoring originates.
In a next step (block 290 of
From the cross correlation output, a left area of interest and right area of interest is selected. These areas are defined by selecting a subset of lags 310 that correspond to sound arriving at the microphone array 110 from a plurality of direction angles. These direction angles are those from which a sound produced by snoring is likely to originate in the surroundings of the bed 140. The subset of lags 310 comprises the first n bins and the last N−n bins, where N is the total number of bins of the GCCPHAT output and n indicates the number of bins of interest in each direction (left and right). For clarity, the lags 310 are re-labelled −n to n. This corresponds to sound arriving broadside of the microphone array 110+/−some angles. Positive lags 310 correspond to direction angles with a component in the left direction (forming the left area of interest). Sound coming from the left area of interest is generally referred to herein as coming from the left direction. Negative lags 310 correspond to direction angles with a component in the right direction (forming the right area of interest). Sound coming from the right area of interest is generally referred to herein as coming from the right direction. In other words, the GCCPHAT output for n bins of interest indicates the arrival strength of the detected audio signal for 2n different direction angles relative to the microphone array 110. Lags 310 corresponding to the left area of interest and right area of interest are indicated in the graph 300 of
The data over time is then smoothed (block 360 of
The maximum value in the smoothed left and smoothed right subsets is then selected for each sample point. As discussed above, a peak in the signal strength map 370 at a particular instant in time indicates that sound is coming from the direction represented by that lag at that time (also referred to herein a direction of interest). The result from the left subsets is a signal that is representative of how the directionality of sound arriving at the microphone array 110 from the left area of interest changes in time (referred to herein as the left maximum direction signal). The result from the right subsets is a signal that is representative of how the directionality of sound arriving at the microphone array 110 from the right area of interest changes in time (referred to herein as the right maximum direction signal). The left and right maximum direction signals are then normalized (block 380 of
As described above, snoring has a characteristic frequency. This characteristic frequency is the result of a characteristic user breathing rate. In the context of the maximum direction signals 390, 400, this translates to a characteristic frequency at which the directionality of sound arriving at the microphone array 110 changes. In other words, snoring leads to ‘peaks’ in the signal strength map 370 that are captured by the maximum direction signals 390, 400. After normalization, whether there is a component consistent with the characteristic breathing rate in the normalized left maximum direction signal 400 and normalized right maximum direction signal 390 is determined. The characteristic user breathing rate is defined as being within a range of 1.5 seconds per breath to 6 seconds per breath. Although in the example discussed herein, the range of interest is chosen to be that commonly associated with snoring, in other examples the range may be chosen to detect other sounds produced by a user during sleep (e.g., due to sleep apnea or catathrenia). To determine whether the maximum direction signals 390, 400 comprise a component consistent with a sound produced by snoring requires long term observation of the maximum direction signals 390, 400. This is to capture the frequency at which directionality is changing over time. The Fourier transform of each of the left and right normalized maximum direction signals is computed (block 410 of
To determine whether there is a sound produced by snoring coming from each of the left and right areas of interest, two thresholds are used. Using two thresholds provides hysteresis. The snoring metrics derived from the maximum direction signals 390, 400 of
Measures can be taken to alleviate snoring in response to detecting a sound produced by snoring. What corrective action is required is determined using the snoring metric (block 440 of
According to aspects of the present disclosure, a snoring detection system 100 is provided which is able to determine the direction from which snoring originates. By detecting which direction the snoring is coming from, measures can be taken to alleviate the snoring, as discussed above.
According to aspects of the present disclosure, a snoring detection system 100 is provided which is able to indicate the direction from which snoring originates in the presence of other sources of sound. As discussed in more detail above, after isolating sound detected from the left area of interest and right area of interest, the presence of snoring is determined by seeking out a characteristic snoring frequency. By focusing on a characteristic snoring frequency, snoring can be detected even in the presence of a relatively loud non-snoring noise and in the presence of diffuse noise. Examples of such noise include noise from a television or radio, and speech.
According to aspects of the present disclosure, a snoring detection system 100 is provided that is computationally efficient. As described above, the snoring detection system 100 uses computationally efficient techniques such as FFT based cross-correlation to detect the direction of snoring. The method described in
An example second embodiment of a snoring detection system 530 according to aspects of the present disclosure is shown in
The two microphone arrays 110a-b are positioned in the headboard 150 of bed 140. A first microphone array 110a is referred to herein as the left microphone array. A second microphone array 110b is referred to herein as the right microphone array. Each microphone array 110a-b is similar to the microphone array 110 of the first embodiment described above. However, the orientation of the microphone arrays 110a-b is different. This difference in orientation is significant. Each microphone array 110a-b is oriented such that a first direction defined from the midpoint between the first and second microphones 120a-b, 130a-b of the array 110a-b towards a position of the first microphone 120a-b (indicated by arrow 532) is directed towards the base 462 of bed 140. This direction is referred to herein as the down direction. A second direction defined from the midpoint between the first and second microphones 120a-b, 130a-b of the array 110a-b towards a position of the second microphone 130a-b (indicated by arrow 534) is directed away from the base of the bed 140. This direction is referred to herein as the up direction. Such an orientation facilitates focusing the snoring detection on where a user is likely to be positioned on the bed 140.
The snoring detection method implemented by system 530 is similar to that described above with reference to the first embodiment. However, the orientation of the microphone arrays 110a-b leads to a difference in the areas of interest. Referring now to the example shown in
The data stored in the subset of lags for each of the left microphone array 110a and right microphone array 110b is processed independently as described above with reference to
The FFT of the normalized maximum direction signal is then computed for the left microphone array 110a and right microphone array 110b (block 410 of
From the Fourier transform, it is determined if a component corresponding to the characteristic breathing rate (1.5 s to 6 s per breath) is present. In the example shown in
The above process gives rise to a snoring metric for the left microphone array 110a and a snoring metric for the right microphone array 110b. As discussed above, two thresholds are used to determine if the maximum direction signals corresponding to each array comprise a component consistent with a sound produced by snoring.
Snoring detection system 530 provides the same advantages as discussed above for snoring detection system 100. This embodiment further provides the ability to focus the snoring detection on an area of interest 540a-b. This area of interest is chosen to correspond to where a user's head is likely to be positioned during use. This is, therefore, where sounds consistent with snoring are most likely to originate.
Having described above several aspects of at least one embodiment, it is to be appreciated various alterations, modifications, and improvements will readily occur to those skilled in the art. Such alterations, modifications, and improvements are intended to be part of this disclosure and are intended to be within the scope of the invention. Accordingly, the foregoing description and drawings are by way of example only, and the scope of the invention should be determined from proper construction of the appended claims, and their equivalents.
This application claims priority under 35 U.S.C. § 119(e) to U.S. Provisional Patent Application Ser. No. 63/307,704, titled “SNORING DETECTION SYSTEM,” filed Feb. 8, 2022, the entire contents of which is incorporated herein by reference for all purposes.
Number | Date | Country | |
---|---|---|---|
63307704 | Feb 2022 | US |