Glass breaking audio detection has been implemented using energy detection techniques where the energy pattern is monitored over time. A typical glass breaking signal will consist of an impulse plus an exponentially decreasing tail. Prior art glass breaking detection systems range from simple acoustic energy detectors to frequency counters, to more sophisticated spectral analysis algorithms, however these systems generally suffer from a significant number of false positives.
What is desired, and not provided by the prior art, is a glass breakage detection system which reduces the number of false positives while increasing the probability of detecting breakage of glass.
Accordingly, it is a principal object of the present invention to overcome at least some of the disadvantages of the prior art. In one embodiment a glass breakage detection method is enabled, the method comprising: receiving a plurality of audio samples; estimating low frequency power values of the received plurality of audio samples; estimating wide band power values of the received plurality of audio samples; responsive to the estimated wide band power values, determining an amplification value; responsive to the estimated low frequency power being greater than a predetermined threshold, amplifying a function of the received plurality of audio samples by the amplification value; comparing the amplified function with a predetermined function of sound of breaking glass; and outputting an indication of the comparison.
In one embodiment, the method further comprises determining Mel-spaced band power values of the received plurality of audio samples, wherein low frequency power value estimation is responsive to the determined Mel-spaced band power values. In another embodiment, plurality of audio samples are received over a predetermined time period, wherein the method further comprises comparing the estimated low frequency power values of each of a plurality of portions of the predetermined time period with a predetermined threshold, and wherein the amplification is responsive to estimated low frequency power values being greater than the predetermined threshold for more than one of the plurality of time period portions.
Independently, the embodiments provide for an alarm system, comprising: an input module arranged to: receive audio data; and sample the received audio data at a predetermined sampling rate to produce a plurality of audio samples, an impact detection module arranged to receive an output of the input module, the impact detection module arranged to: estimate low frequency power values of the received plurality of audio samples; estimate wide band power values of the received plurality of audio samples; determine, responsive to the estimated wide band power values, an amplification value for the gain module; and assert, responsive to the estimated low frequency power being greater than a predetermined threshold, an impact detection signal, a gain module, responsive to an output of the impact detection module and to the impact detection signal, the gain module arranged to receive the output of the input module and arranged to amplify a function of the received plurality of audio samples by the determined amplification value in the event that the impact detection signal has been asserted; a glass breakage detection module responsive to an output of the gain module, the glass breakage detection module arranged to compare the amplified function of the received plurality of audio samples with a predetermined function of sound of breaking glass; and an output module responsive to the glass breakage detection module arranged to output an indication of the comparison.
In one embodiment, the impact detection module is further arranged to determine Mel-spaced band power values of the received plurality of audio samples, the low frequency power value estimation responsive to the determined Mel-spaced band power values. In another embodiment the plurality of audio samples are received over a predetermined time period and wherein the impact detection module is further arranged to: compare the estimated low frequency power values of each of a plurality of portions of the predetermined time period with a predetermined threshold; and wherein the assertion of the impact detection signal amplification is responsive to the compare estimated low frequency power values being greater than the predetermined threshold for more than one of the plurality of time period portions.
Independently, the embodiments herein provide for a multi-purpose alarm system, comprising: an input module arranged to receive audio samples; a T3/T4 detection module arranged to detect sounds of a T3 or T4 alarm within the received audio samples; a glass breakage detection module arranged to detect sounds of breaking glass within the received audio samples; a programmable sound energy detection module arranged to detect various predetermined sounds within the received audio samples; and a voice communication module arranged to provide two way communication between a communication device and a communication network, wherein each of the T3/T4 detection module, the glass breakage detection module and the programmable sound energy detection module comprise a unique amplifier arranged to amplify the received audio samples by a predetermined respective gain.
Additional features and advantages of the invention will become apparent from the following drawings and description.
For a better understanding of the invention and to show how the same may be carried into effect, reference will now be made, purely by way of example, to the accompanying drawings in which like numerals designate corresponding elements or sections throughout.
With specific reference now to the drawings in detail, it is stressed that the particulars shown are by way of example and for purposes of illustrative discussion of the preferred embodiments of the present invention only, and are presented in the cause of providing what is believed to be the most useful and readily understood description of the principles and conceptual aspects of the invention. In this regard, no attempt is made to show structural details of the invention in more detail than is necessary for a fundamental understanding of the invention, the description taken with the drawings making apparent to those skilled in the art how the several forms of the invention may be embodied in practice. In the accompanying drawings:
Before explaining at least one embodiment of the invention in detail, it is to be understood that the invention is not limited in its application to the details of construction and the arrangement of the components set forth in the following description or illustrated in the drawings. The invention is applicable to other embodiments or of being practiced or carried out in various ways. Also, it is to be understood that the phraseology and terminology employed herein is for the purpose of description and should not be regarded as limiting.
The terms “connected” or “coupled”, or any variant thereof, as used herein is not meant to be limited to a direct connection, and is meant to include any coupling or connection, either direct or indirect, and the use of appropriate resistors, capacitors, inductors and other active and non-active elements does not exceed the scope thereof.
The output of input module 20 is fed to impact detection module 30 and to gain module 40. The output of impact detection module 30 is fed to a control input of gain module 40. The output of gain module 40 and an output of memory are each fed to respective inputs of glass breakage detection module 50. The output of glass breakage detection module 50 is fed to output module 60.
Input module 20 is in electrical communication with a microphone 80 and is arranged to receive audio data therefrom. Input module 20 digitally samples the received audio data from microphone 80 at a predetermined sampling rate and outputs the sampled audio data to both impact detection module 30 and gain module 40.
As will be described below, impact detection module 30 is arranged to analyze the audio data to determine whether a low frequency impact sound has been received at microphone 80. A low frequency impact sound indicates that an object has impacted glass, thereby increasing the probability that sounds of breaking glass will be detected at microphone 80. In the event that impact detection module 30 detects a low frequency impact sound, a signal is output to gain module 40. Responsive to the received signal, gain module 40 is arranged to amplify a predetermined portion of the audio data of input module 20, the amplified portion received by glass breakage detection module 50. In one embodiment, the predetermined audio data portion is 1.6 seconds of audio data. As will be described below, glass breakage detection module 50 is arranged to compare a function of the amplified audio portion with functions of known sounds of glass breaking stored on memory 70. Responsive to the comparison, glass breakage detection module 50 is arranged to determine whether the sounds received at microphone 80 include sounds of breaking glass, the determination output by output module 60 to an external network and/or to an alarm system.
The output of input module 20 is fed to power spectrum module 110 and to buffer 140. The output of power spectrum module 110 is fed to frame power detection module 120. The output of frame power detection module 120 is fed to impact decision module 125 and to gain control 130. The output of impact decision module 125 is fed to buffers 140 and 160. The output of gain control 130 is fed to a control input of amplifier 150. The output of buffer 140 is fed to amplifier 150 and the output of amplifier 150 is fed to buffer 160. The output of buffer 160 is fed to power spectrum module 170 and the output of power spectrum module 170 is fed to a first input of glass breakage decision module 180. A second input of glass breakage decision module 180 is fed from memory 70, as will be described below. The output of glass breakage decision module 180 is fed to output module 60. The output of output module 60 is in one embodiment fed to an alarm system 65.
As illustrated in
As illustrated in
As illustrated in
As illustrated in
As illustrated in
In operation, input module 20 is arranged to receive audio data from a microphone 80. Input module 20 is arranged to sample the audio data received from microphone 80 at a predetermined sampling rate. In one embodiment, input module 20 is further arranged to filter out unwanted noise. The sampled audio data is output to power spectrum module 110 and is further output to buffer 140. Pre-emphasis module 190 of power spectrum module 110 is arranged to filter the received audio data to amplify the higher frequencies of the data. A non-limiting example of a filter frequency response of pre-emphasis module 190, with a sampling rate of 8000 Hertz, is illustrated by curve 310 in a graph of
The filtered audio data is transformed to the frequency domain by DFT module 200, utilizing a DFT, and separated into equally spaced frequency bands. Particularly, prior to the transform, the audio data is split into sample frames, with each frame consisting of 8 milliseconds of audio data. The sample frames are then overlapped. Specifically, the samples of each frame are concatenated with the samples of the previous frame. The overlapped frames are then windowed, optionally with a Hamming window. The windowed overlapped frames are then transformed to the frequency domain utilizing a DFT, optionally producing 63 equally spaced frequency bands. Mel scaling module 210 is arranged to multiply the frequency bands of DFT module 200 with a predetermined matrix to create 26 Mel-spaced band power values.
The Mel-spaced band power values are received by frame power detection module 120. Frame power detection module 120 is arranged to determine the sound power over each frame period, i.e. 8 milliseconds in the example described above. Particularly, low frequency power estimation module 220 is arranged to estimate the sound power in lower frequencies and wide band power estimation module 230 is arranged to estimate the sound power over a wide frequency band. In one embodiment, wide band power estimation module 230 is arranged to determine a sum of the Mel-space band power values for each frame. Furthermore, low frequency power estimation module 220 is arranged to determine a weighted sum of the lower Mel-space band power values for each frame. In one embodiment, one of a high sensitivity and a low sensitivity setting can be used for low frequency power estimation module 220, optionally responsive to a user input. In one further embodiment, the high sensitivity low frequency power estimation is determined as:
P
LF(i)=PMB(i,0)+PMB(i,1)±PMB(i,2)+2*PMB(i,3)+2*PMB(i,4)+0.5*PMB(i,5) EQ.1
and the low sensitivity low frequency power estimation is determined as:
P
LF(i)=0.125*PMB(i,0)±0.125*PMB(i,1)±0.125*PMB(i,3) EQ. 2
where PLF is the low frequency power estimation array, i is the index of each frame period and PMB is the Mel-space band power value array for each frame period.
Impact decision module 125 is arranged to compare the output of low frequency power estimation module 220 for each frame with a predetermined threshold value. As described above, there are a plurality of settings for the sensitivity of low frequency power estimation module 220. When the high sensitivity is selected, the probability of the low frequency power estimation being greater than the threshold value increases, thereby reducing the chance of missing a breaking glass sound while increasing the chance of detecting a false positive. When the low sensitivity is selected, the probability of the low frequency power estimation being greater than the threshold value decreases, thereby reducing the chance of detecting a false positive while increasing the chance of missing a breaking glass sound. In the event that the low frequency power estimation is greater than the threshold value for at least a predetermined number of frames, optionally 2 out of 20 consecutive frames of a 1.6 second time period, impact decision module 125 asserts an impact detection signal indicating that an impact on glass has been detected. Particularly, the initial percussive burst of the glass breaking has significant low frequency energy that is fast decaying compared to higher portions of the sound spectra. This decay and frequency signature is recognized by the above described method of frame power detection module 120 and impact decision module 125.
Responsive to the output impact detection signal, buffer 140 is arranged to feed a predetermined number of samples to amplifier 150, optionally the samples from a time period of 1.6 seconds, and buffer 160 is arranged to feed the amplified samples to power spectrum module 170 for analyses. Advantageously, analyzing whether glass has been broken occurs only when an impact on glass has been identified, increases the accuracy of detection. Additionally, the samples are amplified appropriately to increase the quality of detection, as will be described herein.
Peak detection module 240 is arranged to determine the highest value in the wide band power estimation array, i.e. from the frame exhibiting the highest power sum. Gain determination module 250 is arranged to compare the value determined by peak detection module 240 with a lookup table stored on memory 70 to determine the appropriate gain for amplifier 150. An non-limiting embodiment of such a lookup table is as follows:
For example, if the frame with the highest power sum, as determined by wide band power estimation module 230, exhibits a power sum of 6.0, gain determination module 250 is arranged to adjust the gain of amplifier 150 to a value of 11.25.
The amplified samples are fed to first power spectrum module 170, via buffer 160 which is arranged to receive the amplified samples of the predetermined time period. First power spectrum module 170 is arranged to determine Mel-frequency cepstral coefficients (MFCCs) of the amplified samples. Specifically, in one embodiment, pre-emphasis module 190 is arranged to emphasize the higher frequencies of the amplified samples, as described above. DFT module 200 is arranged to transform the emphasized samples to the frequency domain and Mel scaling module 210 is arranged to scale the frequency bands to Mel-spaced frequency band power values, as described above. Logarithm module 255 is arranged to determine a logarithm of the Mel-spaced frequency band power values and a DCT is applied to the outcome by DCT module 260, thereby deriving Cepstrum values. In one embodiment, 8 Cepstrum values are derived from 26 Mel-spaced frequency band power values of Mel scaling module 210. The Cepstrum values are fed to coefficient module 275 and are additionally fed to differentiation module 270. Differentiation module 270 is arranged to determine the rate of change over time, from frame to frame, of each the Cepstrum values. In one embodiment, differentiation module 270 is arranged to apply a digital filter which approximates the operation of a differentiator by utilizing a difference equation. In one non-limiting embodiment, the difference equation is as follows:
dc(i,k)=0.0667*c(i−4,k)+0.0500*c(i−3,k)+0.0333*c(i−2,k)+0.0167*c(i−1,k)−0.0167*c(i+1,k)−0.0333*c(i−3,k)−0.0500*c(i+3,k)−0.0667*c(i+4,k) EQ. 3
where i is the frame index, k is the Cepstrum value index such that c is the array of Cepstrum values for each frame.
Coefficient module 275 is arranged to concatenate, for each frame, the Cepstrum values with the differential values output by differentiation module 270, thereby deriving MFCCs. Memory 70 has stored thereon MFCC templates, i.e. precomputed sets of MFCCs which are generated, as described above, from sounds representing breaking glass. Glass breakage decision module 180 is arranged to compare the MFCCs received from coefficient module 175 with the MFCCs stored on memory 70. In one embodiment, a 1.6 second set of MFCCs are compared one by one to eight precomputed sets of MFCCs stored on memory 70.
Specifically, in one embodiment, DTW module 280 is arranged to compare the MFCCs utilizing a dynamic time warping algorithm. In one non-limiting embodiment, the DTW algorithm implements a comparison of two matrices and outputs a scalar positive value which is lower when the two input matrices are similar. One non-limiting example of ‘C’ code is described below.
Threshold module 290 has stored thereon predetermined thresholds for comparisons of MFCCs with the MFCCs stored on memory 70. For each comparison of DTW module 280, comparison module 300 is arranged to compare the value output by DTW module 280 with the respective predetermined threshold. In the event that at least one of the values is less than the respective predetermined threshold, glass breakage decision module 180 is arranged to output to output module 60 a signal indicating that glass has been broken. Output module 60 is arranged to output the indication to an external network and/or to alarm system 65. In one embodiment, the thresholds stored on threshold module 290 are adjustable for different sensitivity setting, in accordance with stored statistical analysis data, the sensitivity settings optionally responsive to a user input at a user sensitivity input device.
In one embodiment, glass breakage detection system 100 is set to detect breakage of laminated glass, which produces a significantly different sound than regular glass. Unique MFCCs for laminated glass are stored on memory 70 and the above method is similarly utilized for detection of laminated glass breakage and differentiating the sound of breaking laminated glass from other sounds, such as slamming doors or other household impacts.
In some cases, a rich signal (often music or a similarly pulsed non T3 alarm) can cause a false positive detection. To keep those situations from causing a false trigger, the energy out of band may be tested in accordance with an embodiment of the invention. In this embodiment, the signal power including the total power and the power in the desired band (3100 Hz and/or 520 Hz) is monitored in parallel to the PLL 430 and pattern detector 440 by out-of-band energy qualifier 450. A wideband-to-narrowband ratio is determined and output from out-of-band energy qualifier 450. The ratio represents a value between 0 and 1 and is used to adjust the output of the pattern detector 440. In a situation where there is little wideband noise, the output of out-of-band energy qualifier 450 will be closer to 1. Conversely, in a situation where a lot of wideband noise is present, the output of out-of-band energy qualifier 450 will be closer to 0 and thus will significantly lower the matching score output from pattern detector 440. This has the effect of requiring the detected signal to be very exact if there is a lot of out of band noise. The output of the out-of-band energy qualifier 450 is input into multiplier 460 along with the output of the pattern detector 440. The output of multiplier 460 represents an adjusted output of the pattern detector in view of background noise or a non T3/T4 alarm.
The output of multiplier 460 is input into comparator 470. The comparator 470 compares the output of the pattern detector 440 with a threshold value 472 to qualify the result of the pattern detector 440. If the output of the pattern detector 440 meets and/or exceeds the threshold value 472, the audible alert signal detected by microphone interface 410 is determined to be an actual T3/T4 pulse stream and the comparator 470 outputs an active high signal. However, if the output of the pattern detector 440 is lower than the threshold value 472, the audible alert signal is determined not to be a T3/T4 pulse stream and the comparator 470 outputs an active low signal.
In certain embodiments, after a single T3/T4 alarm period is detected at the output of comparator 470 by an active high signal, the alarm can be further qualified by checking if subsequent alarms are present by multi-pulse qualifier 480. For example, in some embodiments of the invention, N audible alarms must be detected within a predetermined time window determined by timer 482 before outputting an alarm detected signal. In the event that only a single alarm period is detected, with no subsequent alarm period within the predetermined time window, the multi-pulse qualifier 480 does not assert an alarm detected signal. This adds to the general robustness of the alarm detection accuracy. This process looks to see if more than a predetermined number of frames in a given interval resulted in assertion of an active high signal by comparator 470. Since the output of the pattern detector 440, before comparator 470, is a score corresponding to the probability a T3/T4 alarm was detected, these scores may be summed over time to provide a continuous multiple pulse qualification. If so, the host/user is alerted that a T3/T4 alarm was detected responsive to an output alarm detected signal from the multi-pulse qualifier 480. In block 490, an interrupt or a notification is generated and output, responsive to output alarm detected signal from the multi-pulse qualifier 480, preferably to a host system so that an action can be taken. The interrupt or notification is thus generated responsive to the asserted signal at the output of comparator 470. In certain embodiments neither multi-pulse qualifier 480 nor out-of band energy qualifier 450 are provided. Alternately, in other embodiments, the output of pattern detector 440, appropriately buffered or amplified if required, is used as the interrupt or notification output, without requiring comparator 470, or multi-pulse qualifier 480.
Here, three user definable parameters, frequency bins, time duration, and magnitude threshold are set to qualify the acoustic input signal. The time domain signal is first converted to a collection of frequency bins in the frequency domain, via time to frequency conversion module 510 and frequency bin selection module 530. Responsive to a user input, selected frequencies module 520 selects which bins which are typically contiguous to look at, frequency bin selection module 530 ignoring the ones not selected. The bins are then combined-summed or sum squared- and averaged over a user defined time window at integrator 530, the user defined time window stored on integration time module 540 and integrator 530 is responsive thereto. The resulting output energy is compared, by comparator 570, against a preset threshold output by energy threshold module 560. Should the energy in the selected bins be high enough so that the average energy over the specified time interval is greater than the threshold, the energy detector signals a positive indication, at the output of comparator 570. This detector can be set for broadband noise detection or single tone detection and can catch short time window or persistent signals.
T3/T4 alarm detection algorithm unit 612 is implemented as described above in relation to audible alarm detector 400. Glass breakage detection algorithm unit 622 is implemented as described above in relation to glass breakage detection systems 10 and 100. Energy detection algorithm unit 632 is implemented as described above in relation to programmable energy detector 500. Voice communication module 640 is implemented as a voice over internet protocol (VoIP) communications system arranged to provide full duplex two-way voice communication via a communications device, such as a desktop speaker phone.
T3/T4 alarm detection module 610, glass breakage detection module 620, energy detection module 630 and voice communication module 640 are integrated onto a single chip 650. Each of T3/T4 alarm detection module 610, glass breakage detection module 620, energy detection module 630 and voice communication module 640 may be enabled or disabled by programmable configuration registers accessible by an external host device or user interface.
In one embodiment, the firmware for each of T3/T4 alarm detection module 610, glass breakage detection module 620, energy detection module 630 and voice communication module 640 are stored individually in memory which is either integrated into chip 650, as illustrated in
In operation, sounds are received by microphone 80 and sampled and amplified by an input module 670. The output samples from input module 670 are then amplified separately by each of T3/T4 alarm detection amplifier 614, glass breakage detection amplifier 624 and energy detection amplifier 634. Each of T3/T4 alarm detection amplifier 614, glass breakage detection amplifier 624 and energy detection amplifier 634 exhibits a different gain value in accordance with the respective algorithm. The amplified audio samples are then respectively analyzed by T3/T4 alarm detection algorithm unit 612, glass breakage detection algorithm unit 622 and energy detection algorithm unit 632 to detect the relevant sounds and output an alarm signal to alarm 65 as needed.
It should be appreciated by those skilled in the art that any block diagrams herein represent conceptual views of illustrative circuitry embodying the principles of the invention. For example, a processor may be provided through the use of dedicated hardware as well as hardware capable of executing software in association with appropriate software. When provided by a processor, the functions may be provided by a single dedicated processor, by a single shared processor, or by a plurality of individual processors, some of which may be shared. Moreover, explicit use of the term “processor” should not be construed to refer exclusively to hardware capable of executing software, and may implicitly include, without limitation, digital signal processor (DSP) hardware, network processor, application specific integrated circuit (ASIC), field programmable gate array (FPGA), read only memory (ROM) for storing software, random access memory (RAM), and non-volatile storage. Other hardware, conventional and/or custom, may also be included. The functional blocks or modules illustrated herein may in practice be implemented in hardware or software running on a suitable processor.
It is appreciated that certain features of the invention, which are, for clarity, described in the context of separate embodiments, may also be provided in combination in a single embodiment. Conversely, various features of the invention which are, for brevity, described in the context of a single embodiment, may also be provided separately or in any suitable sub-combination.
Unless otherwise defined, all technical and scientific terms used herein have the same meanings as are commonly understood by one of ordinary skill in the art to which this invention belongs. Although methods similar or equivalent to those described herein can be used in the practice or testing of the present invention, suitable methods are described herein.
All publications, patent applications, patents, and other references mentioned herein are incorporated by reference in their entirety. In case of conflict, the patent specification, including definitions, will prevail. In addition, the materials, methods, and examples are illustrative only and not intended to be limiting.
It will be appreciated by persons skilled in the art that the present invention is not limited to what has been particularly shown and described herein above. Rather the scope of the present invention is defined by the appended claims and includes both combinations and sub-combinations of the various features described hereinabove as well as variations and modifications thereof which would occur to persons skilled in the art upon reading the foregoing description and which are not in the prior art.
Number | Date | Country | |
---|---|---|---|
62325233 | Apr 2016 | US |