Method for determining alcohol consumption, and recording medium and terminal for carrying out same

Information

  • Patent Grant
  • 9934793
  • Patent Number
    9,934,793
  • Date Filed
    Friday, January 24, 2014
    10 years ago
  • Date Issued
    Tuesday, April 3, 2018
    6 years ago
Abstract
Disclosed are a method for determining whether a person is drunk after consuming alcohol capable of analyzing alcohol consumption in a time domain by analyzing a voice, and a recording medium and a terminal for carrying out same. An alcohol consumption-determining terminal comprises: a voice input unit for generating a voice frame by converting an inputted voice signal and outputting the voice frame; a voiced/unvoiced sound analysis unit for determining whether the voice frame inputted through the voice input unit corresponds to a voiced sound, an unvoiced sound, or background noise; a voice frame energy detection unit for extracting the average energy of voice frames which have been determined as a voiced sound by the voiced/unvoiced sound analysis unit; an interval energy detection unit for detecting the average energy of intervals including a plurality of voice frames which have been determined as voiced sounds; and an alcohol consumption determining unit for determining whether a person is drunk after consuming alcohol by extracting a difference value among the average energy of neighboring intervals which have been detected by the interval energy detection unit, thereby determining whether a person is drunk after consuming alcohol by analyzing the voice signal in a time domain.
Description
TECHNICAL FIELD

The present invention relates to a method of determining whether a person is drunk after consuming alcohol using voice analysis in the time domain, and a recording medium and terminal for carrying out the same.


BACKGROUND ART

Although there may be differences among individuals, a drunk driving accident is likely to happen when a driver is half-drunk or drunk. As methods of measuring drunkenness, there is a method of measuring the concentration of alcohol within exhaled air during respiration using a breathalyzer equipped with an alcohol sensor and a method of measuring the concentration of alcohol in the blood flow using a laser. Generally, the former method is usually used for cracking down on drunk driving. In this case, when any driver refuses a drunkenness test, the Widmark Equation may be used to estimate a blood alcohol concentration by collecting the blood of the driver with his or her consent.


A technology for determining whether a driver has consumed alcohol and controlled starting device for a vehicle in order to prevent drunk driving is commercialized. Some vehicles to which the technology is applied are already commercially available. Such a technology works by enabling or disabling a vehicle to be started by attaching a detection device equipped with an alcohol sensor to the starting device of the vehicle, this is a field in which much research is being conducted by domestic and foreign automotive manufacturers. These methods use an alcohol sensor and thus may relatively accurately measure a concentration of alcohol. However, in an environment with high humidity and dust, such as an automotive interior environment, the alcohol sensor has a low accuracy and is not entirely usable due to frequent failures. Furthermore, the sensor has a short lifetime. Accordingly, when the sensor is combined to an electronic device, there is an inconvenience of having to repair the electronic device in order to replace the sensor.


DISCLOSURE
Technical Problem

An aspect of the present invention is directed to a method of determining whether a person is drunk after consuming alcohol using voice analysis in the time domain, and a recording medium and terminal for carrying out the same.


Technical Solution

According to an aspect of the present invention, an alcohol consumption determination method includes converting a received voice signal into a plurality of voice frames and extracting average energy for each of the voice frames, dividing the plurality of voice frames into sections with a predetermined length and extracting average energy for a plurality of voice frames included in each of the sections; and comparing the average energy between a plurality of neighboring sections to determine whether alcohol has been consumed.


The converting of a received voice signal into a plurality of voice frames and the extracting of average energy for each of the voice frames may include determining whether each of the plurality of voice frames corresponds to a voiced sound, an unvoiced sound, or background noise and extracting average energy for each voice frame corresponding to the voiced sound.


The comparing of the average energy between a plurality of neighboring sections to determine whether alcohol has been consumed may include setting the neighboring sections to overlap either partially or not at all, extracting average energy for voice frames included in each of the sections, and determining whether a person is drunk after consuming alcohol according to a difference in the extracted average energy.


The comparison of the average energy between a plurality of neighboring sections to determine whether alcohol has been consumed may include determining that alcohol has been consumed when a difference in average energy between the plurality of neighboring sections is less than a predetermined threshold and determining that alcohol has not been consumed when the difference is greater than the predetermined threshold.


According to an embodiment of the present invention, an alcohol consumption determination terminal includes: a voice input unit configured to convert a received voice signal into voice frames and output the voice frames; a voiced/unvoiced sound analysis unit configured to determine whether each of the voice frames corresponds to a voiced sound, an unvoiced sound, or background noise; a voice frame energy detection unit configured to extract average energy of a voice frame that is determined as a voiced sound by the voiced/unvoiced sound analysis unit; a section energy detection unit configured to detect average energy for a section in which a plurality of voice frames determined as voiced sounds are included; and an alcohol consumption determination unit configured to compare average energy between neighboring sections detected by the section energy detection unit to determine whether alcohol has been consumed.


The voiced/unvoiced sound analysis unit may receive a voice frame, extract predetermined features from the voice frame, and determine whether the voice frame corresponds to a voiced sound, an unvoiced sound, or background noise according to the extracted features.


The alcohol consumption determination unit may include a storage unit configured to pre-store a threshold to determine whether alcohol has been consumed and a difference calculation unit configured to calculate a difference in average energy between neighboring sections.


The difference calculation unit may detect an average energy difference between neighboring sections that are set to partially overlap with each other or may detect an average energy difference between neighboring sections that are set not to overlap with each other.


The voice input unit may receive the voice signal through a microphone provided therein or receive the voice signal from a remote site to generate the voice frame.


According to an embodiment of the present invention, a computer-readable recording medium having a computer program recorded thereon for determining whether a person is drunk after consuming alcohol by using the above-described alcohol consumption determination terminal.


Advantageous Effects

As described above, according to an aspect of the present invention, whether alcohol has been consumed may be determined by analyzing an input voice in the time domain.





BRIEF DESCRIPTION OF DRAWINGS


FIG. 1 is a control block diagram of an alcohol consumption determination terminal according to an embodiment of the present invention.



FIG. 2 is a view for describing a concept in which voice signals are converted into voice frames by a voice input unit included in the alcohol consumption determination terminal according to an embodiment of the present invention.



FIG. 3 is a control block diagram of a voiced/unvoiced sound analysis unit included in the alcohol consumption determination terminal according to an embodiment of the present invention.



FIG. 4 is a view for describing a section setting operation of a voice frame energy detection unit included in the alcohol consumption determination terminal according to an embodiment of the present invention.



FIGS. 5A and 5B are views for describing a section setting operation of a section energy detection unit included in the alcohol consumption determination terminal according to an embodiment of the present invention.



FIG. 6 is a control block diagram of an alcohol consumption determination unit included in the alcohol consumption determination terminal according to an embodiment of the present invention.



FIG. 7 is a control flowchart showing an alcohol consumption determination method according to an embodiment of the present invention.





MODES FOR CARRYING OUT THE INVENTION

Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings. In adding reference numbers for elements in each figure, it should be noted that like reference numbers already used to denote like elements in other figures are used for elements wherever possible.



FIG. 1 is a control block diagram of an alcohol consumption determination terminal according to an embodiment of the present invention.


An alcohol consumption determination terminal 100 may include a voice input unit 110 configured to convert received voice signals into voice frames and output the voice frames, a voiced/unvoiced sound analysis unit 120 configured to analyze whether each of the voice frames is associated with a voiced sound or an unvoiced sound, a voice frame energy detection unit 130 configured to detect energy for the voice frame, a section energy detection unit 140 configured to detect energy for a section in which a plurality of voice frames are included, and an alcohol consumption determination unit 150 configured to determine whether alcohol has been consumed using the energy for the section in which the voice frames are included.


The voice input unit 110 may receive a person's voice, convert the received voice into voice data, convert the voice data into voice frames in units of frames, and output the voice frames.


The voiced/unvoiced sound analysis unit 120 may receive a voice frame, extract predetermined features from the voice frame, and analyze whether the voice frame is associated with a voiced sound, an unvoiced sound, or noise according to the extracted features.


The voiced/unvoiced sound analysis unit 120 may determine whether the voice frame corresponds to a voiced sound, an unvoiced sound, or background noise according to a recognition result obtained by the above method. The voiced/unvoiced sound analysis unit 120 may separate and output the voice frame as a voice sound, an unvoiced sound, or background noise according to a result of the determination.


The voice frame energy detection unit 130 may calculate average energy for the voice frame determined as the voiced sound. The average energy is calculated by summing the squares of N samples from short time energy n-N+1 to energy n with respect to sample n, and a detailed description thereof will be provided below.


The section energy detection unit 140 may detect average energy for a section with a predetermined length. The section energy detection unit 140 detects average energy for each of the two neighboring sections.


The alcohol consumption determination unit 150 may calculate a difference in average energy between the two neighboring sections and may determine whether alcohol has been consumed according to the calculated difference.


The alcohol consumption determination unit 150 may compare an average energy difference between the two neighboring sections before drinking and an average energy difference between the two neighboring sections after drinking to determine whether alcohol has been consumed. Here, the average energy difference between the two neighboring sections before drinking may be preset as a threshold and applied in all cases. The threshold may be an optimal value that is set experimentally or customized in advance.


When a person is drunk, his or her ability to control the volume of voice is reduced. Since the person cannot talk smoothly and rhythmically by using a change in energy, the person makes consecutive pronunciations at a loud volume or makes pronunciations at a loud volume when the pronunciation should be made with at a lower volume. Thus, it is determined whether alcohol has been consumed according to an energy change difference in a certain section.


When an energy difference between neighboring sections in a voice frame is smaller than a certain threshold, the alcohol consumption determination unit 150 may determine that alcohol has been consumed.



FIG. 2 is a view for describing a concept in which voice signals are converted into voice frames by a voice input unit included in the alcohol consumption determination terminal according to an embodiment of the present invention.


Typically, analog voice signals are sampled at a rate of 8000 per second and in the size of 16 bits (65535 steps) and converted into voice data.


The voice input unit 110 may convert received voice signals into voice data and convert the voice data into voice frame data in units of frames. Here, one piece of the voice frame data has 256 energy values.


As shown in FIG. 2, the voice data is composed of a plurality of voice frames (n=the number of frames, n=1, 2, 3, . . . ) according to an input voice.


The voice input unit 110 generates a voice frame and then sends information regarding the voice frame to the voiced/unvoiced sound analysis unit 120.



FIG. 3 is a control block diagram of a voiced/unvoiced sound analysis unit included in the alcohol consumption determination terminal according to an embodiment of the present invention.


The voiced/unvoiced sound analysis unit 120 may include a feature extraction unit 121 configured to receive a voice frame and extract predetermined features from the voice frame, a recognition unit 122 configured to yield a recognition result for the voice frame, a determination unit 123 configured to determine whether the received voice frame is associated with a voiced sound or an unvoiced sound or whether the received voice frame is caused by background noise, and a separation and output unit 124 configured to separate and output the voice frame according to a result of the determination.


When the voice frame is received through the voice input unit 110, the feature extraction unit 121 may extract features such as periodic characteristics of harmonics or root mean square energy (RMSE) or zero-crossing count (ZC) of a low-band voice signal energy area from the received voice frame.


Generally, the recognition unit 122 may be composed of a neural network. This is because the neural network is useful in analyzing non-linear problems, that is, complicated problems that cannot be solved mathematically and thus is suitable for analyzing voice signals and determining whether a corresponding voice signal is a voiced signal, an unvoiced signal, or background noise according to a result of the analysis. The recognition unit 122, which is composed of such a neural network, may assign predetermined weights to the features extracted from the feature extraction unit 121 and may yield a recognition result for the voice frame through a calculation process of the neural network. Here, the recognition result refers to a value that is obtained by calculating calculation elements according to weights assigned to features of each voice frame.


The determination unit 123 may determine whether the received voice signal corresponds to a voiced sound or an unvoiced sound according to the above-described recognition result, that is, the value calculated by the recognition unit 122. The separation and output unit 124 may separate and output the voice frame as a voiced sound, an unvoiced sound, or background noise according to a result of the determination of the determination unit 123.


Meanwhile, since the voiced sound is distinctly different from the voiced sound and the background noise in terms of various features, it is relatively easy to identify the voiced sound, and there are several well-known techniques for this. For example, the voiced sound has periodic characteristics in which harmonics are repeated at a certain interval while the background noise does not have the harmonics. On the other hand, the unvoiced sound has harmonics with weak periodicity. In other words, the voiced sound is characterized in that the harmonics are repeated within one frame while the unvoiced sound is characterized in that the characteristics of the voiced sound such as the harmonics are repeated every certain number of frames, that is, is shown to be weak.



FIG. 4 is a view for describing a section setting operation of a voice frame energy detection unit included in the alcohol consumption determination terminal according to an embodiment of the present invention.


The voice frame energy detection unit 130 may calculate average energy for a voice frame determined as a voiced sound. The average energy is calculated by summing the squares of N samples from short time energy n-N+1 to energy n with respect to sample n, and a detailed description thereof will be provided in the following:










E
n

=


1
N

·




m
=

n
-
N
+
1


n





s
2



(
m
)


.







[

Equation





1

]







Average energy for each of the voice frames determined as voiced sounds may be calculated through Equation 1.



FIGS. 5A to 5C are views for describing a section setting operation of a section energy detection unit included in the alcohol consumption determination terminal according to an embodiment of the present invention.


The section energy detection unit 140 may divide a plurality of voice frames determined as voiced sounds into predetermined sections and may detect average energy for the voice frames included in each of the predetermined sections, that is, average section energy. Since the voice frame energy detection unit 130 calculates average energy for each of the voice frames determined as voiced sounds, the section energy detection unit 140 may detect average section energy using the average energy.


As shown in FIG. 5A, the section energy detection unit 140 may detect average energy for a section with a predetermined length (i.e., sector 1). The section energy detection unit 140 may find average section energy using the following equation:










E
d

=


1

F
n


·




k
=
1


F
n





E
n



(
k
)








[

Equation





2

]








where Fn is the number of voice frames in a section, and En(k) is average energy for a k-th voice frame.


The section energy detection unit 140 may detect average energy for two neighboring sections by using the above-described method. Here, the neighboring sections may be implemented in a form in which the voice frames in a certain section partially overlap with each other as shown in FIG. 5B or in a form in which, starting from a frame next to the last voice frame of a certain section, another section is set as shown in FIG. 5C.



FIG. 6 is a control block diagram of an alcohol consumption determination unit included in the alcohol consumption determination terminal according to an embodiment of the present invention.


The alcohol consumption determination unit 150 may include a difference calculation unit 151 configured to calculate a difference in average energy between two neighboring sections and a storage unit 152 configured to prestore a threshold used to determine whether alcohol has been consumed.


The difference calculation unit 151 may calculate the average energy difference between neighboring sections that is transmitted from the section energy detection unit 140 by using the following equation:

ER=α·(Ed1−Ed2)−β  [Equation 3]

where Ed1 is average energy for any one section including a plurality of voice frames, and Ed2 is average energy for a section neighboring that of Ed1, and also α and β are constant values that may be predetermined to easily recognize the average energy difference.


In the above embodiments, a difference in average energy between the two neighboring sections has been used. However, it will be appreciated that the average energy may be compared by calculating an average energy ratio between two sections according to an embodiment of the present invention. That is, an embodiment of the present invention may include all methods of comparing average energy between two sections to determine whether alcohol has been consumed.



FIG. 7 is a control flowchart showing an alcohol consumption determination method according to an embodiment of the present invention.


The voice input unit 110 may receive a voice from the outside. The voice may be received through a microphone (not shown) included in the alcohol consumption determination terminal 100 or may be transmitted from a remote site. A communication unit (not shown) is not shown in the above embodiment. However, it will be appreciated that a communication unit may be provided to transmit a signal transmitted from a remote site or send calculated information to the outside (200).


The voice input unit 110 may convert the received voice into voice data and convert the voice data into voice frame data. The voice input unit 110 may generate a plurality of voice frames for the received voice and transmit the generated voice frames to the voiced/unvoiced sound analysis unit 120 (210).


The voiced/unvoiced sound analysis unit 120 may receive the voice frames, extract predetermined features from each of the voice frames, and determine whether the voice frame corresponds to a voiced sound, an unvoiced sound, or background noise according to the extracted features. The voiced/unvoiced sound analysis unit 120 may extract voice frames corresponding to voiced sounds among the plurality of voice frames that are received (220, 230, and 240).


The voice frame energy detection unit 130 detects average energy for each of the voice frames determined as voiced sounds (250).


The section energy detection unit 140 detects average energy for each of the two neighboring sections. The alcohol consumption determination unit 150 may calculate a difference in average energy between the two neighboring sections and may compare the calculated difference with a predetermined threshold to determine whether alcohol has been consumed. The alcohol consumption determination unit 150 may determine that alcohol has been consumed when the difference in average energy between the two neighboring sections is less than the threshold and may determine that alcohol has not been consumed when the difference in average energy between the two neighboring sections is greater than the threshold (260, 270, 280, and 290).


In the above method, whether alcohol has been consumed is determined by calculating a difference in average energy between the two neighboring sections. It will be appreciated that a method of calculating and comparing differences in average energy between four sections or another number of sections may be used instead of the two neighboring sections. In addition, it will be appreciated that all methods of comparing average energy among a plurality of sections (e.g., a method of calculating a relative ratio of average energy between two neighboring sections rather than the difference in average energy between the two sections) are included.


Furthermore, it will be appreciated that the alcohol consumption method performed by the above-described alcohol consumption determination terminal 100 may be implemented in a computer-readable recording medium having a program recorded thereon.


Although the present invention has been described with reference to exemplary embodiments thereof, it should be understood that numerous other modifications and variations can be made without departing from the spirit and scope of the present invention by those skilled in the art. It is obvious that the modifications and variations fall within the spirit and scope thereof.

Claims
  • 1. A method for determining whether alcohol is consumed by a person in a vehicle, the method comprising: converting a voice signal received from said person in the vehicle into a plurality of voice frames;extracting predetermined features from a voice frame among the plurality of voice frames;determining, based on the predetermined features, whether said voice frame is from a voiced sound, an unvoiced sound, or background noise;extracting a first average energy for each of the voice frames that is determined as the voiced sound, wherein the first average energy is calculated by summing squares of N samples from energy n-N+1 to energy n and dividing by N;dividing the plurality of voice frames that is determined as the voiced sound into sections with a predetermined length;calculating a second average energy of the first average energy in each of the sections;computing a difference of the second average energy between neighboring sections, wherein the neighboring sections does not overlap one another;determining that alcohol is consumed by said person when the difference is less than a predetermined threshold; andenabling or disabling the vehicle based on the determination.
  • 2. The method of claim 1, wherein the predetermined features comprise root mean square energy (RMSE), or zero-crossing count (ZC) of a low-band voice signal energy area.
  • 3. The method of claim 1, wherein the extracting the first average energy for each of the voice frames comprises extracting the first average energy for each voice frame corresponding to the voiced sound.
  • 4. The method of claim 1, wherein determining that alcohol is consumed by said person comprises: identifying a section and one or more neighboring sections thereof,computing the difference of the second average energy between the identified sections, anddetermining whether alcohol is consumed by said person according to the computed difference of the second average energy.
  • 5. The method of claim 4 further comprises: determining that alcohol is not consumed by said person when the difference is greater than the predetermined threshold.
  • 6. The method of claim 1 further comprising receiving the voice signal which is transmitted from a remote site.
  • 7. The method of claim 1 wherein computing the difference of the second average energy between neighboring sections is calculated by the following equation, ER=α·(Ed1−Ed2)−β
  • 8. A non-transitory computer-readable recording medium having a computer program recorded thereon for determining whether alcohol is consumed by a person in a vehicle, the method comprising: converting a voice signal received from said person in the vehicle into a plurality of voice frames;extracting predetermined features from a voice frame among the plurality of voice frames;determining, based on the predetermined features, whether said voice frame is from a voiced sound, an unvoiced sound, or background noise;extracting a first average energy for each of the voice frames that is determined as the voiced sound, wherein the first average energy is calculated by summing squares of N samples from energy n-N+1 to energy n and dividing by N;dividing the plurality of voice frames that is determined as the voiced sound into sections with a predetermined length;calculating a second average energy of the first average energy in each of the sections;computing a difference of the second average energy between neighboring sections, wherein the neighboring sections does not overlap one another;determining that alcohol is consumed by said person when the difference is less than a predetermined threshold; andenabling or disabling the vehicle based on the determination.
  • 9. The non-transitory computer-readable recording medium of claim 8, wherein the predetermined features comprise root mean square energy (RMSE), or zero-crossing count (ZC) of a low-band voice signal energy area.
  • 10. The non-transitory computer-readable recording medium of claim 8, wherein the extracting the first average energy for each of the voice frames comprises extracting the first average energy for each voice frame corresponding to the voiced sound.
  • 11. The non-transitory computer-readable recording medium of claim 8, wherein determining that alcohol is consumed by said person comprises: identifying a section and one or more neighboring sections thereof,computing the difference of the second average energy between the identified sections, and determining whether alcohol is consumed by said person according to the computed difference of the second average energy.
  • 12. The non-transitory computer-readable recording medium of claim 8 further comprises: determining that alcohol is not consumed by said person when the difference is greater than the predetermined threshold.
  • 13. The non-transitory computer-readable recording medium of claim 8 further comprising receiving the voice signal which is transmitted from a remote site.
  • 14. The non-transitory computer-readable recording medium of claim 8 wherein computing the difference of the second average energy between neighboring sections is calculated by the following equation, ER=α·(Ed1−Ed2)−β
Priority Claims (1)
Number Date Country Kind
10-2014-0008741 Jan 2014 KR national
PCT Information
Filing Document Filing Date Country Kind
PCT/KR2014/000726 1/24/2014 WO 00
Publishing Document Publishing Date Country Kind
WO2015/111771 7/30/2015 WO A
US Referenced Citations (55)
Number Name Date Kind
5776055 Hayre Jul 1998 A
5913188 Tzirkel-Hancock Jun 1999 A
5983189 Lee Nov 1999 A
6006188 Bogdashevsky Dec 1999 A
6151571 Pertrushin Nov 2000 A
6205420 Takagi Mar 2001 B1
6275806 Pertrushin Aug 2001 B1
6446038 Bayya Sep 2002 B1
6748301 Ryu Jun 2004 B1
7283962 Meyerhoff Oct 2007 B2
7925508 Michaelis Apr 2011 B1
7962342 Coughlan Jun 2011 B1
8478596 Schultz Jul 2013 B2
8775184 Deshmukh Jul 2014 B2
8793124 Hidaka Jul 2014 B2
8938390 Xu Jan 2015 B2
9058816 Lech Jun 2015 B2
9659571 Van Der Schaar May 2017 B2
9672809 Togawa Jun 2017 B2
9715540 Deshmukh Jul 2017 B2
20020010587 Pertrushin Jan 2002 A1
20020194002 Petrushin Dec 2002 A1
20030069728 Tato Apr 2003 A1
20040167774 Shrivastav Aug 2004 A1
20050075864 Kim Apr 2005 A1
20050102135 Goronzy May 2005 A1
20070071206 Gainsboro Mar 2007 A1
20070124135 Schultz May 2007 A1
20070192088 Oh Aug 2007 A1
20070213981 Meyerhoff Sep 2007 A1
20070288236 Kim Dec 2007 A1
20080037837 Noguchi Feb 2008 A1
20090265170 Irie Oct 2009 A1
20100010689 Yasushi et al. Jan 2010 A1
20110035213 Malenovsky Feb 2011 A1
20110105857 Zhang May 2011 A1
20110282666 Washio Nov 2011 A1
20120089396 Patel Apr 2012 A1
20120116186 Shrivastav May 2012 A1
20120262296 Bezar Oct 2012 A1
20130006630 Hayakawa Jan 2013 A1
20130253933 Maruta Sep 2013 A1
20140122063 Gomez Vilda May 2014 A1
20140188006 Alshaer Jul 2014 A1
20140244277 Krishna Rao Aug 2014 A1
20140379348 Sung Dec 2014 A1
20150127343 Mullor et al. May 2015 A1
20150142446 Gopinathan May 2015 A1
20150257681 Shuster et al. Sep 2015 A1
20150310878 Bronakowski Oct 2015 A1
20150351663 Zigel Dec 2015 A1
20160027450 Gao Jan 2016 A1
20160155456 Wang Jun 2016 A1
20160379669 Bae et al. Dec 2016 A1
20170004848 Bae et al. Jan 2017 A1
Foreign Referenced Citations (14)
Number Date Country
1850328 Oct 2007 EP
2003-36087 Feb 2003 JP
2010-015027 Jan 2010 JP
5017534 Sep 2012 JP
10-1997-0038004 Jul 1997 KR
10-0201256 Jun 1999 KR
10-0206205 Jul 1999 KR
1999-0058415 Jul 1999 KR
2004-0033783 Apr 2004 KR
10-0497837 Jun 2005 KR
10-0664271 Jan 2007 KR
10-2009-0083070 Aug 2009 KR
10-2012-0074314 Jul 2012 KR
2012014301 Feb 2012 WO
Non-Patent Literature Citations (19)
Entry
Booklet, Tobias, Korbinian Riedhammer, and Elmar Nöth. “Drink and Speak: On the automatic classification of alcohol intoxication by acoustic, prosodic and text-based features.” Twelfth Annual Conference of the International Speech Communication Association. 2011.
Geumran Baek et al “A Study on Voice Sobriety Test Algorithm in a Time -Frequency Domain” International Journal of Multimedia & Ubiquitous Engineering, vol. 8, No. 5, pp. 395-402, Sep. 2013.
Lee, Won Hui et al. “Valid-frame Distance Deviation of Drunk and non-Drunk Speech” The Journal of Korea Information and Communications Society (winter) 2014, pp. 876-877, Jan. 2014.
Jung, Chan Joong et al. “A Study on Detecting Decision Parameter about Drinking in Time Domain,” The Journal of Korea Information and Communications Society (winter) 2014, pp. 784-785, Jan. 2013.
Geumran Baek et al. “A Study on Judgment of Intoxication State Using Speech,” Information and Telecommunication Department, Soongsil University, pp. 277-282.
Seong Geon Bae, Dissertation for Ph.D, “A study on Improving Voice Surveillance System Against Drunk Sailing”. Information and Communication Engineering Dept., Soongsil University, Republic of Korea. Dec. 2013. (English Abstract at pp. x-xii).
Seong-Geon Bae et al. “A Study on Personalized Frequency Bandwidth of Speech Signal using Formant to LPC,” The Journal of Korean Institute of Communications and Information Sciences (winter), 2013, pp. 669-670.
Seong-Geon Bae et al. “A Study on Drinking Judgement Method of Speech Signal Using the Fomant Deviation in the Linear Prediction Coefficient,” he Journal of Korean Institute of Communications and Information Sciences (winter), 2013, pp. 667-668.
Tae-Hun Kim et al. “Drinking Speech System”, Department of Information Communication, Sang Myung University, pp. 257-262.
Lee, Won-Hee et al..“A Study on Drinking Judgement using Differential Signal in Speech Signal”, The Journal of Korea Information and Communications Society (winter) 2014, pp. 878-879, Jan. 2014.
See-Woo Lee, “A Study on Formant Variation with Drinking and Nondrinking Condition,” Department of Information & Telecommunication Engineering, Sangmyung University, vol. 10, No. 4, pp. 805-810, 2009.
Chan Joong Jung et al. “A Study on Drunken Decision using Spectral Envelope Changes” Korea Institute of communications and Information Sciences, Winter Conference, vol. 2013 No. 1 (2013), pp. 674-675.
Chan Joong Jung et al. “Speech Sobriety Test Based on Formant Energy Distribution” International Journal of Multimedia and Ubiquitous Engineering vol. 8 No. 6 (2013), pp. 209-216.
Baumeister, Barbara, Christian Heinrich, and Florian Schiel. “The influence of alcoholic intoxication on the fundamental frequency of female and male speakers.” The Journal of the Acoustical Society of America 132.1 (2012): 442-451.
Schuller, Bjorn W., et al. “The INTERSPEECH 2011 Speaker State Challenge.” INTERSPEECH. 2011.
Hollien, Harry, et al. “Effects of ethanol intoxication on speech suprasegmentals.” The Journal of the Acoustical Society of America 110.6 (2001): 3198-3206.
Kim (Kim, Jonathan, Hrishikesh Rao, and Mark Clements. “Investigating the use of formant based features for detection of affective dimensions in speech.” Affective computing and intelligent interaction (2011): 369-377.).
Broad (Broad, David J., and Frantz Clermont. “Formant estimation by linear transformation of the LPC cepstrum.” The Journal of the Acoustical Society of America 86.5 (1989)).
Sato (Sato, Nobuo, and Yasunari Obuchi. “Emotion recognition using mel-frequency cepstral coefficients.” Information and Media Technologies 2.3 (2007): 835-848.).
Related Publications (1)
Number Date Country
20170004848 A1 Jan 2017 US