(1) Field of the Invention
The present invention relates to a sound recognition device which discriminates between a periodic sound, such as engine sound or voice, and an aperiodic sound, such as wind noise, rain sound, or background noise, to determine a frequency signal of the periodic or aperiodic sound.
(2) Description of the Related Art
The following are the sound recognition technologies having conventionally been employed.
Japanese Unexamined Utility Model Application Publication No. 5-92767 discloses a technology to sense a nearby vehicle present around a user's vehicle by detecting a sound of the nearby vehicle. This technology is referred to as a first conventional technology hereafter. The first conventional technology uses a spectral subtraction method (referred to as the SS method hereafter) to eliminate an engine sound of the user's vehicle and ambient noise. Then, this technology senses the nearby vehicle on the basis of power of a sound signal from which the noises have been eliminated, and detects a direction of the nearby vehicle on the basis of an arrival time difference between the engine sounds received by microphones.
Moreover, Japanese Patent No. 4310371, for example, discloses a technology related to eliminating noises such as wind noise. This technology is referred to as a second conventional technology. The second conventional technology focuses on differences of temporal fluctuations in the phases of sound signals, and accordingly discriminates between a periodic sound, such as voice, and an aperiodic sound, such as wind noise.
The first conventional technology uses the SS method to eliminate the noises. In the SS method, frequency analysis is performed on sounds included in a certain period of time, and then power for each obtained frequency is subtracted as noise to extract a sound included in the certain period of time. For doing so, it is necessary to estimate the noises beforehand. In the case where a sound having steady power is present in the ambient noise, the noise can be estimated and thus eliminated. However, an unsteady noise, such as wind noise, fluctuates in power over time. The SS method is not robust enough to such an unsteady noise, and cannot accurately discriminate between the wind noise and the vehicle sound.
The second conventional technology recognizes a periodic sound on the basis of characteristics that the periodic sound, such as an engine sound, is approximately constant in frequency and is constant in phase with respect to the time.
When the vehicle is running at a constant speed and the number of engine revolutions is constant (meaning that the frequency of the engine sound is constant with respect to the time), the periodic sound can be recognized.
However, when the number of engine revolutions fluctuates according to acceleration or deceleration of the vehicle, the recognition accuracy needs to be improved so as to respond to the temporal fluctuations in frequency. In particular, in the case of, for example, an application for detecting a vehicle present in a blind spot of the user's vehicle, it is important to accurately detect, for supporting safer driving, an accelerating vehicle which may cause a serious accident with a high probability.
The present invention is conceived in view of the stated problem, and has an object to provide a sound recognition device which discriminates between a periodic sound, such as engine sound or voice, and an aperiodic sound, such as wind noise, rain sound, or background noises, to determine a frequency signal of the periodic or aperiodic sound, and to provide particularly a sound recognition device which accurately recognizes the periodic sound fluctuating in frequency over time.
In order to achieve the aforementioned object, the sound recognition device according to an aspect of the present invention is a sound recognition device including: a frequency analysis unit which analyzes a frequency signal of a sound signal; a phase curve calculation unit which calculates a phase curve approximating temporal fluctuations of a phase of the frequency signal; an error calculation unit which calculates an error between the phase curve and the phase of the frequency signal; and a sound signal recognition unit which recognizes whether or not the sound signal is a signal of a periodic sound, based on the calculated error, wherein the phase curve is expressed by a quadratic polynomial in which a value of the phase is a variable.
When the frequency fluctuates over time, the phase also fluctuates over time. The temporal phase fluctuations can be represented by a phase curve. Based on the error with respect to the phase curve, the sound signal can be determined as being of a periodic sound or not. As a result, the sound recognition device can discriminate between a periodic sound, such as engine sound or voice, and an aperiodic sound, such as wind noise, rain sound, or background noise, to determine a frequency signal of the periodic or aperiodic sound. In particular, the sound recognition device can accurately recognize the periodic sound fluctuating in frequency over time.
When the frequency fluctuations of the sound signal can be expressed by a linear equation, the phase fluctuations can be expressed by a quadratic polynomial. Thus, the phase curve can be expressed using a curve represented by a quadratic polynomial, so that the phase fluctuations can be expressed with accuracy.
Preferably, the sound recognition device may further include a phase modification unit which modifies a phase which is different from a predetermined number of phases, by adding ±2 π*m (radian), where m is a natural number, to the phase to reduce a difference between the phase and the predetermined number of phases.
With this, the phase which is significantly shifted with respect to the phases at other times can be modified, so that the sound recognition can be performed with accuracy.
Moreover, the sound recognition device may further include a phase modification unit which modifies the phase of the frequency signal by adding ±2 π*m (radian), where m is a natural number, to the phase to include the phase within an angular range, the modification being performed for each of different angular ranges, wherein the phase curve calculation unit calculates the phase curve for each of the angular ranges, the error calculation unit calculates the error for each of the angular ranges, the phase modification unit further selects one of the angular ranges in which the error is a minimum, and the sound signal recognition unit recognizes whether or not the sound signal is the signal of the periodic sound, based on the error in the selected angular range.
With this, the phase which is significantly shifted with respect to the phases at other times can be modified, so that the sound recognition can be performed with accuracy.
More preferably, the frequency analysis unit may analyze the frequency signal for each of a plurality of sound signals received, respectively, by a plurality of microphones arranged at a distance from each other, and the sound recognition device may further include a direction detection unit which detects a sound source direction of the periodic sound on the basis of an arrival time difference between the sound signals received by the microphones, when the sound signal recognition unit recognizes that the sound signal received by at least one of the microphones is the signal of the periodic sound.
When the periodic sound is recognized, a direction of an approaching vehicle is detected from the arrival time difference between the sound signals received by the microphones. Thus, the direction of the approaching vehicle can be accurately detected without being influenced by the noises.
It should be noted that the present invention can be implemented not only as a sound recognition device including the characteristic units as described above, but also as a sound recognition method having, as steps, the characteristic processing units included in the sound recognition device. Also, the present invention can be implemented as a computer program causing a computer to execute the characteristic steps included in the sound recognition method. It should be obvious that such a computer program can be distributed via a recording medium such as a Compact Disc-Read Only Memory (CD-ROM) or via a communication network such as the Internet.
The sound recognition device according to the present invention is capable of discriminating between a periodic sound, such as engine sound or voice, and an aperiodic sound, such as wind noise, rain sound, or background noises, to determine a frequency signal of the periodic or aperiodic sound. In particular, the present invention can provide a sound recognition device which accurately recognizes the periodic sound fluctuating in frequency over time.
The disclosure of Japanese Patent Application No. 2010-025930 filed on Feb. 8, 2010 including specification, drawings and claims is incorporated herein by reference in its entirety.
The disclosure of PCT application No. PCT/W2011/000036 filed on Jan. 7, 2011, including specification, drawings and claims is incorporated herein by reference in its entirety.
These and other objects, advantages and features of the invention will become apparent from the following description thereof taken in conjunction with the accompanying drawings that illustrate a specific embodiment of the invention. In the Drawings:
The present invention focuses attention on characteristics of temporal frequency fluctuations of a periodic sound such as engine sound or voice. The inventors of the present invention analyzed the sound-generating mechanism and the data of sound actually collected. As a result, the inventors made a new finding that the temporal frequency fluctuations of the periodic sound in a time-frequency domain can be approximated by a piecewise linear function. From this new finding, the inventors further found that the temporal phase fluctuations which have been piecewise-linearly approximated can be modeled by a curve. Thus, the periodic sound can be recognized with accuracy even when the frequency fluctuates over time. It should be noted that the periodic sound in the present invention refers to a sound whose phase is constant or whose phase fluctuations are cyclic.
Here, the term “phase” used in the present invention is defined with reference to
Moreover, (b) of
The result obtained by this process is shown in (c) of
It should be noted that, in the sound signal processing, the Fast Fourier Transform (FFT), and the like, it is common to perform the convolution process while the base waveform is being shifted in the direction of the time axis. In the case where the convolution process is performed while the base waveform is being shifted in the direction of the time axis, the phase can be modified later to be converted into a phase defined in the present invention. The explanation is given as follows, with reference to the drawings.
Moreover, (b) of
The result obtained by this process is shown in (c) of
Next, an explanation is given about temporal fluctuations in the frequency of the engine sound. The frequency of the engine sound fluctuates as the number of engine revolutions fluctuates over time.
In an engine, a predetermined number of cylinders make piston motion to cause revolutions to a powertrain. The engine sound from the vehicle includes: a sound dependent on the engine revolutions; and a fixed vibration sound and an aperiodic sound which are independent of the engine revolutions. In particular, the sound mainly detected from the outside of the vehicle is the periodic sound dependent on the engine revolutions. In the following embodiments, the periodic sound dependent on the engine revolutions is extracted as the engine sound.
It can be seen from dashed-line circles 501, 502, and 503 in
f(t)=At+f0 (Equation 1)
To be more specific, the frequency f at a time t can be linearly approximated using a line segment which increases or decreases from an initial value f0 in proportion to the time t (i.e., a proportionality coefficient A) in a predetermined time period. For example, when the vehicle is accelerating, the number of engine revolutions generally increases almost linearly. In a period B showing the frequency fluctuations of when the vehicle is accelerating, the frequency increases, that is, rises to the right. During the period B, the number of engine revolutions is increasing, meaning that the vehicle is accelerating. Thus, the frequency of this engine sound can be approximated by a piecewise linear function where a slope A is positive. When the vehicle is decelerating, the number of engine revolutions decreases linearly. In a period A showing the frequency fluctuations of when the vehicle is decelerating, the frequency decreases, that is, falls to the right. Thus, the frequency of this engine sound can be approximated by a piecewise linear function where the slope A is negative. When the vehicle is running at a constant speed, the number of engine revolutions remains constant. In a period C showing the frequency fluctuations of when the vehicle is running at the constant speed, the frequency remains approximately constant. Thus, the frequency of this engine sound can be approximated by a piecewise linear function where the slope A is zero.
When the frequency f is expressed by Equation 1 above, the phase ψ at the time t can be expressed as follows.
ψ(t)=2π∫f(t)dt=π∫(At+f0)dt=πAt2+2πf0t+ψ0 (Equation 2)
In Equation 2, ψ0 in the third term on the right-hand side indicates an initial phase, and the second term (2 π f0 t) indicates that the phase advances by an angular frequency 2 π f0 t in proportion to the time t. Also, the first term (π A t2) indicates that the phase can be approximated by a quadratic curve.
As described above, the temporal phase fluctuations of the periodic sound, such as an engine sound, can be modeled by a curve. On the other hand, the temporal phase fluctuations of the aperiodic sound, such as wind noise, are random and show no periodicity, meaning that these fluctuations cannot be approximated by a quadratic curve. The inventors of the present invention noted the difference of the temporal phase fluctuations between the periodic sound and the aperiodic sound. That is, the inventors found out that a frequency signal of the periodic or aperiodic sound can be determined by discriminating between the periodic sound, such as the engine sound, which shows change in the periodicity and the aperiodic sound, such as wind noise, rain sound, or background noise. In particular, an application for detecting a vehicle present in a blind spot, for example, can instantaneously detect an accelerating vehicle.
A relation between the fluctuations in the number of engine revolutions and the phase of the engine sound is analyzed as follows.
In
It should be noted that, when the frequency of a target sound is constant and the frequency of a base waveform is low, the phase gradually delays. However, since the amount of decrease is constant, the phase linearly decreases. On the other hand, when the frequency of the target sound is constant and the frequency of the base waveform is high, the phase gradually advances. However, since the amount of increase is constant, the phase linearly increases.
In
In
The following is a description of the embodiments according to the present invention, with reference to the drawings.
A noise elimination device in the first embodiment is described as follows.
In
The microphone 2400 collects a mixed sound 2401 from the outside. The mixed sound 2401 includes an engine sound of a vehicle and wind noise.
Receiving the mixed sound 2401, the DFT analysis unit 2402 performs the Fourier transform processing on the mixed sound 2401 to obtain a frequency signal of the mixed sound 2401 for each of frequency bands.
It should be noted that, instead of the Fourier transform processing, the DFT analysis unit 2402 may perform the frequency conversion according to a different method of processing, such as the fast Fourier transform processing, the discrete cosine transform processing, or the wavelet transform processing.
The number of frequency bands included in the frequency signal obtained by the DFT analysis unit 2402 is represented as M and a number identifying a frequency band is represented as a symbol j (j=1 to M).
The noise elimination processing unit 1504 includes a phase modification unit 1501(j) (j=1 to M), a sound determination unit 1502(j) (j=1 to M), and a sound extraction unit 1503(j) (j=1 to M). That is to say, the phase modification unit, the sound determination unit, and the sound extraction unit are provided for each of the frequency bands. The phase modification unit 1501(j) (j=1 to M) corresponds to a phase modification unit described in the claims set forth below. The sound extraction unit 1503(j) (j=1 to M) corresponds to a sound signal recognition unit in the claims set forth below.
The phase modification unit 1501(j) (j=1 to M) includes an M number of phase modification units, and a j-th phase modification unit 1501(j) executes processing for a j-th frequency band. In the present specification, the same processing is performed for the other frequency bands by the corresponding units having reference numbers assigned as above.
Supposing that a phase of the frequency signal at a time t is represented as ψ (t) (radian), the phase modification unit 1501(j) (j=1 to M) makes a phase modification to the frequency signal of the frequency band j obtained by the DFT analysis unit 2402. To be more specific, the phase ψ (t) of the frequency signal at the time t is modified to ψ′ (t)=mod 2 π (ψ(t)—2 π f t) (where f is the analysis-target frequency).
The sound determination unit 1502(j) (j=1 to M) calculates a phase curve (an approximate curve) by approximating temporal phase fluctuations using a phase-modified signal at an analysis-target time in a predetermined period, and then calculates an error between the calculated phase curve and the phase at the analysis-target time. Here, a phase distance (i.e., the error between the phase curve and the phase at the analysis-target time) is calculated using ψ′ (t).
Then, finally, on the basis of the error (i.e., the phase distance) calculated by the sound determination unit 1502(j) (j=1 to M), the sound extraction unit 1503(j) (j=1 to M) extracts, as an extracted sound, a frequency signal whose error is equal to or smaller than a threshold.
These processes are performed while the predetermined period is being shifted in the direction of the time axis. Accordingly, a frequency signal 2408 of the extracted sound can be extracted for each time-frequency domain.
The sound determination unit 1502(j) (j=1 to M) includes a frequency signal selection unit 1600(j) (j=1 to M), a phase distance determination unit 1601(j) (j=1 to M), and a phase curve calculation unit 1602(j) (j=1 to M). The phase curve calculation unit 1602(j) (j=1 to M) corresponds to an error calculation unit in the claims set forth below.
The frequency signal selection unit 1600(j) (j=1 to M) selects frequency signals which are to be used for calculating a phase curve and phase distances, from among the frequency signals, in the predetermined period, to which the phase modification unit 1501(j) (j=1 to M) has made phase modifications.
The phase curve calculation unit 1602(j) (j=1 to M) calculates, as a quadratic curve, a phase form which fluctuates over time, using the modified phase ψ′ (t) of the frequency signal selected by the frequency signal selection unit 1600(j) (j=1 to M). Following this, the phase distance determination unit 1601(j) (j=1 to M) determines a phase distance between the phase curve calculated by the phase curve calculation unit 1602(j) (j=1 to M) and the modified phase at the analysis-target time.
It should be noted that essential components in the present invention are the DFT analysis unit 2402 and the sound extraction unit 1503(j) shown in
Next, an operation performed by the noise elimination device 1500 configured as described thus far is explained.
In the following, the j-th frequency band is described. The same processing is performed for the other frequency bands. Here, the explanation is given, as an example, about the case where a center frequency and an analysis-target frequency of the frequency band agree with each other.
The analysis-target frequency refers to a frequency f as in ψ′ (t)=mod 2 π (ψ(t)−2 π f t) used in calculating the phase distance. The noise elimination device 1500 determines whether or not a to-be-extracted sound exists in the frequency f.
As another method, the to-be-extracted sound may be determined using a plurality of frequencies including the frequency band as the analysis frequencies. In such a case, whether or not the to-be-extracted sound exists in the frequencies around the center frequency can be determined.
The microphone 2400 collects the mixed sound 2401 from the outside and then outputs the collected mixed sound 2401 to the DFT analysis unit 2402 (step S200).
Receiving the mixed sound 2401, the DFT analysis unit 2402 performs the Fourier transform processing on the mixed sound 2401 to obtain a frequency signal of the mixed sound 2401 for each frequency band j (step S300).
Next, supposing that the phase of the frequency signal at the time t is represented as ψ (t) (radian), the phase modification unit 1501(j) (j=1 to M) makes a phase modification to the phase ψ (t) of the frequency signal obtained by the DFT analysis unit 2402 to convert the phase ψ (t) into the phase ψ′ (t)=mod 2 π (ψ(t)−2 π f t) (where f is the analysis-target frequency) for each frequency band j (step S1700(j)).
The following explains a reason why the phase is used in the present invention and also describes an example of a phase modification method.
In
Then, the frequency signal is obtained for each of the times while the time shift is being executed as shown by t1, t2, t3, and so on in (a) of
In
In
Suppose that a real part of the frequency signal is represented as x (t) and that an imaginary part of the frequency signal is represented as y (t). In this case, the phase y (t) and the magnitude (power) P (t) are expressed as follows.
ψ(t)=mod 2π(arctan(y(t)/x(t))) (Equation 3)
P(t)=√{square root over (x(t)2+y(t)2)}{square root over (x(t)2+y(t)2)} (Equation 4)
In the above equations, “t” represents a time corresponding to the frequency. Here, a vehicle engine sound of when a noise such as a wind noise is present is explained, with reference to
In
With this being the situation, the engine sound is extracted using the temporal phase fluctuations in the present invention. Firstly, phase characteristics of the engine sound are explained.
In an engine, a predetermined number of cylinders make piston motion to cause revolutions to a powertrain. The engine sound from the vehicle includes: a sound dependent on the engine revolutions; and a fixed vibration sound or an aperiodic sound which is independent of the engine revolutions. In particular, the sound mainly detected from the outside of the vehicle is the periodic sound dependent on the engine revolutions. In the present invention, this periodic sound dependent on the engine revolutions is extracted as the engine sound.
It can be seen from the dashed-line circles 501, 502, and 503 in
When the frequency f is expressed by Equation 1 above, the phase ψ at the time t can be expressed by Equation 2 above.
Next, the phase modification process to ease the approximation performed on the temporal phase fluctuations is explained.
The phase modification is made to convert the phase ψ (t) of the frequency signal shown in (c) of
Firstly, the phase modification unit 1501(j) determines a reference time. Here, (a) of
Next, the phase modification unit 1501(j) determines a plurality of times of the frequency signals to which phase modifications are to be made. In this example, five times (t1, t2, t3, t4, and t5) indicated by open circles in (a) of
Here, note that the phase of the frequency signal at the reference time t0 is expressed as follows.
ψ(t0)=mod 2π(arctan(y(t0)/x(t0))) (Equation 5)
Also note that the phases of the to-be-modified frequency signals at the five times are expressed as follows.
ψ(ti)=mod 2π(arctan(y(ti)/x(ti))) (i=1,2,3,4,5) (Equation 6)
Each of the phases before the modifications is indicated by X in (a) of
P(ti)=√{square root over (x(ti)2+y(ti)2)}{square root over (x(ti)2+y(ti)2)} (i=1,2,3,4,5) (Equation 7)
ψ′(ti) (i=0,1,2,3,4,5)
In (b) of
Δψ=2πf(t2−t0) (Equation 8)
Thus, in order to modify this phase difference caused by a time difference between the phases at the times t0 and t2 in (a) of
ψ′(t0)=ψ(t0) (Equation 9)
ψ′(ti)=mod 2π(ψ(ti)−2πf(ti−t0)) (i=1,2,3,4,5) (Equation 10)
The phases of the frequency signals obtained as a result of the phase modifications are indicated by X in (b) in
Returning to
Firstly, the frequency signal selection unit 1600(j) selects the frequency signals which are to be used by the phase curve calculation unit 1602(j) for calculating the phase curve, from among the frequency signals, in the predetermined period, to which the phase modification unit 1501(j) has made the phase modifications (step S1800(j)). In this example, the analysis-target time is t0, and the phase curve is calculated from the phases of the frequency signals at the times t1 to t5 with respect to the phase at the time t0. Here, the number of frequency signals (six signals in total at the times t0 to t5) used for calculating the phase curve is equal to or greater than a predetermined value. This is because it would be difficult to determine the regularity of the temporal phase fluctuations when the number of frequency signals selected for the phase curve calculation is small. The time length of the predetermined period may be determined on the basis of characteristics of the temporal phase fluctuations of the extracted sound.
Next, the phase curve calculation unit 1602(j) calculates the phase curve (step S1801(j)). Note that the phase curve is calculated via approximation according to, for example, a quadratic polynomial expressed by Equation 11 as follows.
Ψ(t)=A2 t2+A1 t+A0 (Equation 11)
Moreover, coefficients in the above equations are expressed as follows.
Returning to
E
0=|Ψ(t0)−ψ′(t0)| (Equation 20)
It should be noted that the analysis-target point may be excluded in calculating the form of the phase, and that a phase difference between the calculated form and the analysis-target point may be calculated. With this method, when a noise shifted significantly from the calculated form is included in the analysis-target point, the form can be approximated more accurately.
It should be noted that, in the present example, the phase form is calculated from the phases at the times t1 to t5 with respect to the phase at the analysis-target time t0. For example, when the time t2 is an analysis target time (in other words, the time t2 is set as a time t0′), a phase curve may be newly calculated from phases at times t1′, t2′, t3′, t4′, and t5′ to calculate an error. Alternatively, the phase curve which has been already calculated from the phases at the times t0 to t5 may be used for calculating the error. To be more specific, the error calculated using the already-calculated phase curve is expressed as follows.
E
i=|Ψ(ti)−ψ′(ti)| (Equation 21)
With this method, the number of times to calculate the phase curve is reduced, so that the amount of calculation can be accordingly reduced. Moreover, a predetermined period may be set as an analysis target, and it may be determined, on the basis of an average of errors, whether all of the frequency signals included in the analysis-target period have errors. For example, the average of the errors may be expressed as follows.
It should be noted that the analysis-target period may be variable depending on circumstances. For example, the analysis-target period may be set shorter around an intersection where vehicles are likely to suddenly accelerate or decelerate, and may be longer where acceleration or deceleration is relatively unlikely to happen.
Returning to
In (a) of
In
In (b) of
In (c) of
In (d) of
In (e) of
As described thus far, the wind noise and the engine sound can be discriminated on the basis of the calculated curve and the error with respect to the curve.
Analysis conditions are that: frequency analyses are performed at 256 points (32 ms) of each of the sounds sampled at 8 kHz; and a phase curve calculation is performed using 768 points as a period (96 ms). Then, the average and distribution of the errors with respect to the phase curve are calculated. As shown in
In
Note that the phase modification unit 1501(j) may further perform the following process during the phase modification. When the following phase modification process is further performed, processes including calculating a phase curve and calculating errors with respect to the phase curve are also performed. Thus, the phase modification unit 1501(j) performs the following process, referring to as necessary the calculation results given by the sound determination unit 1502(j).
In (a) of
For example, the phase may be modified using an N number of phases which are present before, after, or before and after the present phase. Suppose, as an example, that an average of the phases at the times t1 to t5 (N=5) shown in (b) of
Next, the phase ψ (6) at the time t6 is modified to a value such that an error between the phase at the time t6 and the average phase ψ becomes smaller. In the case shown in (b) of
In
It should be noted that the phase modification method is not limited to the method described thus far. For example, the phase curve may be firstly calculated, and then the phase modification using ±2 π may be performed on each point at which an error with respect to the curve is significant. Alternatively, the range of possible angles for the phase may be modified. The explanation is presented as follows, with reference to the drawing.
In
As described thus far, the present embodiment can discriminate between the periodic sound, such as engine sound or voice, and the aperiodic sound, such as wind noise, rain sound, or background noise, for each time-frequency domain, so as to determine a frequency signal of the periodic or aperiodic sound. The present embodiment can accurately recognize especially the periodic sound, such as the engine sound, which fluctuates in frequency over time in the time-frequency domain. In particular, an application for detecting a vehicle present in a blind spot can accurately detect an accelerating vehicle which may cause a serious accident with a high probability.
The following is a description of a vehicle detection device in the second embodiment. The vehicle detection device in the second embodiment determines a frequency signal of an engine sound (i.e., a to-be-extracted sound) from each of mixed sounds received by a plurality of microphones, calculates an arrival direction of an approaching vehicle from a sound arrival time difference, and informs a driver about the direction and presence of the approaching vehicle.
In
The vehicle detection processing unit 4101 includes a phase modification unit 4102(j) (j=1 to M), a sound determination unit 4103(j) (j=1 to M), and a sound extraction unit 4104(j) (j=1 to M).
In
The microphone 4107(1) shown in
The DFT analysis unit 1100 performs the discrete Fourier transform processing on the mixed sound 2401(1) and the mixed sound 2401(2) to obtain the respective frequency signals of the mixed sound 2401(1) and the mixed sound 2401(2). In this example, the time window width for the DFT is 256 points (38 ms). Hereinafter, the number of frequency bands obtained by the DFT analysis unit 1100 is represented as M and a number specifying a frequency band is represented as a symbol j (j=1 to M). In this example, a frequency band from 10 Hz to 500 Hz where an engine sound of a vehicle exists is divided into 10-Hz bands (M=50) to obtain the frequency signal.
Supposing that a phase of a frequency signal at a time t is ψ (t) (radian), the phase modification unit 4102(j) (j=1 to M) modifies the phase ψ (t) of the frequency signal of the frequency band j (j=1 to M) obtained by the DFT analysis unit 1100 to a phase ψ″ (t)=mod 2 π (ψ (t)−2 π f′ t) (where f′ is a frequency of the frequency band). In the present example, the phase ψ (t) is modified using the frequency f′ of the frequency band where the frequency signal is obtained, instead of using the analysis-target frequency.
The sound determination unit 4103(j) (j=1 to M) calculates the phase curve from the phase-modified frequency signal at an analysis-target time in a predetermined period, and then determines a to-be-extracted sound on the basis of the calculated phase curve. Here, the number of frequency signals used for calculating a phase distance is equal to or greater than a first threshold. In the present example, the predetermined period is 96 ms. Also, the phase distance is calculated using ψ″ (t). The sound determination unit 4103(j) (j=1 to M) performs the same processing as the processing performed by the sound determination unit 1502(j) (j=1 to M) in the first embodiment. Therefore, the detailed description is not repeated here.
The sound determination unit 4103(j) (j=1 to M) includes a phase distance determination unit 4200(j) (j=1 to M), a phase curve calculation unit 4201(j) (j=1 to M), and a frequency signal selection unit 4202(j) (j=1 to M).
The frequency signal selection unit 4202(j) (j=1 to M) selects frequency signals which are to be used for calculating a phase curve and phase distances, from among the frequency signals, in the predetermined period, to which the phase modification unit 4102(j) (j=1 to M) has made phase modifications. The frequency signal selection unit 4202(j) (j=1 to M) performs the same processing as the processing performed by the frequency signal selection unit 1600(j) (j=1 to M) in the first embodiment. Therefore, the detailed description is not repeated here.
The phase curve calculation unit 4201(j) (j=1 to M) calculates, as a curve, a phase form which fluctuates over time, using the modified phase ψ″ (t) of the frequency signal. The phase curve calculation unit 4201(j) (j=1 to M) performs the same processing as the processing performed by the phase curve calculation unit 1602(j) (j=1 to M) in the first embodiment. Therefore, the detailed description is not repeated here.
The phase distance determination unit 4200(j) (j=1 to M) determines whether a phase distance with respect to the phase curve calculated by the phase curve calculation unit 4201(j) (j=1 to M) is equal to or smaller than a second threshold. To be more specific, the phase curve calculation is performed using 768 points as a period (96 ms), and the phase distance is calculated. The phase distance determination unit 4200(j) (j=1 to M) employs the same methods for calculating the phase curve and phase distance as those employed by the phase distance determination unit 1601(j) (j=1 to M) in the first embodiment. Therefore, the detailed description is not repeated here.
Next, the sound extraction unit 4104(j) (j=1 to M) extracts the engine sound on the basis of the phase distance determined by the sound determination unit 4103(j) (j=1 to M). To be more specific, the threshold of error is set at 20 degrees, and then a sound having an error equal to or smaller than the threshold is extracted as the engine sound. The sound extraction unit 4104(j) (j=1 to M) performs the same processing as the sound extraction unit 1503(j) (j=1 to M) in the first embodiment. Therefore, the detailed description is not repeated here. It should be noted that, when the engine sound is extracted, the sound extraction unit 4104(j) (j=1 to M) also outputs a sound detection flag 4105.
Returning to
Suppose that a spacing between the microphone 4107 (1) and the microphone 4107 (2) is d (m). Also suppose that an engine sound is detected from an angle θ (radian) with respect to the driver's vehicle. In this case, the angle θ (radian) can be expresses by Equation 23 as follows, where a sound arrival time difference is represented as Δt (s) and a sound speed is represented as c (m/s).
θ=sin−1 (Δtc/d) (Equation 23)
Finally, the presentation unit 4106 connected to the vehicle detection device 4100 informs the driver about the direction of the nearby vehicle detected by the direction detection unit 4108. For example, the presentation unit 4106 may show, on a display, the direction from which the nearby vehicle is approaching.
The vehicle detection device 4100 and the presentation unit 4106 perform these processes while the predetermined period is being shifted in the direction of the time axis.
Next, an operation performed by the vehicle detection device 4100 configured as described thus far is explained.
In the following, the j-th frequency band (where the frequency is f′) is described.
Firstly, each of the microphone 4107 (1) and the microphone 4107 (2) receives the mixed sound 2401 from the outside, and sends the received mixed sound to the DFT analysis unit 2402 (step S201).
Receiving the mixed sound 2401 (1) and the mixed sound 2401 (2), the DFT analysis unit 1100 performs the discrete Fourier transform processing on the mixed sound 2401 (1) and the mixed sound 2401 (2) to obtain the respective frequency signals of the mixed sound 2401 (1) and the mixed sound 2401 (2) (step S300).
Supposing that a phase of a frequency signal at a time t is ψ (t) (radian), the phase modification unit 4102(j) modifies the phase ψ (t) of the frequency signal of the frequency band j (the frequency f′) obtained by the DFT analysis unit 1100 to a phase ψ″ (t)=mod 2 π (ψ (t)−2 π f′ t) (where f′ is the frequency of the frequency band) (step S4300(j)).
Next, the sound determination unit 4103(j) (the phase distance determination unit 4200(j)) determines the analysis-target frequency f, for each of the mixed sound 2401 (1) and the mixed sound 2401 (2), using the phase ψ″ (t) of the phase-modified frequency signals in the predetermined period. Here, the number of phase-modified signals is equal to or greater than the first threshold. Also, the first threshold is represented by a value which corresponds to 80% of the frequency signals at the times in the predetermined period. Then, the sound determination unit 4103(j) (the phase distance determination unit 4200(j)) calculates the phase distance using the determined analysis-target frequency f (step S4301(j)).
The process performed in step S4301(j) is described in detail with reference to
Following this, the phase curve calculation unit 4201(j) calculates the phase curve (step S1801(j)).
Next, the phase distance determination unit 4200(j) calculates the phase distance between the form calculated by the phase curve calculation unit 4201(j) and the modified phase at the analysis-target time (step S1802(j)).
Returning to
The direction detection unit 4108 identifies the direction in which the nearby vehicle is present, for the time-frequency domain of the engine sound extracted by the sound extraction unit 4104(j), and the presentation unit 4106 informs the driver about the direction of the nearby vehicle detected by the direction detection unit 4108 (step S4304).
As described thus far, when the engine sound is extracted, the vehicle detection device in the second embodiment identifies the direction of the vehicle on the basis of the arrival time difference of the engine sound. Thus, the direction of the vehicle can be accurately detected without any influence from the noises.
Although the noise elimination device and the vehicle detection device in the embodiments according to the present invention have been described, the present invention is not limited to these embodiments.
In the above embodiments, the engine sound is extracted as an example. Note that the extraction target in the present invention is not limited to the engine sound. The present invention is applicable in any case as long as the sound is periodic like a human voice, an animal sound, or a motor sound.
In the above embodiments, the sound extraction unit determines, for each frequency signal, whether the signal represents a periodic sound or a noise. However, the sound extraction unit may perform this determination for each predetermined period, and thus may determine whether the frequency signals included in the predetermined period represent a periodic sound or a noise. For example, referencing to
Also, to be more specific, each of the above-described devices may be a computer system configured with a microprocessor, a ROM, a RAM, a hard disk drive, a display unit, a keyboard, a mouse, and so forth. The RAM or the hard disk drive stores a computer program. The microprocessor operates according to the computer program, so that functions of the components included in the computer system are carried out. Here, note that the computer program includes a plurality of instruction codes indicating instructions to be given to the computer so as to achieve a specific function.
Moreover, some or all of the components included in each of the above-described devices may be realized as a single system Large Scale Integration (LSI). The system LSI is a super multifunctional LSI manufactured by integrating a plurality of components onto a signal chip. To be more specific, the system LSI is a computer system configured with a microprocessor, a ROM, a RAM, and so forth. The RAM stores a computer program. The microprocessor operates according to the computer program, so that a function of the system LSI is carried out.
Furthermore, some or all of the components included in each of the above-described devices may be implemented as an IC card or a standalone module that can be inserted into and removed from the corresponding device. The IC card or the module is a computer system configured with a microprocessor, a ROM, a RAM, and so forth. The IC card or the module may include the aforementioned super multifunctional LSI. The microprocessor operates according to the computer program, so that a function of the IC card or the module is carried out. The IC card or the module may be tamper resistant.
Also, the present invention may be the methods described above. Each of the methods may be a computer program implemented by a computer, or may be a digital signal of the computer program.
Moreover, the present invention may be the aforementioned computer program or digital signal recorded on a computer-readable recording medium, such as a flexible disk, a hard disk, a CD-ROM, an MO, a DVD, a DVD-ROM, a DVD-RAM, a Blu-ray Disc (BD) (registered trademark), or a semiconductor memory. Also, the present invention may be the digital signal recorded on such a recording medium.
Furthermore, the present invention may be the aforementioned computer program or digital signal transmitted via a telecommunication line, a wireless or wired communication line, a network represented by the Internet, and data broadcasting.
Also, the present invention may be a computer system including a microprocessor and a memory. The memory may store the aforementioned computer program and the microprocessor may operate according to the computer program.
Moreover, by transferring the recording medium having the aforementioned program or digital signal recorded thereon or by transferring the aforementioned program or digital signal via the aforementioned network or the like, the present invention may be implemented by a different independent computer system.
Furthermore, the above embodiments and variations may be combined.
Although only some exemplary embodiments of this invention have been described in detail above, those skilled in the art will readily appreciate that many modifications are possible in the exemplary embodiments without materially departing from the novel teachings and advantages of this invention. Accordingly, all such modifications are intended to be included within the scope of this invention.
The present invention is applicable to a sound recognition device capable of discriminating, for each time-frequency domain, between a periodic sound, such as engine sound, and an aperiodic sound, such as wind noise, rain sound, or background noise, to determine a frequency signal of the periodic or aperiodic sound, and also applicable to a vehicle detection device capable of detecting a direction of a vehicle on the basis of a recognized periodic sound.
Number | Date | Country | Kind |
---|---|---|---|
2010-025930 | Feb 2010 | JP | national |
This is a continuation application of PCT application No. PCT/JP2011/000036 filed on Jan. 7, 2011, designating the United States of America.
Number | Date | Country | |
---|---|---|---|
Parent | PCT/JP2011/000036 | Jan 2011 | US |
Child | 13282902 | US |