The present invention relates to a technique for detecting a pitch (or fundamental frequency) of an audio or sound signal.
Heretofore, there have been proposed various techniques for detecting a pitch of an audio or sound signal. Japanese Patent Application Laid-open Publication No. SHO-61-26089 discloses an example technique, where detection is made of a pitch of a sound signal having passed through a low-pass filter and where the cutoff frequency of the low-pass filter is variably controlled in accordance with a result of the pitch detection. The pitch detection technique disclosed in the No. SHO-61-26089 publication can advantageously detect a pitch of a sound signal with a high accuracy because, of the sound signal, intensities of peaks other than a peak corresponding to the pitch are controlled.
However, with the technique disclosed in the No. SHO-61-26089 publication, where the cutoff frequency of the low-pass filter is changed instantaneously to a frequency corresponding to the detected pitch of the sound signal at a predetermined time point after the pitch detection, pitches detected before and after the change of the cutoff frequency tend to become unstable.
In view of the foregoing, it is an object of the present invention to detect a pitch of a sound signal with a high accuracy and in a stable manner.
In order to accomplish the above-mentioned object, the present invention provides an improved pitch detection apparatus, which comprise: a band-pass filter which suppresses frequency components of a sound signal that are lower than a low-side cutoff frequency and that are higher than a high-side cutoff frequency; a pitch detection section which detects a pitch of the sound signal having been processed by the band-pass filter; a target setting section which, in accordance with the pitch detected by the pitch detection, variably sets a low-side target value lower than the detected pitch and a high-side target value higher than the detected pitch; and a filter control section which not only causes the low-side cutoff frequency to approach the low-side target value over time (i.e., with the passage of time) but also causes the high-side cutoff frequency to approach the high-side target value over time.
According to the present invention, the low-side target value and the high-side target value are variably set in accordance with a detected pitch of a sound signal. Once the low-side target value and the high-side target value are changed, the low-side cutoff frequency and the high-side cutoff frequency are caused to approach the changed low-side target value and the changed high-side target value, respectively, progressively over time without the low-side and high-side cutoff frequencies, which determines the pass band of the band-pass filter, being switched instantaneously to the changed low-side and high-side target values. In this way, the pass band of the band-pass filter can be smoothly (i.e., not rapidly) variably controlled in response to pitch change of the sound signal that is an object of pitch detection.
According to another aspect of the present invention, there is provided an improved pitch detection apparatus, which comprises: a holding section which time-serially holds a sound signal; a band-pass filter which suppresses frequency components of the sound signal that are outside a pass band; a pitch detection section which detects a pitch of the sound signal, having been processed by the band-pass filter, for each of predetermined time frames; a control section which variably sets the pass band of the band-pass filter in accordance with the pitch detected by the pitch detection section; and an output control section which normally supplies sound signals of the individual time frame to the band-pass filter with a first cyclic period. Once a state of the pitch detection by the pitch detection section changes, in a given one of the time frames, from a state where no pitch could be detected to another state where a pitch could be detected, the output control section supplies, in time-serial order, sound signals of the given time frame and a plurality of previous time frames, preceding the given time frame, from the holding section to the band-pass filter with a second cyclic period shorter than the first cyclic period, so that a pitch detection operation is performed again on the sound signals of the plurality of time frames by the pitch detection section.
According to the other aspect, once the state of the pitch detection by the pitch detection section changes, in a given time frame, from the state where no pitch could be detected (i.e., non-pitch-detectable state) to the other state where a pitch could be detected (i.e., pitch-detectable state), the pitch detection operation (i.e., band-pass filtering operation) is performed again on the sound signals of the plurality of previous time frames, for which no pitch could be detected, using a pass band optimally set in correspondence with the given time frame for which a pitch could be detected. Thus, the present invention can accurately and stably detect a pitch of the sound signal in an in-between (or state change) period when the non-pitch-detectable state changes to the pitch-detectable state.
The present invention may be constructed and implemented not only as the apparatus invention as discussed above but also as a method invention. Also, the present invention may be arranged and implemented as a software program for execution by a processor such as a computer or DSP, as well as a storage medium storing such a software program. In this case, the program may be provided to a user in the storage medium and then installed into a computer of the user, or delivered from a server apparatus to a computer of a client via a communication network and then installed into the client's computer. Further, the processor used in the present invention may comprise a dedicated processor with dedicated logic built in hardware, not to mention a computer or other general-purpose type processor capable of running a desired software program.
The following will describe embodiments of the present invention, but it should be appreciated that the present invention is not limited to the described embodiments and various modifications of the invention are possible without departing from the basic principles. The scope of the present invention is therefore to be determined solely by the appended claims.
For better understanding of the object and other features of the present invention, its preferred embodiments will be described hereinbelow in greater detail with reference to the accompanying drawings, in which:
As shown in
The arithmetic processing device 12 functions as a plurality of components, such as a signal segmentation section 22, band-pass filter 24, pitch detection section 26 and control section 30, by executing the programs stored in the storage device 14. There may be employed an alternative construction where an electronic circuit (DSP) dedicated to processing of a sound signal A0 implements the individual components of the arithmetic processing device 12, or where the individual components of the arithmetic processing device 12 are provided distributively on a plurality of integrated circuits.
The signal segmentation section 22 of
The band-pass filter 24 generates a sound signal A1 by attenuating frequency components, outside its pass band B, of the sound signal A0 having been subjected to the processing by the signal segmentation section 22. The pass band B is a frequency band between a low-side cutoff frequency FC_L and a high-side cutoff frequency FC_H. Namely, the band-pass filter 24 suppresses frequency components of the sound signal A0 which are lower than the low-side cutoff frequency FC_L and higher than the high-side cutoff frequency FC_H. The low-side cutoff frequency FC_L and the high-side cutoff frequency FC_H are variably set under control of the control section 30, as will be later described in detail. The band-pass filter 24 may comprise a high-pass filter having the low-side cutoff frequency FC_L as its cutoff frequency, and a low-pass filter having the high-side cutoff frequency FC_H as its cutoff frequency. Note that there may be employed an alternative construction where the signal segmentation section 22 segments the sound signal A1, having been processed by the signal segmentation section 22, into unit segments U.
The pitch detection section 26 detects a pitch PA of the sound signal, having been processed by the band-pass filter 24, for each of the unit segments U. For each of the unit segments U of the sound signal A1 for which no pitch PA has been detected (like a unit segment U of an unvoiced sound or a no-sound-generated unit U which has no clear harmonic structure), a result indicating “no pitch has been detected” (or non-pitch-detectable state) is output.
The pitch PA can be calculated as a logarithmic value in cents, as defined in Mathematical Expression (1) below. Coefficient F0 in Mathematical Expression (1) represents a minimum value of possible frequencies (Hz) which the sound signal A1 is assumed to have, and this coefficient F0 is set at an appropriate value in accordance with a characteristic of a sound generation source (such as a musical instrument or a human). In the case of a sound signal A0 obtained by sampling a performance tone of a guitar, for example, the coefficient F0 is set at 8.1757989 Hz. Further, a coefficient FP in Mathematical Expression (1) represents a pitch (fundamental frequency) in hertz (Hz) of the sound signal A1.
PA=1200.0*log 2(FP/F0) [cent] (1)
Any suitable conventionally-known technique may be employed for detecting a pitch PA of a sound signal A1. For example, there may be employed a method where extreme values in a trajectory of the greater of reference values attenuating over time from intensities of individual peaks of a sound signal A1 and signal values of the sound signal A1 are detected as peaks of the sound signal A1 and then a pitch PA is detected from intervals between the peaks (e.g., the method disclosed in Japanese Patent Application Laid-open Publication No. SHO-61-44330). Also suitable for detecting a pitch PA of a sound signal A1 is a zero crossing method where a pitch PA is detected on the basis of intervals between zero crossover points at which the intensity of the sound signal A1 changes across zero, or an auto correlation method where a pitch PA is detected on the basis of a section where autocorrelation values of a sound signal A1 become greatest (i.e., pitch period of the sound signal A1).
The control section 34 variably controls the pass band B (determined by the low-side cutoff frequency FC_L and high-side cutoff frequency FC_H) of the band-pass filter 24, and it includes a target setting section 32 and a filter control section 34. The target setting section 32 variably sets a target value of the low-side cutoff frequency FC_L (hereinafter referred to as “low-side target value”) and a target value of the high-side cutoff frequency FC_H (hereinafter referred to as “high-side target value”) in accordance with the pitch PA detected by the pitch detection section 26.
As shown in
FT
—
L=PA−OFST
—
L (2a)
FT
—
H=PA+OFST
—
H (2b)
The predetermined offset values OFST_L and OFST_H are selected, for example, in accordance with a characteristic of a sound generation source of a sound signal A0 (such as a type or tone color of a musical instrument). Tone of a guitar, for example, has the characteristic that components of overtones (particularly the second overtone) of the tone are greater in intensity than a component of a pitch (fundamental frequency) PA. Thus, the predetermined offset value OFST_H is set at a greater value (cent value) than the predetermined offset value OFST_L so that the target band BT includes frequencies of the second and third overtones corresponding to the assumed pitch PA of the sound signal A1. Consequently, as shown in
The filter control section 34 of
Upon start of the process of
If the pitch detection section 26 has detected (or could detect) a pitch PA (YES determination at step S1), the control section 30 further determines, at step S3, whether the detected pitch PA is different, i.e., has changed, from a pitch PA in the immediately preceding unit segment U. More specifically, the control section 30 determines that the detected pitch PA in the current unit segment U has changed from the pitch PA in the immediately preceding unit segment U, if the absolute value of a difference between the pitch PA in the current unit segment U and the pitch PA in the immediately preceding unit segment U is greater than a predetermined value; otherwise, the control section 30 determines that the detected pitch PA in the current unit segment U has not changed from the pitch PA in the immediately preceding unit segment U. Affirmative (i.e., YES) determination is also made at step S3 when no pitch PA was detected in the immediately preceding unit segment U.
With a YES determination at step S3, the target setting section 32 updates the target band BT (i.e, low-side target value FT_L and low-side target value FT_H) in accordance with the detected pitch PA, at step S4. Namely, the target setting section 32 sets a low-side target value FT_L and high-side target value FT_H by performing the arithmetic operations of Mathematical Expressions (2a) and (2b) on the detected pitch PA in the current unit segment U. Namely, the low-side target value FT_L and high-side target value FT_H are updated each time the sound signal A0 changes in pitch PA.
Following step S4, the filter control section 34 at step S5 updates the low-side cutoff frequency FC_L and high-side cutoff frequency FC_H so that the pass band BT of the band-pass filter 24 approaches the target band BT updated at step S4. If, on the other hand, the pitch PA detected by the pitch detection section 26 in the current unit segment U has not changed from the pitch PA in the immediately preceding unit segment U (NO determination at step S3), the filter control section 34 goes to step S5, without performing updating of the target pass band BT (step S4), to update (or interpolate between) the low-side cutoff frequency FC_L and high-side cutoff frequency FC_H. The operation at step S5 will be detailed below.
Let's assume a case where a pitch PA1 is detected in the unit segment U3 (YES determination at step S3) as shown in
As set forth above, each time the pitch PA of the sound signal A0 changes, the pass band B (low-side target value FT_L and high-side target value FT_H) is caused to approach over time the target band BT corresponding to the changed pitch PA. Then, once a state where no pitch PA is detected (i.e., non-pitch-detectable state) occurs (NO determination at step S1), the pass band B is initialized to the initial band B0.
In the above-described embodiment, the pass band B of the band-pass filter 24 is variably set in accordance with a pitch PA of a sound signal A0. Namely, the varied pass band B is used for pitch detection after frequency components (e.g., noise components), diverged from the pitch PA, of the sound signal A0 is suppressed. Thus, the instant embodiment can detect a pitch PA of a sound signal A0 with a high accuracy as compared to the construction where the pass band B is fixed or the band-pass filter 24 is omitted. In the case of a tone of a musical instrument, such as a guitar or piano, whose tone generation source is a string, there is a noticeable tendency that its intensity attenuates immediate after the tone generation so that noise is emphasized relatively. Thus, the first embodiment can effectively achieve the advantageous benefit that it can detect a pitch PA with a high accuracy while reducing influences of noise, particularly in a case where a pitch PA of a tone generated from a tone generation source in the form of a string is to be detected.
Further, because the instant embodiment changes the pass band B of the band-pass filter 24 progressively over time toward the target band BT, a pitch PA of a sound signal A0 can be detected in a stable manner as compared to the construction where the pass band B is changed instantaneously to the target band BT.
The following describe a second embodiment of the present invention, with reference to
As shown in
Once the signal SATK is supplied from the attack detection section 42, i.e. once an attach of the sound signal A0 is detected, the control section 30 initialize the pass band B of the band-pass filter 24 to the initial band B0. In the second embodiment, the same operations as those at and after step S3 of
In the above-described first embodiment, where the pass band B is initialized in response to non-detection of any pitch PA, the pass band B of the band-pass filter 24 may sometimes be initialized at a time point delayed from an attack of a sound signal A0. If the initialization of the pass band B is delayed like this, a pitch PA may sometimes not be accurately detected in a case where components of pitches PA in unit segments from the attack of the sound signal A0 to the initialization (i.e., expansion) of the pass band B are located outside the narrower pass band B before being initialized (and thus these components are suppressed by the band-pass filter 24). However, in the second embodiment, where the pass band B is initialized in response to detection of an attack of a sound signal A0, it is possible to promptly initialize the pass band B without waiting for the result of the detection (i.e., presence or absence of a detected pitch PA) by the pitch detection section 26. Thus, the second embodiment can detect a pitch PA of a sound signal A0 (particularly, a pitch PA near the attack of the sound signal A0) with a high accuracy as compared to the first embodiment of the present invention.
The holding section 52 is a FIFO (First-In-First-Out) type delay buffer (register or memory) that sequentially holds a plurality of (i.e., N) of unit segments U of a sound signal A0, output from the signal segmentation section 22, in the same order as the unit segments U are supplied from the signal segmentation section 22. Although the holding section 52 is shown as a separate component from the storage device 14 in the figure, a storage area of the storage device 14 may be used as the holding section 52.
The output control section 54 selectively acquires any one of the N unit segments U. The unit segment U which the output control section 54 acquires from the holding section 52 (i.e., readout position of the holding section 52) is variably controlled. Thus, the holding section 52 and the output control section 54 function as a delay circuit for imparting a variable delay amount D to the individual unit segments U. Namely, the operation of the output control section 54 acquiring the latest (first-stage) unit segment U from among the N unit segments U corresponds to operation of a delay circuit whose delay amount D is set at a minimum value (zero), while the operation of the output control section 54 acquiring the oldest (N-th-stage) unit segment U from among the N unit segments U corresponds to operation of the delay circuit whose delay amount D is set at a maximum value N.
The adjustment section 56 adjusts the sound signal intensity of the unit segment U acquired by and the output from the output control section 54. For example, the adjustment section 56 may be in the form of a multiplier for multiplying the signal value of the sound signal A0 by a variable adjustment value M. The sound signal A0 adjusted by the adjustment section 56 is supplied to the band-pass filter 24. Control of the adjustment value M will be described later.
Once the pitch detection section 26 detects a pitch PA[Uk] of the unit segment Uk, the target setting section 32 of the control section 30 calculates a target band BT (i.e, low-side target value FT_L and high-side target value FT_H) by performing the arithmetic operations of Mathematical Expressions (2a) and (2b) above on the detected pitch PA[Uk]. Further, the filter control section 34 sets the target band BT, set by the target setting section 32 in accordance with the detected pitch PA[Uk], into the band-pass filter 24 as the band B. Namely, whereas the above-described first and second embodiments are constructed to cause the pass band B to approach the target band BT progressively over time, the third embodiment is constructed to set the pass band B at the target band BT (i.e., set the target band BT as the pass band B) immediately after the detection of the pitch PA[Uk].
Once the pass band B is set at the target band BT, the output control section 54 sets the delay amount D at the maximum value N (i.e, delay amount D corresponding to the N-th-stage unit segment U). Then, in a time period TR following the setting of the target band BT and having a time length equal to or smaller than the cyclic period t1 (this time period will hereinafter be referred to as “re-processing time period TR”), the output control section 54, while sequentially reducing the delay amount D to the minimum value (zero) with a cyclic period t2 (e.g., t2=t1/N) shorter than the cyclic period t1, sequentially acquires, from the holding section 52, unit segments U corresponding to delay amounts D and outputs the acquired unit segments U to the adjustment section 56. Thus, as shown in
The band-pass filter 24, whose pass band B has been controlled to take the target pass BT, sequentially processes the N units output from the holding section 52 at the N-fold (N-times higher) speed, and then the pitch detection section 26, as shown in
The delay amount D decreases to zero at the end point of the re-processing time period TR. After elapse of the re-processing time period TR, the filtering (with the target band BT) by the band-pass filter 24 and the pitch detection by the pitch detection section 26 is performed sequentially on unit segments U (following the unit segment Uk+1) supplied sequentially from the signal segmentation section 22 with the cyclic period t1, in the same way as before the start of the re-processing time period TR. Operation performed in response to change in the pitch PA after the elapse of the re-processing time period TR is similar to that described above with reference to
The above-described third embodiment of the present invention, where the pass band B of the band-pass filter 24 is variably set in accordance with a pitch PA of a sound signal A0, can detect a pitch PA of a sound signal A0 with a high accuracy in the same manner as the first embodiment. Further, because the third embodiment is constructed to perform the filtering, using the target band BT corresponding to the pitch PA, and pitch detection (re-detection of a pitch) on previous unit segments having been subjected to the filtering and pitch detection using the initial band B0, the third embodiment can advantageously detect pitches PA of the individual unit segments U in a stable manner, despite the construction that the pass band B of the band-pass filter 24 is changed instantaneously to the target band BT corresponding to the detected pitch PA. Further, because individual unit segments are output from the holding section 52 at the N-fold speed within the re-processing time period TR, pitches PA can be detected, with no delay, for unit segments U to be newly supplied to the holding section 52 after the lapse of the re-processing time period TR.
Further, because the instant embodiment lowers a signal value of the sound signal A0 in accordance with an adjustment value M at the beginning of the re-processing time period TR, it can advantageously suppress discontinuity of the waveform of the sound signal A0 at the start point of the re-processing time period TR. However, if discontinuity of the waveform of the sound signal A0 does not present any particular problem, then the adjustment section 56 of
Note that, whereas
The above-described embodiments may be modified variously. Specific examples of such modifications are as follows. Two or more selected ones of the following examples may be combined as necessary.
(1) Modification 1:
Whereas each of the above-described embodiments has been described above as setting the bandwidth of the target band BT at the fixed value (OFST_L+OFST_H), the bandwidth of the target band BT may be variably controlled, for example, in accordance with a detected pitch PA. For example, the target band BT may be set at a wider bandwidth as the detected pitch PA becomes higher.
(2) Modification 2:
Whereas each of the above-described embodiments is constructed to initialize the pass band B of the band-pass filter 24 in response to non-detection of any pitch PA, i.e. non-pitch-detectable state (first embodiment) or in response to detection of an attack of a sound signal A0 (second embodiment), the present invention is not so limited; for example, the pass band B of the band-pass filter 24 may be initialized to the initial band B0 in response to detection of a release (fall) of a sound signal A0.
(3) Modification 3:
Whereas each of the first and second embodiments has been described above as causing the low-side cutoff frequency FC_L and high-side cutoff frequency FC_H to approach the low-side target value FT_L and high-side target value FT_H, respectively, by varying the low-side cutoff frequency FC_L and high-side cutoff frequency FC_H by the predetermined value Δ at a time, the way for causing the pass band B of the band-pass filter 24 to approach the target band BT is not so limited; for example, there may be employed a construction where a low-side cutoff frequency FC_L and high-side cutoff frequency FC_H at each intermediate time point in a predetermined time period are controlled (or interpolated) in such a manner that the pass band B of the band-pass filter 24 can approach the target band BT within the predetermined time period. Therefore, in this case, a minimum unit change amount of the low-side cutoff frequency FC_L and high-side cutoff frequency FC_H need not be of a fixed value Δ.
The present application is based on, and claims priority to, Japanese Patent Application No. 2008-289974 filed on Nov. 12, 2008. The disclosure of the priority application, in its entirety, including the drawings, claims, and the specification thereof, is incorporated herein by reference.
Number | Date | Country | Kind |
---|---|---|---|
2008-289974 | Nov 2008 | JP | national |