The present invention relates to an audio signal processing apparatus, an audio signal processing method, an audio signal processing program, and a computer-readable recording medium, that reproduce sound with sound effects added by processing an audio signal. However, utilization of the present invention is not limited to the above-mentioned audio signal processing apparatus, audio signal processing method, audio signal processing program, and computer-readable recording medium.
Acoustic equipment that reproduces sound with added sound effects by processing a multi-channel audio signal is in wide use. For example, there is a technology that analyzes the contents of a piece of music and automatically sets an equalizer to optimal equalization characteristics in the acoustic equipment. In this technology, when the music conforms to a pattern of hand clapping at the beginning and at the end, the music is judged to be recorded live and the equalizer is set for a live recording (see, for example, Patent Document 1).
Patent Document 1: Japanese Patent Application Laid-Open Publication No. 2001-85962
However, generally, surround components of 5.1 channel, etc., other than the sound to be definitely oriented in the rear, often include non-correlated signals to mimic the ambience of a live music hall. Typical sound processing by an equalizer or a reverberator, when applied to music itself, makes the sound unnatural. For this reason, conventionally, it has been earnestly desired that the processing may be applied only to such components that will give the ambience of a live music hall. One example is a problem in that while the original object of the equalizer is to arrange transfer characteristics from a speaker to a listener, thought was not given to adding the sound effects to components other than the music sound.
An audio signal processing apparatus according to the invention of claim 1 includes a cutout unit that cuts out audio signals of plural channels by time frame; a correlation calculating unit that calculates a correlation value between respective signals of the plural channels included in a predetermined time frame cut out by the cutout unit; a spectrum calculating unit that calculates spectrum information indicative of spectrum characteristics with respect to a signal of a predetermined channel cut out by the cutout unit; a coefficient calculating unit that calculates a coefficient to be multiplied by the signal of the predetermined channel, based on the correlation value calculated by the correlation calculating unit and the spectrum information calculated by the spectrum calculating unit; and an assigning unit that multiplies the coefficient calculated by the coefficient calculating unit by the signal of the predetermined channel and assigns the multiplied signal to other channels than the predetermined channel.
An audio signal processing method according to the invention of claim 10 includes a cutout step of cutting out audio signals of plural channels by time frame; a correlation calculating step of calculating a correlation value between respective signals of the plural channels included in a predetermined time frame cut out by the cutout unit; a spectrum calculating step of calculating spectrum information indicative of spectrum characteristics with respect to a signal of a predetermined channel cut out by the cutout unit; a coefficient calculating step of calculating a coefficient to be multiplied by the signal of the predetermined channel, based on the correlation value calculated by the correlation calculating unit and the spectrum information calculated by the spectrum calculating unit; and an assigning step of multiplying the coefficient calculated by the coefficient calculating unit by the signal of the predetermined channel and assigning the multiplied signal to other channels than the predetermined channel.
An audio signal processing program according to the invention of claim 11 causes a computer to execute the audio signal processing method according to claim 10.
A computer-readable recording medium according to the invention of claim 12 stores therein the audio signal processing program according to claim 11.
Referring to the accompanying drawings, exemplary embodiments of a audio signal processing apparatus, a audio signal processing method, a audio signal processing program, and a computer-readable recording medium, according to the present invention, with reference to accompanying drawings will be described below.
The cutout unit 101 cuts out audio signals of plural channels by a time frame. The cutout unit 101 is also capable of cutting out the audio signals of plural channels by windowing the audio signals in a time scale. The correlation calculating unit 102 calculates a correlation value between the respective signals of the plural channels included in a predetermined time frame cut out by the cutout unit 101. The spectrum calculating unit 103 calculates spectrum information indicative of spectrum characteristics with respect to the signal of a predetermined channel cut out by the cutout unit 101.
The coefficient calculating unit 104 calculates a coefficient to be multiplied by the signal of the predetermined channel, based on the correlation value calculated by the correlation calculating unit 102 and the spectrum information calculated by the spectrum calculating unit 103. The coefficient calculating unit 104 is also capable of calculating a value inversely proportional to the correlation value as such a coefficient. The assigning unit 105 multiplies the coefficient calculated by the coefficient calculating unit 104 by the signal of the predetermined channel and assigns this multiplied signal to channels other than the predetermined channel.
The spectrum calculating unit 103 is capable of calculating a spectral range of the signal of the predetermined channel. In this case, the coefficient calculating unit 104 is also capable of calculating a value proportional to the value obtained by dividing the spectral range by the time length of the time frame as the coefficient. The coefficient calculating unit 104 is also capable of calculating, as the coefficient, a value proportional to the total value obtained by adding a value inversely proportional to the time from the starting point of the time frame and a value inversely proportional to the time to the ending point of the time frame.
The spectrum calculating unit 103 is capable of calculating a spectrum of the signal of the predetermined channel. In this case, the coefficient calculating unit 104 is also capable of calculating, as the coefficient, a value inversely proportional to a difference of the spectrum in the signal of the predetermined channel from a target spectrum.
The audio signals of the plural channels may include signals of a front left channel, a front right channel, a center channel, a surround left channel, and a surround right channel, respectively. In this case, when the coefficient calculating unit 104 calculates the coefficient with respect to the surround left channel, the assigning unit 105 may assign the signal to the front left channel, the front right channel, the center channel, and the surround right channel, respectively. In this case, when the coefficient calculating unit 104 calculates the coefficient with respect to the surround right channel, the assigning unit 105 may also assign the signal to the front left channel, the front right channel, the center channel, and the surround left channel, respectively.
The coefficient calculating unit 104 calculates the coefficient based on the correlation value calculated by the correlation calculating unit 102 and the spectrum information calculated by the spectrum calculating unit 103 (step S204). This coefficient is the coefficient to be multiplied by the signal of the predetermined channel. The assigning unit 105 multiplies the coefficient calculated by the coefficient calculating unit 104 by the signal of the predetermined channel and assigns this multiplied signal to channels other than the predetermined channel (step S205).
The embodiment described above enables assignment of a particular component to another channel according to the correlation between the channels and the spectrum characteristics. For example, a component other than the music may be extracted out of the surround component. For example, by assigning a component other than the music to the front channel, the ambience of listening to live music and being surrounded by hand clapping may be given to the listener.
A DSP (Digital Signal Processor) 302 receives the digital signal from the sound source 301 as a source and adds sound effects thereto. Here, the DSP 302 exchanges information about the sound source 301 with a microcomputer 303 and, depending on contents thereof, may change the contents of the processing. The DSP 302, internally calculates a processing coefficient in accordance with acoustic properties of the sound source 301 and the information obtained from the microcomputer 303. This audio signal processing apparatus usually uses signal processing such as by an equalizer and a reverberator. However, these methods, using a fixed coefficient irrespective of the kind of music, can not necessarily make reproduction according to characteristics of the music.
A D/A converter 304 converts the signal output from the DSP 302 to an analog signal. The converted analog signal is amplified by an amplifier 305 and is acoustically reproduced by a speaker 306.
As described above, in the audio signal processing apparatus, the signal from the sound source 301 is received by the DSP 302, which performs the signal processing of the signal in cooperation with the microcomputer 303. This signal-processed signal is converted by the D/A converter to an analog signal and is acoustically reproduced by the amplifier 305 and the speaker 306.
Firstly, a surround left (SLin) component and a surround right (SRin) component are input to a coefficient controller 401. Next, the coefficient controller 401 analyzes the surround left (SLin) component and the surround right (SRin) component. The coefficient controller 401, based on results of analysis, calculates distribution amounts aSL and aSR to other channels. Outputs from the coefficient controller 401 are updated by analyzing the sound components, as required.
Multiplying units 402 and 403 multiply the calculated distribution amounts aSL and aSR by the surround components. The distribution amount aSL is multiplied by the surround left component and the distribution amount aSR is multiplied by the surround right component. Then, with effects (F) of the equalizer, reverberator, etc., added at a filter 404, the multiplied distribution amounts are distributed to other channels.
For example, in the case of calculating the distribution amount aSL of the surround left component, distribution is to the front left channel (Lin), the front right channel (Rin), the center channel (Cin), and the surround right channel (SRin). For example, in the case of calculating the distribution amount aSR of the surround right component, distribution is to the front left channel (Lin), the front right channel (Rin), the center channel (Cin), and the surround left channel (SLin). As a result of the distribution, the signals are output of the front left channel (Lout), the front right channel (Rout), the center channel (Cout), the surround left channel (SLout), and the surround right channel (SRout), respectively.
Configuring the DSP 302 as shown in
The time frame cutout units 502 and 512 window surround signals SLin(n) and SRin (n), respectively, in a time scale and cut out signals FSL and FSR, respectively. Here, the frame length of the cutout signals FSL and FSR is given as fftlen.
A correlation calculating unit 520 calculates a correlation value ρ of the cutout signals FSL and FSR. On the other hand, a spectral range calculating units 530 and 531 calculate spectral ranges WSL and WSR, respectively, of the cutout signals FSL and FSR. The spectral range calculating units 530 and 531 calculate the spectral ranges WSL and WSR by counting the number of lines of amplitude exceeding a certain threshold, out of an amplitude spectrum obtained by applying FFT to a signal sequence. The spectral ranges WSL and WSR, which come infinitely close to the length fftlen, for example, in a wide-band signal such as white noise, may be considered to be an index of whiteness. A coefficient calculating unit 550 calculates the coefficient values aSL and aSR for assignment to other channels from the time t in one track obtained from a timer 540 in addition to the correlation value ρ and the spectral ranges WSL and WSR. For example, equations (1) and (2) are used as calculating equations.
The intent of these equations includes the following three points: (1) In the case of a signal of a narrow bandwidth, the coefficients aSL and aSR are made smaller and conversely, in the case of a signal of a wide bandwidth, the coefficients aSL and aSR are made greater. (2) When the correlation is small, the coefficients aSL and aSR are made greater and conversely, when the correlation is great, the coefficients aSL and aSR are made smaller. (3) As the time is closer to the start or the end of the track, the coefficients aSL and aSR are made greater. Conversely, around the center of the track, the coefficients aSL and aSR are made smaller. Tend represents the time length of one piece of music.
These equations utilize the characteristic that the signal of hand clapping, etc., with “wide bandwidth” and “low correlation between channels” is present “at the end or beginning of a piece of music”. By distributing as much of such kind of a signal as possible to other channels, reproduction may be made of the ambience of being surrounded by hand clapping.
In equations (1) and (2), the left-hand side is a volume proportional to the right-hand side. Because of the diverse tastes of people, such as those who would like to listen concentrating on the music and others who would like to listen giving weight to the ambience, here, only the ratio of distribution is calculated by these equations. Thereafter, at the stage where the sound effects are added, the amount of distribution may be determined according to the taste of the user.
The output multiplied by the coefficient is output from the other channels. For example, the surround left signal (SLin) multiplied by the coefficient is output from speakers other than the SL speaker. By outputting the signals with the sound effects added and a direct sound component through separate speakers, coloration is reduced as much as possible. Having the sound output from various directions has also a secondary effect of being capable of outputting a more natural and extensive sound.
Then, the multiplying units 402 and 403 multiply the coefficients aSL and aSR by the surround signals SLin(n) and SRin(n)(step S606), respectively. The multiplied signals are filtered by the filters 404 and 405 (step S607), the obtained signals are assigned to other channels (step S608), and a sequence of processing is finished.
Configuration may be such that the output of the calculated coefficient is filtered by a smoothing filter such as a low-pass filter. Since the correlation value, a spectrum pattern, etc., vary at every moment, variation of the coefficient actually is considerably large. For this reason, the energy of the signal to be assigned to other channels, if directly applied, has a wide range of variation and large dispersion, resulting in an unstable signal level. By smoothing the output of the coefficient, the variation of the coefficient becomes smooth and the instability is eliminated.
Although described above, the coefficient is generated with respect to the two channels of the surround left and right, the coefficient may also be generated with respect to two front channels, or the coefficient may also be generated with respect to four channels of the front left and right channels and the surround left and right channels. In this case, in the case of 2 channels such as in a CD, the coefficient is generated with respect to one set of the right and left channels. While it is generally said that the components other than the music, such as hand clapping, are put in the surround components, frequently is the case that such components are put in the front components as well. By monitoring the signals of components other than the surround components, a reproduction method is enabled that is rich in variety.
Configuration may be such that the coefficients and content of processing with respect to the signals FSL and FSR are changed depending on the outputting speaker 306. By changing the coefficient for each outputting speaker 306 and making the signals less correlative, more expansive expression of the sound field may be achieved.
The time frame cutout units 502 and 512 window the surround signals SLin(n) and SRin(n), respectively, in a time scale and cut out the signals FSL and FSR with the frame length of fftlen, respectively.
The correlation calculating unit 520 calculates the correlation value ρ of the cutout signals FSL and FSR. On the other hand, spectrum calculating units 601 and 611 calculate spectra SSL and SSR, respectively, of the cutout signals FSL and FSR. A coefficient calculating unit 620 calculates the coefficient values aSL and aSR for assignment to other channels from the correlation value ρ and the spectra SSL and SSR. For example, equations (3) and (4) are used as calculating equations.
The intent of these equations includes the following two points: (1) When the spectrum is distant from a spectrum target, the coefficients aSL and aSR are made smaller and conversely, when the spectrum is close to the spectrum target, the coefficients aSL and aSR are made greater. (2) When the correlation is small, the coefficients aSL and aSR are made greater and conversely, when the correlation is great, the coefficients aSL and aSR are made smaller.
Instead of calculating the spectral range by the spectral range calculating units 530 and 531, configuration may be such that the spectrum calculating units 601 and 611, using an FFT spectrum, calculate the spectrum in such a manner that higher weighting is given when the spectrum is close to a particular spectrum. In this example, in consideration of the audio signal of a television, etc., which is not divided by track, the time information is not used. Of course, in the case of the package medium such as the DVD, the time information may be inserted as in the calculating method of the first embodiment.
In this case, there are a number of sounds that give the ambience of being present at an event, such as the yells of cheering, etc., and cheering trumpets while watching a baseball game in addition to hand clapping. This example, by focusing only on the sound of a characteristic spectrum, also enables giving the ambience of being surrounded by a sound source.
The examples described above analyze the sound source with the two channel signals used as a pair, thereby enabling extraction of components other than the music and increasing the ambience of being present at the event. The sound effects may also be applied to other than the equalizer. Here, the sound effects may more suitably be used in combination with the effect of creating the ambience of the event by the reverberator, etc.
Generally, in the sound components of the 5.1 channel, etc., other than the sound to be definitely oriented at the rear, non-correlated signals are often inserted to give the ambience of a live music hall. Accordingly, by examining the correlation of surround components of two channels, desired sound may be oriented at the front. The calculation based on the spectral range, the time information, and the correlation value enables enhanced accuracy.
Typical sound processing by the equalizer or reverberator, when applied to the music itself, makes the sound unnatural at times. In contrast, these examples enable processing only the component that gives the ambience of a live music hall.
Conventionally, the object of the equalizer is to arrange the transfer characteristics from a speaker to a listener. The embodiment aims mainly at adding sound effects to components other than the music. However, application of the embodiment is not limited to the equalizer. For more realistic ambience of the event, it may be conceivable to combine the equalizer with, for example, a reverberator control, etc.
The above embodiment may be applied to home or car audio equipment (especially, surround sound reproducing equipment), television sets (especially, those compliant with terrestrial broadcasting and surround sound reproduction), and auxiliary music equipment for concert halls and live music halls
The audio signal processing method explained in the present embodiment can be implemented by a computer such as a personal computer and a workstation executing a program that is prepared in advance. The program is recorded on a computer-readable recording medium such as a hard disk, a flexible disk, a CD-ROM, an MO, and a DVD, and is executed by being read out from the recording medium by a computer. The program can be a transmission medium that can be distributed through a network such as the Internet.
The above embodiments enable adding the sound effects only to the components other than the music, out of the sound components, for example, in the DVD live music disk, etc. By designing the coefficient controller so as to extract such components that have a high probability of being other than the music and assigning such components to the front channels, the atmosphere, for example, of listening to the live music surrounded by the hand clapping may be enjoyed. Also, in the television broadcasting of a baseball game, by reproducing the sound characteristic of a cheering party (for example, cheering trumpet sound and shouts of joy), etc., in a little greater volume from the surrounding, the atmosphere may be enjoyed of watching the baseball game in the midst of the cheering crowd.
Number | Date | Country | Kind |
---|---|---|---|
2005-194413 | Jul 2005 | JP | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/JP2006/311947 | 6/14/2006 | WO | 00 | 12/27/2007 |