The present technology relates to an audio-signal processing device, an audio-signal processing method, a program, and a recording medium. In particular, the present technology relates to an audio-signal processing device, an audio-signal processing method, a program, and a recording medium that can be applied to a headphone device, a speaker device, and so on that reproduce 2-channel stereo audio signals.
When audio signals are supplied to speakers and are reproduced, the sound image is localized in front of a listener. In contrast, when the same audio signals are supplied to a headphone device and are reproduced, the sound image is localized within the head of the listener to thereby create a significantly unnatural sound field. In order to correct the unnatural sound field in the sound-field localization by the headphone device, for example, Japanese Unexamined Patent Application Publication No. 2006-14218 discloses a headphone device adapted to achieve natural out-of-head sound-image localization as if audio signals were reproduced from actual speakers. In the headphone device, impulse responses from an arbitrary speaker position to both ears of a listener are measured or calculated and digital filters or the like are used to convolve the impulse responses with audio signals and the resulting audio signals are reproduced.
Now, a description will be given of an impulse response for sound-image localization for a headphone device. As illustrated in
In multi-channel reproduction, estimated channel layout may vary depending on the format of a compressed audio stream. For example, 7.1-channel audio signals may contain 2-channel audio signals for left and right front high channels or may contain 2-channel audio signals for left and right back surround channels in addition to general 5.1 channels.
It is desirable to perform sound-image localization processing in a favorable manner and to reduce the amount of memory.
According to an embodiment of the present technology, there is provided an audio-signal processing device. The audio processing device includes: a decoding unit configured to decode a compressed audio stream to obtain audio signals for a predetermined number of channels; a signal processing unit configured to generate 2-channel audio signals including left-channel audio signals and right-channel audio signals, on the basis of the predetermined-number-of-channels audio signals obtained by the decoding unit, wherein the signal processing unit uses digital filters to convolve impulse responses for paths from sound-source positions of the channels to a left ear of a listener with the corresponding predetermined-number-of-channels audio signals and adds results of the convolutions for the channels to generate the left-channel audio signals and uses digital filters to convolve impulse responses for paths from the sound-source positions of the channels to a right ear of the listener with the corresponding predetermined-number-of-channels audio signals and adds results of the convolutions for the channels to generate the right-channel audio signals; and a coefficient setting unit configured to set filter coefficients corresponding to the impulse responses for the digital filters in the signal processing unit, on the basis of format information of the compressed audio stream.
In the present technology, the decoding unit decodes the compressed audio stream to obtain audio signals for the predetermined number of channels. Examples of the audio signals include 2-channel audio signals, 5.1-channel audio signals, and 7.1-channel audio signals. On the basis of the predetermined-number-of-channels audio signals, the signal processing unit generates 2-channel audio signals including the left-channel audio signals and the right-channel audio signals.
In this case, in the signal processing unit, digital filters are used to convolve impulse responses for paths from sound-source positions of the channels to a left ear of a listener with the corresponding predetermined-number-of-channels audio signals and results of the convolutions for the channels are added to generate the left-channel audio signals. Similarly, in the signal processing unit, digital filters are used to convolve impulse responses for paths from the sound-source positions of the channels to a right ear of the listener with the corresponding predetermined-number-of-channels audio signals and results of the convolutions for the channels are added to generate the right-channel audio signals.
A coefficient setting unit sets filter coefficients corresponding to the impulse responses for the digital filters in the signal processing unit, on the basis of format information of the compressed audio stream. For example, filter coefficients corresponding to an estimated channel layout determined by the format information are set for the digital filters for the channels indicated by decode-mode information of the decoding unit.
For example, when the format information indicates 5.1-channel audio signals, filter coefficients corresponding to the estimated channel layout are set for the digital filters for 6-channel audio signals. Also, for example, when the format information indicates 7.1-channel audio signals (including front high or back surround channel audio signals), filter coefficients corresponding to the estimated channel layout are set for the digital filters for 8-channel audio signals.
Thus, in the present technology, on the basis of the format information of the compressed audio stream, filter coefficients corresponding to the impulse responses are set for the digital filters in the signal processing unit. Thus, even when the format of the compressed audio stream Ast is changed, 2-channel stereo audio signals with which sound-image localization for each channel can be performed in a favorable manner can be obtained from audio signals for a predetermined number of channels.
In the present technology, at least one of the digital filters in the signal processing unit may be used to process the audio signals for multiple ones of the predetermined number of channels. The at least one digital filter used to process the audio signals for the multiple channels may process front high audio signals included in 7.1-channel audio signals or back surround audio signals included in 7.1-channel audio signals. Since the at least one of the digital filters is used to process the audio signals for multiple ones of the predetermined number of channels, the circuit scale of the signal processing unit can be reduced.
According to another embodiment of the present technology, there is provided an audio-signal processing device. The audio-signal processing device includes: a decoding unit configured to decode a compressed audio stream to obtain audio signals for a predetermined number of channels; and a signal processing unit configured to generate 2-channel audio signals including left audio signals and right audio signals, on the basis of the predetermined-number-of-channels audio signals obtained by the decoding unit. The signal processing unit uses digital filters to convolve impulse responses for paths from sound-source positions of the channels to a left ear of a listener with the corresponding predetermined-number-of-channels audio signals and adds results of the convolutions for the channels to generate the left audio signals and uses digital filters to convolve impulse responses for paths from the sound-source positions of the channels to a right ear of the listener with the corresponding predetermined-number-of-channels audio signals and adds results of the convolutions for the channels to generate the right audio signals. In the signal processing unit, the digital filters for processing at least the audio signals for a low-frequency enhancement channel are implemented by infinite impulse response filters.
In the present technology, the decoding unit decodes the compressed audio stream to obtain audio signals for the predetermined number of channels. Examples of the audio signals include 2-channel audio signals, 5.1-channel audio signals, and 7.1-channel audio signals. On the basis of the predetermined-number-of-channels audio signals, the signal processing unit generates 2-channel audio signals including the left audio signals and the right audio signals.
In this case, in the signal processing unit, digital filters are used to convolve impulse responses for paths from sound-source positions of the channels to a left ear of a listener with the corresponding predetermined-number-of-channels audio signals and results of the convolutions for the channels are added to generate the left audio signals. Similarly, in the signal processing unit, digital filters are used to convolve impulse responses for paths from the sound-source positions of the channels to a right ear of the listener with the corresponding predetermined-number-of-channels audio signals and results of the convolutions for the channels are added to generate the right audio signals.
In the signal processing unit, the digital filters for processing at least the audio signals (sub-woofer signals) for a low-frequency enhancement channel are implemented by IIR (infinite impulse response) filters. In this case, for example, the digital filters for processing the audio signals for the other channels may be implemented by FIR (finite impulse response) filters.
In the present technology, since the digital filters for processing at least the audio signals (sub-woofer signals) for the low-frequency enhancement channel are implemented by IIR filters, the amounts of memory and computation for processing the low-frequency enhancement channel audio signals can be reduced.
According to another embodiment of the present technology, there is provided an audio-signal processing device. The audio-signal processing device includes: a decoding unit configured to decode a compressed audio stream to obtain audio signals for a predetermined number of channels; and a signal processing unit configured to generate 2-channel audio signals including left audio signals and right audio signals, on the basis of the predetermined-number-of-channels audio signals obtained by the decoding unit. The signal processing unit uses digital filters to convolve impulse responses for paths from sound-source positions of the channels to a left ear of a listener with the corresponding predetermined-number-of-channels audio signals and adds results of the convolutions for the channels to generate the left audio signals, and uses digital filters to convolve impulse responses for paths from the sound-source positions of the channels to a right ear of the listener with the corresponding predetermined-number-of-channels audio signals and adds results of the convolutions for the channels to generate the right audio signals. In the signal processing unit, the filter coefficient set for the digital filter for processing audio signals for a front high channel is data obtained by combination of actual-sound-field data and anechoic-room data.
In the present technology, the decoding unit decodes the compressed audio stream to obtain audio signals for the predetermined number of channels. Examples of the audio signals include 2-channel audio signals, 5.1-channel audio signals, and 7.1-channel audio signals. On the basis of the predetermined-number-of-channels audio signals, the signal processing unit generates 2-channel audio signals including the left audio signals and the right audio signals.
In this case, in the signal processing unit, digital filters are used to convolve impulse responses for paths from sound-source positions of the channels to a left ear of a listener with the corresponding predetermined-number-of-channels audio signals and results of the convolutions for the channels are added to generate the left audio signals. Similarly, in the signal processing unit, digital filters are used to convolve impulse responses for paths from the sound-source positions of the channels to a right ear of the listener with the corresponding predetermined-number-of-channels audio signals and results of the convolutions for the channels are added to generate the right audio signals.
In this case, in the signal processing unit, the filter coefficient set for the digital filter for processing audio signals for a front high channel is data obtained by combination of actual-sound-field data and anechoic-room data. For example, the actual-sound-field data may include speaker characteristics of the front channel and reverberation-part data of the front channel.
In the present technology, in the signal processing unit, the filter coefficient set for the digital filter for processing audio signals for a front high channel is data obtained by combination of actual-sound-field data and anechoic-room data. Thus, for example, even for a typical 5.1 channel layout in an actual sound field, filter coefficients for front high channels of 7.1 channels can be easily obtained.
According to a still another embodiment of the present technology, there is provided an audio-signal processing device. The audio-signal processing device includes: a decoding unit configured to decode a compressed audio stream to obtain audio signals for a predetermined number of channels; a signal processing unit configured to generate 2-channel audio signals including left-channel audio signals and right-channel audio signals, on the basis of the predetermined-number-of-channels audio signals obtained by the decoding unit, wherein the signal processing unit uses digital filters to convolve impulse responses for paths from sound-source positions of the channels to a left ear of a listener with the corresponding predetermined-number-of-channels audio signals and adds results of the convolutions for the channels to generate the left-channel audio signals and uses digital filters to convolve impulse responses for paths from the sound-source positions of the channels to a right ear of the listener with the corresponding predetermined-number-of-channels audio signals and adds results of the convolutions for the channels to generate the right-channel audio signals, and the convolutions by the digital filters are performed in a frequency domain; a coefficient holding unit configured to hold time-series coefficient data as filter coefficients corresponding to the impulse responses; and a coefficient setting unit configured to read the time-series coefficient data held by the coefficient holding unit, transform the time-series coefficient data into frequency-domain data, and set the frequency-domain data for the digital filters.
In the present technology, the decoding unit decodes the compressed audio stream to obtain audio signals for the predetermined number of channels. Examples of the audio signals include 2-channel audio signals, 5.1-channel audio signals, and 7.1-channel audio signals. On the basis of the predetermined-number-of-channels audio signals, the signal processing unit generates 2-channel audio signals including the left-channel audio signals and the right-channel audio signals.
In this case, in the signal processing unit, digital filters are used to convolve impulse responses for paths from sound-source positions of the channels to a left ear of a listener with the corresponding predetermined-number-of-channels audio signals and results of the convolutions for the channels are added to generate the left audio signals. Similarly, in the signal processing unit, digital filters are used to convolve impulse responses for paths from the sound-source positions of the channels to a right ear of the listener with the corresponding predetermined-number-of-channels audio signals and results of the convolutions for the channels are added to generate the right audio signals.
In this case, the convolutions by the digital filters are performed in a frequency domain. Actual-time coefficient data are stored as the filter coefficients corresponding to the impulse responses. The coefficient setting unit reads the actual-time coefficient, data from the coefficient holding unit, transforms the actual-time coefficient data into frequency-domain data, and sets the frequency-domain data for the digital filters.
In the present technology, the time-series coefficient data are held, as the filter coefficients corresponding to the impulse pulses, the time-series coefficient data are transformed into frequency-domain data, and the frequency-domain data are set for the digital filters. Accordingly, it is possible to reduce the amount of memory that holds the filter coefficients.
According to the present technology, it is possible to perform sound-image localization processing in a favorable manner and it is also possible to reduce the amount of memory.
A mode (herein referred to as an “embodiment”) for implementing the present disclosure will be described below. A description below is given in the following sequence:
1. First Embodiment
2. Modification
<1. Embodiment>
[Example of Configuration of Audio Signal Processing Device]
The control unit 101 includes a microcomputer to control operations of the individual elements in the audio-signal processing device 100. The input terminal 102 is a terminal for inputting a compressed audio stream Ast. The decoding unit 103 decodes the compressed audio stream Ast to obtain audio signals for a predetermined number of channels. Examples of the audio signals include 2-channel audio signals, 5.1-channel audio signals, and 7.1-channel audio signals.
As illustrated in
The decoder 103a in the decoding unit 103 performs the decode processing in a mode corresponding to the format of the compressed audio stream Ast. The decoding unit 103 sends this format information and decode-mode information to the control unit 101. Under the control of the control unit 101 based on the format information, for example, the post decoder 103b converts the 2-channel audio signals, obtained from the decoder 103a, to 5.1-channel or 7.1-channel audio signals or converts the 5.1-channel audio signals, obtained from the decoder 103a, to 7.1-channel audio signals.
The 2-channel audio signals contain audio signals for 2 channels including a left-front channel (FL) and a right-front channel (FR). The 5.1-channel audio signals contain audio signals for 6 channels including a left-front channel (FL), a right-front channel (FR), a center channel (C), a left-rear channel (SL), a right-rear channel (SR), and a low-frequency enhancement channel (LFE).
The 7.1-channel audio signals contain 2-channel audio signals in addition to 6-channel audio signals that are similar to the above-described 5.1-channel audio signals. In accordance with the format of the compressed audio stream Ast or as a result of the processing of the post decoder 103b, the 2-channel audio signals contained in the 7.1-channel audio signals are, for example, 2-channel audio signals for a left front high channel (HL) and a right front high channel (HF) or a left back surround channel (BL) and a right back surround (BR).
The signal processing unit 105 is implemented by, for example, a DSP (digital signal processor), and generates left-channel audio signals SL and right-channel audio signals SR to be supplied to a headphone device 200, on the basis of the predetermined-number-of-channels audio signals obtained by the decoding unit 103. Signal lines for the audio signals for the 8 channels of the 7.1 channels are prepared between an output side of the decoding unit 103 and an input side of the signal processing unit 105.
When 2-channel or 6-channel audio signals are output from the decoding unit 103, only signal lines for the corresponding channels are used to send the audio signals from the decoding unit 103 to the signal processing unit 105.
When the format of the compressed audio stream Ast is a 7.1-channel format and 8-channel audio signals are output from the decoding unit 103, all of the prepared signal lines are used to send the audio signals from the decoding unit 103 to the signal processing unit 105. In this case, the 2-channel audio signals for the left-front high channel (HL) and the right-front high channel (HR) and the 2-channel audio signals for the left-back surround channel (HL) and the right-back surround channel (BR) are sent through the same signal lines.
The signal processing unit 105 uses digital filters to convolve impulse responses for the paths from the sound-source positions of the channels to the left ear of a listener with the corresponding predetermined-number-of-channels audio signals and adds results of the convolutions for the channels to thereby generate the left-channel audio signals SL. Similarly, the signal processing unit 105 uses digital filters to convolve impulse responses for the paths from the sound-source positions of the channels to the right ear of the listener with the corresponding predetermined-number-of-channels audio signals and adds results of the convolutions for the channels to thereby generate the right-channel audio signals SR.
FIR filters 51-2L and 51-2R are digital filters for processing the right-front channel (FR) audio signals. The FIR filter 51-2L convolves an impulse response for the path from the sound-source position of the right-front channel (FR) to the left ear of the listener with the right-front channel (FR) audio signals. The FIR filter 51-2R convolves an impulse response for the path from the sound-source position of the right-front channel (FR) to the right ear of the listener with the right-front channel (FR) audio signals.
FIR filters 51-3L and 51-3R are digital filters for processing the central channel (C) audio signals. The FIR filter 51-3L convolves an impulse response for the path from the sound-source position of the center channel (C) to the left ear of the listener with the center channel (C) audio signals. The FIR filter 51-3R convolves an impulse response for the path from the sound-source position of the center channel (C) to the right ear of the listener with the center channel (C) audio signals.
FIR filters 51-4L and 51-4R are digital filters for processing the left-rear channel (SL) audio signals. The FIR filter 51-4L convolves an impulse response for the path from the sound-source position of the left-rear channel (SL) to the left ear of the listener with the left-rear channel (SL) audio signals. The FIR filter 51-4R convolves an impulse response for the path from the sound-source position of the left-rear channel (SL) to the right ear of the listener with the left-rear channel (SL) audio signals.
FIR filters 51-5L and 51-5R are digital filters for processing the right-rear channel (SR) audio signals. The FIR filter 51-5L convolves an impulse response for the path from the sound-source position of the right-rear channel (SR) to the left ear of the listener with the right-rear channel (SR) audio signals. The FIR filter 51-5R convolves an impulse response for the path from the sound-source position of the right-rear channel (SR) to the right ear of the listener with the right-rear channel (SR) audio signals.
FIR filters 51-6L and 51-6R are digital filters for processing the audio signals for the left-front high channel (FL) or the left-back surround channel (FL). The FIR filter 51-6L convolves an impulse response for the path from the sound-source position of the left-front high channel (HL) or the left-back surround channel (HL) to the left ear of the listener with the left-front high channel (HL) or the left-back surround channel (BL) audio signals. The FIR filter 51-6R convolves an impulse response for the path from the sound-source position of the left-front high channel (HL) or the left-back surround channel (FL) to the right ear of the listener with the left-front high channel (HL) or the left-back surround channel (BL) audio signals.
FIR filters 51-7L and 51-7R are digital filters for processing the audio signals for the right-front high channel (HR) or the right-back surround channel (BR). The FIR filter 51-7L convolves an impulse response for the path from the sound-source position of the right-front high channel (HF) or the right-back surround channel (BR) to the left ear of the listener with the right-front high channel (HR) or the right-back surround channel (BR) audio signals. The FIR filter 51-7R convolves an impulse response for the path from the sound-source position of the right-front high channel (HR) or the rightback surround channel (BR) to the right ear of the listener with the right-front high channel (HR) or the right-back surround channel (BR) audio signals.
IIR filters 51-8L and 51-8R are digital filters for processing the low-frequency enhancement channel (LFE) audio signals (subwoofer signals). The IIR filter 51-8L convolves an impulse response for the path from the sound-source position of the low-frequency enhancement channel (LFE) to the left ear of the listener with the low-frequency enhancement channel (LFE) audio signals. The IIR filter 51-8R convolves an impulse response for the path from the sound-source position of the low-frequency enhancement channel (LFE) to the right ear of the listener with the low-frequency enhancement channel (LFE) audio signals.
An adder 52L adds signals output from the FIR filters 51-1L, 51-2L, 51-3L, 51-4L, 51-5L, 51-6L, and 51-7L and a signal output from the IIR filter 51-8L to generate left-channel audio signals SL and outputs the left-channel audio signals SL to the output terminal 106L. An adder 52R adds signals output from the FIR filters 51-1R, 51-2R, 51-3R, 51-4R, 51-5R, 51-6R, and 51-7R and a signal output from the IIR filter 51-8R to generate right-channel audio signals SR and outputs the right-channel audio signals SR to the output terminal 106R.
As illustrated in FIG, 3, in the signal processing unit 105, the digital filters for processing the audio signals S-LFE for the low-frequency enhancement channel (LFE) are implemented by the IIR filters 51-8L and 51-8R and the digital filters for processing the audio signals SA for the other channels are implemented by the FIR filters 51-L and 51-R. As illustrated in
However, when the FIR filters 51-8L′ and 51-8R′ are used, the tap length increases and the amounts of memory and computation also increase because of the low frequency of the audio signals S-LFE for the low-frequency enhancement channel (LFE). In contrast, when the IIR filters 51-8L and 51-8R are used, the low frequency can be enhanced with high accuracy and the amounts of memory and computation can be reduced. It is, therefore, preferable that the IIR filters 51-8L and 51-8R be used to constitute the digital filters for processing the audio signals S-LFE for the low-frequency enhancement channel (LFE).
A flowchart in
An addition output of the adder 84 is supplied to an output terminal 87. The addition output is also delayed by a delay circuit 85a and is then supplied to the adder 84 via a coefficient multiplexer 86a. An output of the delay circuit 85a is delayed by a delay circuit 85b and is then supplied to the adder 84 via a coefficient multiplexer 86b. The adder 84 performs processing for adding the supplied signals to obtain an addition output.
Referring back to
The coefficient setting unit 104 has a coefficient holding unit 104a and an FFT (Fast Fourier Transform) unit 104b. The coefficient holding unit 104a holds actual-time coefficient data (time-series coefficient data) as the filter coefficients corresponding to the impulse responses. The FFT unit 104b reads the actual-time coefficient data held by the coefficient holding unit 104a, transforms the actual-time coefficient data into frequency-domain data, and sets the frequency-domain data for the digital filters in the signal processing unit 105. Although not described above, each digital filter in the signal processing unit 105 performs the impulse--response convolution in a frequency domain.
Coefficient data 52-2L and 52-2R represent coefficient data FR-L and FR-R to be set for the FIR filters 51-2L and 51-2R, respectively, in the signal processing unit 105. Coefficient data 52-3L and 52-3R represent coefficient data C-L and C-R to be set for the FIR filters 51-3L and 51-3R, respectively, in the signal processing unit 105. Coefficient data 52-4L and 52-4R represent coefficient data SL-L and SL-R to he set for the FIR filters 51-4L and 51-4R, respectively, in the signal processing unit 105.
Coefficient data 52-5L and 52-5R represent coefficient data SR-L and SR-R to be set for the FIR filters 51-5L and 51-5R, respectively, in the signal processing unit 105. Coefficient data 52-6La and 52-6Ra represent coefficient data HL-L and HL-R to be set for the FIR filters 51-6L and 51-6R, respectively, in the signal processing unit 105. Coefficient, data 52-7ba and 52-7Ra represent coefficient data HR-L and HR-R to be set for the FIR filters 51-7L and 51-7R, respectively, in the signal processing unit 105.
Coefficient data 52-6Lb and 52-6Rb represent coefficient data BL-L and BL-R to be set for the FIR filters 51-6L and 51-6R, respectively, in the signal processing unit 105. Coefficient data 52-7Lb and 52-7Rb represent coefficient data BR-L and BR-R to be set for the FIR filters 51-7L and 51-7R, respectively, in the signal processing unit 105. Coefficient data 52-8L and 52-8R represent coefficient data LF-L and LF-R to he set for the IIR filters 51-8L and 51-8R, respectively, in the signal processing unit 105.
In step ST12, the coefficient setting unit 104 determines whether or not audio signals (audio data) for the back surround channels are included. When audio signals for the back surround channels are included, the process proceeds to step ST13 in which the coefficient setting unit 104 sets a set of coefficients for the back surround channels for the corresponding digital filters (FIR filters) Thereafter, in step ST14, the coefficient setting unit 104 unmutes the signal processing unit (DSP) 105.
When it is determined in step ST12 that audio signals for the back surround channels are not included, that is, when audio signals for the front high channels are included, the process proceeds to step ST15 in which the coefficient setting unit 104 sets a set of coefficients for the front high channels for the digital filters (FIR filters). Thereafter, in step ST14, the coefficient setting unit 104 unmutes the signal processing unit (DSP) 105.
In step ST22, the coefficient setting unit 104 determines whether or not filter coefficients are to be set for the FIR filters for processing the audio signals for the front high channels (HL and HR) or the back sound channels (BL and BR). When the format of the output of the decoding unit 103 is a 7.1-channel format and it is determined in step ST22 that filter coefficients are to be set for the FIR filters, the process proceeds to step ST23 in which the coefficient setting unit 104 sets filter coefficients for the digital filters for processing the audio signals for the channels including the front high channels (HL and HR) or the back surround channels (BL and BR). Thereafter, in step ST24, the coefficient setting unit 104 unmutes the signal processing unit (DSP) 105.
When the format of the output of the decoding unit 103 is a 5.1-channel format and it is determined in step ST22 that filter coefficients are not to be set for the FIR filters, the process proceeds to step ST25 in which the coefficient setting unit 104 sets filter coefficients for the digital filters for processing the audio signals for the channels of the general 5.1 channels, other than the front high channels (HL and HR) or the back surround channels (BL and BR). Thereafter, in step ST24, the coefficient setting unit 104 unmutes the signal processing unit (DSP) 105.
As illustrated in
In the present embodiment, however, it is preferable to employ a configuration in which the coefficient holding unit 104a holds the time-series coefficient data as the filter coefficients, the time-series coefficient data are transformed into frequency-domain data, and the frequency-domain data are set for the digital filters 51-L and 51-R. The reason is that holding the time-series coefficient data as the filter coefficients makes it possible to reduce the amount of memory in the coefficient holding unit 104a, compared to a case in which the frequency-domain data are held as the filter coefficients.
When the time-series coefficient data are held in the coefficient holding unit 104a as the filter coefficients, part of the time-series coefficient data can be shared by the multiple channels and the amount of memory in the coefficient holding unit 104a can be further reduced. FIG. 18 illustrates one example in which the coefficient holding unit 104a holds time-series coefficient data to be shared by multiple channels.
Time-series coefficient data A is, for example, data of direct-sound part of a first channel, for example, a front channel (a front low channel) and time-series coefficient data B is, for example, data of direct-sound part of a second channel, for example, a front high channel. Time-series coefficient data C is reverberation part (indirect-sound part) data to be shared by those two channels.
That is, for setting the filter coefficients for the digital filters 51-L and 51-R with respect to the first channel, the coefficient setting unit 104 obtains the time-series coefficient data A and C from the coefficient holding unit 104a, uses the FFT unit 104b to transform the time-series coefficient data A and C into frequency-domain data and sets the frequency-domain data for the digital filters 51-L and 51-R. On the other hand, for setting the filter coefficients for the digital filters 51-L and 51-R with respect to the second channel, the coefficient setting unit 104 obtains the time-series coefficient data B and C from the coefficient holding unit 104a, uses the FFT unit 104h to transform the time-series coefficient data B and C into frequency-domain data, and sets the frequency-domain data for the digital filters 51-L and 51-R.
Although the above description has been given of an example in which the time-series coefficient data are shared by multiple channels, the present technology is not limited thereto. For example, with respect to one channel, the arrangement may be such that direct-sound part data are independently held so as to correspond to multiple formats of the compressed audio stream Ast and common data is used for reverberation part (indirect-sound part) data. In such a case, when the format of the compressed audio stream Ast is changed, the coefficient setting unit 104 can deal with the change by transforming only direct-sound part data corresponding to the changed format of the compressed audio stream Ast into frequency-domain data and setting the frequency-domain data for the digital filters.
In step ST43, the coefficient setting unit 104 uses the FFT unit 104b to transform the direct-sound part data into frequency-domain data and sets the frequency-domain data for the digital filters 51-L and 51-R. As a result, in step ST44, the digital filters 51-L and 51-R can convolve the post-change impulse responses in a frequency domain.
Now, a description will be given of one example of a scheme for creating time-series coefficient data for the front high channels. This scheme utilizes actual-measurement data of the front channels (the front low channels). First, as illustrated in
In this measurement, time-series coefficient data corresponding to the impulse responses, the time-series coefficient data being to be set for the digital filters (FIR filters) 51-LL and 51-LR for processing the audio signals 5-FL for the front channels (the front low channels), can be obtained as illustrated in
Next, as illustrated in
The direct-sound coefficient data includes speaker characteristics SPa and transfer functions La and Ra. Since the speaker characteristics SPa are known, the transfer functions La and Ra can be obtained from the measured direct-sound coefficient data. The speaker characteristics SPa can be normalized as illustrated in
Final time-series coefficient data to be set for the digital filters (FIR filters) 51-HL and 51-HR for processing front high channel audio signals S-FH are generated based on the above-described actual-measurement data and the anechoic-room data. Thus, the generated time-series coefficient data is a combination of the actual-sound-field data and the anechoic-room data.
In this case, as illustrated in
Similarly, as illustrated in
Creation of the time-series coefficient data for the front high channels by using a scheme as described above can facilitate that, for example, filter coefficients (time-series coefficient data) for the front high channels of 7.1 channels are obtained even for only a general 5.1-channel layout in an actual sound field. In this case, conditions of a sound field the listener wishes to reproduce are maintained and the relationship between the left channels and the right channels has the relationship in the anechoic room. Accordingly, it is possible to provide faithful sound-image localization and it is also possible to reproduce reverberation in the sound field the listener wishes to reproduce with respect to reverberation.
Creation of the time-series coefficient data for the front high channels by using a scheme as described above makes it possible to share the speaker characteristics SPr of the time-series coefficient data to be set for the digital filters 51-HL and 51-HR. This can reduce a difference between sound of the left channels and sound of the right channels, thus can significantly reduce the user's sense of discomfort in the sound-image localization. The left and right channels may share the data of the reverberation-part (indirect-sound part) data. In such a case, the amount of memory in the coefficient holding unit 104a can be reduced.
The time-series coefficient data to be set for the digital filters 51-HL and 51-HR illustrated in
An operation of the audio-signal processing device 100 illustrated in
Audio signals for a predetermined number of channels (e.g., 2 channels, 6 channels, or 8 channels), the audio signals being obtained by the decoding unit 103, are supplied to the signal processing unit 105 through corresponding dedicated signal lines. Under the control of the control unit 101, the coefficient setting unit 104 sets filter coefficients corresponding to an estimated-channel layout for the digital filters in the signal processing unit 105, on the basis of the decode-mode information of the decoding unit 103. That is, filter coefficients corresponding to the estimated channel positions determined by the decode-mode information are set for the digital filters for the channels indicated by the decode-mode information.
The signal processing unit 105 generates left-channel audio signals SL and right-channel audio signals SR to he supplied to the headphone device 200, on the basis of predetermined-number-of-channels audio signals obtained by the decoding unit 103. In this case, digital filters convolve impulse responses for paths from the sound-source positions of the channels to the left ear of a listener with the corresponding predetermined-number-of-channels audio signals and the results of the convolutions for the channels are added to generate the left-channel audio signals SL. Similarly, digital filters convolve impulse responses for paths from the sound-source positions of the channels to the right ear of the listener with the corresponding predetermined-number-of-channels audio signals and the results of the convolutions for the channels are added to generate the right-channel audio signals SR.
The left-channel audio signals St generated by the signal processing unit 105 are output from the output terminal 106L. The right-channel audio signals SR generated by the signal processing unit 105 are output from the output terminal 106R. The audio signals St and SR are supplied to the headphone 200 and are reproduced.
As described above, the audio-signal processing device 100 illustrated in
In the audio-signal processing device 100 illustrated in
In the audio-signal processing device 100 illustrated in
In the audio-signal processing device 100 illustrated in
Thus, according to the present technology, even when the format of the compressed audio stream Ast is changed, 2-channel stereo audio signals with which sound-image localization for each channel can be performed in a favorable manner can be obtained from audio signals for a predetermined number of channels. According to the present technology, it is possible to reduce the amounts of memory and computation for processing audio signals for the bass-dedicated channels. In addition, according to the present technology, for example, even for only a general 5.1 channel layout in an actual sound field, the filter coefficients for the front high channels of 7.1 channels can be easily obtained. According to the present technology, it is possible to reduce the amount of memory that holds the filter coefficients.
<2. Modification>
A description in the embodiment described above has been given of an example in which 2-channel audio signals for driving the headphone device are generated from multi-channel audio signals. Needless to say, not only can the present technology be applied to the headphone device, but also the present technology can be applied to a case in which, for example, 2-channel audio signals for driving 2-channel speakers arranged adjacent to the listener are generated.
The present technology may be configured as described below.
(1) An audio-signal processing device including:
a decoding unit configured to decode a compressed audio stream to obtain audio signals for a predetermined number of channels;
a signal processing unit configured to generate 2-channel audio signals including left-channel audio signals and right-channel audio signals, on a basis of the predetermined-number-of-channels audio signals obtained by the decoding unit,
a coefficient setting unit configured to set filter coefficients corresponding to the impulse responses for the digital filters in the signal processing unit, on a basis of format information of the compressed audio stream.
(2) The audio-signal processing device according to (1), wherein, the coefficient setting unit sets, for the digital filters for the channels indicated by decode-mode information of the decoding unit, filter coefficients corresponding to an estimated channel layout determined by the format information.
(3) The audio-signal processing device according to (1) or (2), wherein at least one of the digital filters in the signal processing unit is used to process the audio signals for multiple ones of the predetermined number of channels.
(4) The audio-signal processing device according to (3), wherein the at least one digital filter used to process the audio signals for the multiple channels processes front high audio signals included in 7.1-channel audio signals or back surround audio signals included in 7.1-channel audio signals.
(5) An audio-signal processing method including:
decoding a compressed audio stream to obtain audio signals for a predetermined number of channels;
generating 2-channel audio signals including left audio signals and right, audio signals, on a basis of the predetermined-number-of-channels audio signals obtained in the decoding,
setting filter coefficients corresponding to the impulse responses for the digital filters, on a basis of format information of the compressed audio stream.
(6) A program for causing a computer to execute an audio signal processing method including:
decoding a compressed audio stream to obtain audio signals for a predetermined number of channels;
generating 2-channel audio signals including left audio signals and right audio signals, on a basis of the predetermined-number-of-channels audio signals obtained in the decoding,
setting filter coefficients corresponding to the impulse responses for the digital filters, on a basis of format information of the compressed audio stream.
(7) A recording medium storing a program for causing a computer to execute an audio signal processing method including:
decoding a compressed audio stream to obtain audio signals for a predetermined number of channels;
generating 2-channel audio signals including left audio signals and right audio signals, on a basis of the predetermined-number-of-channels audio signals obtained in the decoding,
setting filter coefficients corresponding to the impulse responses for the digital filters, on a basis of format information of the compressed audio stream.
(8) An audio-signal processing device including:
a decoding unit configured to decode a compressed audio stream to obtain audio signals for a predetermined number of channels; and
a signal processing unit configured to generate 2-channel audio signals including left audio signals and right audio signals, on a basis of the predetermined-number-of-channels audio signals obtained by the decoding unit;
wherein the signal processing unit
wherein, in the signal processing unit, the digital filters for processing at least the audio signals for a low-frequency enhancement channel are implemented by infinite impulse response filters.
(9) An audio-signal processing method including:
decoding a compressed audio stream to obtain audio signals for a predetermined number of channels; and
generating 2-channel audio signals including left audio signals and right audio signals, on a basis of the predetermined-number-of-channels audio signals obtained in the decoding,
(10) A program for causing a computer to execute an audio signal processing method including:
decoding a compressed audio stream to obtain audio signals for a predetermined number of channels; and
generating 2-channel audio signals including left audio signals and right audio signals, on a basis of the predetermined-number-of-channels audio signals obtained in the decoding,
(11) A recording medium storing a program for causing a computer to execute an audio signal processing method including:
decoding a compressed audio stream to obtain audio signals for a predetermined number of channels; and
generating 2-channel audio signals including left audio signals and right audio signals, on a basis of the predetermined-number-of-channels audio signals obtained in the decoding,
(12) An audio-signal processing device, including:
a decoding unit configured to decode a compressed audio stream to obtain audio signals for a predetermined number of channels; and
a signal processing unit configured to generate 2-channel audio signals including left audio signals and right audio signals, on a basis of the predetermined-number-of-channels audio signals obtained by the decoding unit;
wherein, in the signal processing unit, the filter coefficient set for the digital filter for processing audio signals for a front high channel is data obtained by combination of actual-sound-field data and anechoic-room data.
(13) The audio-signal processing device according to (12), wherein the actual-sound-field data includes a speaker characteristic of a front channel and data of reverberation part of the front channel.
(14) An audio-signal processing method including:
decoding a compressed audio stream to obtain audio signals for a predetermined number of channels; and
generating 2-channel audio signals including left audio signals and right audio signals, on a basis of the predetermined-number-of-channels audio signals obtained in the decoding,
(15) A program for causing a computer to execute an audio signal processing method including:
decoding a compressed audio stream to obtain audio signals for a predetermined number of channels; and
generating 2-channel audio signals including left audio signals and right audio signals, on a basis of the predetermined-number-of-channels audio signals obtained in the decoding,
(16) A recording medium storing a program for causing a computer to execute an audio signal processing method including:
decoding a compressed audio stream to obtain audio signals for a predetermined number of channels; and
generating 2-channel audio signals including left audio signals and right audio signals, on a basis of the predetermined-number-of-channels audio signals obtained in the decoding,
(17) An audio-signal processing device including:
a decoding unit configured to decode a compressed audio stream to obtain audio signals for a predetermined number of channels;
a signal processing unit configured to generate 2-channel audio signals including left-channel audio signals and right-channel audio signals, on a basis of the predetermined-number-of-channels audio signals obtained by the decoding unit,
a coefficient holding unit configured to hold time-series coefficient data as filter coefficients corresponding to the impulse responses; and
a coefficient setting unit configured to read the time-series coefficient data held by the coefficient holding unit, transform the time-series coefficient data into frequency-domain data, and set the frequency-domain data for the digital filters.
(18) An audio-signal processing device including:
decoding a compressed audio stream to obtain audio signals for a predetermined number of channels;
generating 2-channel, audio signals including left-channel audio signals and right-channel audio signals, on a basis of the predetermined-number-of-channels audio signals obtained in the decoding,
reading time-series coefficient data held by a coefficient holding unit, transforming the time-series coefficient data into frequency-domain data, and setting the frequency-domain data for the digital filters.
(19) A program for causing a computer to execute an audio signal processing method including:
decoding a compressed audio stream to obtain audio signals for a predetermined number of channels;
generating 2-channel audio signals including left-channel audio signals and right-channel audio signals, on a basis of the predetermined-number-of-channels audio signals obtained in the decoding,
reading time-series coefficient data held by a coefficient holding unit, transforming the time-series coefficient data into frequency-domain data, and setting the frequency-domain data for the digital filters.
(20) A recording medium storing a program for causing a computer to execute an audio signal processing method including:
decoding a compressed audio stream to obtain audio signals for a predetermined number of channels;
generating 2-channel audio signals including left-channel audio signals and right-channel audio signals, on a basis of the predetermined-number-of-channels audio signals obtained in the decoding,
reading time-series coefficient data held by a coefficient holding unit, transforming the time-series coefficient data into frequency-domain data, and setting the frequency-domain data for the digital filters.
The present disclosure contains subject matter related to that disclosed in Japanese Priority Patent Application JP 2011-223485 filed in the Japan Patent Office on Oct. 7, 2011, the entire contents of which are hereby incorporated by reference.
It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and alterations may occur depending on design requirements and other factors insofar as they are within the scope of the appended claims or the equivalents thereof.
Number | Date | Country | Kind |
---|---|---|---|
2011-223485 | Oct 2011 | JP | national |
Number | Name | Date | Kind |
---|---|---|---|
5404406 | Fuchigami et al. | Apr 1995 | A |
6928179 | Yamada et al. | Aug 2005 | B1 |
6961632 | Hashimoto et al. | Nov 2005 | B2 |
7466831 | Magrath | Dec 2008 | B2 |
7720240 | Wang | May 2010 | B2 |
8243969 | Breebaart et al. | Aug 2012 | B2 |
8285556 | Jung et al. | Oct 2012 | B2 |
8873761 | Fukui et al. | Oct 2014 | B2 |
20070154019 | Kim | Jul 2007 | A1 |
20090046864 | Mahabub et al. | Feb 2009 | A1 |
20120213375 | Mahabub et al. | Aug 2012 | A1 |
Number | Date | Country |
---|---|---|
2006-014218 | Jan 2006 | JP |
Number | Date | Country | |
---|---|---|---|
20130089209 A1 | Apr 2013 | US |