The present technology relates to a signal processing apparatus, a signal processing method, and a program, and in particular, relates to a signal processing apparatus capable of appropriately improving sound quality of an audio signal produced by, for example, decimating a portion of frequency components, a signal processing method, and a program.
When an audio signal is transmitted or recorded in a recording medium, the audio signal is encoded to reduce the amount of data of the audio signal.
When an audio signal is encoded, the amount of data of the audio signal is reduced by deleting, for example, a portion of frequency components from among frequency components of high frequencies.
Thus, a signal obtained by decoding encoded data obtained by encoding an audio signal lacks frequency components of high frequencies of an original sound, which is an audio signal before encoding, and the ambience is damaged and a muffled sound is generated, leading to lower sound quality.
Thus, a method of reproducing a signal of high sound quality by extending the frequency band (generating frequency components of high frequencies) based on frequency components of low frequencies of a signal obtained by decoding encoded data is proposed (see, for example, Japanese Patent Application Laid-Open No. 2008-139844).
Incidentally, proposals of technology capable of appropriately improving sound quality of an audio signal created by decimating a portion (in several frequencies) of frequency components of an original sound by using, for example, a masking effect are demanded.
The present technology is developed in view of the above circumstances and can appropriately improve sound quality of an audio signal created by decimating a portion (in several frequencies) of frequency components.
A signal processing apparatus and a program according to an aspect of the present technology are a signal processing apparatus and a program causing a computer to function as a signal processing apparatus, including a filter unit that filters an audio signal created by decimating a portion of frequency components by an all-pass filter and outputs a filtering result thereof as improvement components to improve sound quality of the audio signal and an adder that generates an improved sound in which the sound quality of the audio signal is improved by adding the improvement components to the audio signal.
A signal processing method according to an aspect of the present technology is a signal processing method including the steps of filtering an audio signal created by decimating a portion of frequency components by an all-pass filter, outputting a filtering result thereof as improvement components to improve sound quality of the audio signal, and generating an improved sound in which the sound quality of the audio signal is improved by adding the improvement components to the audio signal.
According to an aspect of the present technology, an audio signal created by decimating a portion of frequency components is filtered by an all-pass filter and a filtering result thereof is output as improvement components to improve sound quality of the audio signal. Then, an improved sound in which the sound quality of the audio signal is improved is generated by adding the improvement components to the audio signal.
The signal processing apparatus may be an independent apparatus or an internal block constituting one apparatus.
The program can be provided by transmission via a transmission medium or recording in a recording medium.
According to an aspect of the present technology, sound quality of an audio signal created by decimating a portion of frequency components can appropriately be improved.
Hereinafter, preferred embodiments of the present disclosure will be described in detail with reference to the appended drawings. Note that, in this specification and the appended drawings, structural elements that have substantially the same function and structure are denoted with the same reference numerals, and repeated explanation of these structural elements is omitted.
In
The acquisition unit 21 acquires encoded data created by encoding an audio signal of a music piece, sound of TV broadcasting program or the like from a recording medium or transmission medium and supplies the encoded data to the decoder 22.
That is, the acquisition unit 21 has a drive into which, for example, an optical disk (for example, a Blu-Ray (registered trademark) disk) or a memory card (for example, a memory stick (registered trademark)) can be inserted. The acquisition unit 21 acquires encoded data recorded in a recording medium by reproducing (reading) the encoded data from the recording medium inserted into the drive and supplies the data to the decoder 22.
The acquisition unit 21 also has, for example, a network card and a tuner. The acquisition unit 21 acquires encoded data coming by being transmitted via a transmission medium such as the Internet, a terrestrial signal, or a satellite wave by receiving the encoded data and supplies the encoded data to the decoder 22.
The encoded data acquired by the acquisition unit 21 is obtained by, for example, encoding that performs at least processing to decimate a portion of frequency components of an original sound, which is an original audio signal.
In encoding of an original sound, frequency components whose decimating is considered less likely to be perceived by listeners (frequency components harder to hear by listeners due to the masking effect) are decimated by using, for example, the masking effect.
Encoding methods of the above original sound include, for example, AAC (Advanced Audio Coding), mp3 (MPEG Audio Layer 3), AC3 (Audio Code Number 3), and dts (Digital Theater System).
The decoder 22 decodes the encoded data supplied from the acquisition unit 21 and supplies a resultant audio signal (hereinafter, also called a decoded output sound) to the signal processing unit 23.
The signal processing unit 23 performs sound quality improvement processing to improve sound quality and other signal processing on the decoded output sound from the decoder 22 and outputs a resultant audio signal to the speaker 24. Whether to perform the sound quality improvement processing may be set, for example, in accordance with a user's operation.
The speaker 24 outputs (a sound corresponding to) the audio signal from the signal processing unit 23.
The control unit 25 controls each block constituting the audio player.
As described with reference to
Even if the masking effect is used, a portion of frequency components (in several frequencies) of the original sound is decimated and thus, if a listener hears the decoded output sound as it is, the listener may feel dissatisfied.
To prevent the listener from feeling dissatisfied with sound quality, it is necessary to perform some kind of sound quality improvement processing to improve sound quality on the decoded output sound.
In
However, to recognize frequencies at which frequency components are decimated from codec information, it becomes necessary to interpret different code information for each encoding method.
In addition, in the sound quality improvement processing in which amplitudes of decimated frequency components are estimated by considering harmonic components, an envelope and the like of the decoded output sound and frequency components are interpolated on the frequency axis, adverse effects such as the decoded output sound after the sound quality improvement processing being an unnatural sound or a sound with extra attendant sound frequently show up.
Thus, the signal processing unit 23 in
In
The decoded output sound from the decoder 22 (
The filter unit 31 filters the decoded output sound from the decoder 22, that is, an audio signal (linear PCM (Pulse Code Modulation)) created by decimating a portion (in several places) of frequency components using an all-pass filter and outputs the filtering result as improvement components to improve sound quality of the decoded output sound. Improvement components output by the filter unit 31 are supplied to the amplifier 32.
The amplifier 32 amplifies (attenuates) improvement components from the filter unit 31 by α times, which is a MIX coefficient of the value in the range represented by an equation 0<α<1, and supplies the components to the adder 33.
The adder 33 generates and outputs an improved sound obtained by improving sound quality of a decoded output sound by adding improvement components from the amplifier 32 to the decoded output sound from the decoder 22. That is, the adder 33 adds the decoded output sound and (α-multiplied) improvement components and outputs the addition result as an improved sound obtained by improving sound quality of the decoded output sound.
In step S11, the filter unit 31 generates improvement components by filtering a decoded output sound from the decoder 22 using an all-pass filter and supplies the improvement components to the amplifier 32 before the processing proceeds to step S12.
In step S12, the amplifier 32 adjusts the gain (amplitude) of the improvement components from the filter unit 31 to α times and supplies the gain to the adder 33 before the processing proceeds to step S13.
In step S13, the adder 33 generates and outputs an improved sound by adding the improvement components from the amplifier 32 to the decoded output sound from the decoder 22.
In
If a (digital) signal to be filtered by an all-pass filter is called an input signal and a (digital) signal obtained by filtering the input signal by the all-pass filter is called an output signal, the input signal is supplied to the adder 41.
The adder 41 adds the input signal and a signal supplied from the amplifier 45 and outputs a resultant added value. The added value output by the adder 41 is supplied to the delay unit 42 and the amplifier 44.
The delay unit 42 includes, for example, a plurality of registers and outputs the added value from the adder 41 after a delay amount (time) corresponding to a tap number n, which is the number of registers constituting the delay unit 42, as a delayed signal. The delayed signal output from the delay unit 42 is supplied to the adder 43 and the amplifier 45.
The adder 43 adds the delayed signal from the delay unit 42 and a signal supplied from the amplifier 44 and outputs a resultant added value as an output signal.
The amplifier 44 amplifies (attenuates) the added value from the adder 41 by g times (0<g<1) and supplies the amplified added value to the adder 43.
The amplifier 45 amplifies (attenuates) the delayed signal from the delay unit 42 by −g times and supplies the amplified delayed signal to the adder 41
The all-pass filter as the filter unit 31 configured as described above allows an input signal in all frequency bands to pass and changes only the phase thereof. Therefore, an output signal output from the filter unit 31 is, for example, a signal having the same amplitude characteristics as an input signal and different phase characteristics from the input signal.
In the sound quality improvement apparatus, improvement components are generated by processing on a time axis of filtering a decoded output sound (
As a result, a signal correlated with the decoded output sound (naturally distorted components) is obtained as improvement components.
Then, in the sound quality improvement apparatus, improvement components are amplified (attenuated) by α (less than 1) times by the amplifier 32 and improvement components are added to the decoded output sound by the adder 33 to determine an improved sound.
That is, the sound quality improvement apparatus generates an improved sound in
The all-pass filter as the filter unit 31 allows an input signal in all frequency bands to pass and changes only the phase thereof and thus, in a steady state, no frequency component that is not present in the decoded output sound, which is an input signal of the all-pass filter, appears in improvement components, which are an output signal of the all-pass filter.
However, frequency components that are not present in the decoded output sound appear in (α multiplied) improvement components in
It is possible to verify that frequency components of the sine wave are distorted regarding the output signal in the transition segment b1 of
In the transition segment b1, as described above, frequency components of the sine wave are distorted as shown in
Then, frequency components appearing at surrounding frequencies of frequency components of the sine wave significantly contribute to improvement of sound quality of the decoded output sound as improvement components.
Because it is necessary to add improvement components to a decoded output sound temporally as close as possible to the decoded output sound used for filtering by the all-pass filter to generate improvement components, the delay amount corresponding to the tap number n of the delay unit 42 constituting the all-pass filter (
Thus, the delay amount of the delay unit 42 (
It is possible to verify that compared with the original sound in
It is also possible to verify that for the improved sound in
According to the sound quality improvement apparatus in
That is, if, for example, an improved sound is generated by interpolating energy into the decoded output sound on a frequency axis, the sound balance of the improved sound may be lost or the improved sound may be an unnatural sound.
On the other hand, when improvement components obtained by filtering a decoded output sound by an all-pass filter are added to the decoded output sound (on a time axis), the sound balance of the improved sound is not lost and the improved sound will not be an unnatural sound.
According to the sound quality improvement apparatus in
Further, with the envelope of the improved sound being restored (put in order), the localization of a sound image becomes clear so that a wide sound field (particularly surround) close to the original sound can be obtained.
Moreover, the sound quality improvement processing by the sound quality improvement apparatus in
Further, the sound quality improvement processing by the sound quality improvement apparatus in
In
In the sound quality improvement apparatus in
Therefore, asymmetric processing is performed on the L channel decoded output sound and the R channel decoded output sound in the sound quality improvement apparatus in
That is, in
In the sound quality improvement apparatus in
The amplifier 51L amplifies the R channel decoded output sound by K (for example, 0.1) times and supplies the amplified R channel decoded output sound to the adder 52L.
The adder 52L adds the R channel decoded output sound from the amplifier 51L to the L channel decoded output sound and supplies the resultant added value to the all-pass filter 54L1 in the first stage of an all-pass filter block 54L in which the all-pass filters 54L1 to 54L3 are cascade-connected.
The all-pass filter 53L1 is an all-pass filter in the first stage of the all-pass filter block 53L in which the all-pass filters 53L1 to 53L3 are cascade-connected and filters the L channel decoded output sound to supply the filtering result to the all-pass filter 53L2 in the subsequent stage.
The all-pass filters 53L1 to 53L3, the all-pass filters 53R1 to 53R3, the all-pass filters 54L1 to 54L3, and the all-pass filters 54R1 to 54R3 are configured in the same manner as the all-pass filter as the filter unit 31 shown in
In
This also applies to blocks representing the all-pass filters 53Ri, 54Li, 54Ri.
Therefore, in
Also in
The all-pass filter 53L2 filters the filtering result from the all-pass filter 53L1 in the previous stage to supply the filtering result to the all-pass filter 53L3 in the subsequent stage.
The all-pass filter 53L3 filters the filtering result from the all-pass filter 53L2 in the previous stage to supply the filtering result to the adder 55L.
The all-pass filter 54L1 filters the added value from the adder 52L to supply the filtering result to the all-pass filter 54L2 in the subsequent stage.
The all-pass filter 54L2 filters the filtering result from the all-pass filter 54L1 in the previous stage to supply the filtering result to the all-pass filter 54L3 in the subsequent stage.
The all-pass filter 54L3 filters the filtering result from the all-pass filter 54L2 in the previous stage to supply the filtering result to the adder 55L.
The adder 55L adds the filtering result from the all-pass filter 53L3 and the filtering result from the all-pass filter 54L3 to supply the resultant added value to the amplifier 56L as improvement components.
The amplifier 56L amplifies improvement components from the adder 55L by α (for example, 0.1) times and supplies the amplified improvement components to the adder 57L.
The adder 57L adds improvement components from the amplifier 51L to the L channel decoded output sound and outputs the resultant added value as an L channel improved sound.
The amplifier 51L, the adder 52L, (the all-pass filters 53L1 to 53L3 constituting) the all-pass filter block 53L, (the all-pass filters 54L1 to 54L3 constituting) the all-pass filter block 54L, and the adder 55L correspond to the filter unit 31 in
If the adder 52L, the all-pass filter blocks 53L, 54L, and the adder 55L corresponding to the filter unit 31 is called a corresponding filter unit, the L channel decoded output sound as an audio signal of one channel of the L channel decoded output sound and R channel decoded output sound is filtered by the all-pass filter block 53L in the corresponding filter unit.
Also in the corresponding filter unit, the R channel decoded output sound output by the amplifier 51L as an audio signal of the other channel is added to the L channel decoded output sound by the adder 52L to cause a crosstalk and a resultant crosstalk signal is filtered by the all-pass filter block 54L.
Then, the filtering result of the L channel decoded output sound by the all-pass filter 53L and the filtering result of the crosstalk signal by the all-pass filter 53L are added by the adder 55L and the resultant added value is output as improvement components of the L channel decoded output sound.
In the amplifier 51R, the adder 52R, the all-pass filters 53R1 to 53R3 constituting the all-pass filter block 53R, the all-pass filters 54R1 to 54R3 constituting the all-pass filter block 54R, the adders 55R, the amplifier 56R, and the adder 57R, the R channel decoded output sound is used, instead of the L channel decoded output sound, and the same processing as that of the amplifier 51L to the adder 57L is performed excluding the fact that the R channel decoded output sound is used, instead of the L channel decoded output sound.
In
On the other hand, the delay amount n and the gain g of the all-pass filter 53Ri constituting the all-pass filter block 53R that filters the R channel decoded output sound are N#(i+3) and G#(i+3) respectively and the delay amount n and the gain g of the all-pass filter 54Ri constituting the all-pass filter block 54R that filters a crosstalk signal caused by a crosstalk of the L channel decoded output sound to the R channel decoded output sound are N#i and G#i respectively.
In
Further, in
Therefore, in
For example, 0.6484, 0.6016, and 0.5391 can be adopted as gains G#1, G#2, and G#3 respectively and, for example, the same values as those of the gains G#1, G#2, and G#3 can be adopted for gains G#4, G#5, and G#6 respectively.
For example, 97 taps (samples), 61 taps, and 43 taps can be adopted as delay amounts (tap number) N#1, N#2, and N#3 respectively and, for example, 89 taps, 67 taps, and 41 taps can be adopted as delay amounts N#4, N#5, and N#6.
Incidentally, one frame of AAC has 1024 samples and one frame of mp3 has 576 samples. One frame of AC3 has 768 samples at 48 kHz/384 kbps, which is the standard rate of DVD, and one frame of dts used by DVD has 512 samples.
If, for example, 97 taps, 61 taps, and 43 taps described above are adopted as the delay amounts N#1, N#2, and N#3 respectively, the sum total N#1+N#2+N#3 of the delay amounts of the all-pass filters 53L and 54R becomes a time equal to or less than the length of the frame regardless of the encoding method.
Similarly, if 89 taps, 67 taps, and 41 taps described above are adopted as the delay amounts N#4, N#5, and N#6 respectively, the sum total N#4+N#5+N#6 of the delay amounts of the all-pass filters 54L and 53R becomes a time equal to or less than the length of the frame regardless of the encoding method.
Incidentally, the delay amounts and gains of the all-pass filters 53L, 53R, 54L, 54R are not limited to the above values. This also applies to the gains K of the amplifiers 51L, 51R and the gains α of the amplifiers 56L, 56R.
In
Further, in
Also in
If the all-pass filter block 53L is formed by cascade-connecting a plurality of all-pass filters (this also applies to the all-pass filter blocks 53R, 54L, 54R), improvement components in which distortion is more uniformly spread in a transition period can be obtained.
That is,
The input into the all-pass filter 53L1 is a sine wave shown in
From
In
The sound quality improvement apparatus in
However, the sound quality improvement apparatus in
The amplifiers 61L, 62R output a signal input thereinto after amplifying the signal by α1 times.
The amplifiers 62L, 61R output a signal input thereinto after amplifying the signal by α2 times.
The gain α1 of the amplifiers 61L, 62R and the gain α2 of the amplifiers 62L, 61R match at α, the sound quality improvement apparatus in
In the sound quality improvement apparatus in
In
The sound quality improvement apparatus in
The sound quality improvement apparatus in
Therefore, like in
Further, in the sound quality improvement apparatus in
This also applies to the R channel.
In
The sound quality improvement apparatus in
However, the sound quality improvement apparatus in
In
In
The sound quality improvement apparatus in
The sound quality improvement apparatus in
In the sound quality improvement apparatus in
Also, the R channel decoded output sound is amplified by K2 times by the amplifier 51L and supplied to the adder 52L. The adder 52L causes a crosstalk by adding the R channel decoded output sound from the amplifier 51L to the L channel decoded output sound and supplies the resultant crosstalk signal to the all-pass filter block 54L via the amplifier 62L.
On the other hand, the L channel decoded output sound is amplified by K2 times by the amplifier 81R and supplied to the adder 71R. The adder 71R causes a crosstalk by adding the L channel decoded output sound from the amplifier 81R to the R channel decoded output sound and supplies the resultant crosstalk signal to the all-pass filter block 53R via the amplifier 61R.
The L channel decoded output sound is amplified by K1 times by the amplifier 51R and supplied to the adder 52R. The adder 52R causes a crosstalk by adding the L channel decoded output sound from the amplifier 51R to the R channel decoded output sound and supplies the resultant crosstalk signal to the all-pass filter block 54R via the amplifier 62R.
Subsequently, processing similar to that in
Next, the above sequence of processing can be performed by hardware or software. If the sequence of processing should by performed by software, a program constituting the software is installed on a general-purpose computer.
The program may be recorded in a hard disk 105 or a ROM 103 as a recording medium contained in the computer in advance.
Alternatively, the program may be stored (recorded) in a removable recording medium 111. The removable recording medium 111 can be provided as so-called package software. As the removable recording medium 111, for example, a flexible disk, CD-ROM (Compact Disc Read Only Memory), MO (Magneto Optical) disk, DVD (Digital Versatile Disc), magnetic disk, and semiconductor memory can be cited.
In addition to the above installation of the program from the removable recording medium 111 to the computer, the program can also be installed in the contained hard disk 105 by downloading the program to the computer via a communication network or broadcasting network. That is, the program can be transferred to the computer, for example, from a download site via an artificial satellite for digital satellite broadcasting wirelessly or via a network such as a LAN (Local Area Network) and the Internet by wire.
The computer contains a CPU (Central Processing Unit) 102 and an input/output interface 110 is connected to the CPU 102 via a bus 101.
If an instruction is input into the CPU 102 by the user through an operation of an input unit 107 or the like via the input/output interface 110, the CPU 102 executes the program stored in the ROM (Read Only Memory) 103 according to the program. Alternatively, the CPU 102 loads and executes the program stored in the hard disk 105 by loading the program into a RAM (Random Access Memory) 104.
Accordingly, the CPU 102 performs processing according to the above flow chart or processing performed according to the configuration of the above block diagram. Then, for example, the CPU 102 outputs the processing result from an output unit 106 via the input/output interface 110 or transmits the processing result from a communication unit 108 and further causes the hard disk 105 to record the processing result if necessary.
Incidentally, the input unit 107 is constituted of a keyboard, mouse, microphone or the like. The output unit 106 is constituted of an LCD (Liquid Crystal Display), speaker or the like.
Processing performed by the computer according to a program does not have to be necessarily executed chronologically in the order described as a flow chart. That is, processing performed by the computer according to a program includes processing performed in parallel or individually (for example, parallel processing or processing by an object).
Moreover, a program may be performed by one computer (processor) or a plurality of computer in a distributed manner. Further, a program may be transferred to a remote computer to be executed there.
It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and alterations may occur depending on design requirements and other factors insofar as they are within the scope of the appended claims or the equivalents thereof.
Additionally, the present technology may also be configured as below.
[1] A signal processing apparatus, comprising:
a filter unit that filters an audio signal created by decimating a portion of frequency components by an all-pass filter and outputs a filtering result thereof as improvement components to improve sound quality of the audio signal; and
an adder that generates an improved sound in which the sound quality of the audio signal is improved by adding the improvement components to the audio signal.
[2] The signal processing apparatus according to [1],
wherein the audio signal is obtained by decoding encoded data obtained by encoding that performs at least processing to decimate the portion of frequency components of an original sound.
[3] The signal processing apparatus according to [2],
wherein the all-pass filter includes a delay unit that delays a signal, and
a delay amount of the delay unit is a time period equal to or less than a length of a frame to be a unit of the processing in the encoding of the original sound.
[4] The signal processing apparatus according to any one of [1] to [3],
wherein the filter unit filters the audio signal by a plurality of cascade-connected all-pass filters.
[5] The signal processing apparatus according to any one of [1] to [4],
wherein the filter unit filters a first channel audio signal among two channel audio signals by the all-pass filter and also filters a crosstalk signal obtained by causing a crosstalk of a second channel audio signal to the first channel audio signal by the all-pass filter, adds the filtering result of the first channel audio signal to the filtering result of the crosstalk signal, and outputs an added value as the improvement components to improve the sound quality of the audio signal of the one channel.
[6] The signal processing apparatus according to [5],
wherein asymmetric processing is performed on the audio signals of the two channels.
[7] A signal processing method, comprising:
filtering an audio signal created by decimating a portion of frequency components by an all-pass filter and outputting a filtering result thereof as improvement components to improve sound quality of the audio signal; and
generating an improved sound in which the sound quality of the audio signal is improved by adding the improvement components to the audio signal.
[8] A program causing a computer to function as:
a filter unit that filters an audio signal created by decimating a portion of frequency components by an all-pass filter and outputs a filtering result thereof as improvement components to improve sound quality of the audio signal; and
an adder that generates an improved sound in which the sound quality of the audio signal is improved by adding the improvement components to the audio signal.
The present disclosure contains subject matter related to that disclosed in Japanese Priority Patent Application JP 2011-141566 filed in the Japan Patent Office on Jun. 27, 2011, the entire content of which is hereby incorporated by reference.
Number | Date | Country | Kind |
---|---|---|---|
2011-141566 | Jun 2011 | JP | national |