This application is based upon and claims the benefit of priority of the prior Japanese Patent Application Nos. 2017-199673, filed on Oct. 13, 2017, and 2017-147119, filed on Jul. 28, 2017, the entire contents of which are incorporated herein by reference.
The embodiments discussed herein are related to an audio encoding apparatus and the like.
In recent years, a technology called a spectral band replication (SBR) has been used for, for example, television broadcasting, radio broadcasting, Internet radio, or music distribution. The SBR is an encoding technology that compresses and expands sound signals such as the sound and music.
An encoding apparatus that performs a coding based on the SBR and a decoding apparatus in the related art will be described.
The low-frequency signal extraction unit 11 is a processing unit that acquires a sound signal from an external device and extracts a low-frequency signal of the sound signal. The low-frequency signal extraction unit 11 outputs the low-frequency signal to the low-frequency encoding unit 12.
The low-frequency encoding unit 12 is a processing unit that generates a “low-frequency code” by encoding the low-frequency signal. For example, the low-frequency encoding unit 12 performs an encoding based on an advanced audio coding (AAC). The low-frequency encoding unit 12 outputs a low-frequency code to the multiplexing unit 15.
The high-frequency information extraction unit 13 is a processing unit that acquires a sound signal from an external device and extracts high-frequency information based on the sound signal. The high-frequency information extraction unit 13 outputs the high-frequency information to the high-frequency encoding unit 14.
The high-frequency information includes an envelope power, a tone frequency, and a frequency resolution. The envelope power represents an envelope in the high-frequency of the frequency spectrum of the sound signal and corresponds to, for example, an envelope power 6a in
The tone frequency indicates the frequency at which a tone is present. For example, the tone is a large power with a protruding power value. In the example illustrated in
The high-frequency encoding unit 14 is a processing unit that generates a “high-frequency code” by encoding high-frequency information. The high-frequency encoding unit 14 outputs the high-frequency code to the multiplexing unit 15.
The multiplexing unit 15 is a processing unit that generates a stream by multiplexing the low-frequency code and the high-frequency code. The multiplexing unit 15 transmits the stream to the decoding apparatus via a network.
The demultiplexing unit 31 is a processing unit that acquires a stream from the encoding apparatus 10 and separates the acquired stream into a low-frequency code and a high-frequency code. The demultiplexing unit 21 outputs the low-frequency code to the low-frequency decoding unit 22. The demultiplexing unit 21 outputs the high-frequency code to the high-frequency decoding unit 24.
The low-frequency decoding unit 22 is a processing unit that extracts a low-frequency signal by decoding the low-frequency code. The low-frequency decoding unit 22 outputs the low-frequency signal to the high-frequency generation unit 23.
The high-frequency generation unit 23 is a processing unit that generates a high-frequency signal by replicating the waveform of the low-frequency signal to a high-frequency side. The high-frequency generation unit 23 outputs the signal information including the low-frequency signal and the high-frequency signal to the high-frequency shaping unit 25.
The high-frequency decoding unit 24 is a processing unit that extracts high-frequency information by decoding the high-frequency code. The high-frequency decoding unit 24 outputs the high-frequency information to the high-frequency shaping unit 25. As described above, the high-frequency information includes an envelope power, a tone frequency, and a frequency resolution.
The high-frequency shaping unit 25 is a processing unit that shapes the high-frequency signal of the signal information based on the high-frequency information. The high-frequency shaping unit 25 outputs the shaped signal information to an external device.
Step S11 of
Step S12 of
Related technologies are disclosed in, for example, International Publication Pamphlet No. WO 2014/199632 and Japanese Laid-Open Patent Publication No. 2016-173597.
According to an aspect of the invention, an audio encoding apparatus includes a memory, and a processor coupled to the memory and the processor configured to determine whether a tone is included in a boundary between a low-frequency that is a frequency bandwidth below a predetermined frequency of an input signal and a high-frequency that is a frequency bandwidth above the predetermined frequency of the input signal, suppress a tone in one of the low-frequency and the high-frequency, encode the input signal having the low-frequency to generate a low-frequency code, encode the input signal having the high-frequency to generate a high-frequency code, and generate an encoded stream by multiplexing the low-frequency code and the high-frequency code.
The object and advantages of the invention will be realized and attained by means of the elements and combinations particularly pointed out in the claims.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory and are not restrictive of the invention, as claimed.
In the above-described technology in the related art, there is a problem that the sound quality of a sound signal deteriorates.
For example, there may be a case where, when a tone is at a boundary between the low-frequency and the high-frequency, the resolution on the high-frequency side is coarse, and tones are generated at a frequency shifted from the low-frequency at the time of decoding. When the tones are generated at a frequency shifted from the low-frequency, two adjacent tones are generated, and a vibration is generated to deteriorate sound quality.
For example, no vibration is generated in the input sound itself, but there is one tone at the boundary between the low-frequency and the high-frequency. Here, as described in
Step S22 will be described. The high-frequency shaping unit 25 of the decoding apparatus 20 shapes the high-frequency signal based on envelope information 9. For example, when the resolution is rough, the envelope information 9 is adjusted so that the value of the boundary becomes larger due to the influence of the tone 36a and the value of the right end side becomes smaller. Thus, the power value 35b is shaped to a power value 35b′, which is the same size as the tone 36a, and the tone 36b is shaped to the power value 36b′. Of these tones 35b′ and 36b′, the tone 36a and the power value 35b′ become vibration components, and the sound quality is deteriorated.
Hereinafter, an embodiment of a technology capable of suppressing the deterioration of the sound quality of a sound signal will be described in detail with reference to the accompanying drawings. However, the present disclosure is not limited to this embodiment.
The audio encoding apparatus 100 is a device that acquires a sound signal from an external device and encodes the sound signal. For example, when the audio encoding apparatus 100 detects that the tone is at the boundary between the low-frequency and the high-frequency, the audio encoding apparatus 100 suppresses one of the tones on a low-frequency side and a high-frequency side, and multiplexes the low-frequency code and the high-frequency code to generate a stream. The audio encoding apparatus 100 transmits the stream to the decoding apparatus 20. The stream corresponds to an encoded stream.
The decoding apparatus 20 is a device that receives a stream from the audio encoding apparatus 100 and decodes the stream. The description of the decoding apparatus 20 is the same as that of the decoding apparatus 20 described with reference to
The low-frequency signal extraction unit 110 is a processing unit that acquires a sound signal from an external device and extracts a low-frequency signal included in the low-frequency of the sound signal. The low-frequency signal extraction unit 110 outputs the low-frequency signal to the low-frequency correction unit 140. An administrator is configured to set the upper limit frequency of the low-frequency in advance.
The high-frequency information extraction unit 120 is a processing unit that acquires a sound signal from an external device and extracts high-frequency information from the high-frequency of the sound signal. The high-frequency information extraction unit 120 outputs the high-frequency information to the high-frequency correction unit 160. The high-frequency information includes an envelope power, a tone frequency, and a frequency resolution. The administrator is configured to set the lower limit frequency of the high-frequency in advance. Further, the lower limit frequency of the high-frequency may be lower than the upper limit frequency of the low-frequency.
For example, the high-frequency information extraction unit 120 converts the sound signal into a frequency spectrum, and extracts the shape of the envelope on the high-frequency side of the frequency spectrum as an envelope power. The high-frequency information extraction unit 120 extracts, as a tone frequency, a frequency at which the power is equal to or greater than a threshold value in the high-frequency of the frequency spectrum. The frequency resolution is configured to be set in advance.
The determination unit 130 is a processing unit that acquires a sound signal from an external device and determines whether the tone is included in the boundary between the low-frequency and the high-frequency of the sound signal. In addition, when it is determined that the tone is included in the boundary, the determination unit 130 determines whether the low-frequency tone or the high-frequency tone is suppressed. The boundary between the low-frequency and the high-frequency is a bandwidth between the upper limit of the low-frequency and the lower limit of the high-frequency. Further, a vertical width of the bandwidth between the upper limit of the low-frequency and the lower limit of the high-frequency may be provided. For example, the “width between the lower limit of the boundary bandwidth −ε and the upper limit of the boundary bandwidth +ε” may be used.
The BPF 131 is a filter that passes a sound signal near a boundary between a low-frequency and a high-frequency band of the sound signal. The sound signal that passes through the BPF 131 is output to the tone detection unit 132.
Here, as an example, a BPF 131 is used to extract a sound signal near a boundary from the sound signal, but the present invention is not limited thereto. For example, a sound signal near the boundary may be extracted using a fast Fourier transform (FFT), a modified discrete cosine transform (MDCT), or a quadrature mirror filter (QMF) conversion.
The tone detection unit 132 is a processing unit that determines whether a tone is included in a sound signal near the boundary. For example, the tone detection unit 132 calculates a numerical value indicating a tone characteristic based on the sound signal near the boundary, and determines that the tone is included when the numerical value indicating the tone characteristic is equal to or larger than a threshold value. In the following description regarding the tone detection unit 132, a sound signal near the boundary is simply expressed as a sound signal. The tone detection unit 132 detects the presence or absence of a tone by performing a first tone detection processing or a second tone detection processing.
An example of the first tone detection processing will be described. The tone detection unit 132 calculates an inverse number of flatness of a power spectrum of the sound signal as a number T1 indicating the tone characteristic based on an equation (1). As the number T1 becomes smaller, the waveform of the frequency spectrum of the sound signal becomes more flat and the tone is less likely to be included. In the equation (1), X (ω) denotes the power of the sound signal corresponding to a frequency ω.
When the number T1 is larger than a threshold value TH1, the tone detection unit 132 determines that the tone is included in the sound signal. In the meantime, when the number T1 is not larger than the threshold value TH1, the tone detection unit 132 determines that the tone is not included in the sound signal.
An example of the second tone detection processing will be described. The tone detection unit 132 obtains an autocorrelation R(j) at a value x(i) of the sound signal at time i with respect to the time domain of the sound signal based on equations (2) and (3a), and calculates the maximum value of the autocorrelation R(j) as a number T2 indicating the tone characteristic. When the number T2 is larger than a threshold value TH2, the tone detection unit 132 determines that the tone is included in the sound signal. In the meantime, when the number T2 is not larger than the threshold value TH2, the tone detection unit 132 determines that the tone is not included in the sound signal.
The tone detection unit 132 performs the first tone detection processing or the second tone detection processing, and when it is determined that there is a tone, the tone detection unit 132 outputs information on the presence of a tone to the correction determination unit 133. Further, the tone detection unit 132 outputs the tone power to the low-frequency correction unit 140 and the high-frequency correction unit 160. Tone power is the power of the tones that are present at the boundary between the low-frequency and the high-frequency.
In the meantime, when the tone detection unit 132 determines that there is no tone, the tone detection unit 132 outputs information on the absence of a tone to the correction determination unit 133.
The tone detection unit 133 is a processing unit that acquires an encoding condition when information indicating that the tone is present from the tone detection unit 132 is acquired, and determines whether the low-frequency tone or the high-frequency tone of the sound signal is suppressed based on the encoding condition. The encoding condition includes, for example, information on an encoding bit rate. The information on the encoding condition may be input by the administrator or may be set in the correction determination unit 133 in advance.
The correction determination unit 133 determines that the encoding condition is a high rate when the value of the bit rate included in the encoding condition is equal to or larger than the threshold value. When it is determined that the encoding condition is a high rate, the correction determination unit 133 determines that the high-frequency tone is suppressed, and outputs a control signal to the high-frequency correction unit 160.
The correction determination unit 133 determines that the encoding condition is a low rate when the value of the bit rate included in the encoding condition is less than the threshold value. When it is determined that the encoding condition is a low rate, the correction determination unit 133 determines that the low-frequency tone is suppressed, and outputs the control signal to the low-frequency correction unit 140.
Referring back to
When the control signal is not received from the determination unit 130, the low-frequency correction unit 140 outputs the low-frequency signal received from the low-frequency signal extraction unit 110 to the low-frequency encoding unit 150 as it is.
The switch 141 is a switch that switches the path of the low-frequency signal according to the control signal acquired from the determination unit 130. When the switch 141 does not receive a control signal, the switch 141 connects a terminal 141a and a terminal 141b, thereby passing through the low-frequency signal as it is. When the switch 141 receives the control signal, the switch 141 connects the terminal 141a and the terminal 141c, thereby inputting the low-frequency signal to the tone suppression unit 144.
The suppression gain calculation unit 142 is a processing unit that calculates a gain for suppressing the tone of the low-frequency signal below a dynamic masking threshold value. The dynamic masking threshold value is a threshold value determined by a set of the frequency at which the suppression target tone is present and the tone power.
The dynamic masking threshold value of a tone 65A becomes a threshold value 66. Since the tone power of the tone 65A is above the threshold value 66, the sound of the tone 65A is heard. In the meantime, when the tone power of the tone 65A is suppressed and corrected to a tone 65B, the threshold value becomes less than 66, and the sound of the tone 65B is not heard.
The dynamic masking threshold value for a tone 65C becomes a threshold value 67. Since the tone power of the tone 65C is above a threshold value 67, the sound of the tone 65C is heard. In the meantime, when the tone power of the tone 65C is suppressed and corrected to a tone 65D, the threshold value becomes less than 67, and the sound of the tone 65D is not heard.
The suppression gain calculation unit 142 refers to a table that associates the tone frequency, the tone power, and the dynamic masking threshold value with each other to specify the dynamic masking threshold value. For example, the frequency of the tone is set to the frequency at the boundary between the low-frequency and the high-frequency. The suppression gain calculation unit 142 compares the tone power with the dynamic masking threshold value to specify a suppression gain at which the tone power is less than the dynamic masking threshold value. The suppression gain calculation unit 142 outputs the suppression gain to the smoothing unit 143.
The smoothing unit 143 is a processing unit that outputs a suppression gain that gradually increases to the tone suppression unit 144 in order to smoothly suppress the tone component of the low-frequency signal. For example, the smoothing unit 143 gradually increases the suppression gain from the initial value, and finally adjusts the magnitude of the suppression gain to the magnitude of the suppression gain notified from the suppression gain calculation unit 142.
The tone suppression unit 144 is a processing unit that suppresses the tone of the boundary by multiplying the tone component by the suppression gain acquired from the smoothing unit 143 and corrects the low-frequency signal. The tone suppression unit 144 outputs the corrected low-frequency signal to the low-frequency encoding unit 150.
As illustrated in the frequency spectrum 70a, there is a tone 71a at the boundary. The dynamic masking threshold value corresponding to the tone 71a is set to a dynamic masking threshold value 72. The tone suppression unit 144 corrects the tone 71a to a tone 71b by giving a suppression gain such that the tone 71a is less than the dynamic masking threshold value 72. As a result, the tone 71b is less than the dynamic threshold value 72 and is not heard, so that the sound quality of the sound signal may deteriorate.
Referring back to
The high-frequency correction unit 160 is a processing unit that corrects the high-frequency information by suppressing the envelope power of the boundary included in the high-frequency information when the control signal is received from the determination unit 130. The high-frequency correction unit 160 outputs the corrected high-frequency information to the high-frequency encoding unit 170.
When the control signal is not received from the determination unit 130, the high-frequency correction unit 160 outputs the high-frequency information acquired from the high-frequency information extraction unit 120 to the high-frequency encoding unit 170 as it is.
The switch 161 is a switch that switches the path of the high-frequency information according to the control signal obtained from the determination unit 130. When the switch 161 does not receive the control signal, the switch 161 connects a terminal 161a and a terminal 161b, thereby passing through the high-frequency information as it is. When the switch 161 receives the control signal, the switch 161 connects the terminal 161a and the terminal 161c, thereby inputting the high-frequency information to the tone suppression unit 164.
The suppression gain calculation unit 162 is a processing unit that calculates a gain that suppresses the envelope power (tone power) at the boundary included in the high-frequency information to the dynamic masking threshold value or less. The dynamic masking threshold is a threshold value determined by the frequency of the boundary and the envelope power of the boundary.
The suppression gain calculation unit 162 specifies the dynamic masking threshold value by referring to a table that associates the frequency of the boundary, the envelope power of the boundary, and the dynamic masking threshold value with each other. The suppression gain calculation unit 162 compares the envelope power at the boundary with the dynamic masking threshold value to specify the suppression gain at which the envelope power is less than the dynamic masking threshold value. The suppression gain calculation unit 162 outputs the suppression gain to the smoothing unit 163.
The smoothing unit 163 is a processing unit that outputs a suppression gain that gradually increases to the tone suppression unit 164 in order to smoothly suppress the value of the envelope power. For example, the smoothing unit 163 gradually increases the suppression gain from the initial value, and finally adjusts the magnitude of the suppression gain to the magnitude of the suppression gain notified from the suppression gain calculation unit 162.
The tone suppression unit 164 is a processing unit that corrects the high-frequency information by multiplying the suppression gain acquired from the smoothing unit 163 by the envelope power of the boundary. By suppressing the envelope power of the boundary, the tone of the boundary decoded by the decoding apparatus 20 is less than the dynamic masking threshold value. The tone suppression unit 164 outputs the corrected high-frequency information to the high-frequency encoding unit 170. Further, the tone suppression unit 164 corrects only the envelope power in the envelope power, the tone frequency, and the frequency resolution included in the high-frequency information, and does not correct the tone frequency and the frequency resolution.
For example, the dynamic masking threshold corresponding to an envelope power 76a near the boundary 77 is set to a dynamic masking threshold value 78. The tone suppression unit 164 corrects the high-frequency information by generating an envelope power 76b which suppresses the envelope power 76a so that the envelope power 76a of the boundary 77 becomes less than the dynamic masking threshold value 78. Since the envelope power 76b is less than the dynamic masking threshold value 78, the tone component of the boundary which is decoded based on the envelope power 76b is suppressed.
Referring back to
Next, the processing procedure of the determination unit 130 of the audio encoding apparatus 100 according to the first embodiment will be described.
The determination unit 130 determines whether the tone characteristic T is larger than the threshold value TH (operation S102). In the operation S102, the determination unit 130 compares the tone characteristic T1 with the threshold value TH1 when the tone characteristic T1 is calculated. When the tone characteristic T2 is calculated, the determination unit 130 compares the tone characteristic T2 with the threshold value TH2.
When it is determined that the tone T is larger than the threshold value TH (“YES” in the operation S102), the determination unit 130 determines that a tone is present (operation S104). In the meantime, when it is determined that the tone characteristic T is not larger than the threshold value TH (“NO” in the operation S102), the determination unit 130 determines that no tone is present (operation S103). The determination unit 130 calculates the tone power (operation S105).
When it is determined that the tone detection result indicates the presence of a tone (“YES” in the operation S201), the determination unit 130 determines whether the bit rate of the encoding condition is equal to or greater than a predetermined value (operation S203). When it is determined that the bit rate of the encoding condition is equal to or greater than the predetermined value (“YES” in the operation S203), the determination unit 130 outputs a control signal indicating that a high-frequency correction is performed to the high-frequency correction unit 160 (operation S204).
When it is determined that the bit rate of the encoding condition is not equal to or greater than the predetermined value (“NO” in the operation S203), the determination unit 130 outputs a control signal indicating that a low-frequency correction is performed to the low-frequency correction unit 140 (operation S205).
Next, an example of the processing procedure of the audio encoding apparatus 100 according to the first embodiment will be described.
The low-frequency signal extraction unit 110 of the audio encoding apparatus 100 extracts a low-frequency signal from the sound signal (operation S302). The high-frequency information extraction unit 120 of the audio encoding apparatus 100 extracts high-frequency information from the sound signal (operation S303).
The determination unit 130 of the audio encoding apparatus 100 determines the presence or absence of a tone at the boundary. When the tone is present, the determination unit 130 determines whether the low-frequency or the high-frequency is to be corrected (operation S304).
The low-frequency correction unit 140 of the audio encoding apparatus 100 corrects the low-frequency signal when it is determined that the low-frequency is corrected (operation S305). The high-frequency correction unit 160 of the audio encoding apparatus 100 corrects the envelope power of the high-frequency information when it is determined that the high-frequency is corrected (operation S306).
The low-frequency encoding unit 150 of the audio encoding apparatus 100 encodes the low-frequency signal to generate a low-frequency code (operation S307). The high-frequency encoding unit 170 of the audio encoding apparatus 100 encodes the high-frequency information to generate a high-frequency code (operation S308).
The multiplexing unit 180 of the audio encoding apparatus 100 generates a stream obtained by multiplexing the low-frequency code and the—high frequency code (operation S309). The multiplexing unit 180 transmits the stream to the decoding apparatus 20 (operation S310).
Next, the effect of the audio encoding apparatus 100 according to the first embodiment will be described. The audio encoding apparatus 100 suppresses one of the tones on the low-frequency side or the high-frequency side when the tone is detected at the boundary between the low-frequency and the high-frequency and then generates a stream obtained by multiplexing the low-frequency code and the high-frequency code. Thus, deterioration of the sound quality of the sound signal may be suppressed.
For example, the audio encoding apparatus 100 detects that the tone is at the boundary and suppresses the tone of the low-frequency signal, so that, for example, the tone 32a in
The audio encoding apparatus 100 determines whether the low-frequency tone or the high-frequency tone is suppressed by comparing the bit rate of the encoding condition with the threshold value and suppresses the tone of the bandwidth according to the determination result. As a result, it is possible to make a correction in the bandwidth with poor sound quality, depending on the bit rate. For example, when the bit rate is high, since the sound quality of the high-frequency is poor, the high-frequency is corrected. In the meantime, when the bit rate is low, since the sound quality of the low-frequency is poor, the low-frequency is corrected.
A spectrum 81b and a time waveform 82b are the spectrum and the time waveform related to a signal that is obtained by decoding the stream encoded by the encoding apparatus 10 in the related art by the decoding apparatus 20. A spectrum 81c and a time waveform 82c are the spectrum and the time waveform related to a signal that is obtained by decoding the stream encoded by the audio encoding apparatus 100 by the decoding apparatus 20.
The horizontal axis of the spectra 81a to 81c is an axis corresponding to the time, and the vertical axis thereof is an axis corresponding to the frequency. Further, the spectra 81a to 81c represent the magnitude of the power value due to light and darkness, and the bright part represents a large power, while the dark part represents a low power. The horizontal axis of the time waveforms 82a to 82c is an axis corresponds to the time, and the vertical axis thereof is an axis corresponding to the amplitude.
Upon comparing the spectra 81a to 81c and comparing the time waveforms 82a to 82c, the encoding of the audio encoding apparatus 100 may suppress the fluctuation and suppress the deterioration of the sound quality compared with the technology in the related art.
The audio encoding apparatus 100 illustrated in
For example, when the audio encoding apparatus 100 includes the low-frequency correction unit 140 and does not include the high-frequency correction unit 160, the low-frequency correction unit 140 corrects the low-frequency signal every time the tone of the boundary is detected. In the meantime, when the audio encoding apparatus 100 does not include the low-frequency correction unit 140 and includes the high-frequency correction unit 160, the high-frequency correction unit 160 corrects the envelope power of the high-frequency information every time the tone of the boundary is detected. With this configuration, it is possible to save the hardware resources of the audio encoding apparatus 100 and suppress the deterioration of the sound signal.
The determination unit 210 is a processing unit that acquires a sound signal from an external device and determines whether the tone is included in the boundary between the low-frequency and the high-frequency of the sound signal. Further, when the determination unit 210 determines that the tone is included in the boundary, the determination unit 210 outputs the control signal and the tone power to the input signal correction unit 220. A processing of determining by the determination unit 210 whether the tone is included in the boundary is the same as a processing of the determination unit 130 illustrated in the first embodiment.
The input signal correction unit 220 is a processing unit that corrects the sound signal by suppressing the tone component of the boundary included in the sound signal when a control signal is received from the determination unit 210. The input signal correction unit 220 outputs the corrected sound signal to the low-frequency signal extraction unit 110.
The switch 221 is a switch that switches the path of the sound signal according to the control signal obtained from the determination unit 210. When the switch 221 does not receive a control signal, the switch 221 connects a terminal 221a and a terminal 221b, thereby passing through the sound signal as it is. When the switch 221 receives the control signal, the switch 221 connects the terminal 221a and the terminal 221c, thereby inputting the sound signal to the tone suppression unit 224.
The suppression gain calculation unit 222 is a processing unit that calculates a gain for suppressing the tone located at the boundary of the sound signal below the dynamic masking threshold value. The suppression gain calculation unit 222 outputs the suppression gain to the smoothing unit 223. A processing of calculating the suppression gain by the suppression gain calculation unit 222 corresponds to a processing of the suppression gain calculation unit 142 illustrated in the first embodiment.
The smoothing unit 223 is a processing unit that outputs a suppression gain that gradually increases to the tone suppression unit 224 in order to smoothly suppress the tone component of the sound signal. For example, the smoothing unit 223 gradually increases the suppression gain from the initial value, and finally adjusts the magnitude of the suppression gain to the magnitude of the suppression gain notified from the suppression gain calculation unit 222.
The tone suppression unit 224 is a processing unit that suppresses the tone of the boundary by multiplying the suppression gain acquired from the smoothing unit 223 by the tone component at the boundary of the sound signal and corrects the low-frequency signal. The tone suppression unit 224 outputs the corrected sound signal to the low-frequency signal extraction unit 110.
Referring back to
Next, the effect of the audio coding apparatus 200 according to the second embodiment will be described. When the tone is detected at the boundary between the low-frequency and the high-frequency, the tone of the boundary of the sound signal is suppressed, and then a stream in which the low-frequency code and the high-frequency code are multiplexed is generated. As a result, deterioration of the sound quality of the sound signal may be suppressed. In addition, since the tone of the original sound signal is suppressed, it is possible to skip the processing of determining whether the low-frequency tone or the high-frequency tone is to be suppressed, so that the processing load may be reduced. It also makes it possible to save hardware resources.
The descriptions of the low-frequency signal extraction unit 110, the high-frequency information extraction unit 120, the high-frequency encoding unit 170, and the multiplexing unit 180 are the same as that of the low-frequency signal extraction unit 110, the high-frequency information extraction unit 120, the high-frequency encoding unit 170, and the multiplexing unit 180 described in the first embodiment, respectively.
The correction control unit 310 is a processing unit that limits a bandwidth to be encoded when encoding the low-frequency signal. The correction control unit 310 is an example of an encoding unit. With respect to the third embodiment, in the following description, the bandwidth to be encoded when encoding the low-frequency signal is expressed as an “encoding target bandwidth.”
For example, the default bandwidth of an encoding target bandwidth is an encoding target bandwidth 87a. The correction control unit 310 corrects the encoding target bandwidth 87a to an encoding target bandwidth 87b. For example, in the correction control unit 310, the encoding target bandwidth 87b corresponds to a case where the upper limit of the encoding target band 87a is shifted to the low-frequency by one sub-band. The correction control unit 310 outputs information of the corrected encoding target bandwidth to the low-frequency encoding unit 320.
The low-frequency encoding unit 320 is a processing unit that acquires a low-frequency signal from the low-frequency signal extraction unit 110 and generates a low-frequency code by encoding the low-frequency signal into a bit string. The low-frequency encoding unit 320 outputs the low-frequency code to the multiplexing unit 180. Further, the low-frequency encoding unit 320 encodes a low-frequency signal that is included in the encoding target bandwidth 87b received from the correction control unit 310. Since the encoding target bandwidth 87b does not include the tone 86a at the boundary 86, the tone 86a is not included in the low-frequency code, and as a result, the deterioration of the sound quality may be suppressed.
Next, the effect of the audio encoding apparatus 300 according to the third embodiment will be described. When the low-frequency signal is encoded, the audio encoding apparatus 300 performs an encoding on the sound signal of the encoding target bandwidth excluding a boundary where the tone is present. This makes it possible to suppress the deterioration of the sound quality since the tone of the boundary is not included in the low-frequency signal.
The descriptions of the low-frequency signal extraction unit 110, the low-frequency encoding unit 150, the high-frequency encoding unit 170, and the multiplexing unit 180 are the same as that of the low-frequency signal extraction unit 110, the low-frequency encoding unit 150, the high-frequency encoding unit 170, and the multiplexing unit 180 described in the first embodiment, respectively.
The correction control unit 302 is a processing unit that limits a target bandwidth when encoding a high-frequency signal. The correction control unit 302 is an example of an encoding unit. Regarding a fourth embodiment, in the following description, a bandwidth to be used when encoding a high-frequency signal is expressed as an “encoding target bandwidth.”
For example, the default bandwidth of an encoding target bandwidth is an encoding target bandwidth 89a. The correction control unit 302 corrects the encoding target bandwidth 89a to an encoding target bandwidth 89b. For example, the encoding target bandwidth 89b corresponds to a case where the lower limit of the encoding target bandwidth 89a is shifted to the high-frequency by one sub-band. The correction control unit 302 outputs the corrected information of the encoding target bandwidth to the high-frequency information extraction unit 303.
The high-frequency information extraction unit 303 is a processing unit that acquires a sound signal from an external device and extracts high-frequency information from the high-frequency of the sound signal (an encoding target bandwidth 89b illustrated in
Next, the effect of the audio encoding apparatus 301 according to the fourth embodiment will be described. When the high-frequency signal is encoded, the audio encoding apparatus 301 encodes the sound signal of the encoding target bandwidth excluding a boundary where the tone is present. This makes it possible to suppress deterioration of the sound quality since the tone of the boundary is not included in the high-frequency signal.
The descriptions of the low-frequency signal extraction unit 110, the high-frequency information extraction unit 120, the determination unit 130, the low-frequency correction unit 140, the low-frequency encoding unit 150, the high-frequency encoding unit 170, and the multiplexing unit 180 are the same as that of the respective processing units illustrated in
The high-frequency correction unit 410 is a processing unit that corrects high-frequency information by correcting the tone frequency included in the high-frequency information when a control signal is received from the determination unit 130. For example, the information of the tone frequency includes information on the presence or absence of a tone for a plurality of high-frequency bandwidths divided according to the resolution. When the presence or absence of the tone in the bandwidth corresponding to the boundary is indicated as “presence,” the high-frequency correction unit 410 corrects the presence or absence of the tone in the bandwidth corresponding to the boundary to “absence.”
The switch 411 is a switch that switches the path of the high-frequency information according to the control signal acquired from the determination unit 130. When the switch 411 does not receive a control signal, a terminal 411a and a terminal 411b are connected to each other to allow the high-frequency information to pass therethrough. When the control signal is received, the switch 411 inputs the high-frequency information to the additional tone suppression unit 412 by connecting the terminal 411a and the terminal 411c.
The additional tone suppression unit 412 is a processing unit that corrects the tone frequency included in the high-frequency information.
For example, the tone frequency is information that indicates whether there is a tone in the corresponding bandwidth by “0” or “1,” and the fineness of the divided bandwidths depends on the frequency resolution. When there is a tone, “1” is set for the block of the corresponding bandwidth, and when there is no tone, “0” is set for the block of the corresponding bandwidth.
Tone frequencies 95a and 95b illustrated in
When the block 21 having the tone frequency 95a is set to “1,” the additional tone suppression unit 412 generates the tone frequency 95b by correcting the block 21 to “0.” The additional tone suppression unit 412 outputs the high-frequency information including the corrected tone frequency 95b, the envelope power, and the frequency resolution to the high-frequency encoding unit 170.
Next, the effect of the audio encoding apparatus 400 according to the fifth embodiment will be described. When the tone is present at the boundary, the audio encoding apparatus 400 corrects the tone frequency of the high-frequency information so that the tone is not present at the boundary. This makes it possible to suppress the deterioration of the sound quality because no tone is generated at the boundary of the high-frequency signal that is decoded based on the corrected high-frequency information.
The processing of the audio encoding apparatuses 100 to 400 illustrated in the first to fifth embodiments is an example. Herein, descriptions will be made of the other processing of the audio encoding device. Here, such descriptions will be made using a block diagram of the audio encoding apparatus 100 illustrated in
The determination unit 130 of the audio encoding apparatus 100 may compare the error power of the low-frequency with the error power of the high-frequency to determine whether the low-frequency tone or the high-frequency tone is suppressed.
For example, a low-frequency signal of a sound signal (original sound) is referred to as a first low-frequency signal, and a low-frequency signal obtained by decoding the low-frequency signal is referred to as a second low-frequency signal. The error power of the low-frequency is regarded as a difference value between the first low-frequency signal and the second low-frequency signal. The high-frequency signal of the sound signal (original sound) is referred to as a first high-frequency signal, and the high-frequency signal decoded based on the high-frequency code is referred to as a second high-frequency signal. The error power of the high-frequency is regarded as a difference value between the first high-frequency signal and the second high-frequency signal.
When the error power of the low-frequency is higher than the error power of the high-frequency, the determination unit 130 determines that the high-frequency tone is suppressed. In the meantime, when the error power of the low-frequency is equal to or lower than the error power of the high-frequency, the determination unit 130 determines that the low-frequency tone is suppressed.
When it is determined that the tone detection result indicates the presence of a tone (“YES” in the operation S401), the determination unit 130 determines whether the error power of the low-frequency is higher than the error power of the high-frequency (operation S403). When it is determined that the error power of the low-frequency is higher than the error power of the high-frequency (“YES” in the Operation S403), the determination unit 130 outputs a control signal indicating that the high-frequency correction is performed to the high-frequency correction unit 160 (Operation S404).
When it is determined that the error power of the low-frequency is not higher than the error power of the high-frequency (“NO” in the operation S403), the determination unit 130 outputs a control signal indicating that the low-frequency correction is performed to the low-frequency correction unit 140 (operation S405).
As described above, it is possible to appropriately select a bandwidth that suppresses the tone to improve the sound quality by feedbacking whether the bandwidth in which the tone has actually been suppressed is appropriate based on a comparison of the error power of the low-frequency and the error power of the high-frequency as described above.
Prior to describing a sixth embodiment, the problem of the audio encoding apparatus 100 described in the first embodiment will be described. When the decoding apparatus 20 decodes the encoded stream generated by the audio encoding apparatus 100, the quality of the sound signal after decoding may deteriorate depending on the setting of the inverse filter mode of the decoding apparatus 20, as described in
For example, when the audio encoding apparatus 100 detects a tone 903 near the boundary 902, the low-frequency signal is corrected by suppressing the tone 903 included in the low-frequency, thereby generating a low-frequency code in which the low-frequency signal is encoded. The audio encoding apparatus 100 generates an encoded stream by multiplexing the low-frequency code and the high-frequency code obtained by encoding the high-frequency information, and outputs the generated encoded stream to the decoding apparatus 20.
The decoding apparatus 20 generates a frequency spectrum 910 by decoding the encoded stream received from the audio encoding apparatus 100. Here, a frequency spectrum 920 may be generated depending on the processing of the decoding apparatus 20. For the frequency spectra 910 and 920, the horizontal axis is an axis corresponding to the frequency and the vertical axis is an axis corresponding to the power (value).
The frequency spectrum 910 is an appropriately decoded frequency spectrum and includes a tone 912 near a boundary 911. In the meantime, the frequency spectrum 920 does not include the tone near a boundary 921, and the quality of the sound signal deteriorates.
Next, descriptions will be made of the reason why the tone is not generated near the boundary 921 of the frequency spectrum 920. For example, the decoding apparatus 20 that uses an SBR technology has a function of turning ON/OFF the reverse filter mode.
When the inverse filter mode is “OFF,” the decoding apparatus 20 replicates the low-frequency of the frequency spectrum to the high-frequency to generate a sound signal. In this way, when the decoding apparatus 20 performs a processing of replicating the frequency spectrum of the low-frequency to the high-frequency, the frequency spectrum 910 illustrated in
In the meantime, when the inverse filter mode is “ON,” the decoding apparatus 20 generates a sound signal by decorrelating the low-frequency of the frequency spectrum and then replicating it to the high-frequency. Thus, when the decoding apparatus 20 decorrelates the low-frequency signal and then replicates the high-frequency, no tone is generated in the high-frequency, and the frequency spectrum 920 illustrated in
The decoding apparatus 20 generates the frequency spectrum 931 by decorrelating the low-frequency of the frequency spectrum 930. The decoding apparatus 20 generates the frequency spectrum 932 by selecting a bandwidth 931a of the frequency spectrum 931 and replicating the frequency spectrum of the selected bandwidth 931a to the high-frequency. The decoding apparatus 20 decodes the final frequency spectrum by performing an envelope adjustment on the frequency spectrum 932. As described in
In order to solve the problem described with reference to
The time-frequency conversion unit 601 is a processing unit that converts the sound signal into a time-frequency signal. The time-frequency conversion unit 601 outputs the time-frequency signal to the high-frequency information extraction unit 602, the determination unit 604, and the low-frequency extraction unit 605.
For example, the time-frequency conversion unit 601 converts a sound signal s[n] into a frequency signal S[k][n] using a quadrature mirror filter (QMF) filter bank defined by an equation (3). In the equation (3), n is a variable representing time, and k is a variable representing a frequency.
The time-frequency conversion unit 601 generates a time-frequency signal L[k][n] by associating each time with a frequency signal S of each frequency.
Referring back to
Further, the high-frequency information extraction unit 602 estimates whether the inverse filter mode set in the decoding apparatus 700 is ON or OFF based on the time-frequency signal. The high-frequency information extraction unit 602 outputs information of the estimated inverse filter mode to the low-frequency correction unit 606.
The high-frequency information extraction unit 602 calculates an average value of the tone components of the time-frequency signal. The average value of the tone components is expressed as a “bandwidth tone component.” The high-frequency information extraction unit 602 calculates the average power in a frame using the bandwidth tone component. The frame corresponds to the data obtained by dividing the time-frequency signal by a predetermined time. The high-frequency information extraction unit 602 smoothes the bandwidth tone component of the current frame using the bandwidth tone component of the previous frame.
The high-frequency information extraction unit 602 determines whether the inverse filter mode is ON or OFF based on the smoothed bandwidth tone component and the average power. For example, the high-frequency information extraction unit 602 determines the inverse filter level by performing a threshold value comparison as described with reference to
As illustrated in
When it is determined that the bandwidth tone component is equal to or larger than the first threshold value (“YES” in the operation S31), the high-frequency information extraction unit 602 proceeds to the operation S33. When it is determined that the bandwidth tone component is less than the second threshold value (“NO” in the operation S33), the high-frequency information extraction unit 602 determines that the inverse filter level is 1 (operation S34) and proceeds to the operation S38.
When it is determined that the bandwidth tone component is equal to or greater than the second threshold value (“YES” in the operation S33), the high-frequency information extraction unit 602 proceeds to the operation S35. When it is determined that the bandwidth tone component is less than the third threshold value (“NO” in the operation S35), the high-frequency information extraction unit 602 determines that the inverse filter level is 2 (operation S36) and proceeds to the operation S38.
When it is determined that the bandwidth tone component is equal to or greater than the third threshold value (“YES” in the operation S35), the high-frequency information extraction unit 602 determines that the inverse filter level is 3 (operation S37) and proceeds to the operation S38.
The high-frequency information extraction unit 602 determines whether the average power is less than the fourth threshold value (operation S38). When it is determined that the average power is less than the fourth threshold value (“YES” in the operation S38), the high-frequency information extraction unit 602 updates the inverse filter level to 0 (operation S39), and ends the processing of determining the inverse filter level. In the meantime, when it is determined that the average power is equal to or greater than the fourth threshold value (“NO” in the operation S38), the high-frequency information extraction unit 602 ends the processing of determining the inverse filter level.
In order to avoid a processing of a reverse filter for the signals which are mostly silent, the inverse filter level is set to “0” when the average power is very small. For this reason, the fourth threshold value is set to a very small value.
The high-frequency information extraction unit 602 executes the processing illustrated in
Referring back to
The determination unit 604 is a processing unit that determines whether the tone is included in the boundary between the low-frequency and the high-frequency of the sound signal based on the time-frequency signal. When it is determined that the tone is included in the boundary, the determination unit 604 outputs the control signal to the low-frequency correction unit 606. A processing of determining by the determination unit 604 whether the tone is included in the boundary between the low-frequency and the high-frequency of the sound signal is the same as the processing of the determination unit 130.
The low-frequency extraction unit 605 is a processing unit that extracts low-frequency information of a time-frequency signal. The low-frequency extraction unit 605 outputs the extracted low-frequency information to the low-frequency correction unit 606. An administrator is configured to set the upper limit frequency of the low-frequency in advance.
The low-frequency correction unit 606 is a processing unit that performs a low-frequency correction based on the information of the inverse filter mode and the control signal. Specifically, the low-frequency correction unit 606 performs the low-frequency correction when the inverse filter mode is “OFF” and the control signal is received (when the tone is included). The low-frequency correction unit 606 performs the low-frequency correction for the low-frequency of the time-frequency signal. For example, the low-frequency correction unit 606 performs the low-frequency correction by suppressing the tone component included in the low-frequency of the time-frequency signal. The low-frequency correction unit 606 outputs the time-frequency signal subjected to the low-frequency correction to the frequency-time conversion unit 607.
In the meantime, the low-frequency correction unit 606 does not perform the low-frequency correction when the inverse filter mode is “ON” or when the control signal is not received (when the tone is not included), and outputs the low-frequency information of the time-frequency signal to the frequency-time conversion unit 607.
In the meantime, when it is determined that the inverse filter mode is OFF (“NO” in the operation S50), the low-frequency correction unit 606 determines whether the control signal is received (operation S52). When it is determined that no signal is received (“NO” in the operation S52), the low-frequency correction unit 606 proceeds to the operation S51.
When it is determined that the control signal is received (“YES” in the operation S52), the low-frequency correction unit 606 suppresses the tone component included in the low-frequency of the time-frequency signal (operation S53). The low-frequency correction unit 606 outputs the low-frequency information of the time-frequency signal, for which the tone is suppressed, to the frequency-time conversion unit 607 (operation S54).
The description of
For example, the frequency-time conversion unit 607 converts a time-frequency signal S′[k][n] into a low-frequency signal Slow(n) according to the filter bank defined by an equation (4). In the equation (4), Klow=32 and Nlow=128. Here, the time-frequency signal S′[k][n] corresponds to the time-frequency signal for which the low-frequency correction is performed by the low-frequency correction unit 606, or the time-frequency signal for which the low-frequency correction is not performed.
The low-frequency encoding unit 608 is a processing unit that generates a low-frequency code by encoding a low-frequency signal into a bit string. For example, the low-frequency encoding unit 608 performs an encoding based on the AAC. The low-frequency encoding unit 608 outputs the low-frequency code to the multiplexing unit 609.
The multiplexing unit 609 is a processing unit that generates an encoded stream by multiplexing the low-frequency code and the high-frequency code. The multiplexing unit 609 transmits the encoded stream to the decoding apparatus 700 via the network 50.
For example, the multiplexing unit 609 outputs the encoded stream in an MPEG-4 ADTS (audio data transport stream) format.
For example, the ADTS frame 952 includes an ADTS header 960 and a RAW data block 961. A low-frequency code 970 and a FILL element 971 are stored in the RAW data block 961. The high-frequency code 972 is also stored in the FILL element 971. The data structure of the ADTS frames 951, 953, and 954 is the same as the data structure of the ADTS frame 952.
Next, the decoding apparatus 700 illustrated in
The code separation unit 701 is a processing unit that receives the encoded stream from the audio encoding apparatus 600 and separates the low-frequency code and the high-frequency code included in the encoded stream. The code separation unit 701 outputs the low-frequency code to the low-frequency decoding unit 702. The code separation unit 701 outputs the high-frequency code to the high-frequency inverse quantization unit 704.
The low-frequency decoding unit 702 is a processing unit that generates a low-frequency signal by decoding the low-frequency code. The low-frequency decoding unit 702 outputs the low-frequency signal to the analysis QMF unit 703.
The analysis QMF unit 703 is a processing unit that converts the low-frequency signal into a time-frequency signal using the QMF filter bank defined by the equation (3). This time-frequency signal is information corresponding to the frequency spectrum of the low-frequency of each time. In the following description, the time-frequency signal obtained by converting the low-frequency signal is referred to as a “low-frequency signal.”
The high-frequency inverse quantization unit 704 is a processing unit that extracts high-frequency information by decoding the high-frequency code. The high-frequency inverse quantization unit 704 outputs the extracted high-frequency information to the high-frequency generation unit 705. The high-frequency information includes an envelope power, a tone frequency, and a frequency resolution.
The high-frequency generation unit 705 is a processing unit that generates a high-frequency signal based on the low-frequency signal. The high-frequency signal generated by the high-frequency generation unit 705 is information corresponding to the frequency spectrum of the high-frequency representing a relationship between the time and the frequency. The high-frequency generation unit 705 outputs the high-frequency signal and the high-frequency information to the envelope adjusting unit 706.
Hereinafter, descriptions will be made of the processing of the high-frequency generation unit 705 when the inverse filter mode is OFF and the processing of the high-frequency generation unit 705 when the inverse filter mode is ON. The ON/OFF of the inverse filter mode is set in the high-frequency generation unit 705 in advance.
Descriptions will be made of the processing of the high-frequency generation unit 705 when the inverse filter mode is “OFF.” The high-frequency generation unit 705 generates a high-frequency signal by replicating the low-frequency signal to the high-frequency side as it is.
Descriptions will be made of the processing of the high-frequency generation unit 705 when the inverse filter mode is “ON.” When the inverse filter mode is “ON,” the high-frequency generation unit 705 generates a high-frequency signal by performing an inverse filter (performing a decorrelation) on the low-frequency signal and replicating the low-frequency signal on which the inverse filter is performed to the high-frequency side. The decorrelation performed by the high-frequency generation unit 705 on the low-frequency signal is an example of correction for the low-frequency signal.
The envelope adjusting unit 706 is a processing unit that adjusts the high-frequency signal based on the frequency resolution and the envelope power included in the high-frequency information. The envelope adjusting unit 706 also gives a tone component to the high-frequency signal based on the tone frequency. The envelope adjusting unit 706 outputs the adjusted high-frequency signal to the synthesizing unit 707.
The synthesizing unit 707 is a processing unit that decodes the sound signal by synthesizing the low-frequency signal output from the analysis QMF unit 703 and the adjusted high-frequency signal output from the envelope adjusting unit 706. The synthesizing unit 707 outputs the decoded sound signal.
Next, an example of the processing procedure of the audio encoding apparatus 600 according to the sixth embodiment will be described.
The high-frequency information extraction unit 602 of the audio encoding apparatus 600 extracts high-frequency information from a sound signal (time-frequency signal) (operation S503). The high-frequency encoding unit 603 of the audio encoding apparatus 600 encodes the high-frequency information and generates a high-frequency code (operation S504). The high-frequency information extraction unit 602 estimates the ON/OFF of the inverse filter mode (operation S505).
The low-frequency extraction unit 605 of the audio encoding apparatus 600 extracts a low-frequency signal from a sound signal (time-frequency signal) (operation S506). The low-frequency correction unit 606 performs a correction determination processing (operation S507). The processing procedure of the correction determination processing of the operation S507 corresponds to the processing procedure described with reference to
The frequency-time conversion unit 607 of the audio encoding apparatus 600 performs a frequency-time conversion with respect to the low-frequency signal (operation S508). The low-frequency encoding unit 608 encodes the low-frequency signal and generates a low-frequency code (operation S509).
The multiplexing unit 609 of the audio encoding apparatus 600 generates an encoded stream by multiplexing the low-frequency code and the high-frequency code (operation S510). The multiplexing unit 609 transmits the encoded stream to the decoding apparatus 700 (operation S511).
Next, an example of the processing procedure of the decoding apparatus 700 according to the sixth embodiment will be described.
The low-frequency decoding unit 702 of the decoding apparatus 700 generates a low-frequency signal by decoding the low-frequency code (operation S602). The analysis QMF unit 703 of the decoding apparatus 700 generates a low-frequency signal using the QMF filter bank (operation S603).
The high-frequency inverse quantization unit 704 of the decoding apparatus 700 generates high-frequency information by performing a high-frequency inverse quantization on the high-frequency code (operation S604). The high-frequency generation unit 705 of the decoding apparatus 700 determines whether the inverse filter mode is on (operation S605).
When it is determined that the inverse filter mode is OFF (“NO” in the operation S605), the high-frequency generation unit 705 proceeds to the operation S607. In the meantime, when it is determined that the inverse filter mode is ON (“YES” in the operation S605), the high-frequency generation unit 705 performs an inverse filter processing on the low-frequency signal (operation S606).
The high-frequency generation unit 705 generates a high-frequency signal by replicating the low-frequency signal (operation S607). The envelope adjusting unit 706 of the decoding apparatus 700 adjusts the enveloping of the high-frequency signal based on the high-frequency information (operation S608).
The synthesizing unit 707 of the decoding apparatus 700 decodes the sound signal by synthesizing the low-frequency signal and the high-frequency signal (operation S609). The synthesizing unit 707 outputs the sound signal (operation S610).
Next, the effect of the audio coding apparatus 600 according to the sixth embodiment will be described. The audio encoding apparatus 600 controls the presence or absence of correction of the low-frequency signal according to the ON/OFF of the inverse filter mode. For example, when the inverse filter mode is “OFF,” the audio encoding apparatus 600 suppresses the tone by correcting the low-frequency signal. In the meantime, when the inverse filter mode is “ON,” the audio encoding apparatus 600 does not suppress the low-frequency signal tone by not performing the low-frequency signal correction. In this way, the suppression of the tone is controlled according to the ON/OFF of the inverse filter mode, and the problem of quality deterioration of the sound signal is resolved when the decoding apparatus 700 performs a decoding.
When the inverse filter mode is “OFF,” the audio encoding apparatus 600 suppresses the tone by performing the low-frequency signal correction, thereby suppressing the vibration caused by generation of a plurality of tones near the boundary between the low-frequency and the high-frequency and resolving the problem of quality deterioration of the sound signal.
In addition, when the inverse filter mode is “ON,” the audio encoding apparatus 600 does not perform the low-frequency signal correction, thereby resolving the problem of quality deterioration of the sound signal which is caused by no generation of tones near the boundary between the low-frequency and the high-frequency.
The audio encoding apparatus 600 estimates whether the inverse filter mode is ON or OFF based on the average value of the tone components included in the sound signal and the average power of the sound signal. Thus, whether the inverse filter is executed on the decoding apparatus 700 side may be automatically estimated in accordance with the characteristics of the sound signal.
The decoding apparatus 700 according to the sixth embodiment corrects the frequency spectrum of the low-frequency signal (performs an inverse filter on the low-frequency) according to the ON/OFF of the inverse filter mode and decodes the high-frequency signal using the corrected frequency spectrum of the low-frequency signal. As described above, the tone component of the low-frequency signal is not corrected when the inverse filter mode is on. Thus, even when the inverse filter mode is performed, the audio encoding apparatus 600 may resolve the problem of sound quality deterioration since the tone component remains near the boundary of the decoded sound signal.
Next, descriptions will be made of an example of the hardware configuration of a computer that implements the same functions as those of the audio encoding apparatus 100 (200, 300, 301, 400, or 600) illustrated in the above-described embodiment.
As illustrated in
The hard disk device 507 includes a determination program 507a, an encoding program 507b, and a multiplexing program 507c. The CPU 501 reads the determination program 507a, the encoding program 507, and the multiplexing program 507c to develop these programs in the RAM 506.
The determination program 507a functions as a determination processing 506a. The encoding program 507b functions as an encoding processing 506b. The multiplexing program 507c functions as a multiplexing processing 506c.
The determination processing 506a corresponds to the processing of the determination units 130, 210, and 604. The encoding processing 506b corresponds to the processing of a low-frequency signal extraction unit 110, a high-frequency information extraction unit 120, a low-frequency correction unit 140, an input signal correction unit 220, the low-frequency encoding units 150 and 320, the high-frequency correction units 160 and 410, a high-frequency encoding unit 170, and the encoding unit 600a. The multiplexing processing 506c corresponds to the processing of the multiplexing units 180 and 609.
Next, descriptions will be made of an example of the hardware configuration of a computer that implements the same function as the decoding apparatus 700 illustrated in the above-described embodiment.
As illustrated in
The hard disk device 557 includes a separation program 557a, a low-frequency decoding program 557b, a high-frequency generation program 557c, and a synthesis program 557d. The CPU 551 reads the separation program 557a, the low-frequency decoding program 557b, the high-frequency generation program 557c, and the synthesis program 557d to develop these programs in the RAM 556.
The separation program 557a functions as a separation processing 556a. The low-frequency decoding program 557b functions as a low-frequency decoding processing 556b. The high-frequency generation program 557c functions as a high-frequency generation processing 556c. The synthesis program 557d functions as a synthesis processing 556d.
The separation processing 556a corresponds to the processing of the code separation unit 701. The low-frequency decoding processing 556b corresponds to the processing of the low-frequency decoding unit 702. The high-frequency generation processing 556c corresponds to the processing of the high-frequency generation unit 705. The synthesis processing 556d corresponds to the processing of the synthesizing unit 707.
Further, each of the programs 507a to 507c and 557a to 557d may not necessarily be stored in the hard disk devices 507 and 557 from the beginning. For example, each program is stored in a “portable physical medium” such as a flexible disk (FD), a CD-ROM, a DVD disk, a magneto-optical disk, or an IC card inserted in the computer 500 or 550. Then, the computers 500 and 550 may be configured to read and execute the programs 507a to 507c and 557a to 557d, respectively.
All examples and conditional language recited herein are intended for pedagogical purposes to aid the reader in understanding the invention and the concepts contributed by the inventor to furthering the art, and are to be construed as being without limitation to such specifically recited examples and conditions, nor does the organization of such examples in the specification relate to an illustrating of the superiority and inferiority of the invention. Although the embodiments of the present invention have been described in detail, it should be understood that the various changes, substitutions, and alterations could be made hereto without departing from the spirit and scope of the invention.
Number | Date | Country | Kind |
---|---|---|---|
2017-147119 | Jul 2017 | JP | national |
2017-199673 | Oct 2017 | JP | national |
Number | Name | Date | Kind |
---|---|---|---|
20030142746 | Tanaka | Jul 2003 | A1 |
20070156397 | Chong et al. | Jul 2007 | A1 |
20100106511 | Shirakawa | Apr 2010 | A1 |
20110054885 | Nagel | Mar 2011 | A1 |
20110288873 | Nagel | Nov 2011 | A1 |
20120243526 | Yamamoto | Sep 2012 | A1 |
20130275142 | Hatanaka | Oct 2013 | A1 |
20150162010 | Ishikawa | Jun 2015 | A1 |
20150170663 | Disch | Jun 2015 | A1 |
20160111103 | Nagisetty et al. | Apr 2016 | A1 |
Number | Date | Country |
---|---|---|
2728577 | May 2014 | EP |
3343560 | Jul 2018 | EP |
2016-173597 | Sep 2016 | JP |
2005104094 | Nov 2005 | WO |
2013124445 | Aug 2013 | WO |
2014199632 | Dec 2014 | WO |
Entry |
---|
Extended European Search Report dated Nov. 20, 2018 for corresponding European Patent Application No. 18182629.8, 9 pages. |
Number | Date | Country | |
---|---|---|---|
20190035413 A1 | Jan 2019 | US |