The present disclosure relates to an audio-signal processing apparatus and method, and program. In particular, the present disclosure relates to an audio-signal processing apparatus and method, and program that is suitable for use in the case of expanding a dynamic range of an audio signal.
When an audio signal is transmitted through a network, such as the Internet, etc., a transmittable peak value of the audio signal (hereinafter referred to as a maximum transmission level) is limited by capacity of a transmission line, a standard, etc. And if an audio signal is attenuated such that a peak value of the audio signal becomes not higher than a maximum transmission level, the wider an audio signal has a dynamic range, the lower an average value of the signal level (hereinafter referred to as an average level) becomes, and a sound-volume feeling is lost.
Thus, in order to increase a sound-volume feeling, there are cases where an audio signal is amplified to increase an average level once, then, components of the audio signal exceeding a maximum transmission level are cut or attenuated by a limiter circuit, etc., and the audio signal is transmitted. A description will be given of a specific example of this processing with reference to FIG. 1A to
Individual graphs in
The audio signal in
On the other hand, to date, an expander has become widespread as a technique for expanding a dynamic range of an audio signal (for example, refer to Japanese Unexamined Patent Application Publication No. 2001-230647). The expander changes an amplification gain in accordance with an audio signal level so as to expand a dynamic range of the audio signal. Also, an expander enabling a user to adjust an expansion characteristic is provided.
Using an expander having this input/output characteristic, it is possible to make a small sound still smaller, and to make a loud sound further louder. As a result, it is possible to expand a dynamic range, and to obtain a sound that is nicely varied.
Here, consider the case where the dynamic range of the audio signal in
Normally, an average level of an audio signal is decreased to a predetermined value before the audio signal is input into the expander in order for a peak value of the audio signal not to become too high by expanding the dynamic range. For example, the audio signal in
The present disclosure has been made in view of such circumstances. It is desirable to properly recover a dynamic range of an audio signal.
According to an embodiment of the present disclosure, there is provided an audio-signal processing apparatus including: a center-localization-degree detection section detecting a center-localization degree indicating a degree of concentration on a center of localization distribution of an audio signal; an expansion-section detection device detecting an expansion section expanding a dynamic range of the audio signal on the basis of the center-localization degree; and an expansion section expanding the dynamic range of the audio signal in the expansion section.
In the above-described audio-signal processing apparatus, the expansion-section detection device may detect a section having a center-transition degree indicating a degree of transition of the audio signal on the center of localization distribution not less than a predetermined threshold value as the expansion section.
The center-transition degree may be a derivative of the center-localization degree.
The expansion-section detection device may detect a section having the center-localization degree not less than a predetermined threshold value as the expansion section.
The above-described audio-signal processing apparatus may further include: an input-signal-level detection section detecting an input signal level indicating a level of the audio signal; and an expansion-signal-level setting section setting an expansion signal level indicating an expansion level of the dynamic range of the audio signal and having a predetermined minimum value in a section different from the expansion section and changing within a range not higher than a predetermined maximum value in the expansion section, wherein the expansion section expands the dynamic range of the audio signal in a section having the expansion signal level higher than the input signal level.
The above-described audio-signal processing apparatus may further include: a comparison section comparing the input signal level and the expansion signal level, and outputting either higher one of the levels; and a expansion-gain calculation section calculating an expansion gain on the basis of the output value of the comparison section and the input signal level, wherein the expansion section amplifies the audio signal on the basis of the expansion gain so as to expand the dynamic range of the audio signal.
The above-described audio-signal processing apparatus may further include a correction section correcting the level of the audio signal so as to keep the signal at a constant level, wherein the center-localization-degree detection section detects a center-localization degree of the corrected audio signal, and the expansion section expands the dynamic range of the corrected audio signal.
According to another embodiment of the present disclosure, there is provided A method of processing an audio signal by an audio-signal processing apparatus expanding a dynamic range of the audio signal, the method including: detecting a center-localization degree indicating a degree of concentration on a center of localization distribution of the audio signal; detecting an expansion section expanding the dynamic range of the audio signal on the basis of the center-localization degree; and expanding the dynamic range of the audio signal in the expansion section.
According to another embodiment of the present disclosure, there is provided a program for causing a computer to perform processing including: detecting a center-localization degree indicating a degree of concentration on a center of localization distribution of an audio signal; detecting an expansion section expanding a dynamic range of the audio signal on the basis of the center-localization degree; and expanding the dynamic range of the audio signal in the expansion section.
In an embodiment of the present disclosure, a center-localization degree indicating a degree of concentration on a center of localization distribution of an audio signal is detected, an expansion section expanding a dynamic range of the audio signal is detected on the basis of the center-localization degree, and the dynamic range of the audio signal is expanded in the expansion section.
By an embodiment of the present disclosure, it is possible to properly recover a dynamic range of an audio signal.
In the following, descriptions will be given of modes for carrying out the present disclosure (hereinafter called embodiments). In this regard, the descriptions will be given in the following order.
1. First embodiment (basic embodiment)
2. Second embodiment (example of suppressing excessive expansion of dynamic range)
3. Variations
An audio-signal processing apparatus 101 in
The AGC 111 performs level correction so as to keep the input signal constant. The AGC 111 supplies the level-corrected input signal (hereinafter referred to as a corrected input signal) to the expansion-section detection device 112 and the dynamic range expander 115.
The expansion-section detection device 112 detects an expansion section in which a dynamic range of the input signal is expanded on the basis of the corrected input signal. The expansion-section detection device 112 supplies an expansion-section detection signal indicating the detected expansion section to the expansion-signal-level setting section 113.
The expansion-signal-level setting section 113 sets an expansion signal level indicating a level to which the dynamic range of the input signal is expanded on the basis of the expansion-section detection signal. The expansion-signal-level setting section 113 supplies an expansion-signal-level signal indicating the set expansion signal level to the expansion-gain calculator 114.
The expansion-gain calculator 114 calculates an expansion gain for expanding the dynamic range of the input signal on the basis of the expansion-signal-level signal. The expansion-gain calculator 114 supplies an expansion-gain signal indicating the calculated expansion gain to the dynamic range expander 115.
The dynamic range expander 115 amplifies the corrected input signal on the basis of the expansion-gain signal so as to expand the dynamic range of the corrected input signal. The dynamic range expander 115 outputs the audio signal (hereinafter referred to as an output signal) obtained as a result to the subsequent stage.
The center-localization degree detector 131 detects a center localization degree indicating a degree of concentration on a center of a localization distribution of the corrected input signal supplied from the AGC 111. The center-localization degree detector 131 supplies a center-localization-degree detection signal indicating the detected center localization degree to the center-transition degree detector 132.
The center-transition degree detector 132 detects a degree of transition on a center of the localization distribution of the corrected input signal on the basis of the center-localization-degree detection signal. The center-transition degree detector 132 supplies a center-transition-degree detection signal indicating the detected center-transition degree to the threshold determinator 133.
The threshold determinator 133 detects an expansion section by comparing the center-transition-degree detection signal with a predetermined threshold value. The threshold determinator 133 supplies an expansion-section detection signal indicating the detected expansion section to the expansion-signal-level setting section 113.
Next, a description will be given of dynamic-range expansion processing executed by the audio-signal processing apparatus 101 with reference to a flowchart in
In this regard, individual graphs in
Also, in the following, a description will be given of processing of the case where an input signal shown in
In step S1, the AGC 111 performs level correction. Specifically, the AGC 111 performs level correction on the input signal such that an average level of the individual channels becomes a predetermined reference level. Thereby, for example, the input signal in
In this regard, a method of the AGC 111 performing level correction is not limited to a specific method, and any method may be employed.
In step S2, the expansion-section detection device 112 executes expansion-section detection processing. Here, a detailed description will be given of the expansion-section detection processing with reference to a flowchart in
In this regard, individual graphs in
In step S21, the center-localization degree detector 131 detects a center localization degree. Specifically, the center-localization degree detector 131 calculates a localization distribution of the corrected input signal on the basis of the corrected input signals of both right and left channels. Further, the center-localization degree detector 131 detects a center localization degree indicating how much a localization distribution of the corrected input signal is concentrated on a center on the basis of the calculated localization distribution. And the center-localization degree detector 131 supplies a center-localization-degree detection signal indicating the detected center localization degree to the center-transition degree detector 132.
In this regard, in order to detect a localization distribution and a center localization degree of an audio signal, any method, for example, a method disclosed in Japanese Unexamined Patent Application Publication No. 2008-28693, etc., may be employed.
In step S22, the center-transition degree detector 132 detects a center-transition degree. Specifically, the center-transition degree detector 132 differentiates the center-localization-degree detection signal so as to generate a center-transition-degree detection signal indicating how much the localization distribution of the corrected input signal has changed to a center. And the center-transition degree detector 132 supplies the generated center-transition-degree detection signal to the threshold determinator 133.
In step S23, the threshold determinator 133 detects an expansion section. Specifically, the threshold determinator 133 compares the center-transition-degree detection signal with a predetermined threshold value. And the threshold determinator 133 generates an expansion-section detection signal which becomes an H level (high level) during a section having the center-transition-degree detection signal not less than a threshold value, and becomes an L level (low level) during a section having the center-transition-degree detection signal less than the threshold value. A section in which the center-transition-degree detection signal becomes the threshold value or more, and the expansion-section detection signal becomes the H level is determined to be an expansion section. And the threshold determinator 133 supplies the generated expansion-section detection signal to the expansion-signal-level setting section 113.
In general, playback sound of a stereo audio signal, such as music, etc., is spread in right and left directions in room, and sound of rhythm instruments, vocal, etc., is concentrated around a center, and is localized. Accordingly, a localization distribution of an audio signal normally is spread in right and left directions, but a beginning phrase of a vocal, an attack sound of a rhythm instrument, etc., are rapidly concentrated on a center at a moment when a strong attack appears.
Also, even if amplitude of an audio signal is corrected by limit processing, etc., a localization distribution of an audio signal is not so much influenced by that correction, and is kept substantially unchanged. That is to say, a localization distribution of an audio signal seldom changes before and after amplitude of the audio signal is corrected.
Accordingly, by detecting a section having a center-transition degree not less than a threshold value, it is possible to detect a section in which a localization distribution of an audio signal rapidly changes, that is to say, to detect a section including a moment of the appearance of a strong attack as an expansion section.
After that, the expansion-section detection processing terminates.
Referring back to
And the expansion-signal-level setting section 113 supplies the generated expansion-signal level signal to the expansion gain calculator 114.
In step S4, the expansion-gain calculator 114 calculates an expansion gain. Specifically, the expansion-gain calculator 114 converts a value of the expansion-signal level signal such that the expansion-signal peak level becomes a predetermined maximum value (for example, 2.0) and the expansion-signal minimum level becomes 1.0. Thereby, the expansion-gain signal is generated.
And the expansion-gain calculator 114 supplies the generated expansion-gain signal to the dynamic range expander 115.
In step S5, the dynamic range expander 115 expands the dynamic range. Specifically, the dynamic range expander 115 amplifies the corrected input signal using the expansion gain indicated by the expansion-gain signal so as to expand the dynamic range.
Thereby, the dynamic range of the corrected input signal is expanded in an expansion section having the expansion gain exceeding 1.0. And as described above, the expansion section includes a moment at which a strong attack appears, and thus the strong attack appears in the original audio signal, making it possible to expand the dynamic range at the moment when the signal level becomes high. Accordingly, for example, it is possible to amplify the original audio signal in
And the dynamic range expander 115 outputs the audio signal (output signal) with the expanded dynamic range to the subsequent stage.
After that, the dynamic-range expansion processing is terminated.
Incidentally, in the audio-signal processing apparatus 101, an audio signal is amplified unconditionally in an expansion section having an center-transition degree not less than a threshold value, and thus excessive expansion of a dynamic range is sometimes performed on an input signal. A description will be given of a specific example of this case with reference to
In this regard, the horizontal axes of individual graphs in
A second embodiment of the present disclosure allows suppression of such excessive expansion of a dynamic range.
As compared with the audio-signal processing apparatus 101 in
The signal-level detector 211 detects a level of the corrected input signal supplied from the AGC 111, and supplies an input signal level signal indicating a detection result to the comparator 212 and the expansion-gain calculator 213.
The comparator 212 compares the input-signal level signal with the expansion-signal level signal supplied from the expansion-signal-level setting section 113, and supplies a comparator-output signal indicating a comparison result to the expansion-gain calculator 213.
The expansion-gain calculator 213 calculates an expansion gain on the basis of the input-signal level signal and the comparator-output signal. The expansion-gain calculator 213 supplies the expansion-gain signal indicating the calculated expansion gain to the dynamic range expander 115.
Next, a description will be given of dynamic-range expansion processing executed by the audio-signal processing apparatus 201 with reference to a flowchart in
In this regard, individual graphs in
Also, in the following, a description will be given of processing of the case where an input signal shown in
In step S51, level correction on the input signal is performed in the same processing as the processing in step S1 in
On the other hand,
In step S52, the signal-level detector 211 detects the input signal level. For example, signal-level detector 211 detects an arithmetic mean of the corrected input signals of the right and left channels as an input signal level. The input-signal level detector 211 supplies an input-signal level signal indicating the detected input signal level to the comparator 212 and the expansion-gain calculator 213.
On the other hand,
In step S53, in the same manner as the processing in step S2 in
As described above, the localization distribution of the audio signal is seldom changed by the limiter processing, etc. Accordingly, substantially same sections are detected for the corrected input signals in
In step S54, the expansion signal level is set in the same manner as the processing in step S3 in
In step S55, the comparator 212 compares the input signal level with the expansion signal level. Specifically, the comparator 212 compares the input signal level indicated by the input-signal level signal with the expansion signal level indicated by the expansion-signal level signal, and selects and outputs either higher one. Thereby, a comparator-output signal indicating a higher level between the input signal level and the expansion signal level at each point in time is generated, and is supplied from the comparator 212 to the expansion-gain calculator 213.
On the other hand,
In step S56, the expansion-gain calculator 213 calculate an expansion gain. Specifically, the expansion-gain calculator 213 calculates the comparator-output signal value/the input-signal level signal value (input signal level) as the expansion gain. And the expansion-gain calculator 213 generates an expansion-gain signal indicating the calculated expansion gain, and supplies it to the dynamic range expander 115.
On the other hand,
In step S57, the corrected input signal is amplified, and the dynamic range is expanded by the same processing as that in step S5 in
On the other hand,
After that, the dynamic-range expansion processing is terminated.
In the following, a description will be given of variations of the embodiments of the present disclosure.
Variation 1
In the above description, expansion sections are detected on the basis of the corrected input signal after having been subjected to level correction. However, as described above, a localization distribution of an audio signal is hardly influenced by level correction, and thus expansion sections may be detected on the basis of the input signal before having been subjected to level correction.
Variation 2
Also, in the above description, expansion sections are detected on the basis of a center-transition degree. However, expansion sections may be detected on the basis of a center localization degree. For example, a section having a center localization degree of a predetermined threshold value or higher may be detected as an expansion section. Thereby, it is possible to expand a dynamic range of a section in which a localization distribution of an input signal is concentrated on a center, for example, a section having a large sound volume of vocal and rhythm instruments. Also, expansion sections may be detected using both the center localization degree and the center-transition degree.
Variation 3
Also, a peak value of the expansion-signal level signal and the slope of the expansion-signal level signal at increase time or decrease time may be varied on the basis of, for example, a length of an expansion section, and a level of the corrected input signal. Also, in
Variation 4
Further, if the average levels are arranged to be equal before input, and then the input signal is input, it is possible to eliminate the AGC 111, and to omit the level correction processing.
Variation 5
Also, when dynamic-range expansion processing is performed on a multi-channel input signal, for example, down mix ought to be performed on the audio signals of three front-side channels, namely the right, the left, and center channels. And expansion sections ought to be detected on the basis of the generated audio signal. And on the basis of the detected expansion sections, the dynamic-range expansion ought to be performed only on the audio signals of the three front-side channels.
In this regard, it is possible to apply the present disclosure, for example, to an apparatus performing playback or recording of an audio signal, or an apparatus performing correction of an audio signal.
Also, in the present specification, the audio signal includes a sound signal including only a human voice and a cry of an animal, etc.
The above-described series of processing can be executed by hardware or can be executed by software. When the series of processing is executed by software, programs of the software may be installed in a computer. Here, the computer includes a computer that is built in a dedicated hardware, and for example, a general-purpose personal computer, etc., capable of executing various functions by installing various programs.
In the computer, a CPU (Central Processing Unit) 301, a ROM (Read Only Memory) 302, a RAM (Random Access Memory) 303 are mutually connected through a bus 304.
An input/output interface 305 is further connected to the bus 304. An input section 306, an output section 307, a storage section 308, a communication section 309, and a drive 310 are connected to the input/output interface 305.
The input section 306 includes a keyboard, a mouse, a microphone, etc. The output section 307 includes a display, a speaker, etc. The storage section 308 includes a hard disk, a nonvolatile memory, etc. The communication section 309 includes a network interface, etc. The drive 310 drives a removable medium 311, such as a magnetic disk, an optical disc, a magneto-optical disc, or a semiconductor memory, etc.
In the computer having the configuration as described above, the CPU 301 loads the program stored, for example in storage section 308 to the RAM 303 through the input/output interface 305 and the bus 304 to execute the program, thereby the above-described series of processing is performed.
The program to be executed by the computer (CPU 301) can be provided by being recorded on a removable medium 311 as a package medium, etc., for example. Also, the program can be provided through a wired or a wireless transmission medium, such as a local area network, the Internet, a digital satellite broadcasting, etc.
In the computer, the programs can be installed in the storage section 308 through the input/output interface 305 by attaching the removable medium 311 to the drive 310. Also, the program can be received by the communication section 309 through a wired or wireless transmission medium and can be installed in the storage section 308. In addition, the program may be installed in the ROM 302 or the storage section 308 in advance.
In this regard, the programs to be executed by the computer may be programs that are processed in time series in accordance with the sequence described in this specification. Alternatively, the programs may be the programs to be executed in parallel or at necessary timing, such as at the time of being called, or the like.
Also, an embodiment of the present disclosure is not limited to the above-described embodiments. It is possible to make various changes without departing from the gist of the present disclosure.
The present disclosure contains subject matter related to that disclosed in Japanese Priority Patent Application JP 2010-280164 filed in the Japan Patent Office on Dec. 16, 2010, the entire contents of which are hereby incorporated by reference.
Number | Date | Country | Kind |
---|---|---|---|
P2010-280164 | Dec 2010 | JP | national |