1. Field of the Invention
The present invention relates to an interpolation apparatus for interpolating an error portion of audio data such as PCM data.
2. Description of the Related Background Art
Recently, in order to enjoy music, audio data representing a music piece is downloaded onto a computer via the Internet, and the music piece is reproduced in accordance with the audio data. Errors such as failures of data may occur in the downloaded audio data depending on the data transmission condition of the Internet. To interpolate these error portions, an audio data interpolation apparatus is employed (see Japanese Patent Publication 3041928, Japanese Unexamined Patent Application Publication 2000-214875, Japanese Unexamined Patent Application Publication 2002-41088, Japanese Unexamined Patent Application Publication H9-161417, and Japanese Unexamined Patent Application Publication 2003-99096, for example).
As shown in
The error position detecting unit 11 detects a frame including an error in the input data. When MP3 format audio data, for example, is used as the input data, an error check item for a two-byte CRC (cyclic redundancy check) is provided immediately after the frame header of each frame, and when the value of the error check does not match a CRC value calculated on the basis of the main data in a frame, it is determined that the frame is an error frame. When the error position detecting unit 11 detects a frame including an error in the input data, an error detection signal is generated and transmitted to the PCM generating unit 12.
The PCM generating unit 12 is a decoder which decodes the input data, generates PCM data, and outputs the generated PCM data to the buffer 13. When a frame including an error is output in accordance with the error detection signal from the error position detecting unit 11, the PCM generating unit 12 also outputs a switching signal indicating the frame (the frame number) to the output switching unit 16. The buffer 13 holds the PCM data supplied by the PCM generating unit 12 in block units corresponding to the frames of the input data, and outputs the held PCM data to the delay unit 15 at a predetermined timing.
The interpolation processing unit 14 receives the PCM data of the blocks in front and rear of the error block from the buffer 13 using a recursive filter, creates interpolated PCM data corresponding to the error block, and outputs the interpolated PCM data to the data switching unit 16.
The delay unit 15 delays the PCM data from the buffer 13 by the amount of time required for the interpolation processing unit 14 to create the interpolated PCM data, and then outputs the delayed PCM data to the output switching unit 16.
The output switching unit 16 typically receives and outputs the PCM data supplied by the delay unit 15, and receives and outputs the interpolated PCM data supplied by the interpolation processing unit 14 in response to the frame indicated by the switching signal.
With the above configuration, when the error position detecting unit 11 detects a frame including an error in the input data, an error detection signal is generated. The error detection signal is then output to the output switching unit 16 from the PCM generating unit 12 as a switching signal indicating the frame which includes the error. The PCM data that is generated by the PCM generating unit 12 passes through the delay unit 15, and is typically output by the output switching unit 16. At the time of the block which corresponds to the frame indicated by the switching signal, the output switching unit 16 outputs the interpolated PCM data supplied by the interpolation processing unit 14.
In the conventional audio data interpolation apparatus, when the PCM data generated by the PCM generating unit 12 switches to the interpolated PCM data created by the interpolation processing unit 14, the listener may feel unnatural by the reproduced sound of the interpolated portion, depending on the content.
An object of the present invention is to provide an audio data interpolation apparatus which is capable of reducing the unnatural feeling caused by the reproduced sound of an interpolated portion.
An audio data interpolation apparatus according to the present invention is an apparatus for interpolating an error portion of audio data, comprising: an error position detecting unit which detects an error position in said audio data; an audio feature amount detecting unit which detects a feature amount of said audio data; an interpolated data creating unit which creates interpolated data corresponding to said error position of said audio data using a filter having a filter characteristic that corresponds to said feature amount of said audio data, in accordance with at least data pieces before said error position of said audio data; and a switching unit which replaces the data portion at said error position of said audio data with said interpolated data.
An audio data interpolation method according to the present invention is a method for interpolating an error portion of audio data, and comprises the steps of: detecting an error position in the audio data; detecting a feature amount of the audio data; creating interpolated data corresponding to the error position of the audio data using a filter having a filter characteristic that corresponds to the feature amount of the audio data, in accordance with at least data pieces before the error position of the audio data; and replacing the data portion at the error position of the audio data with the interpolated data.
An embodiment of the present invention will be described in detail below with reference to the drawings.
As shown in
In response to an interpolation output instruction from the PCM generating unit 22, the audio feature amount detecting unit 27 detects an audio feature amount in accordance with the PCM data held in the buffer 23. The audio feature amount is the maximum value and minimum value of the amplitude level of the audio signal. The maximum value and minimum value are absolute values, but may be the maximum value and minimum value of the plus level alone.
The interpolation parameter generating unit 28 generates interpolation parameters in accordance with the maximum value and minimum value, or in other words the audio feature amount, detected by the audio feature amount detecting unit 27. The interpolation parameters are multiplication coefficients k1, k2, . . . , kj, g1, g2, . . . , gj of the interpolation processing unit 24. Each of the multiplication coefficients k1, k2, . . . , kj takes a value of no less than 0 and less than or equal to 1, and each of the multiplication coefficients g1, g2, . . . , gj takes a value of no less than 0 and less than or equal to 1.
As shown in
It is assumed that the audio feature amount detecting unit 27 and interpolation parameter generating unit 28 are both operated by a single control operation performed by a CPU not shown in the drawing.
Next, the operations of the audio feature amount detecting unit 27 and interpolation parameter generating unit 28 will be explained in detail.
As shown in
The maximum value and minimum value of the read data pieces data[0] to data[n−1] are detected and saved as a maximum value max_blk(i) and a minimum value min_blk(i) (step S3). A maximum value max_blk and a minimum value min_blk are then detected from maximum values max_blk(0) to max_blk(m−1) and minimum values min_blk(0) to min_blk(m−1) of the past m blocks, including the current maximum value max_blk(i) and minimum value min_blk(i) (step S4). For example, m equals 50.
When the maximum value max_blk and minimum value min_blk are obtained, a determination is made as to whether or not they satisfy predetermined conditions (step S5). The predetermined conditions are min_blk>max_val*a1 and min_blk>max_blk*a2. max_val is the maximum value at which the data pieces data[0] to data[n−1] can be obtained. Hence, in the case of 16 bit data, max_val equals 32767, for example. a1 is a first coefficient which satisfies 0<a1<1, and equals approximately 0.1, for example. a2 is a second coefficient which satisfies 0<a2<1, and equals approximately 0.3, for example. max_val*a1 is the level shown in
When the predetermined conditions are satisfied, the interpolation parameters k1, k2, . . . , kj, g1, g2, . . . , gj are set such that the effect of the interpolation increases (step S6). If, on the other hand, the predetermined conditions are not satisfied, the interpolation parameters k1, k2, . . . , kj, g1, g2, . . . , gj are set such that the effect of the interpolation decreases (step S7). The steps S6 and S7 serve as filter characteristic setting means. More specifically, if the predetermined conditions are satisfied, this indicates continuous sound such as music in which sound continues at a level that is detectable by the listener, and therefore the values of k1, k2, . . . , kj, g1, g2, . . . , gj are set high in the step S6 such that the interpolation processing unit 24 has a filter characteristic whereby the signal level indicated by the output data decreases gradually in each of the IIR filters 291 to 29j. On the other hand, if the predetermined conditions are not satisfied, this indicates intermittent sound such as the vocalized sound of an announcer on a news program, which includes low-level blocks that can be detected by the listener among the m block sets, and therefore the values of the interpolation parameters are set low in the step S7 such that the interpolation processing unit 24 has a filter characteristic whereby the signal level indicated by the output data decreases rapidly in each of the IIR filters 291 to 29j. Only a part of the interpolation parameters k1, k2, . . . , kj, g1, g2, . . . , gj may be altered, rather than changing all of the values of the interpolation parameters.
After executing the step S6 or S7, 1 is added to the variable i (step S8), and a determination is made as to whether or not i is equal to or greater than m (step S9). If i<m, the process returns to the step S2 and the operation described above from the step S2 to the step S9 is repeated. On the other hand, if i≧m, the process ends.
The steps S2 to S4 correspond to an operation of the audio feature amount detecting unit 27, and the steps S5 to S7 correspond to an operation of the interpolation parameter generating unit 28.
As a result of these operations of the audio feature amount detecting unit 27 and interpolation parameter generating unit 28, the filter characteristics of the IIR filters 291 to 29j in the interpolation processing unit 24 are set, and in the frame (block) indicated by the switching signal, the interpolated PCM data obtained by these filter characteristics are output by the output switching unit 26 in place of the PCM data supplied by the delay unit 25. The PCM data output by the output switching unit 26 are reproduced by a reproduction apparatus not shown in the drawing, and then output as reproduced sound by electro-acoustic transducing means such as speakers.
As shown in
When the audio signal indicates the voice of a newscaster, it is desirable to make the reproduced sound generated by the interpolated PCM data less noticeable by applying comparatively fast fade-out from the level of the PCM data before the error position.
Further, as shown in
The operations of the audio feature amount detecting unit 27 and interpolation parameter generating unit 28 described above may be executed only when an error is detected by the error position detecting unit 21, or may be repeated every m blocks regardless of error detection.
Furthermore, in the embodiment described above the audio feature amount is detected by the audio feature amount detecting unit 27 from the PCM data, but in the case of the audio signal data of a broadcast program, when PCM data is not used, the audio feature amount may be detected from program information such as an EPG (electronic program guide). Further, instead of detecting the maximum value and minimum value of the audio signal level from the PCM data, the frequency components of the audio signal may be detected as the audio feature amount. For example, an audio signal having a large amount of high frequency components is determined to be music, and an audio signal constituted by the human voice band alone is determined to be narration.
Furthermore, in the embodiment described above only the data pieces before the error position is used by the interpolation processing unit 24 to create the interpolated PCM data, but the interpolated PCM data may be created using the data after the error position as well as the data before the error position. Also in the embodiment described above, the interpolation parameters k1, k2, . . . , kj, g1, g2, . . . , gj are varied, but the delay parameters Z−n1, Z−n2, . . . , Z−nj may also be varied. Also, the recursive filter is not limited to the IIR filter having the constitution described in the above embodiment.
In the present invention, the filter is not limited to a recursive filter, and a non-recursive filter such as an FIR (finite impulse response) filter may be used.
The error position detecting unit 21 detects a frame which includes an error in the input data, but the method thereof is not limited to a method using the CRC of the error position detecting unit 11. Further, the input data are not limited to compressed data, and may be PCM data. If the input data are PCM data, the PCM generating unit 22 is not required.
The present invention may be applied widely in the field of audio signal reproducing and recording apparatuses, to apparatuses having a function for detecting audio errors. In particular, the present invention may be applied to fields of use such as mobile broadcast reception and network music delivery, in which a high error frequency can be expected.
The present invention described above comprises error position detecting means for detecting an error position in audio data, audio feature amount detecting means for detecting the feature amount of the audio data, interpolated data creating means for creating interpolated data corresponding to the error position in the audio data using a filter having a filter characteristic that corresponds to the feature amount of the audio data, in accordance with at least data pieces before the error position of the audio data, and means for replacing the data portion in the error position of the audio data with the interpolated data, and therefore unnatural feeling by a listener in relation to the reproduced sound of the interpolated portion can be reduced.
This application is based on Japanese Patent Application No. 2004-333948 which is hereby incorporated by reference.
Number | Date | Country | Kind |
---|---|---|---|
2004-333948 | Nov 2004 | JP | national |