Claims
- 1. An apparatus for a voice transcoder that produces a destination code bitstream in a destination codec format from a source code bitstream in a source codec format, the apparatus comprising:
an unpacking module operative to unpack the source codec bitstream and decode the information into at least one parameter of a common codec for which a common codec parameter space is defined; a linear prediction parameters generation module operative to generate destination codec linear prediction parameters by mapping from source codec linear prediction parameters or by linear prediction analysis; a perceptual weighting filter module operative to use weighting factors that have been optimized for transcoding between a specific source codec and destination codec pair; an excitation parameter generation module for determining at least one common codec excitation parameter in the destination codec format, said parameter generation module operative to provide direct mapping processes and searching processes for each said common codec excitation parameter; a packing module operative to pack the destination codec common codec parameters to the bitstream; and a control module for selecting a transcoding strategy and to provide additional control information.
- 2. The apparatus of claim 1, wherein said linear prediction parameters generation module comprises:
a linear prediction parameters mapping and conversion module for interpolating the linear prediction parameters upon determination of a difference between source codec frame size and destination codec frame size, and for mapping the linear prediction parameters to the destination codec format; and a linear prediction analysis module for generating linear prediction parameters from a reconstructed speech signal.
- 3. The apparatus of claim 1, wherein optimized weighting factors of said perceptual weighting filter module are pre-computed prior to transcoding and storing as part of the apparatus.
- 4. The apparatus of claim 1, wherein said excitation parameter generation module comprises:
first modules for direct mapping of the source codec excitation parameters format to the destination codec excitation parameters format; second modules for searching for said source codec excitation parameters and said destination codec excitation parameters; and pass-through modules for third excitation parameters, said third excitation parameters being used if the types of said source codec and said destination codec and respective bit-rates are the same.
- 5. The apparatus of claim 4, wherein said first modules for direct mapping of excitation parameters comprise an adaptive codebook pitch lag mapping module, an adaptive codebook pitch gain mapping module, a fixed codebook gain mapping module, and a fixed codebook index mapping module.
- 6. The apparatus of claim 4, wherein said second modules for searching for excitation parameters comprise an adaptive codebook pitch lag searching module, an adaptive codebook pitch gain searching module, a fixed codebook gain searching module, a fixed codebook index searching module, and an excitation reconstruction module.
- 7. The apparatus of claim 4, wherein said pass-through modules for excitation parameters comprise an adaptive codebook pitch lag searching module, an adaptive codebook pitch gain searching module, a fixed codebook gain searching module, a fixed codebook index searching module and an excitation reconstruction module.
- 8. The apparatus of claim 1, wherein said control module is operative to employ a transcoding strategy comprising a set of rules to determine a specific process of transcoding.
- 9. The apparatus of claim 1, wherein said linear prediction parameters generation module is controlled by said control module.
- 10. The apparatus of claim 1, wherein said excitation parameter generation module is controlled by said control module.
- 11. The apparatus of claim 1, wherein reconstructed speech of the source codec is not pre-processed.
- 12. The apparatus of claim 1 having no noise suppression functions.
- 13. The apparatus of claim 1 having no post-filtering and no gain adjustment.
- 14. A method for producing a destination code bitstream in a destination codec format from a source code bitstream in a source codec format in order to perform voice transcoding between common codec parameter-based voice codecs comprising:
determining and storing weighting factors for a perceptual weighting filter, said weighting factors being optimized for a specific source codec and destination codec pair; configuring transcoding strategies for each preselected transcoding pair; unpacking said source codec bitstream to produce source codec common codec parameters; reconstructing a speech signal using source codec common codec parameters; mapping one or more parameters in parameter space of the common codec parameters according to a selected transcoding strategy; perceptually weighting voice signals using said perceptual weighting filter according to the selected transcoding strategy; searching for one or more excitation parameters according to the selected transcoding strategy; and packing the destination codec common codec parameters to the destination codec bitstream.
- 15. The method of claim 14, wherein said common codec parameters are defined by a linear code, further including the interim step of:
performing linear prediction analysis according to the selected transcoding strategy to determine linear prediction coefficients for further processing.
- 16. The method of claim 14, wherein said excitation parameters mapping comprises determining quantized values of at least one of adaptive codebook pitch lag, adaptive codebook pitch gain, fixed-codebook index and fixed-codebook gain by interpolating the source codec parameters upon determination of at least one of a difference in frame size, subframe size, and mappable characteristics between the source codec and the destination codec; and
directly converting the excitation parameters to the destination codec format.
- 17. The method of claim 14, wherein said excitation parameters searching step comprises determining quantized values of at least one of adaptive codebook pitch lag, adaptive codebook pitch gain, fixed-codebook index, and fixed-codebook gain by minimizing the error between a reconstructed signal and a target signal.
- 18. The method of claim 14, wherein transcoding strategies configuring step comprise selecting a number of respective mapping and searching options to determine signal processing flow.
- 19. The method of claim 14 wherein the transcoding strategy specifies a process whereby some parameters are first obtained from said common codec parameter mapping and remaining parameters are obtained through a searching procedure.
- 20. The method of claim 14, wherein the transcoding strategy specifies a process whereby all common codec parameters from the source codec are mapped to the destination codec without searching.
- 21. The method of claim 14, wherein reconstructing a speech signal involves no post-processing operations.
- 22. The method of claim 14, wherein no noise suppression or speech pre-processing is performed prior to speech perceptual weighting.
- 23. The method of claim 14, wherein said transcoding strategies comprise:
direct mapping of a code-excited linear prediction parameter upon determination of presence of a similar code-excited linear prediction parameter compression process between the source codec and destination codec of the transcoding pair; performing speech reconstruction and speech perceptual weighting if searching is required to determine code-excited linear prediction parameters for the destination codec; performing linear prediction analysis if there are substantial differences in linear prediction parameter compression processes between the source codec and the destination codec in a transcoding pair, and if the steps of linear prediction parameter interpolation, mapping, and conversion do not produce a target output voice quality in the transcoding; searching the adaptive codebook, if LP analysis processing is required; searching the adaptive codebook, 1) if the adaptive codebook parameter compression process has substantial differences between source codec and destination codec in a transcoding pair, and 2) the adaptive codebook parameter space mapping method does not produce the target output voice quality in the transcoding; searching the fixed codebook, if adaptive codebook searching is required; searching the fixed codebook, if the fixed codebook parameter compression process has substantial differences between source codec and destination codec in a transcoding pair, and if the fixed codebook parameter space mapping method does not produce the target output voice quality in the transcoding.
- 24. The method of claim 14, wherein said weighting factors obtaining step comprises transcoding a set of voice samples using different weighting factor values, performing voice quality tests on the transcoded voice signals, and selecting specific weighting factors for a specific source codec and destination codec pair in order to produce a target voice quality.
- 25. The method of claim 14, wherein said weighting factors obtaining step comprises finding best weighting factors for each possible mode and bit rate combination of the source codec and the destination codec.
CROSS-REFERENCES TO RELATED APPLICATIONS
[0001] This application claims priority to U.S. Provisional Patent Application Serial No. 60/439,420 (Attorney Docket Number 021318-001900US) titled “High Quality Audio Transcoding” filed Jan. 9, 2003, which is incorporated by reference herein for all purposes.
Provisional Applications (1)
|
Number |
Date |
Country |
|
60439420 |
Jan 2003 |
US |