Claims
- 1. A method for operating a media subsystem when playing audio data for reducing synchronization delay, comprising:
when a frame comprising audio data is sent to a decoder, measuring the synchronization delay; determining by how much the synchronization delay should be adjusted; and adjusting the synchronization delay in a content-aware manner by adding or removing one or more audio samples in a selected current frame or in a selected subsequent frame so as not to significantly degrade the quality of the played back audio data.
- 2. A method as in claim 1, where the step of determining determines the number of audio samples in steps of size one sample.
- 3. A method as in claim 1, where when the synchronization delay is adjusted by more than one audio sample, the adjustment is made by all of the determined audio samples in one adjustment.
- 4. A method as in claim 1, where when the synchronization delay is adjusted by more than one audio sample, the adjustment is made by less than all of the determined audio samples by a plurality of adjustments.
- 5. A method as in claim 1, where the step of adjusting selects, if possible, an unvoiced frame.
- 6. A method as in claim 1, where the step of adjusting discriminates against a transient frame.
- 7. A method as in claim 1, where the step of determining comprises measuring an average amount of time that a frame resides in a jitter buffer; and adjusting the synchronization delay so that the average duration approaches a desired jitter buffer residency duration.
- 8. A method as in claim 1, where at least one frame of audio data has a playback duration in the range of about 20 milliseconds to about 60 milliseconds.
- 9. Apparatus for reproducing a speech signal, comprising buffer circuitry for storing data from a packet that contains data representing a speech signal prior to the data being sent to a decoder, further comprising control circuitry operable when a frame comprising audio data is sent to the decoder, for measuring a synchronization delay, for determining by how much the synchronization delay should be adjusted and for adjusting the synchronization delay in a content-aware manner by adding or removing one or more audio samples in a selected current frame or in a selected subsequent frame so as not to significantly degrade the quality of the played back audio data.
- 10. Apparatus as in claim 9, where said control circuitry determines the number of audio samples in steps of size one sample.
- 11. Apparatus as in claim 9, where when the synchronization delay is adjusted by more than one audio sample, the adjustment is made by all of the determined audio samples in one adjustment.
- 12. Apparatus as in claim 9, where when the synchronization delay is adjusted by more than one audio sample, the adjustment is made by less than all of the determined audio samples by a plurality of adjustments.
- 13. Apparatus as in claim 9, where said control circuitry selects for making the adjustment, if possible, an unvoiced frame.
- 14. Apparatus as in claim 9, where said control circuitry discriminates against a transient frame for making the adjustment.
- 15. Apparatus as in claim 9, where said control circuitry, when determining by how much the synchronization delay should be adjusted, operates to measure an average amount of time that a frame resides in said buffer, and adjusts the synchronization delay so that the average duration approaches a desired buffer residency duration.
- 16. Apparatus as in claim 9, where at least one frame of audio data has a playback duration in the range of about 20 milliseconds to about 60 milliseconds.
- 17. Apparatus as in claim 9, where said circuitry is contained within a wireless communications device, and where the packet is received from a radio channel.
- 18. Apparatus as in claim 9, where said circuitry is contained within a device that processes and plays back packetized speech data.
- 19. Apparatus as in claim 9, where said circuitry comprises part of a mobile telephone or a personal communicator.
- 20. Apparatus as in claim 9, where said circuitry comprises part of a cellular radiotelephone.
- 21. Apparatus as in claim 9, where said circuitry comprises part a PC-based telephony system.
- 22. Apparatus as in claim 9, where said circuitry comprises part an IP telephony gateway.
- 23. Apparatus as in claim 9, where said circuitry comprises part an IP-to-circuit switched media transcoder.
- 24. A method for operating a communication device while synthesizing speech from speech data, the method operating to reduce synchronization delay and comprising:
for a received frame comprising encoded speech data to be sent to a speech decoder, measuring the synchronization delay; determining by how much the synchronization delay should be adjusted; and adjusting the synchronization delay by adding or removing one or more speech samples in a selected frame so as not to significantly degrade the quality of the reproduced speech, where the frame is selected based on at least one speech decoder-related parameter so as to select, if possible, an unvoiced frame over a voiced frame, while discriminating against selecting a transient frame.
- 25. A method as in claim 24, where the step of determining determines the number of samples in steps of size one sample.
- 26. A method as in claim 24, where when the synchronization delay is adjusted by more than one sample, the adjustment is made by all of the determined samples in one adjustment.
- 27. A method as in claim 24, where when the synchronization delay is adjusted by more than one sample, the adjustment is made by less than all of the determined samples by a plurality of adjustments.
- 28. A method as in claim 24, where the step of determining comprises measuring an average amount of time that a frame resides in a jitter buffer; and adjusting the synchronization delay so that the average duration approaches a desired jitter buffer residency duration.
- 29. A method as in claim 24, where the at least one speech decoder-related parameter is comprised of a pitch period.
- 30. A method as in claim 24, where the at least one speech decoder-related parameter is comprised of a pitch gain.
- 31. A method as in claim 24, where the at least one speech decoder-related parameter is comprised of a zero crossing rate within a received frame.
- 32. A method as in claim 24, where the at least one speech decoder-related parameter is comprised of an energy distribution between adaptive and fixed codebook contributions.
- 33. A method as in claim 24, where the at least one speech decoder-related parameter is comprised of a measure of energy of a synthesized speech signal.
- 34. A method as in claim 24, where the at least one speech decoder-related parameter is comprised of a value of a linear prediction error.
- 35. A method as in claim 24, where the at least one speech decoder-related parameter is comprised of a value of a ratio of between an excitation signal at a synthesis filter input and the energy of a synthesized speech signal.
- 36. A method as in claim 24, where said speech decoder comprises a GSM speech decoder.
CROSS-REFERENCE TO A RELATED APPLICATION
[0001] This patent application is a continuation-in-part of copending and commonly assigned U.S. patent application Ser. No. 09/946,066, filed Sep. 4, 2001, entitled “Method and Apparatus for Reducing Synchronization Delay in Packet-Based Voice Terminals”, by Jari Selin, the content of which is incorporated by reference herein in its entirety.
Continuation in Parts (1)
|
Number |
Date |
Country |
Parent |
09946066 |
Sep 2001 |
US |
Child |
10189068 |
Jul 2002 |
US |