When sampling an audio signal for digital transmission, a sample rate is determined by a clock, typically embodied as a quartz crystal oscillator, the output frequency of which can differ from a desired nominal rate for multiple reasons. In telecommunications systems where network access devices have independent clocks, data rate mismatches between two clock rates will inevitably occur. Such differences cause artifacts in an audio signal when it is reconstructed from digital samples. Those artifacts can be manifested as clicks, pops and/or momentary silence, all of which are annoying.
A prior art “brute force” method of simply adding or removing zero samples or repeat samples from a digital signal does not solve the problems created by dissimilar clocks. Adding or removing samples will instead introduce discontinuity in an audio signal and generate audible artifacts (clicks or pops) that will deteriorate the end user experience. Introducing average sample of the surrounding samples still doesn't resolve the audible artifacts completely.
Another prior art method of predicting samples based on historical data may become too much computationally expensive for embedded type applications.
A simple, computationally efficient method of matching different digital sample transmission rates would be an improvement over the prior art.
The radio 102 includes a Bluetooth transceiver 104 having a radio frequency transceiver 106 that receives and transmits Bluetooth signals, from and to respectively, Bluetooth-capable devices to which the transceiver 106 is “paired.” The operation of the transceiver 106, including its conversion of audio signals to pulse coded modulation (PCM), is controlled or timed by a timing signal or clock provided to the transceiver 106 by a conventional quartz crystal 108.
As used herein, the term “real time” refers to the actual time during which something takes place.
The Bluetooth transceiver 104 provides PCM samples 114 to and receives PCM samples 114 from a cellular transceiver 108 in the radio 102, in real time. The cellular transceiver 108 includes a central processing unit (CPU) or computer 120. The CPU 120 receives its own timing signal from its own quartz crystal 122, which is also part of the radio 102. The PCM samples 114 provided to and received from the transceiver 108 in real time, are also provided to and received from the CPU 120 in real time.
The PCM samples 114 that the central processing unit 120 receives in real time from the Bluetooth transceiver 104 are sent or forwarded by the CPU 120 to a coder/decoder (CODEC) 126 in real time, at a rate which is determined by the CPU's quartz crystal 122, not the quartz crystal 108 of the Bluetooth transceiver 104. The PCM samples 114 that the central processing unit 120 sends to the Bluetooth transceiver 104 are received by the CPU 120 from the coder/decoder (CODEC) 126 in real time, at the rate determined by the CPU's quartz crystal 122 because the CPU 120 also provides a clock signal 124 to the CODEC. An output signal 127 from the CODEC 126, which can include audio, can be provided to a loudspeaker 130.
Those of ordinary skill in the art know that the actual frequency and stability, of nominally-identical quartz crystals are rarely identical. The actual frequencies output from two crystals having the same nominal frequency will almost always be different. Their frequencies will also differ or shift differently if the two crystals are subjected to different environmental conditions.
In
Regardless of the cause or reason why two crystals 110 and 122 might have different output frequencies, the timing of the stream of PCM samples 114 provided to the CPU 120 from the Bluetooth transceiver 104 that receives timing signals from its own crystal 108 and vice versa, will almost always have a frequency or sample rate that is slightly different from the frequency or sample rate of the PCM samples 112 output from the CPU 120 because the frequency of the crystal 122 for the CPU 120 will be slightly different from the frequency of the Bluetooth transceivers crystal 108.
Differences between the PCM signal sample rates 114, 112 will inevitably produce artifacts, i.e., clicks, pops and similar annoying sounds, in audio that is re-created from the PCM samples. In the system shown in
Those of ordinary skill in the art will recognize that signals sent from a distant cell phone 144 to the radio 102 in a vehicle will also have their own transmission rates. When the two crystals 108 and 122 in
Put simply, the method and apparatus disclosed herein enables digital samples of audio signals to be exchanged between audio devices that process those audio signal samples at different rates. Stated another way, the method and apparatus herein controls the reception and transmission of digital data representing audio signals exchanged between audio devices that process such data at different rates. Paraphrased, the method and apparatus disclosed herein causes one or both audio devices to either shorten or lengthen frames of audio samples that pass between and through them in order to compensate for data rate mismatches.
As used herein, the term “window” refers to a set of coefficients with which corresponding samples in a data record are multiplied so as to more accurately estimate certain properties of the signal from which the samples were obtained. Generally the coefficient values increase smoothly.
A “window function” is a mathematical function that is zero-valued outside of a chosen interval. By way of example, a function that is a single value inside the interval and zero elsewhere is called a rectangular window, which also describes the shape of its graphical representation. A “triangular” window function will have values that increase gradually across an interval and which are zero outside the interval.
When multiple PCM samples comprising a frame of samples is multiplied by a triangular window function having an interval time that is equal to the time length of the frame, and which has an initial value of zero at the beginning of the interval and a final value of one at the end of the interval, the product of the window function and the frame of PCM samples will be an adjusted or “windowed” frame of PCM samples, the values of which increase gradually from zero. The value or the first sample of the windowed frame will be zero; the value of the last sample of the PCM frame, which is multiplied by one, will be unchanged.
At the beginning 207 of the frame of samples 205, the window function 204 has a starting value of zero (0.0). At the terminus or opposite end 209 of the frame of samples 205, the window function 204 has an ending value of 1.0.
For every PCM sample between the beginning of the frame 207 and the end of the frame 209, the window function 204 has a corresponding value, which increases continuously between the beginning and end of the frame 205, i.e., gradually increasing from zero to one, across the time duration or “width” of the frame 205.
As
A comparison of the graph 210 shown in
A frame rate can be effectively reduced, and mismatched data rates of two different communications devices compensated for, by controlling at least one of two communicating devices, e.g., the Bluetooth transceiver 104 and the CPU 120 or, the CPU 120 and the Bluetooth transceiver 104, in order to remove a sample from windowed frames 210, 212, before the windowed frames 210, 212 are added to each other. Similarly, a frame rate can be effectively increased, and mismatched data rates compensated for, by controlling one of the devices to add a sample to two windowed frames, before the windowed frames are added to each other.
In a preferred embodiment, a frame rate is reduced by removing the first sample from a copy of the windowed frame generated by multiplying a frame of samples 205 by a gradually increasing window function 210 and, removing the last sample from the copy of the windowed frame created by multiplying the same frame of samples 205 by a gradually decreasing window function 212. The value of the increasing window function 204 ranges from zero (0.0) to one (1.0). The value of the decreasing window function 208 ranges from one (1.0) to zero (0.0). The increasing and decreasing window functions are thus inverses of each other.
With regard to
In
As shown in
At step 408, a determination is made whether the first and second signal sample rates are different from each other. If the rates are the same, there is no need to make adjustments to the signal sample rates.
If at step 408, the two signal sample rates are determined to be different, the method 400 proceeds to step 410 where a frame of samples from one of the signals, e.g., the samples of a frame from the Bluetooth transceiver 104, is copied, producing two duplicate frames of samples from the same signal. At step 412, one of the copies of the frame created at step 410 is multiplied by a gradually increasing window function. Each sample of the frame of samples is multiplied by a numeric value of the gradually increasing window function at the “location” in the frame for the sample to be multiplied. By way of example, the value of the window function for the first sample of the frame is by zero. The value of the window function for the last sample of the frame is zero. The first and last samples the frame are thus multiplied by zero and one respectively. The window function can be linear, non-linear, or sigmoid, but preferably has values that vary continuously or at least nearly continuously between 0.0 and 1.0.
At step 414, the second copy of the frame of the audio signal is multiplied by a mirror image or inverse of the gradually increasing window function. The second copy is thus multiplied by a gradually decreasing window function. Its initial value is 0.0; its final value is 1.0.
At step 416, in
As stated above, a frame rate can be effectively reduced by eliminating one of the samples in a frame of samples. At step 418, the first sample from the first copy of the first windowed frame is deleted. For a frame that was originally 80 samples, after the execution of step 418, that frame will have only 79 samples. At step 420, the last sample from the second copy of the first window frame is also deleted. That second copy of the same frame will thus have 79 samples.
At step 422, the two “adjusted” frames are added to each other. And, as set forth above, the arithmetic addition of two windowed frames, one of which was windowed by an inverse function of the other, results in a re-creation of essentially the original frame, i.e., an approximate replication of the original frame, but after step 422, the number of PCM samples in the original frame will have been reduced by 1 sample leaving seventy nine samples, samples s 2-80. At step 424, the frame, reduced by 1 sample, is transmitted to a radio transceiver, loudspeaker or other communications device configured to create or reconstruct audible sound from PCM samples, an example of which is depicted in
Referring again to step 416, if the first frame rate is not greater than the second frame rate, the first frame rate is necessarily less than the second frame rate due to the determination made at step 408 that the two frame rates are different. The first frame rate thus needs to be increased and can be increased by adding a sample to the frame.
At step 426, a new, first sample is added to the windowed frame created by multiplying a frame of samples by a gradually increasing window function. The first windowed frame will thereafter have eighty-one (81) samples instead of the original eighty (80) samples.
At step 428 a new last sample is added to the windowed frame created by multiplying the second copy of the frame of samples by a gradually decreasing window function. The second windowed frame will thus have 81 samples.
The two new samples are preferably the same value and preferably zero. When the two windowed frames are added together at step 430, the resultant frame will have 81 samples instead of 80.
In
If the signal sample rates are determined to be different from each other, a signal sample duplicator 508 receives a frame of samples from one of the signals and duplicates them into two identical copies, copy A, 507 and copy B, 509 as shown. Otherwise, the sample rates are identical. Clock rate compensation is not required.
A window function generator 514, implemented perhaps as an operational amplifier configured to act as an integrator, creates a gradually increasing window function 518. Examples of usable window functions are linear functions that ramp continuously from the value of 0.0 to 1.0 over a frame period, non-linear functions that increase gradually from 0.0 to 1.0 over the same frame duration or a sigmoid-type function which increases gradually from 0.0 to 1.0 over the frame duration. Alternate embodiments of the window function generator 514 create window functions that ramp continuously from a non-zero value to a value slightly greater and/or slightly less than 1.0.
The output of the window function generator 518 is itself provided to a multiplier 520. A window function inverter 516 also receives the output 518 of the window function generator 514 and provides an inverse of the window function to a second multiplier 521. The multiplier can be readily implemented using one or more prior art shift registers or adders.
As shown in
Depending upon which sample rate was determined to be fastest, the signal rate sample determiner 502 instructs the adders and subtractors 528 and 530 to either add or subtract a first sample to, or from, the first windowed frame 524. Similarly, the signal rate determiner 502 controls the second adder/subtractor to subtract or add a last sample to, or from, the second windowed frame 526. The outputs from the adders and subtractors 528, 530 are “adjusted window frames” 529 and 531.
An adder 530 receives the adjusted windowed frames 529, 531, adds them together and provides an increased or decreased frame rate signal 532, the rate of which is essentially the same or identical to one of the first and second frame rates provided to the signal frame rate determiner 502.
The instructions stored in the memory, when executed by the CPU 602 perform the steps described above and depicted in
Referring again to
In various embodiments, audio signals with a first frame rate can be obtained from an audio signal carried on a USB communications link as well as a voice-over Internet Protocol link (VOIP). Both those media are well known to those of ordinary skill in the telecommunications art. Since they are well known, depictions of them per se are therefore omitted in the interest of brevity.
The foregoing description is for purposes of illustration. The true scope of the invention is set forth in the following claims.
Number | Name | Date | Kind |
---|---|---|---|
6252919 | Lin | Jun 2001 | B1 |
7949015 | Miljkovic et al. | May 2011 | B2 |
Number | Date | Country |
---|---|---|
02052240 | Jul 2002 | WO |
Entry |
---|
Search Report dated Apr. 15, 2016, from corresponding GB Patent Application No. GB1513624.5. |
Stefan Werner, “An Algorithm for Audio Skew Compensation in Low Latency Environments”, 2005, 4 pages, vol. 2005. |
Tom O'Haver—Professor Emeritus, “A Pragmatic* Introduction to Signal Processing: Smoothing, pp. 11-16”, A retirement project, 1997 (Last updated Jun. 25, 2015), Department of Chemistry and Biochemistry, University of Maryland at College Park. |
Orion Hodson et al., “Skew Detection and Compensation for Internet Audio Applications”, Proceedings of the IEEE International conference on Multimedia and Expo, Jul. 2000, IEEE Computer Society. |