The subject matter regarded as the invention is particularly pointed out and distinctly claimed in the concluding portion of the specification. The invention, however, both as to organization and method of operation, together with objects, features, and advantages thereof, may best be understood by reference to the following detailed description when read with the accompanying drawings in which:
In the following detailed description, numerous specific details are set forth in order to provide a thorough understanding of the invention. However, it will be understood by those skilled in the art that the present invention may be practiced without these specific details. In other instances, well-known methods, procedures, components and circuits have not been described in detail so as not to obscure the present invention.
Unless specifically stated otherwise, as apparent from the following discussions, it is appreciated that throughout the specification discussions utilizing terms such as “processing”, “computing”, “calculating”, “determining”, or the like, refer to the action and/or processes of a computer or computing system, or similar electronic computing device, that manipulate and/or transform data represented as physical, such as electronic, quantities within the computing system's registers and/or memories into other data similarly represented as physical quantities within the computing system's memories, registers or other such information storage, transmission or display devices.
Embodiments of the present invention may include apparatuses for performing the operations herein. This apparatus may be specially constructed for the desired purposes, or it may comprise a general purpose computer selectively activated or reconfigured by a computer program stored in the computer. Such a computer program may be stored in a computer readable storage medium, such as, but is not limited to, any type of disk including floppy disks, optical disks, CD-ROMs, magnetic-optical disks, read-only memories (ROMs), random access memories (RAMs) electrically programmable read-only memories (EPROMs), electrically erasable and programmable read only memories (EEPROMs), magnetic or optical cards, or any other type of media suitable for storing electronic instructions, and capable of being coupled to a computer system bus.
The processes and displays presented herein are not inherently related to any particular computer or other apparatus. Various general purpose systems may be used with programs in accordance with the teachings herein, or it may prove convenient to construct a more specialized apparatus to perform the desired method. The desired structure for a variety of these systems will appear from the description below. In addition, embodiments of the present invention are not described with reference to any particular programming language. It will be appreciated that a variety of programming languages may be used to implement the teachings of the inventions as described herein.
There is provided, in accordance with some embodiments of the present invention, a system, method and circuit for clock recovery and synchronization in wireless media streaming. More specifically the adverse affect of jitter on the recovery of a wirelessly transmitted MPEG2 Transport Stream (TS) signal at a receiver is addressed through the implementation of algorithms (mathematical analysis and formulas) for accurate clock frequency and phase error estimation, based on real-time statistical evaluation of data at the receiver. The clock frequency and clock phase error estimation are achieved with an Envelope Set Building Algorithm based on real time samples of jitter values. Additional timing signals at the transmitter are also introduced to aid in signal synchronization.
According to some embodiments of the present invention, the media transmitter/transceiver may be adapted to transmit content bearing data from a media source to a media receiver functionally associated with a presentation device. The content bearing data may be a compressed media file stored on a source device's non-volatile memory, DVD, VHS, or other storage medium, or a live broadcast signal transmitted via cable or satellite. For purposes of this application, any of the above mentioned content bearing data, or any other data types which may be transmitted, received, and presented in accordance with any aspect of the present invention, may be referred to as: (1) content bearing data, (2) content bearing data stream, (3) media stream, or (4) any other term which would be understood by one of ordinary skill in the art at the time the present application is filed.
The present invention performs short distance wireless transmission using MPEG video compression and WLAN transmission technologies. However, WLAN was designed for data transfer and not the transmission of video. Packet error jitter introduced by the WLAN can range from 10 msec and approach 100 msec. The MPEG decoder requires jitter to be kept in the range of 1 to 30 μsec, which is several orders of magnitude less. Such high jitter levels cause the MPEG decoder to lose signal lock, and disrupt a displayed image. In addition, WLAN does not meet the quality of service (QoS) demanded by video applications. New high definition (HD) displays demand a high quality video signal. System delay techniques to compensate for transmission issues, such as excessive buffering are not acceptable to the viewer. Consumers expect agile channel response during channel/program searches, with delays of under 1 sec and preferably less than 0.5 sec between displayed channels. The present novel invention compensates for the WLANs lack of video performance, and in essence provides the receiver MPEG decoder with the equivalent signal quality of a wired connection to the MPEG encoder at the transmitter.
The present invention achieves the low level of jitter despite the use of the WLAN connection. WLAN implementations may have multiple transmit queues, of packets waiting to be transmitted, with different priorities for the different queues, and transmitter jitter is due to these variable delays. The transmit jitter can be mitigated by implementing packets that are just time stamps (Please see
Turning now to
Clock Synchronization System Architecture
Clock Control Algorithm Description
The clock synchronization algorithm of the present invention is a single-stage algorithm with fast response in case of sudden source clock frequency changes and smooth operation during periods of frequency stability. The phase corrections are performed with controlled limited frequency shift values, according to the requirements. The algorithm allows for no loss of phase synchronization, even when correcting large frequency shifts. The algorithm also provides for non-linear gain correction that filters efficiently the variance in the phase readings caused by residual jitter.
The Activate/Deactivate interface 608 is used by the upper level applications to control the operation of the clock synchronization subsystem The general lines of operation are the following:
When the RX Unit Receive Jitter Buffer is empty (Sleep mode, Session not established, Communication failure) the clock subsystem should be deactivated. When the media buffering is completed before starting to push the compressed video (MPEG) packets via the TS interface the clock subsystem shall be activated.
In extreme abnormal situations the RX Jitter Buffer may increase too much and the system delay control mechanism may perform one or more packet SKIPs. This may affect the phase error in the clock control mechanisms, and the clock subsystem shall be deactivated and re-activated again.
The operation control entity also performs identical operations on the downstream Error Estimation Block to properly initialize its functionality.
NOTE: When the clock control subsystem is not active, the initialization and refreshing of the VCXO Control Value is within the responsibility of the upper level applications. Once the Clock Control Subsystem has been activated, no other application should modify the Control Value since this will result in system inconsistent operation.
The Error Control Block 702 uses the Clock Control Message Sequence Number field 802 in order to ensure the message sequence continuity. It performs the following operations:
After an ACTIVATION operation performed by the upper SW entities (following link startup or link recovery), the Error Control Block 702 enters a continuity verification phase, where it checks that a significant series of consecutive messages are received without sequence violations. In a preferred embodiment of the present invention at least sixty consecutive messages are checked to determine if there are sequence violations. During this phase, the clock control messages are dropped and not passed down the processing chain.
Once this initial stage is passed, the Error Control Block 702 starts to pass received messages to the Phase Detector block 704, while continuing to verify the message sequence continuity.
If a message is duplicated (the previous sequence number is repeated) the redundant message is dropped.
For any other sequence number violation condition, the message passing to the following processing levels is discontinued until it is assured that messages are again arriving in proper sequence—one way to do this is to check that at least 2 additional messages are received in correct sequence.
Clock control messages contain clock difference values stored as wrap-around unsigned values. For the ease of further handling, they have to go through some pre-processing.
The Phase Detection 704 function transforms the received unsigned valued into a signed phase error value. Since the clock difference values are always in a range less than half of the full available scale, the first received clock difference value is used as reference offset and all values transmitted down the chain are calculated as signed values relative to this offset:
referenceOffset=clockDifference(0)
and
clockError(n)=clockDifference(n)−referenceOffset
Scaling 706 is carried out in order to have the clock error value in desired units of time. In a preferred embodiment of the present invention the clock error value is scaled to microseconds.
Because of the inherent packet jitter in the WLAN networks, the per-sample timestamp information cannot provide accurate estimations of the phase and frequency difference between the TX and RX clocks. This functionality is provided by the Error Estimation Block 604 that uses a large batch of samples to obtain an accurate estimate.
The Clock synchronization mechanism is supposed to keep the clock phase error close to zero. In order to do so we need to declare the initial phase error as an offset reference value and calculate clock error values relative to this offset. This is referred to as Phase Error Normalization 902.
However, the first phase error value received from the pre-processing block may be affected by jitter and can be actually quite far away from the current minimum phase error values. In order to deal with this issue the following algorithm has been implemented. The first estimation cycle is used in order to calculate the phaseReferenceOffset, and is not used for correcting the VCXO control. The details of the algorithm are as follows:
phaseReferenceOffset=Filtered Phase Error(of first batch of samples)phaseError(k)=phaseError(k)−phaseReferenceOffset
The Envelope Set Building Algorithm 904 is at the core of the error detection mechanism. This block receives phase error samples and keeps only those samples that satisfy certain properties—defined below—and are called the Sample Envelope (please see
Simulations showed that building multiple Sample Envelopes and performing weighted averages of the frequency estimates resulted in more precise results compared to building only a Sample Envelope of order zero. Additionally, building multiple Sample Envelopes enabled the calculation of frequency error statistics, which yielded a reliability measure for the frequency error, and gave a criterion for determining which frequency estimates are invalid and should be ignored.
Sample Envelope Definition
For each clock error sample received we shall define a sample point si as the pair (si−x, si·y), where x is the time at which the sample has been received, and y is the sample value (the phase error value). We shall use the notation Si=(Si−x, Si·y) for the sample points on the Sample Envelope defined below.
A Sample Envelope is an ordered subset of sample points E={S0 . . . Sn} that have the following properties.
SE1. Si−x<Si+1·x for any i=0 . . . n−2, where n is the number Envelope points [i.e. an ordered set]
SE2. Having the slope of an Envelope segment Si, Si+1 defined as
slope(Si, Si+1)=(Si+1·y−Si·y)/(Si+1·x−Si−x),
the following relation shall hold:
slope(Si+1, Si+2)>slope(Si, Si+1) for any i=0 . . . n−3.
[i.e. the sample envelope is a concave shape pointing upwards]
SE3. For any sample point s not part of the Sample Envelope subset, there is an envelope point Ei such as:
Si·x<s.x<Si+1·x, and
slope (Si, s)≧slope (Si, Si+1)
[i.e. all other points are contained within the Sample Envelope concavity—The curve is defined so that all samples are on or above the curve.]
Envelope Set
The very first envelope built using all the samples is called the envelope of order zero E(0). If all E(0) points are put aside, a new envelope may be built, E(1), and so on until all points are exhausted.
An Envelope Set of order N is the set of envelopes {E(0), E(1), . . . E(N−1)}. For practical frequency and phase error estimation purposes, only the first envelope orders are useful since they reflect the statistics of large number of samples.
The current algorithm of the preferred embodiment of the present invention uses an order 4 envelope set made of {E(0), E(1), E(2), E(3)}. However, it should be noted that N (the number of envelopes) can vary from one to any user defined number.
Building the Envelope Set
The building of the Envelope Set is a process that starts from adding the new samples to the outmost envelope, E(0). At the level of E(0), the algorithm may decide that certain points are no longer on E(0) and pass them to E(1) for processing. E(1) may keep them and perhaps pass its own previous points to E(2), and so on.
Since the algorithm is iterative, let's suppose that currently we have already built an Envelope E(k)={S0 . . . Sn} and a new sample s is processed:
Step 1. Find the sample horizontal location.
The sample may be outside the envelope limits, or be within the time period covered by an envelope segment (let's call this segment Si, Si+1).
Step 2. Add the sample to the current segment, if applicable.
If the sample is outside the E(k) time span, it is always added to E(k), otherwise it is added only if the associated phase error value is located under the corresponding envelope segment. If the sample is not added to E(k), pass it to E(k−1) for processing and terminate.
Step 3. Eliminate additional points from envelope.
The inserted Sample Sj is taken as the reference and the 2 envelope segments to the left are checked for the envelope rule SE2. If SE2 does not hold, point Sj−1 is eliminated from E(k) and passed to E(k+1) for processing. The step is repeated until SE holds.
The same procedure is applied to the right side of Sj by eliminating Sj+1 points if necessary and passing them to E(k+1) processing.
Termination Condition
To support the cases when the clock jitter distribution is more scattered because of wireless noise, each sample batch used to build the Envelope Set takes 12 sec. The sample frequency used in the algorithm of the preferred embodiment of the present invention is 60 Hz (resulting in batches of 720 samples). Other embodiments or implementations of the present algorithm may use different sampling frequency rate as well as different sample intervals to build the Envelope Set. In addition the samples are not required to be uniformly spaced. A varying or random sampling rate may be used.
Both the frequency error and phase error estimations algorithms use the fact that for large sample batches the slope of the long envelope segments approximate fairly well the difference in the clocks speed—namely the frequency error. Such a large envelope segment is pictured in
Frequency Error Estimation
The Frequency error freqErr(n) based on Envelope of order n is defined as follows:
Find the envelope segment which has the largest time span, for example find the value i that maximizes Si.x−Si−1.x.
Calculate the slope of segment i:
freqErr(n)=(Si.y−Si−1.y)/(Si.x−Si−1.x)
Scale this value to ppm units
In order to increase the accuracy of the estimation, the frequency error estimations obtained from the envelopes E(0) . . . E(n−1) are used in a weighted formula:
freqErr=W0*freqErr(0)+W1*freqErr(1)+W2*freqErr(2)+ . . . +Wn−1*freqErr(n−1)
The weighting of the envelopes is based on observed results and can be varied according to a particular system performance. A simplified version of the algorithm may consist of just a single envelop with no weighting applied. In an example of a preferred embodiment of the present invention employing an order of four envelope, estimations obtained from the first four envelopes E(0) . . . E(3) are used in the following weighted formula:
freqErr=0.30*freqErr(0)+0.37*freqErr(1)+0.22*freqErr(2)+0.11*freqErr(3)
where freqErr(k) corresponds to the frequency error estimation based on the E(k) longest segment in time. The frequency error is calculated in 27 MHz ppm units.
Frequency Error Statistics
It may happen that the frequency error estimation obtained from different envelops do not match; in such situations the estimation of the current batch is dropped and not used for corrections.
The level of fitness of the individual estimations is computed using the weighted average deviation from the freqErr calculated above:
avrgDeviation=W0*ABS(freqErr−freqErr(0))+W1*ABS(freqErr−freqErr(1))+W2*ABS(freqErr−freqErr(2))+ . . . +Wn−1*ABS(freqErr−freqErr(n−1))
where ABS( )=absolute value( ).
If the weighted average deviation is larger than a deviation threshold value, the frequency error evaluation is considered not valid and ignored. In an example of a preferred embodiment of the present invention, a deviation threshold value of 5 ppm is used.
The weighting of the average deviation is based on observed results and can be varied according to a particular system performance. In an example of a preferred embodiment of the present invention employing an order of four envelope, frequency error estimations obtained from the first four envelopes E(0) . . . E(3) are used in the following weighted formula:
avrgDeviation=0.30*ABS(freqErr−freqErr(0))+0.37*ABS(freqErr−freqErr(1))+0.22*ABS(freqErr−freqErr(2))+0.11*ABS(freqErr−freqErr(3))
where ABS( )=absolute value( ).
Other weighting values and even alternative weighting functions including mean square error or maximum absolute error may be used.
Phase Error Estimation
The phase error (clock difference) is estimated using a similar procedure.
First, the phase error corresponding to individual envelopes is computed taking as reference the longest envelope segment and extrapolating from it the phase error corresponding to the last envelope sample. If the envelope E(k) segment used for frequency estimation is Si, Si+1, and the last sample s, then
phaseError(k)=Si+1·y+slope(Si, Si+1)*(s.x−Si+1x)
The phase error is then calculated using:
phaseErr=W0*phaseErr(0)+W1*phaseErr(1)+W2*phaseErr(2)+ . . . +Wn−1*phaseErr(n−1)
The weighting of the phase error is based on observed results and can be varied according to a particular system performance. In an example of a preferred embodiment of the present invention employing an order of four envelope, phase error estimations obtained from the first four envelopes E(0) . . . E(3) are used in the following weighted formula:
phaseErr=0.30*phaseErr(0)+0.37*phaseErr(1)+0.22*phaseErr(2)+0.11*phaseErr(3)
Representation Units
The units of the output values of the Error Estimation Block 604 are as follows.
The phase error is represented in microseconds units.
The frequency error units are hundreds of 27 MHz ppb units (i.e. 0.1 ppm units).
Other units may be chosen depending on system implementation requirements.
The Control Value Correction Computing Block 1200 carries out the following computation:
The Loop Control Algorithm computes a new Control Value (CVAL) correction value based on the phase and frequency errors, as follows:
CVAL
n
=CVAL
n−1+δCVAL(freqErrn, phaseErrn),
where CVALn is the next Control Value and CVALn−1 is the previous Control Value.
To preserve accuracy in the preferred embodiment, the Control Values are calculated using units of an order of magnitude (×10) larger than the actual value. The correction function δCVAL is using the VCXO device characteristic curve (please see
δCVAL=−ppmToCVAL(ppm1(freqErrn)+ppm2(phaseErrn))
The ppmToCVAL is device dependent and has to be fit to the particular VCXO component. For the example of
In ideal conditions (no jitter), the function ppm1 would replicate identically the frequency error (i.e. ppm1(x)=x). In this case the Control Value correction formula would be:
δCVAL=−ppmToCVAL(freqErrn+ppm2(phaseErrn))
However, because of the remaining jitter influencing the freqErr values, the ppm1 function is non-linear to attenuate jitter.
for x<=−6.4 ppm1(x)=x+4
for −6.4<x<−3.2 ppm1(x)=x/2+0.8
for −3.2<=x<=3.2 ppm1(x)=x/4
for 3.2<x<6.4 ppm1(x)=x/2−0.8
for x>=6.4 ppm1(x)=x−4
The formulas have as input the frequency error in 27 MHz ppm units, and have as output a ppm correction value. As in the case of the phase correction, the Control Value units are obtained using the specific VCXO characteristic curve slope.
for x<=−250 ppm2(x)=−4
for −250<x<−60 ppm2(x)=(x+50)/50
for −60<=x<=60 ppm2 (x)=x/100
for 60<x<250 ppm2(x)=(x−50)/50
for x>=250 ppm2(x)=4
The phase error units are microseconds and the output units are computed in 27 MHz clock ppm units.
Since calculations may sometimes exceed the actual scale, the Control Value has to be corrected to minimum and maximum values. The limits are set by the Control Value Limiter 1202 of
If (CVALn<MIN—CVAL)CVALn=MIN—CVAL
If (CVALn>MAX—CVAL)CVALn=MAX_CVAL
The MIN_CVAL and MAX_CVAL values have to take into account the current scaling (see below).
As mentioned before, the CVAL values are calculated using an order of magnitude larger than the actual values. To perform actual commands, the CVAL values are scaled down with a factor of 10: This is carried out by the CVAL Scaler Block 1204 of
CVALControlValuen=CVALn/10
Please note that the original CVALn values are left intact by scaling for use in the next control calculation step.
CVALControlValue is used as a control voltage of the VCXO.
If Pulse Width Modulation is employed in the correction process then a pwmControlValue is used to drive a Pulse Width Modulation modulator that is connected to the control voltage of the VCXO.
Alternatively, if the CVALControlValue feeds a D/A (digital to analog converter) that is connected to the control voltage input of the VCXO; the changing of the control voltage of the VCXO will change the output frequency of the VCXO.
In yet another alternative implementation, the wireless receiver block 206, instead of using a VCXO, may have a fixed frequency clock source that is divided down to the required frequency. The division ratio is varied over a small range, and thus the required frequency can also vary over a small range. In this case, the CVALControlValue will be scaled appropriately and will be used to determine the division ratio, and in this manner will control the output frequency.
Video Streaming Description
The Rx TS Engine 506 derives timestamp values from the clock output generated by the VCXO 504, or from the fixed frequency clock source of the alternative embodiment. The application specific delay of a particular multimedia signal implementation will dictate to the Rx TS Engine 506 what the required difference between the derived timestamp value and the timestamps that are part of the received Timestamped TS Packets 412. Subsequently, the Rx TS Engine 506 processes the received Timestamped TS Packet 412 that is first in the Rx Jitter Buffer 508. As explained earlier, the received Timestamped TS Packet 412 is composed of a Timestamp and of a TS packet. The RX TS Engine 506 examines the Timestamp of the received Timestamped TS Packet 412. At the instant the difference between the Timestamp of the received Timestamped TS Packet 412 and the Rx TS Engine 506 derived timestamp equals the required difference, the RX TS Engine 506 sends the TS packet of the received Timestamped TS Packet 412 to the video compression decoder 314, and clears the first entry in the Rx Jitter Buffer 508.
In an alternative embodiment, according to the delay required by the multimedia signal application, the Rx TS Engine 506 will determine a range of required difference values between the Rx TS Engine 506 derived timestamp and the timestamps that are part of the received Timestamped TS Packets 412. Subsequently, the Rx TS Engine 506 processes the received Timestamped TS Packet 412 that is first in the Rx Jitter Buffer 508. The RX TS Engine 506 examines the Timestamp of the received Timestamped TS Packet 412. At the instant the difference between the Timestamp of the received Timestamped TS Packet 412 and of the Rx TS Engine 506 derived timestamp falls into the range of required difference, the RX TS Engine 506 sends the TS packet of the received Timestamped TS Packet 412 to the video compression decoder 314, and clears the first entry in the Rx Jitter Buffer 508.
In both of the aforementioned video streaming embodiments, the Rx TS Engine 506 keeps repeating the process with respect to the received Timestamped TS Packet 412 that is first in the Rx Jitter Buffer 508. In this manner the Rx TS Engine 506 manages to supply the TS packets to the video compression decoder 314 with minimal jitter.
While certain features of the invention have been illustrated and described herein, many modifications, substitutions, changes, and equivalents will now occur to those skilled in the art. It is, therefore, to be understood that the appended claims are intended to cover all such modifications and changes as fall within the true spirit of the invention.