The present invention relates generally to timing recovery. More particularly, but not by limitation, the present invention relates to timing recovery using iterative coding schemes.
The process of synchronizing a sampler with a received analog signal is known as timing recovery. It is a crucial component in a recording system channel detector, such as magnetic recording channel detectors. The quality of synchronization has a tremendous impact on the overall performance of the channel detector. At current areal recording densities, existing timing recovery architectures perform well. However, at the higher areal densities which will be used in the future, signal energy will be lower and noise in the system will increase. Thus, the signal-to-noise ratio (SNR) will decrease.
The advent of iterative error-correction codes allows the system to operate at low SNRs with acceptable performance due to their large coding gains. This means that timing recovery must also function at low SNRs. A conventional receiver performs timing recovery and error-correction decoding separately. Specifically, conventional timing recovery ignores the presence of error-correction codes; therefore, it fails to function properly at low SNRs, and timing errors increase.
Theoretically, joint maximum-likelihood (ML) estimation of timing offsets and message bits, which will jointly perform timing recovery, equalization and decoding, is a preferred method of synchronization; however, its complexity is gigantic. Fortunately, the solution to this problem with complexity comparable to a conventional receiver has been proposed, which is realized by embedding the timing recovery step inside the turbo equalizer so as to perform their tasks jointly. From this point on, that iterative timing recovery (ITR) scheme is denoted as “NonPSP-ITR”, where “PSP” stands for per survivor processing. However, NonPSP-ITR requires a large number of turbo iterations to provide an acceptable performance when the channel experiences severe timing jitter noise.
Embodiments of the present invention provide solutions to these and other problems, and offer other advantages over the prior art.
A method of the present invention includes the steps of receiving a signal indicative of data bits, and performing per survivor processing-iterative timing recovery (PSP-ITR) on the received signal to generate probabilities of the data bits. To perform PSP-ITR on the received signal, the signal can be processed using a per survivor processing-soft decision algorithm (PSP-SDA) which jointly performs timing recovery and equalization in accordance with embodiments of the present invention. The soft decision algorithm (SDA) can be, for example, a Bahl, Cocke, Jelinek, and Raviv (BCJR) algorithm modified in accordance with the concepts of the present invention such that it is configured to implement per survivor processing (PSP) to jointly perform timing recovery and equalization. In other embodiments, the SDA is a Soft Output Viterbi Algorithm (SOVA) configured to implement PSP to jointly perform timing recovery and equalization. Still other SDAs can be used with PSP to jointly perform timing recovery and equalization in other embodiments of the present invention.
In some embodiments of the invention, the step of processing the received signal using a PSP-SDA algorithm includes the step of calculating a plurality of branch metrics, with each branch metric corresponding to a transition branch between states in a trellis. A survivor path between the states is then identified as a function of the calculated branch metrics.
In some embodiments, each state has an associated sampling phase offset used to sample the received signal. The sampling phase offsets differ between various states. In these embodiments, the step of calculating the plurality of branch metrics further includes calculating each branch metric as a function of the sampling phase offset at a starting state of the corresponding branch. The branch metrics can be calculated during both forward and backward recursions.
Other features and benefits that characterize embodiments of the present invention will be apparent upon reading the following detailed description and review of the associated drawings.
As noted previously, when used in a recording or other channel, NonPSP-ITR requires a large number of turbo iterations to provide an acceptable performance when the channel experiences severe timing jitter noise. This problem can be solved by utilizing a PSP technique. Per survivor processing, or PSP, is a technique for jointly estimating a data sequence and unknown parameters, such as channel coefficients, carrier phase, and so forth. PSP has been employed in many applications including channel identification, adaptive maximum likelihood (ML) sequence detectors, and phase/carrier recovery. PSP can be applied to the development of PSP-based timing recovery implemented based on a Viterbi algorithm, which performs timing recovery and data detection jointly. Results have shown that it performs better than the conventional receiver that separates these two tasks, especially when the timing jitter is severe.
Similarly, timing recovery and equalization can be performed jointly by using a PSP technique, which can yield better performance than performing them separately. To do so, a PSP-soft decision algorithm (SDA) is provided in which the timing recovery is embedded inside the soft decision equalizer, and is used to provide a PSP-based iterative timing recovery scheme, which is referred to herein as “PSP-ITR”. PSP-ITR iteratively exchanges soft information between the PSP-SDA and an error-correction decoder. The SDA can be a Bahl, Cocke, Jelinek, and Raviv (BCJR) algorithm resulting in the PSP-SDA being a PSP-BCJR, a Soft Output Viterbi Algorithm (SOVA) resulting in the PSP-SDA being a PSP-SOVA, or other types of soft decision algorithms.
The following discussion provides a description of a new timing recovery architecture, in accordance with the present invention, which is more robust to cycle slips in the system. Without loss of generality or limitation of scope, the PR-IV channel model (a partial response channel with three target values) is used to illustrate the steps in the new algorithm. However, it is worth noting that the algorithm can be applied not only to any target response (other than PR-IV), but also to equalized channels as well. Before going into the details of the timing recovery algorithm of the present invention, the channel architecture is first introduced, and a brief explanation of a conventional approach is provided.
Channel Model
Consider the coded partial response (PR) channel model 200 shown in
The readback signal, s(t), can therefore be written as
where τk is the k-th timing offset, defined as the difference between the actual and the expected arrival time of the k-th pulse, and n(t) is additive white Gaussian noise (AWGN) with two-sided power spectral density N0/2. Timing offset circuit 230 models τk as a random walk model according to Equation 2,
τk+1=τk+σwωk Equation 2
where σw determines the severity of the timing jitter and ωk is an independent identically distributed (i.i.d.) zero-mean unit-variance Gaussian random variable. The random walk model is chosen because of its simplicity to represent a variety of channels by changing only one parameter. Perfect acquisition, i.e., τ0=0, is also assumed.
Conventional Receiver
At the front-end receiver 250 of
where n′k is zero-mean Gaussian random variable with variance σ2n=N0/(2T).
A conventional timing recovery practically takes the form of a phase-locked-loop (PLL) 262 where, with perfect acquisition and no frequency offset component in the system, the sampling phase offset is updated by a first-order PLL, i.e.,
{circumflex over (τ)}k+1={circumflex over (τ)}k+ξ{circumflex over (ε)}k Equation 4
where ξ is a PLL gain parameter determining the loop bandwidth and the convergence rate, and {circumflex over (ε)}k is an estimate of the timing error εk=τk−{circumflex over (τ)}k. This estimate is generated by a well-known Mueller and Müller (M&M) timing error detector (TED) according to:
where the constant 3T/16 ensures the S-curve slope of Equation 5 is one at the origin, and {tilde over (r)}=E[rk|yk] is the k-th soft estimate of the channel output rkε{0,±2}, which is given by:
The soft estimate is considered in this disclosure because it provides better performance than the hard estimate, which is obtained by a memory-less three-level quantization of yk.
In the conventional receiver, conventional timing recovery is followed by a turbo equalizer 265, which iteratively exchanges information between a soft-in soft-out (SISO) equalizer 270 for the precoded PR-IV channel and an error-correction SISO Forward Error-Correction (FEC) decoder 275, both based on BCJR. The iterative exchange of information between SISO equalizer 270 and SISO FEC decoder 275 uses a de-interleaver 280 and an interleaver 285, as well as summation circuitry 290 and 295, in a conventional manner.
where q(t) is a sinc function. This set of samples is then fed to the turbo equalizer 270. In summary, timing recovery benefits from better decisions, and the turbo equalizer benefits from better samples. The process is illustrated in
Iterative Timing Recovery of the Present Invention
To obtain a new iterative timing recovery scheme based on PSP, a description is first provided of the application of PSP to develop PSP-BCJR, which jointly performs timing recovery and equalization. With PSP-BCJR, a PSP-based iterative timing recovery scheme denoted as PSP-ITR is proposed, which performs timing recovery, equalization and decoding jointly, as shown in
PSP-BCJR Algorithms
PSP-BCJR is realized by embedding the timing recovery inside the BCJR equalizer. The PSP-based timing recovery performs timing update operation at each state based on the history data obtained from the survivor path. Unfortunately, there is no such notion as a survivor path in the context of BCJR. In order to perform timing update operation inside BCJR, the concept of a virtual survivor path (or, simply, the survivor path) inside the BCJR is introduced. This survivor path can be easily obtained, once the best state transition leading to each state is determined.
PSP-BCJR has different sampling phase offsets associated with each state. Thus, the branch metrics at each stage of the trellis are calculated based on the sampling phase offset of the starting state. Since BCJR involves two recursions, namely forward and backward recursions, it is useful to perform timing update operation for both directions. The timing update operation during backward recursion will serve as refining the sampler outputs {yk}, thus resulting in an improved set of {γk}, which will be used to compute the log likelihood ratios (LLRs) of {αk}. For simplicity, some embodiments of the invention are restricted to the M&M TED algorithm when performing timing update.
Forward Recursion
Consider the trellis structure in
Consider the state transition at time k. There are two state transitions arriving at Ψk+1=c, i.e., (b,c) and (d,c). First, y(t) is sampled using the forward sampling phase offsets {circumflex over (τ)}k(b) and {circumflex over (τ)}k(d) to obtain yk(b,c) and yk(d,c), respectively. Next, γk(b,c) and γk(d,c) are computed in order to update αk+1(c). The transition metric during forward recursion can be calculated using the relationship illustrated in Equation 6-2:
Then, the starting state is chosen that corresponds to the best state transition leading to Ψk+1=c by:
where y1<k(p) is a collection of all previous sampler outputs associated with the survivor path leading to Ψk=p, and uεQ.
Suppose (b,c) is the best state transition leading to Ψk+1=c (i.e., {circumflex over (p)}=b). The algorithm then stores the starting state and the sampler output associated with (b,c) according to Sk+1(c)={Ψk=b} and πk+1(c)=yk(b,c), respectively. Then, the next forward sampling phase offset is updated by
{circumflex over (τ)}k+1(c)={circumflex over (τ)}k(b)+ξ{circumflex over (ε)}k(b,c) Equation 8
where {circumflex over (ε)}k(b,c) is the k-th estimated timing error associated with (b,c), which is computed using the information from Sk(b) and πk(b), i.e.,
This {circumflex over (τ)}k+1(c) will be used to sample y(t) at time k+1 for the state transitions emanating from Ψk+1=c. This process is repeated from time k=0 to k=L+v−1.
There are many possibilities to exploit the forward sampling phase offsets in the timing update operation during backward recursion. The first example is to ignore the forward sampling phase offsets at all. Another example is to let each state in the trellis store its own forward sampling phase offset. Then, the algorithm can average the backward sampling phase offset at each state using the forward sampling phase offset associated with that state. Nonetheless, for description of an exemplary embodiment, it can be a goal to extract the best set of the forward sampling phase offsets denoted as {{circumflex over (τ)}ƒwk}, which is obtained by tracing back the survivor path that maximizes αL+V. Hence, {{circumflex over (τ)}ƒwk} is used to average the backward sampling phase offset according to a certain criterion, as shall be seen later.
Note that the reasons for averaging the backward sampling phase offsets with the forward ones are: 1) to improve a set of {γk}, and 2) to avoid a cycle slip that might occur when the backward sampling phase offsets start deviating from the forward ones.
Backward Recursion
Now consider backward recursion where the time index starts from k=L+v−1 to k=0. In order to explain how the timing update operation is performed during backward recursion, the virtual transition (or, simply, the backward transition) is introduced represented by the gray arrows as shown in FIG. 5, which explains how PSP-BCJR performs during backward recursion. Define {circumflex over (τ)}bk(q) as the k-th backward sampling phase offset at Ψk+1=q, which is employed to sample y(t) at time k during backward recursion, e.g., yk(p,q)=y(kT+{circumflex over (τ)}bk(q)). Consider the backward transition at time k. There are two backward transitions arriving at Ψk=b, which corresponds to (b,c) and (b,d). First, the algorithm samples y(t) using the backward sampling phase offsets {circumflex over (τ)}bk(c) and {circumflex over (τ)}bk(d) to obtain yk(b,c) and yk(b,d), respectively. Next, γk(b,c) and γk(b,d) are computed in order to update βk(b). The transition metric during backward recursion:
Then, the starting state is chosen that corresponds to the best backward transition leading to Ψk=b by
where the third equality is obtained by ignoring all terms irrelevant to maximization, and yl>k(q) is a collection of all future sampler outputs associated with the survivor path that emanating from Ψk+1=q.
Suppose (b,c) corresponds to the best backward transition leading to Ψk=b (i.e., {circumflex over (q)}=c). The algorithm stores the starting state and the sampler output associated with (b,c) according to Sbk(b)=c and πbk(b)=yk(b,c), respectively. Then, the next backward sampling phase offset is updated by
{circumflex over (τ)}bk−1(b)={circumflex over (τ)}bk(c)+ξ{circumflex over (ε)}bk(b,c) Equation 11
where {circumflex over (ε)}bk(b,c) is the k-th backward estimated timing error associated with (b,c), which is computed using the information from Sbk+1(c) and πbk+1(c), i.e.,
To avoid a cycle slip when {circumflex over (τ)}bk−1(b) starts deviating from {circumflex over (τ)}ƒwk−1, the backward sampling phase offset is averaged according to
where Δ is the threshold that allows {circumflex over (τ)}bk−1(b) to deviate from {circumflex over (τ)}ƒwk−1. In this document, we set Δ=0.1T to keep {{circumflex over (τ)}bk} close to {{circumflex over (τ)}ƒwk} so that the parameters {αk} and {βk} will be optimized. This {circumflex over (τ)}bk−1(b) will be used to sample y(t) at time k−1 for the backward transitions emanating from Ψk=b.
This process is performed from time k=L+v−1 to k=0. Note that when performing the backward timing update operation, it is important to assure that the S-curve slope of Equation 12 during backward recursion is positive at the origin.
Summary of a PS-BCJR Algorithm Embodiment
1) Initialize forward recursion register values α0=[10 . . . 0]
2) Forward recursion:
4) Output {circumflex over (τ)}ƒw from the survivor path that maximizes αL+V
5) Initialize backward recursion register values βL+V=αL+V
6) Backward recursion:
Compute λk according to
End
Beyond the conventional BCJR, PSP-BCJR necessitates new storage requirements for:
It is apparent that each survivor path has its own PLL to update the sampling phase offset. Therefore, for a PR-IV channel, PSP-BCJR requires eight PLLs, i.e., one PLL for each survivor path during both forward and backward recursions.
Simulation Results
This section compares the performance of PSP-ITR with the conventional receiver and NonPSP-ITR in the precoded PR-IV channel shown in
and then punctured to a block length of 4095 bits by retaining only every eighth parity bit. The punctured sequence passes through an s-random interleaver with s=16 to obtain an interleaved sequence of αk. Note that the PLL gain parameter, ξ, for different timing recovery schemes were optimized based on minimizing the RMS timing error σε=√{square root over (E[(τk−{circumflex over (τ)}k)2])} at per-bit SNR, Eb/N0, of 5 dB. The PLL gain parameters for different system conditions are shown in Table 1 of
Next, let us consider the system with a phase offset a, σw/T=1%, which represents high probability of the occurrence of cycle slips.
The reason that PSP-ITR outperforms NonPSP-ITR when the phase offset σw/T is large is because the front-end PLL used in NonPSP-ITR does not work well compared to the PSP-based timing recovery. Additionally, PSP-ITR can automatically correct a cycle slip (without a cycle slip detection and correction technique as used in NonPSP-ITR) much more efficiently than NonPSP-ITR. In other words, PSP-ITR achieves faster convergence than NonPSP-ITR, which can be confirmed by plotting the sector-error rate (SER) versus the number of iterations in
It is also worth plotting the estimated timing offset obtained from NonPSP-ITR and PSP-ITR for two different sample packets, at SNR=5 dB and phase offset σw/T=1%, as shown in
In order to verify that PSP-ITR outperforms NonPSP-ITR, especially when τv/T is high, BER performance of different timing recovery schemes (with 10 iterations) as a function of σw/T's at SNR=5 dB is plotted in
Simulation Results with an Equalized Channel
Until now, this disclosure has assumed the ideal channel model in
The transition response for a longitudinal recording channel (usually known as a Lorentzian pulse) is given by
where K is a scaling constant and PW50 indicates the width of the Lorentzian pulse at half of its peak value. Similarly, the transition response for a perpendicular recording channel is given by
where erf(·) is an error function which is defined by
and PW50 determines the width of the derivative of g(t) at half its maximum. The ratio, normalized density, ND=PW50/T represents the normalized recording density which defines how many data bits can be packed within the resolution unit PW50, and the dibit response (the pulse resulting from two transitions corresponding to one bit) is defined as h(t)=g(t)−g(t−T).
After convolving the transition sequence dk with the transition response g(t), electronic noise is added in the system through the SNR value definition given as
where Ei is the energy of the impulse response of the recording channel, and σ2 is the power of the electronic noise. For convenience, the impulse response of the recording channel is normalized so that Ei becomes unity.
The plots in
Looking at those
Reduction in Implementation Complexity
If one considers the proposed architecture in
The complexity of the architecture in those FIGS. can be reduced by applying the idea of interpolated timing recovery, and converting to the receiver architecture 450 shown in
The previous studies have shown that interpolated timing recovery, once configured correctly, results into essentially the same system performance compared with a timing loop employing hybrid A/D blocks. Thus, the architecture in
In accordance with embodiments of the present invention, PSP is applied to develop PSP-BCJR (or other PSP-SDA) for performing timing recovery and equalization jointly. With PSP-BCJR, a PSP-based iterative timing recovery scheme was provided, denoted as PSP-ITR, for coded PR channels. The proposed scheme iteratively exchanges soft information between PSP-BCJR and an error-correction decoder.
Simulation results have shown that PSP-ITR outperforms NonPSP-ITR, especially when σw/T is large. This is primarily because PSP-ITR can automatically correct a cycle slip much more efficiently than NonPSP-ITR. In other words, PSP-ITR requires much less number of turbo iterations to correct a cycle slip than NonPSP-ITR. In addition, it has been observed that PSP-ITR performs similar to the system with a trained PLL at the 50-th iteration for σw/T up to 1%.
In accordance with embodiments of the present invention, a method of reducing the implementation complexity of the new PSP-based iterative timing recovery scheme is provided. The idea of interpolated timing recovery is employed to get rid of the hybrid A/D blocks within every branch of the PSP-BJCR architecture. Instead, those blocks are replaced with interpolation filters, which are simpler to implement compared to A/D blocks.
The PSP method can also be applied to Soft Output Viterbi Algorithm (SOVA) type soft output detectors, and those soft outputs can be used within the channel iteration.
Purpose
This Appendix further investigates the performance gain of the Per-Survivor Processing Iterative Timing Recovery (PSP-ITR) architecture provided above against the most recently proposed iterative timing recovery. In this Appendix, the most recently proposed iterative timing recovery method is again referred to as Non-PSP-ITR. Current and future magnetic recording products are taken as the base systems at low Signal-to-Noise-Ratio (SNR) regions to quantify the improvement in performance. It is worth noting that Non-PSP-ITR is not the algorithm that is implemented in current products. When needed, to quantify the performance of the timing recovery architecture implemented in current read-channel architectures, plots are labeled as “conventional receiver” to compare with PSP-ITR. The organization of this Appendix is as follows: After a brief introduction, recent investigations on quantifying timing errors in today's recording architectures are presented. Then, the spindle speed variation is taken as a case study, and a comparison is made between the different timing recovery architectures.
Introduction
Referring back to
Looking at
Thus, the question which this Appendix addresses, i.e., “What is the performance gain of the Per-Survivor Processing Iterative Timing Recovery (PSP-ITR) architecture proposed above?” highly depends on the amount of timing errors in the system. In other words, where do we operate on σw/T axis of those plots? Is there any frequency offset in the system? If there is, what is the realistic amount of frequency offset, and how does it affect the system performance? In order to find answers to these questions, a number of resources were utilized.
Timing Errors in Magnetic Recording Architectures
The information presented here can be itemized as:
Among the items above, some will translate into phase jitter in the system, some will be the source of frequency offset, and some will result into sudden phase offsets. Here, we will take the spindle speed variation as a case study because its effect is well quantified.
A Case Study—Effect of Spindle Speed Variation
First assume that all the spindle speed variation will be transferred into phase jitter. In other words, the value of a σw/T in
However, for future magnetic recording architectures with higher areal densities the T value will reduce, which can result in higher σw/T values. For example, currently 80 Gbyte per platter products are available. The platter diameter is 3.5 inch with a hole of diameter 1.8 inch in the middle. This means that the area to write data is π(1.752−0.92), or 7.08 square inches. Each side of the platter is written to, thus the area becomes 14.18 square inches. Eighty 80 Gbytes or 80*8 Gbits of information is stored on that area, which translates into around 45 Gbits per square inch of areal density. For future products of say 500 Gbits per square inch eleven times more areal density will be required. Similarly, for 1Tbits per square inch twenty-two times more areal density will be required. Assuming the Bit-Aspect-Ratio (BAR) and the rotation speed of the future product to be same as today's, then one ends up with 3.32 and 4.69 times reductions in bit period (T) for 500 Gbits and 1 Tbits per square inch designs, respectively. Thus, the result is 3.32 times and 4.69 times more σw/T in the system. Then the spindle speed effect will be 0.66 and 0.938 percent of the bit period. Looking now at the plots in
The spindle speed variation is a slow process compared to sampling time of the channel. Any variations of that will be almost constant within a sector of data. Thus, rather than phase offset, most of it will be translated into frequency offset in the system. Among the other frequency-offset sources, the spindle speed takes a dominant effect. Thus, next consider all the spindle speed variation as a frequency offset in the system. For analysis sake, a frequency offset of 0.3% was assumed to be the nominal value (0.2% coming from spindle speed and 0.1% from other sources). The plots mentioned up until now don't include the frequency offset effect. New simulations with this specific offset value were run, and the results are shown in
It has been discovered that spindle speed variation is a key parameter. Assuming the spindle speed variation contributes only to phase jitter, the benefit of the proposed algorithm is demonstrated for future high areal density products. On the other hand, assuming that the spindle speed variation contributes solely to frequency offset, the benefit of PSP-ITR can be seen even for current recording architectures.
In addition to spindle speed variations, there are also other disturbances in the system, which affect timing errors. Some of those disturbances are heads sliding to off-track, fly-height modulation, and air-bearing resonance. As PSP-ITR is more robust than the conventional algorithms implemented on the chip and the ones proposed in literature, it is submitted that PSP-ITR will also behave better in presence of those other disturbances. In conclusion, the PSP-ITR architecture can be used to increase the performance and improve the robustness of both today's and future recording products.
It is to be understood that even though numerous characteristics and advantages of various embodiments of the invention have been set forth in the foregoing description, together with details of the structure and function of various embodiments of the invention, this disclosure is illustrative only, and changes may be made in detail, especially in matters of structure and arrangement of parts within the principles of the present invention to the full extent indicated by the broad general meaning of the terms in which the appended claims are expressed. For example, the particular elements may vary depending on the particular application for the recording system while maintaining substantially the same functionality without departing from the scope and spirit of the present invention.