Embodiments are related generally to electronic circuits, and more particularly to a Viterbi detector and technique for recovering information from a read signal wherein a modified Viterbi approach achieves max-log-map equivalence through a more efficient implementation that reduces power consumption and reduces physical size of the implementation.
The traditional soft output Viterbi algorithm (SOVA) receives soft decisions/inputs or soft information values for each bit of information being communicated, where a soft information value includes information on both the value of a bit of data the reliability that bit of data. From these soft information values the SOVA computes log-likelihood ratios (LLRs) for each bit as the minimum difference between the log of the probability of the path leading to a 0 or 1 decision for a given bit and the log of the probability of the path leading to the opposite decision for the bit. The log of the probability of the path is represented by path metric value which is the sum of a state metric and a branch metric at time a time k. The difference between the path metrics is considered only if the best path and its alternate lead to a different decision. In this case the log-likelihood ratio LLR is computed in the same way as for the max-log-map algorithm, which is another decoding algorithm as will be appreciated by those skilled in the art. The path metric difference is minimized to thereby maximize the probability (path metric) of the path leading to a decision that a bit is a 1 versus an alternate path leading to a decision that a bit is a 0. As will be appreciated by those skilled in the art, the SOVA does not perform optimally when the alternate path leads to the same decision for the bit as the best path. In this situation the traditional SOVA considers the path metric difference in updating reliability information.
In one embodiment, a modified soft output Viterbi algorithm (SOVA) detector receives a sequence of soft information values. The detector determines a best path and an alternate path for each of these soft information values and further determines, when the best and alternate paths lead to the same value for a given soft information value, whether there is a third path departing from the alternate path that leads to an opposite decision with respect to the best path for a given soft information value. The modified SOVA detector then considers this third path when updating the reliability of the best path. Embodiments are directed to a modified SOVA detector that achieves max-log-map equivalence effectively through the Fossorier approach but with an efficient implementation that reduces power consumption and physical size of the implementation, as will be described in more detail below. More specifically, in one embodiment the modified SOVA detector includes modified reliability metric units for the first N stages of the detector, where N is the memory depth of a given path, and includes conventional reliability metric units for the remaining stages of the detector.
One approach to address this sub-optimal performance situation is that proposed by Fossorier et al. in Fossorier, Marc P.C., et al., “On The Equivalence Between SOVA and Max-Log-MAP Decodings”, IEEE Communications Letters, vol. 2, No. 5, pp. 137-139, May 1998. This approach claims that if there is a third path departing from the alternate path that leads to an opposite decision with respect to the best path, this path should be considered in updating of the reliability of the best path. Embodiments are directed to a modified Viterbi approach that achieves max-log-map equivalence effectively through the Fossorier approach but with an efficient implementation that reduces power consumption and physical size of the implementation, as will be described in more detail below.
In the present description, certain details are set forth in conjunction with the described embodiments to provide a sufficient understanding. One skilled in the art will appreciate, however, that the embodiments may be practiced without these particular details. Furthermore, one skilled in the art will appreciate that the example embodiments described below do not limit the scope of the present disclosure, and will also understand that various modifications, equivalents, and combinations of the disclosed embodiments and components of such embodiments are within the scope of the present disclosure. Embodiments including fewer than all the components of any of the respective described embodiments may also be within the scope although not expressly described in detail below. Finally, the operation of well-known components and/or processes has not been shown or described in detail below to avoid unnecessarily obscuring the present disclosure.
An overview of conventional read channels, Viterbi detectors, and data recovery techniques follows to assist understanding of embodiments described thereafter.
Typically, the greater the data-storage density of the disk 12, the greater the noise the read head 16 picks up while reading the stored data, and thus the lower the SNR of the read signal. The disk 12 typically has a number of concentric data tracks (not shown in
Unfortunately, the Viterbi detector 20 often requires the read signal from the head 16 to have a minimum SNR, and thus often limits the data-storage density of the disk 12. Typically, the accuracy of the detector 20 decreases as the SNR of the read signal decreases. As the accuracy of the detector 20 decreases, the number and severity of read errors, and thus the time needed to correct these errors, increases. Specifically, during operation of the read channel 14, if the error processing circuit (not shown) initially detects a read error, then it tries to correct the error using conventional error-correction techniques. If the processing circuit cannot correct the error using these techniques, then it instructs the read channel 14 to re-read the data from the disk 12. The time needed by the processing circuit for error detection and error correction and the time needed by the read channel 14 for data re-read increase as the number and severity of the read errors increase. As the error-processing and data re-read times increase, the effective data-read speed of the channel 14, and thus of the disk drive 10, decreases. Therefore, to maintain an acceptable effective data-read speed, the read channel 14 is rated for a minimum read-signal SNR. Unfortunately, if one decreases the SNR of the read signal below this minimum, then the accuracy of the read channel 14 degrades such that at best, the effective data-read speed of the disk drive 10 falls below its maximum rated speed, and at worst, the disk drive 10 cannot accurately read the stored data.
Referring to
For example purposes, the operation of the Viterbi detector 20 is discussed in conjunction with an Extended Partial Response 2 (EPR2) data-recovery protocol, it being understood that the concepts discussed here generally apply to other Viterbi detectors and other data-recovery protocols.
Assuming a noiseless read signal and binary stored data, the read circuit 18, which in this example is designed to implement the EPR2 protocol, generates ideal digitized read-signal samples B having three possible relative values: −1, 0, and 1. These values represent respective voltage levels of the read signal, and are typically generated with a 6-bit analog-to-digital (A/D) converter. For example, according to one 6-bit convention, −1=111111, 0=000000, and 1=011111. The value of the ideal sample B at the current sample time k, i.e., Bk, is related to the bit values of the stored data sequence according to the following equation:
B
k
=A
k
−A
k−1 1)
Ak is the current bit of the stored data sequence, i.e., the bit that corresponds to the portion of the read signal sampled at the current sample time k. Likewise, Ak is the immediately previous bit of the stored data sequence, i.e., the bit that corresponds to the portion of the read signal sampled at the immediately previous sample time k−1. Table I includes a sample portion of a sequence of bit values A and the corresponding sequence of ideal samples B for sample times k-k+6.
Referring to Table I, Bk+1=Ak+1−Ak=1, Bk+2=Ak+2−Ak+1=0, and so on. Therefore, by keeping track of the immediately previous bits A, one can easily calculate the value of current bit A from the values of the immediately previous bit A and the current sample B. For example, by rearranging equation (1), we get the following:
A
k
=B
k
+A
k−1 2)
Equation (2) is useful because Bk and Ak−1 are known and Ak is not. That is, we can calculate the unknown value of bit Ak from the values of the current sample Bk and the previously calculated, and thus known, bit Ak−1. It is true that for the very first sample Bk there is no previously calculated value for Ak−1. But the values of Ak and Ak can be determined from the first Bk that equals 1 or −1, because for 1 and −1 there is only one respective solution to equation (1). Therefore, a data sequence can begin with a start value of 010101 . . . to provide accurate initial values for Bk, Ak, and Ak−1.
Unfortunately, the read signal is virtually never noiseless, and thus the read circuit 18 generates non-ideal, i.e., noisy, digitized samples Z, which differ from the ideal samples B by respective noise components. Table II includes an example sequence of noisy samples Z that respectively corresponds to the ideal samples B and the bits A of Table 1.
For example, the difference between Zk and Bk equals a noise component of 0.1, and so on.
According to one technique, a maximum-likelihood detector (not shown) recovers the bits A of the stored data sequence by determining and then using the sequence of ideal samples B that is “closest” to the sequence of noisy samples Z. The closest sequence of samples B is defined as being the shortest Euclidean distance λ from the sequence of samples Z. Thus, for each possible sequence of samples B, the detector 20 calculates the respective distance λ according to the following equation:
For example, for the B and Z sequences of Table II, one gets:
λ=(0.1−0)2+(0.8−1 )2+(−0.2−0)2+(−1.1−−1 )2+(1.2−1 )2+(−0.9−1)2+(0.1−0)2=0.16 4)
Referring again to Tables I and II, there are seven samples B in each possible sequence of B samples. Because the bits A each have two possible values (0 and 1) and because the sequence of B samples is constrained by equations (1) and (2), there are 27 possible sequences of B samples (the sequence of B samples in Tables I and II is merely one of these possible sequences). Using equation (4), a maximum-likelihood detector should calculate 27 λ values, one for each possible sequence of B samples. The sequence of B samples that generates the smallest λ value is the closest to the generated sequence of Z samples. Once the maximum-likelihood detector identifies the closest sequence of B samples, it uses these B samples in conjunction with equation (2) to recover the bits A of the stored data sequence.
Unfortunately, because most sequences of Z samples, and thus the corresponding sequences of B samples, include hundreds or thousands of samples, this maximum-likelihood technique is typically too computationally complex and time consuming to be implemented in a practical manner. For example, for a relatively short data sequence having one thousand data bits A, i=999 in equation (3) such that the Z sequence includes 1000 Z samples and there are 21000 possible B sequences that each include 1000 B samples. Therefore, using equation (3), the maximum-likelihood detector would have to calculate 21000 values for λ, each of these calculations involving 1000 Z samples and 1000 B samples! Consequently, the circuit complexity and time required to perform these calculations would likely make the circuitry for a maximum-likelihood detector too big, too expensive, or too slow for use in a conventional disk drive.
Therefore, referring to
Referring to
As illustrated by the trellis 30, at any particular Z sample time k-k+n, the two most recent bits A and A of the binary data sequence have one of four possible states S: S0=00, S1=01, S2=10, and S3=11. Therefore, the trellis 30 includes one column of state circles 32 for each respective sample time k-k+n. Within each circle 32, the right-most bit 34 represents a possible value for the most recent bit A of the data sequence at the respective sample time, and the left-most bit 36 represents a possible value for the second most recent bit A. For example, in the circle 32b, the bit 34b represents a possible value (logic 1) for the most recent bit A of the data sequence at sample time k, i.e., Ak, and the bit 34b represents a possible value (logic 0) for the second most recent bit Ak−1. Each circle 32 includes possible values for the most recent and second most recent bits A and A−1, respectively, because according to equation (1), B depends on the values of the most recent bit A and the second most recent bit A−1. Therefore, the Viterbi detector 20 can calculate the respective B sample for each circle 32 from the possible data values A and A−1 within the circle.
Also as illustrated by the trellis 30, only a finite number of potential state transitions exist between the states S at one sample time k-k+n and the states S at the next respective sample time k+1-k+n+1. “Branches” 38 and 40 represent these possible state transitions. Specifically, each branch 38 points to a state having logic 0 as the value of the most recent data bit A, and each branch 40 points to a state having logic 1 as the value of the most recent data bit A. For example, if at sample time k the state is S0 (circle 32a) and the possible value of the next data bit Ak+1 is logic 0, then the only choice for the next state S at k+1 is S0 (circle 32e). Thus, the branch 38a represents this possible state transition. Likewise, if at sample time k the state is S0 (circle 32a) and possible value of the next data bit Ak+1 is logic 1, then the only choice for the next state S at k+1 is S1 (circle 32f). Thus, the branch 40a represents this possible state transition. Furthermore, the value 42 represents the value of the next data bit A1 pointed to by the respective branch 38 or 40, and the value 44 represents the value of B that the next data bit A1 and equation (1) give. For example, the value 42c (logic 0) represents that the branch 38b points to logic 0 as the possible value of the next data bit Ak+1, and the value 44c (−1) represents that for the branch 38b, equation (1) gives Bk+1=0(Ak+1)−1(Ak)=−1.
In addition, the trellis 30 illustrates that for the sequence of bits A, the state transitions “fully connect” the states S at each sampling time to the states S at each respective immediately following sample time. In terms of the trellis 30, fully connected means that at each sampling time k-k+n, each state S0-S3′ has two respective branches 38 and 40 entering and two respective branches 38 and 40 leaving. Therefore, the trellis 30 is often called a fully connected trellis.
Furthermore, the trellis 30 illustrates that the pattern of state transitions between adjacent sample times is time invariant because it never changes. In terms of the trellis 30, time invariant means that the pattern of branches 38 and 40 between states at consecutive sample times is the same regardless of the sampling times. That is, the branch pattern is independent of the sampling time. Therefore, the trellis 30 is often called a fully connected trellis.
Still referring to
X
y=(Zy−By)2 5)
And each path length λ is represented by the following equation:
Thus, during each sampling period between the respective sample times k-k+n, the Viterbi detector 20 updates the respective length A of each path by adding the respective branch length X thereto. The path lengths A are actually the same values as given by equation (3) for the sequences of B samples represented by the paths through the trellis 30. But major differences between the closest-distance and dynamic-programming techniques are 1) dynamic programming updates each path length λ once during each sample period instead of waiting until after the read circuit 18 has generated all of the samples Z, and 2) dynamic programming calculates and updates the path lengths λ for only the surviving paths through the trellis 30 (one to each state S as discussed below), and thus calculates significantly fewer λ values than the closest-distance technique. These differences, which are explained in more detail below, significantly reduce the processing complexity and time for data recovery as compared with the maximum-likelihood technique.
To minimize the number of trellis paths and path lengths X that it monitors, the Viterbi detector 20 monitors only the “surviving” paths through the trellis 30 and updates and saves only the path lengths λs of these surviving paths. The surviving path to a possible state S at a particular sample time is the path having the shortest length λs. For example, each of the states S0-S3 of the trellis 30 typically has one respective surviving path at each sample time k-k+n. Therefore, the number of surviving paths, and thus the computational complexity per sample period, depends only on the number of possible states S and not on the length of the data sequence. Conversely, with the maximum-likelihood technique described above, the computational complexity per sample period depends heavily on the length of the data sequence. Thus, the computational complexity of the dynamic-programming technique increases linearly as the length of the data sequence increases, whereas the computational complexity of the closest-distance technique increases exponentially as the length of the data sequence increases. For example, referring to the 1000-bit data sequence discussed above in conjunction with
Referring to
Referring to
Because the branch lengths Xk between the states at sample times k−1 and k are the first branch lengths calculated, λk=Xk for all branches. The path lengths λk from Table III label the respective branches in
Next, the recovery circuit 24 identifies the shortest path to each state at sample time k, i.e., the surviving paths. Referring to state S0 at sample time k, both incoming paths have lengths λk−0.01. Therefore, both paths technically survive. But for ease of calculation, the recovery circuit 24 arbitrarily eliminates the path originating from the highest state (S2 here) at time k−1, i.e., the path along branch 38c. Alternatively, the recovery circuit 24 could eliminate the path along branch 38a instead. But as discussed below, the detector 20 recovers the proper data sequence regardless of the path that the circuit 24 eliminates. Similarly, referring to states S1-S3 at time k, both of their respective incoming paths have equal lengths λk, and thus the circuit 24 arbitrarily eliminates the path originating from the respective highest state. For clarity, the surviving paths are shown in solid line, and the eliminated paths are shown in dashed line.
Referring to
Referring to
The path lengths λk+1 from Table IV label the respective branches in
Next, the recovery circuit 24 identifies the shortest path to each state at time k+1, i.e., the surviving paths, which are shown in solid line in
Referring to
Referring to
Next, the recovery circuit 24 identifies the surviving paths to each state S at time k+2 in a manner similar to that discussed above in conjunction with
Referring to
Referring to
Next, the recovery circuit 24 identifies the surviving paths (solid lines) to each state S at time k+3. One can see that each of the states S0 and S1 technically have two surviving paths because the path lengths λk+3 for these respective pairs of paths are equal (both λk+3=1.9 for S0 and both λk+3=5.1 for S1). Therefore, as discussed above in conjunction with
Referring to
Referring to
Next, the recovery circuit 24 identifies the surviving paths to each state S at time k+4. One can see that at time k+1 the surviving paths converge at S1, and that at time k+2 the surviving paths converge at S3. Thus, in addition to bit Ak, the recovery circuit 24 has recovered Ak+1=1 and Ak+2=1, which, referring to Table II, are the correct values for the Ak+1 and Ak+2 bits of the data sequence A.
Referring to
Referring to
Next, the recovery circuit 24 identifies the surviving paths to each state S at time k+5. One can see that at time k+3, the surviving paths converge at S2. Thus, in addition to bits Ak, Ak+1, and Ak+2, the recovery circuit 24 has recovered Ak+3=0, which, referring to Table II, is the correct value for the bit Ak+3 of the data sequence A.
Referring to
Referring to
Next, the recovery circuit 24 identifies the surviving paths to each state S at time k+6. One can see that at time k+4, the surviving paths converge at S1. Thus, in addition to bits Ak-Ak+3, the recovery circuit 24 has recovered Ak+4=1, which referring to Table II, is the correct value for the bit Ak+4 of the data sequence A.
Referring to
Referring again to
The Viterbi detector 20 continues to recover the remaining bits of the data sequence A in the same manner as described above in conjunction with
Although the trellis 30 is shown having four states S0-S3 to clearly illustrate the dynamic-programming technique, the EPR2 Viterbi detector 20 typically implements a trellis having two states, S0=0 and S1=1, to minimize the complexity of its circuitry.
The disk-drive system 100 also includes write and read interface adapters 128 and 130 for respectively interfacing the write and read controllers 108 and 114 to a system bus 132, which is specific to the system used. Typical system busses include ISA, PCI, S-Bus, Nu-Bus, etc. The system 100 also typically has other devices, such as a random access memory (RAM) 134 and a central processing unit (CPU) 136 coupled to the bus 132.
The traditional SOVA computes the LLR as the minimum difference between the log of the probability of the path leading to a 0 or 1 decision and the log of the probability of the path leading to the opposite decision. Note that the log of the probability of the path is represented by path metric value which is the sum of the state metric and the branch metric at time ‘k’.
The difference between the path metrics is considered only if the best path and its alternate lead to a different decision. In this case the LLR is computed in the same way as the max-log-map system. By minimizing the path metric difference one maximizes the probability (path metric) of the path leading to decision 1 .vs. the path leading to decision 0.
Given a path memory depth ‘p’, RMU depth ‘r’, number of Viterbi states ‘v’, the resources required for a SOVA detector and a modified SOVA (mSOVA) detector generating max-log-map equivalence LLR are shown in the table below:
In another approach a modified RMU for (<p) stages and the traditional RMU for the remainder number of stages. The motivation is to reduce the implementation cost of modified RMU which scales by the number of Viterbi states. State metrics can start from any random value. This approach generates a savings in area & power without significantly compromising on performance as evident in the LLR plots.
Typically initial & terminating states of a Viterbi are known and it is important to initialize these states so that the paths are pinned during trace-back. However the traditional method of implementing this function creates an important path in the accumulate-compare-select (ACS) logic. This was the motivation to develop an alternative method to perform this function without exasperating this path. Apriori-based State Metric Initialization solves this problem. This method assumes the following: a) state metrics can start from any random value; b) assuming an n tap DDNP FIR there are at least n equalized samples prior to the user data; and c) at least the ideal p greater than or equal to 4 bits prior to the user data is known to fabricate the apriori. This is equivalent to the last p bits in the Syncmark pattern.
The following sequence is then employed to initialize the state metric to a known state prior to user data:
Apriori-based State Metric Termination assumes that one can fabricate the high confidence apriori for the pad bits. After the last user data the branch metric computation uses the fabricated ‘apriori alone’ to prune the trellis. A minimum of 4 pad bits is required for this operation.
Traditionally branch metrics that factor in apriori in its computation tends to keep them signed. However one can see that if it is possible to use unsigned branch metric instead, one can reduce the width of the path metric by a single bit and help improve ACS performance. The following method of factoring in apriori achieves this result and provides an unsigned branch metric and therefore an unsigned path metric.
The problem that is solved relates to the timing of the DDNP parameters that drive the branch metric generation for the SOVA. This problem is quite severe when the gap between two fragments is comparable to the depth of the DDNP pipeline. This is indeed the case for high data rates. Different portions of DDNP parameters are used at different points in the DDNP pipeline. The challenge therefore is to perform an update without disrupting the pipeline and keep the parameters consistent. This problem did not exist until continuous calibration of DDNP parameters is desired.
The mentioned challenge is overcome by performing a rolling update of DDNP parameters with the portions updated in a pipelined lock-step manner. This presents a consistent set of parameters for a branch metric computation. In addition the update allows for independent update of the parameters related to a single condition as & when new parameters are generated.
The ACS (Add-Compare-Select) and REA (L1) (Register Exchange Architecture) blocks implement a standard Viterbi detection that computes through the REA the hard decisions. The SOVA computes the LLR by tracking the minimum Path Metric (Branch Metric+State Metric) difference computed in the presence of an alternate path leading to a different decision.
This is accomplished by aggregating the path metric difference throught the RMU (Reliability Measuring Unit) network that is driven by the equivalency checks performed by the REAEQ block. The REAEQ (REA+equivalence check) replays the REA aggregation once that the hard decisions are computed. The hard decisions are used to select the best state (used to track the best path) for every stage of the REAEQ.
The RMU aggregates the path metric difference computed by the ACS and selected by the hard decision out of the FIFO. The PMFIFO and DECS FIFO are used to hold the Path Metric Difference and the ACS decisions until the hard decision of of the REA are ready and the best path is known. Since the Path Metric Difference is an absolute value the final LLR is obtained by combining the hard decision with the RMU LLR to form a signed LLR value.
The Fossorier modification for an ‘n’ state SOVA with soft-memory depth ‘L’ requires (n−1)*L additional RMU resources. For our iterative system, ‘n’=8, ‘L’=28, therefore this amounts to 7×28 additional RMU units. The question therefore is to reduce this cost without significantly affecting the performance. A hybrid solution is therefore proposed with L1 stages of Fossorier update and (L-L1) stages of traditional RMU. This would result in a resource increase of ((n−1)*L1) RMU units to implement this modification. A proper choice of L1 is therefore important and to this end we ran simulations with msimAM in fixed point corresponding to the worst case SNR where BER at the output of 1st SOVA is 1.6e-2 with an iterative decoder. We used total of 3 instances of SOVA to understand if the choice of ‘L’ was instance specific. The results are captured in
The two figures in the 2nd row show the conditional histogram for the case with Fossorier modification for L1=[8:2:18]. The two figures in the 3rd row show to the error in conditional histogram for the case with Fossorier modification for L1=[8:2:18] .vs. the case when L1=20. The 4th row includes two figures that show the error ratio/LLR in conditional histogram for the case with Fossorier modification for L1=[8:2:18] .vs. the case when L1=20. The operating conditions are: Simulator: msimAM, fixed-point mode, 8-State mSOVA, +Iterative Decoder (8-State) Operating Point: snr: 11.2, ubd: 2.1, jit: 0.9, nsecs=1000, Same Seed for d=8:2:18 Version: arisso_020508
The first SOVA for d=8, the deltaRatio is about is about 0.03 for the worst performing LLR. However, for d=10, the deltaRatio is under 0.01 for the worst case & this corresponds to a LLR of magnitude 18. The second SOVA SOVA again d=10, the deltaRatio is under 0.05 for the worst case LLR. The third SOVA and even in this case for d=10, the deltaRatio is under 0.025 for the worst case LLR.
From the foregoing it will be appreciated that, although specific embodiments have been described herein for purposes of illustration, various modifications may be made without deviating from the spirit and scope of the disclosure.
The present application is a Divisional of copending U.S. patent application Ser. No. 12/924707 filed Oct. 1, 2010, which application claims the benefit of U.S. Provisional Patent Application No. 61/247,899, filed Oct. 1, 2009, now expired; all of the foregoing applications are incorporated herein by reference in their entireties.
Number | Date | Country | |
---|---|---|---|
61247899 | Oct 2009 | US |
Number | Date | Country | |
---|---|---|---|
Parent | 14192674 | Feb 2014 | US |
Child | 15234122 | US | |
Parent | 12924707 | Oct 2010 | US |
Child | 14192674 | US |