The following description relates to phase acquisition in spread spectrum systems.
Spread spectrum (SS) systems, such as ultra-wideband (UWB) systems transmit information spread over a large bandwidth. In a UWB system, the source signal is spread over a bandwidth many times larger than its original bandwidth. Pseudo-random or pseudo noise (PN) sequences are periodic sequences with long periods that allow the transmitted signal to have a relatively low signal to noise ratio (SNR). In a direct sequence UWB (DS/UWB) system, the transmitted signal is a train of pulses with polarities determined by the product of a PN binary sequence and the incoming binary source data sequence. For a UWB receiver, the first step of demodulation is to de-spread the signal. In a DS/UWB system, this is achieved by multiplying the incoming samples by a local replica of the PN sequence. The receiver can determine the PN code phase embedded in the transmitted signal by analyzing the data collected from a short observation window to synchronize the local replica. The time period of the observation window is short compared to the PN code period. Determining the PN code phase is called PN acquisition.
In one example, a signal analysis method includes running an iterative message passing algorithm (iMPA) on a standard graphical model augmented with multiple redundant models for fast PN acquisition. In another example, the hardware architecture for implementing this signal analysis is described.
In one aspect, a method for signal analysis includes digitizing a signal modulated by a pseudo noise (PN) sequence, dividing the digitized signal into a plurality of sample blocks, and estimating a PN phase embedded in a sample block of the plurality of sample blocks using an iterative message passing algorithm (iMPA) executed on a redundant graphical model.
This, and other aspects, can include one or more of the following features. The estimated PN phase can be made available to a user. The redundant graphical model can be generated by combining a primary model and at least one auxiliary model. The primary model and the auxiliary model can be based on a same generator polynomial. The iMPA can use a forward backward algorithm. The redundant graphical model can be a cyclic graphical model. The signal can be received from a source in a spread spectrum system. The digitized signal can be stored. Each sample block of the plurality of sample blocks can be stored. The estimated PN phase can be extrapolated over the plurality of sample blocks. The extrapolated sequence can be statistically compared with the digitized signal. The statistical comparison can be correlation. The estimated PN phase can be considered satisfactory if a correlation value is greater than the threshold.
In another aspect, a system includes an analog to digital converter configured to digitize a signal, a channel metric access unit to divide the signal into a plurality of sample blocks, and hardware architecture configured to estimate a PN phase embedded in a sample block of the plurality of sample blocks using an iterative message passing algorithm (iMPA) executed on a redundant graphical model.
This, and other aspects, can include one or more of the following features. The system can further include a receiver configured to receive the signal from a source in a spread spectrum system. This system can further include a first storage unit configured to store the digitized signal. The system can further include a second storage unit configured to store each sample block of the plurality of sample blocks. The system can further include an extrapolation unit configured to extrapolate the estimated PN phase over the plurality of sample blocks. The system can further include a verification unit configured to statistically compare the extrapolated sequence with the digitized signal. The verification unit can further be configured to perform correlation on the extrapolated sequence and the digitized signal.
The system and techniques described can present one or more of the following advantages. The iMPA can enable PN acquisition at low SNR. The iMPA can offer the speed of parallel search and acquisition performance similar to that of serial search at short block lengths. The iMPA can be implemented in hardware to acquire PN sequences with long period. The complexity of the iMPA implementing hardware can be simpler than that implementing parallel search while being faster than that implementing serial search. The use of multiple redundant models can cause faster convergence and operation at lower SNR without increases in the hardware complexity. The multiple models can be aggregated into a single model without significant increase in hardware complexity to reduce memory usage. The logic design based on the described architecture can be easily fit into a small field programmable gate array (FPGA).
The details of one or more implementations are set forth in the accompanying drawings, the description, and the claims below. Other features and advantages will be apparent from the description and drawings, and from the claims.
Like reference symbols in the various drawings indicate like elements.
x
k
=g
1
x
k−1
⊕g
2
x
k−2
⊕ . . . ⊕g
r
x
k−r (1)
Where g0=gr=1, gkε{0,1} for 1<k<r and ⊕ is the modulo-2 addition. The generator polynomial is g(D)=Dr+gr−1Dr−1+gr−2Dr−2+ . . . +D0 where D is the unit delay operator. Given r, the set of gk values that can generate an m-Sequence can be determined. The m-Sequences can be used as spreading sequences in spread spectrum systems due to their excellent auto-correlation and cross-correlation properties.
For a DS/UWB system, a model for acquisition characterization can be represented by Equation 2.
z
k=√{square root over (Ec)}(−1)x
In Equation 2, zk, 0≦k≦M−1, is the noisy sample received by the acquisition module, xk, 0≦k≦M−1, is the spreading m-Sequence, Ec, is the transmitted energy per pulse and nk is additive white Gaussian noise (AWGN) with variance (N0/2). The acquisition module, xk, can be generated by an r-stage LFSR and r<<M<<2r−1. The acquisition module can estimate xk based on zk, 0≦k≦M−1 for a given frame epoch estimate and decide whether the frame epoch estimate is correct. In the present description, the estimate of xk, denoted by x′k, is obtained by running an iterative message passing algorithm. Once r consecutive x′k are obtained, the rest of the sequence is determined by extrapolating the estimate by Equation 1 to ensure that x′k is consistent with Equation 1. Subsequently, zk is correlated with x′k, 0≦k≦M−1 to check whether the correlation threshold is reached.
The PN acquisition is formulated as a decoding problem and an iMPA is applied. The cyclic graphical models can be chosen for low complexity decoding. In one example, the generator polynomial chosen can be g(D)=D15+D1+D0. For a binary variable X, the message passed (i.e., soft information) in a cyclic graph is an approximation of the negative log likelihood ratio represented by:
In this example, in each iteration, the algorithm can successively update messages and decisions can be made by comparing a decision message Mdec to 0 where Mdec is an approximation of:
If Mdec≧0, x′k=0, otherwise, x′k=1. The absolute value of Mdec can be interpreted as the confidence of the decision. If the algorithm converges, Mdec can stabilize after certain number of iterations indicating some level of confidence in the decisions.
x
k
⊕x
k−1
⊕x
k−22=0 (3)
x
k−1
⊕x
k−2
⊕x
k−23=0 (4)
x
k−22
⊕x
k−23
⊕x
k−44=0 (5)
Adding Equations 3, 4, and 5, it can be seen that xk+xk−2+xk−44=0. Therefore, g(D)=D44+D2+D0 also generates the same sequence. The analysis can be extended to show that all generator polynomials represented by Equation 6 generate the same sequence.
g(D)=D22.2
The graphical model based on Equation 6 is referred to as the nth order auxiliary model and the one based on g(D)=D22+D1+D0 as the primary model. Also, the model that combines the primary model and the 1st, 2nd . . . (n−1) order auxiliary models as the nth order model. The decoding graph for an nth order model is formed by constraining the output of primary model and each of the ith order auxiliary models 1≦i≦n to be equal.
The baseline algorithm is summarized below.
The complexity of the current algorithm is same as both the decoding and correlation operations, which is of O(M).
When an nth order model is used for the iMPA, the pulses are decoded by n different models during each iteration. The hardware module that performs iMPA for each auxiliary model is an iterative decoder. The basic building block in the algorithm is an iterative decoder that decodes the sequence generated by g(D)=D22+D1+D0. In some implementations, the hardware architecture can be based on the Tanner graph for the generator polynomial depicted in
If MI[i] and MO[i] are the input and output messages with ports defined in
The 2-state g(D)=D+1 recursive FSM SISO can be implemented by the forward backward algorithm (FBA). Assuming that Fk and Bk are the forward and backward state metric and MI[xk], MO[xk], MI[ak], and MO[ak] are the input and output ports defined in
F0=0 (7)
BM=0 (8)
F
k+1=min(MI[ak],Fk)−min(0,Fk+MI[ak])+MI[xk] (9)
B
k+1=min(MI[ak],MI[xk]+Bk+1)−min(0,Bk+1+MI[xk]+MI[ak] (10)
MO[a
k]=min(Bk+1+MI[xk],Fk)−min(0,Fk+Bk+1+MI[xk]) (11)
MO[x
k
]=F
k+1
+B
k+1
−MI[x
k] (12)
M
dec
=MI[x
k
]+MO[x
k] (13)
Based on Equations 7-13, it can be seen that the FSM SISO requires two types of memory. The first type is for storing the 2M messages passed between the g(D)=D+1 SISO and the broadcaster SISO. Their values are updated based on the results from the previous iteration. The second type is for storing the FSM state metrics Fk and Bk, which are recalculated during every iteration. In other words, the FSM state metric memory can be reused once operations in the current iteration are finished. Therefore, Bk need not be stored if MO[·] are updated immediately once both Fk and Bk+1 become available. Further, the state metric memory can be reduced by updating the state metrics segment by segment to reuse the memory within the current iteration. For example, if the segment size is M/8, the total memory requirement becomes M/8 state metrics +2M messages which is less than the 3M messages requirement based on
x
n
k+i,0≦i≦2n−1
In the above expression, each sub-sequence can be generated by g(D)=D22+D1+D0. The corresponding decoder can be constructed using multiple g(D)=D22+D1+D0 decoders similar to
The 4-state FSM SISO is also based on the forward backward algorithm. The state is defined as Sk={xk−1, xk} and the corresponding decoder is shown in
F0=0 (14)
BM=0 (15)
F
k+1[0]=min(Fk[0],Fk[2]+LI—1k} (16)
F
k+1[1]=min(Fk[0]+RIk+LI—0k+LI—1k,Fk[2]+RIk+LI—0k) (17)
F
k+1[2]=min(Fk[1]+LI—0k,Fk[3]+LI—0k+LI—1k) (18)
F
k+1[3]=min(Fk[1]+RIk+LI—1k,Fk[3]+RIk) (19)
B
k−1[0]=min(Bk[0],Bk[1]+RIk+LI—0k+LI—1k) (20)
B
k−1[1]=min(Bk[2]+LI—0k,Bk[3]+RIk+LI—1k) (21)
B
k−1[2]=min(Bk[0]+LI—1k,Bk[1]+RIk+LI—0k) (22)
B
k−1[3]=min(Bk[2]+LI—0k+LI—1k,Bk[3]+RIk) (23)
RI
k
=LO
—0k+22+LO—1k+44+Mch[k] (27)
LI
—0k+22=ROk+LO—1k+44+Mch[k] (28)
LI
—1k+44=ROkLO—0k+22+Mch[k] (29)
M
dec
=RO
k
+LO
—0k+22LO—1k+44+Mch[k] (30)
In other implementations, multiple auxiliary models can be combined to form a single FSM. For example, a 3rd order model can be implemented using a 16-state FSM.
The internal FSM state metric memory can be reduced by dividing the observation window into multiple segments and running the forward backward algorithm (FBA) segment by segment. In some implementations, the observation window (M=1024) is divided into 8 segments. There is one forward unit and one backward unit running 15 iterations from index 0 to index 1024. During each iteration, the forward unit updates the state metric sequentially from pulse 0 to pulse 1025. The backward unit computes the state metric in the following order: 127→0, 255→128, . . . , 1023→896. Such a sequence of calculations can cause an inability to determine the backward metric B128[i], 0≦i≦3 when computing 127→0, B256[i] when computing 255→128, etc. The problem can be solved by running the backward unit for an additional “warm-up” period. The backward state metric at the segment boundary can be well approximated by starting a backward state recursion just several constraint lengths away. Excluding the warm-up, (i.e., setting B128[i]=0) can incur a loss of 0.25 dB in Ec/N0. Including an additional backward unit can enable running a design using the warm-up approach at full-speed, wherein one unit can warm up while the other is doing the update. The additional unit can be saved if, instead of the warm-up approach, the B128[i] values from the previous iteration are copied. This is feasible because the warm-up period is only required if an FBA-SISO in isolation needs to be approximated. For an iterative system, starting the backward recursions based on earlier iterations values is equivalent to a change in the activation schedule for the iMPA on the cyclic graph, and as such does not significantly affect the performance.
In the present description, the bit widths are determined by simulations in two steps. In the first step, LI—0k, LI—1k are fixed to be of 16 bits. Further, it is determined that 4 bits of analog to digital converter (ADC) output is sufficient.
The bit width for the state metric need only be big enough for the difference between Fk[i] 0≦i≦3. Subtracting Fk[0] from Fk[i] 0≦i≦3, it can be shown that the differences (i.e., the normalized Fk[i]) are bounded between −128 and 127 for 5-bit messages. As a result, it can be represented by 8 bits. Similarly, the normalized Bk[i] 0≦i≦3 can be represented by 8 bits for 0≦k≦1024. Since the normalized Fk[0] and Bk[0] are always 0, they need not be stored. The normalization approach can be applied for all binary variables to reduce memory usage by half and only requires one subtraction. For example, LI—0k is a shorthand for LI—0k[1] where LI—0k[0]=0 for all k. or the quaternary state metric variables, 9 bits can be used to represent the state metric instead of 8 bits.
The architecture can be implemented using hardware description languages (HDL). In some implementations, the architecture can be implemented using Verilog HDL. The code can be synthesized using Synplicity, then mapped by Xilinx Foundation to a Xilinx Virtex 2 device (XC2v250-6). The number of bits implemented in block RAM is 28160, the number of 4-input look-up tables (LUTs) used is 1621 and the number of slices used is 1039. The design can run at 73 MHz. The baseline can decode Freqclk/15 pulses per second. Assuming a 60 MHz clock, the architecture can generate a PN code phase decision every [15/(60 MHz)]·1024=2.56 μs. The decode process is repeated for each frame epoch estimate until the correct frame epoch is found. Assuming the frame time Tf=250 ns (i.e., pulse rate=4 Mpulses/s) and pulse width Tp=1.6 ns, the approximate average acquisition time of our prototype system is Tacq=(2.56 μs)·(Tf/Tp)·(0.5)=20 ms with Pacq=0.95 at (Ec/N0)=−8.9 dB. In this calculation, it is assumed that half of the frame epoch values are searched on average.
A number of implementations have been described. Nevertheless, it will be understood that various modifications may be made without departing from the spirit and scope of the disclosure. For example, parallel FBA architecture (i.e., instantiating multiple forward and backward units to process multiple data segments in parallel) can be used to further lower Tacq. The increase in logic can be expected to be approximately linear when the speed up factor does not exceed 8 because the observation window is already divided into 8 segments in the iterative decoder and each of them can be run in parallel. For lower speed applications, single port memory can be used and the update can be run sequentially. Such a design can save in the number of adders and reduce the routing resources. The logic gate count can be expected to scale linearly for target pulse rate varies from 500 kpulses/s to 32 Mpulses/s. The design can be extended to operate at even lower SNR. Auxiliary model decoders as well as memories can be added for saving the messages from the additional decoders. Since a 6th order model is approximately three times more complex than a 2nd order model, the operating Ec/N0 can be lowered to −13 dB by tripling the gate count or alternatively, increasing the acquisition time by 3 times and tripling the message memory. Accordingly, other embodiments are within the scope of the following claims.
This application claims priority to U.S. Provisional Application 60/793,380, filed Apr. 20, 2006. The disclosure of the prior application is considered part of (and is incorporated by reference in) the disclosure of this application.
The U.S. Government may have certain rights in this invention pursuant to Army Res. Office Grant No.: DAAD19-01-1-0477 and NSF Grant No.: CCF-0428940.
Number | Date | Country | |
---|---|---|---|
60793380 | Apr 2006 | US |