The instant application claims priority to Italian Patent Application No. TO2011A000808, filed Sep. 12, 2011, which application is incorporated herein by reference in its entirety.
An embodiment relates to wireless communication technology, and, more specifically, to systems for estimating the channel impulse response in Orthogonal Frequency Division Multiplexing (OFDM) systems experiencing fading channels. In another embodiment, the channel length also can be estimated jointly with the channel impulse response.
Throughout this description, various publications are cited as representative of related art. For the sake of simplicity, these documents will be referred by reference numbers enclosed in square brackets, e.g., [x]. A complete list of these publications, ordered according to the reference numbers, is reproduced in the section entitled “List of references” at the end of the description. These publications are incorporated herein by reference.
In digital transmission systems, one technique to transmit source bits is to group them into complex symbols representing the amplitude and phase of the signal modulating a frequency carrier. Quadrature Amplitude Modulation (QAM) and Phase Shift Keying (PSK) are exemplary modulation schemes.
Generally, the QAM (or PSK) complex symbols are associated with m binary bits, and the way the bits are associated to the S=2m complex symbols is called “mapping”, while the set of symbols is called a “constellation”.
For example, Quadrature Phase Shift Keying (QPSK) refers to four complex symbols that can be represented by the two bits 00, 01, 10, 11 respectively. In this context, Gray mapping is a well-known exemplary technique wherein two adjacent complex symbols represent group of bits differing by only one bit.
Complex symbols can be graphically represented in the complex plane where the two axes represent the in-phase (I) and quadrature-phase (Q) components of the complex symbol. For example,
Digital data (bits or symbols) are transmitted through physical channels that normally corrupt them because of additive noise. Moreover, in wireless systems, the fading communication channel imposes distortion (i.e., phase and amplitude changes). For these reasons, the received data typically do not coincide with the transmitted ones, and a technique, such as an equalization technique, is required to estimate the transmitted data. Normally, the channel coefficients are estimated prior to such equalization and assumed known by the equalizer. The robustness of a transmission link depends on the ability of the receiver to reliably detect the transmitted bits (i.e., transmitted 1s as 1s and transmitted 0s as 0s).
Signal reflections and diffractions can result in multiple copies of the transmitted signal at the receiver, i.e., multi-path effects. Typically, each of these multi-path components will be characterized by a different phase and magnitude associated with the channel.
For example, the discrete time-domain Channel Impulse Response (CIR), and its associated Power Delay Profile (PDP), represents each multi-path contribution as a time-domain tap. Each tap is typically represented as a complex value whose magnitude represents the associated level of intensity, and the angle a phase rotation, of its contribution to the overall received signal. Moreover, the delay spread of the channel is the delay between the arrival of the first and last multi-path contributions in the PDP. Often used is a single value, which accounts for each multi-path contribution, a root-mean square (rms) delay spread, which measures the delay dispersion around its mean value. The signal will likely be distorted by channels characterized by a higher rms delay spread.
Moreover, in the digital domain, an alternative practical parameter to characterize the channel in the time domain is the channel length (CL), i.e., the number of relevant channel taps.
Time-domain multi-path effects may have a dual representation in the frequency domain, where they determine the level of “frequency selectivity” of the channel, measured through the coherence bandwidth, inversely proportional to the delay spread, which represents a frequency band where the channel frequency-response amplitude assumes almost a constant value.
A popular wireless modulation technique is represented by orthogonal frequency division multiplexing (OFDM). OFDM systems correspond to dividing the overall information stream to be transmitted into many lower-data-rate streams, each one modulating a different “sub-carrier” of the main frequency carrier. Equivalently, the overall bandwidth is divided into many sub-bands respectively centered on the sub-carriers. This operation makes data communication more robust via a wireless multi-path fading channel, and simplifies frequency equalization operations. OFDM systems are well known to those skilled in the art. Examples of popular OFDM-based wireless communication systems include, though not limited to, the Wireless Local Area Network (WLAN) standardized by IEEE as “802.11a” [1] and others like “WiMax” for fixed wireless access, “LTE” (long-term evolution) for next-generation cellular communications, etc.
State-of-the-art channel-estimation (CE) methods for OFDM systems may be classified in several ways, such as (see [2], [3]):
An embodiment of the present description deals with a reduced-rank time-domain least-squares channel estimation (RR TD LS CE) method, able to perform high-performance channel estimation (CE), with the same computational complexity as the state-of-the-art, but with lower memory requirements. Besides, various embodiments of this description deal with joint low-complexity RR TD LS CE and channel-length estimation (CLE).
One or more embodiments include, a method, a corresponding apparatus (a channel estimator and a related receiver), as well as a corresponding related computer-program product, loadable in the memory of at least one computer and including software-code portions for performing the steps of the method when the product is run on a computer. As used herein, reference to such a computer-program product is intended to be equivalent to reference to a computer-readable medium containing instructions for controlling a computer system to coordinate the performance of a method. Reference to “at least one computer” is intended to highlight the possibility for an embodiment to be implemented in a distributed/modular fashion.
As mentioned in the foregoing, the purpose of the present description is to provide embodiments of techniques for estimating the channel in OFDM systems. Moreover, if needed, it is possible to perform jointly channel-impulse-response (CIR) estimation and channel-length estimation (CLE).
Various embodiments of the present description deal with arrangements able to perform a high-performance RR TD LS CE (or “channel smoothing”) at the same complexity as the state of the art, but with a lower memory burden.
Moreover various embodiments deal with arrangements able to perform joint CE and CL estimation at the same time, thus possibly adding relatively little extra complexity compared to CE alone, yet providing accurate length estimates, which can be used in turn to improve the CE output accuracy.
For example, key features of various embodiments disclosed herein include:
The recursive growth of a model with AIC identification is a topic already investigated in the literature, and applied, e.g., to Auto-Regressive Moving-Average (ARMA) filters for time series modeling [10]. Conversely, the present description deals also with embodiments able to efficiently update the AIC metric by means of the by-products of the same LD algorithm also used to compute the RR TD LS CE.
Various embodiments deal with a novel solution to perform RR TD LS CE employing the Levinson Durbin (LD) algorithm [9]. However, it is understood that other recursive algorithms instead of LD might be used in other embodiments.
Various embodiments deal with CLE through a novel efficient implementation of the AIC criterion.
Various embodiments deal with joint CE and CLE based on AIC, wherein the LD algorithm allows a low-complexity computation of the AIC metrics associated with the CIR estimates for all intermediate Lch=1, . . . , Lcp+1.
One or more embodiments are now described with reference to the annexed drawings, which are provided purely by way of non-limiting example and in which:
a and 2b illustrate respectively a OFDM transmitter and a OFDM related receiver according to an embodiment; and
b illustrate various aspects of the channel estimator in accordance with an embodiment.
In the following description, numerous specific details are given to provide a thorough understanding of embodiments. The embodiments can be practiced without one or more of the specific details, or with other methods, components, materials, etc. In other instances, well-known structures, materials, or operations are not shown or described in detail to avoid obscuring aspects of the embodiments.
Reference throughout this specification to “one embodiment” or “an embodiment” means that a particular feature, structure, or characteristic described in connection with the embodiment is included in at least one embodiment. Thus, the appearances of the phrases “in one embodiment” or “in an embodiment” in various places throughout this specification are not necessarily all referring to the same embodiment. Furthermore, the particular features, structures, or characteristics may be combined in any suitable manner in one or more embodiments.
The headings provided herein are for convenience only and do not interpret the scope or meaning of the embodiments.
a and 2b show respectively possible embodiments of an OFDM transmitter and receiver. Specifically, with reference to [1],
In the example considered, typical transmitter baseband digital procedures are grouped as 200. For example, blocks 202 to 206 may refer to coding/mapping operations made on the PLCP-Header and PLCP service data unit (PSDU). Blocks from 208 to 212 instead may act upon the overall PLCP, including also the PLCP Preamble.
The block 200 may be conventional, and may include a Forward Error Correction (FEC) encoder 202, an interleaver 204, a bit-to-complex-constellation mapper 206, a framing and OFDM modulator (and Guard Interval (GI) insertion) block 208, a filter block 210, and digital-to-analog (D/A) converters 212 to convert an input bit stream IB for transmission over a transmitting antenna 20.
As a counterpart, block 400 represents typical baseband elements of a receiver coupled to, e.g., two receiving antennas. In particular, block 400 includes analog-to-digital (A/D) converters 402 and filter blocks 404 for each of the antennas 22 of the receiver.
A typical receiver may further include a conventional synchronization block 406 and an (GI removal and) OFDM demodulator block 408, which provides the received data vector r(i).
Moreover, block 400 may include, as distinguishable units, a channel equalizer 414, a bit demapper and log-likelihood ratios (LLR) calculator block 416, a deinterleaver 418, and a FEC decoder 420, which provides the final output bit stream OB.
Generally, any FEC code might be used in the FEC encoder 202 and FEC decoder 420, such as BCH (Bose and Ray-Chaudhuri), Reed-Solomon, convolutional, low-density parity-check code, and turbo-encoding/decoding schemes.
Moreover, the presence of the interleaver and deinterleaver blocks is optional in the sense that they may not be present in a general OFDM transmission-scheme architecture. Usually, the deinterleaver 418 implements the reciprocal permutation law of the interleaver 204.
Again, these embodiments are for illustration only, and other embodiments of the transmitting and receiving systems 200 and 400 may be used without departing from the scope of this disclosure.
In order to enable coherent communication between the transmitter and the receiver, a common practical strategy is to estimate the channel effects at the receiver side with a channel estimator 430. Typically, the receiver performs effective and low-complexity channel estimation (CE), since battery life and chip area are scarce resources in wireless devices.
For example, with respect to
As mentioned in the foregoing, TD algorithms may provide more accurate CE than FD ones though being more complex. Accordingly, it may be desirable to optimize the overall complexity to cope with realistic hardware (HW) capabilities.
For example, with respect to
a) a Frequency Domain Zero Forcing CE (FD ZF CE) may first be performed by a “tone-by-tone” estimator 432 in order to generate the channel estimates ĥ(f)(i);
b) the FD CE is then transformed to TD, for example by generating channel estimates in the time domain ĥ(i) via an Inverse Discrete Fourier Transform (IDFT) at a block 436 and a Least Squares (LS) estimator is applied to the number of relevant taps (or channel length, CL) of the TD CIR at a block 438 in order to generate updated channel estimates in the time domain {tilde over (h)}(i).
Accordingly, the blocks 432 and 436 calculate for each tap l a respective channel impulse response in the time-domain ĥl, which are grouped as a channel-impulse-response vector in the time domain ĥ.
Such TD processing allows reducing the mean square error (MSE) of the CE by filtering out the noise contributions over non-relevant channel taps, and is also referred to as reduced-rank (RR) TD LS CE, or, more informally, “channel smoothing”.
Accordingly, in the remainder of the present description both denominations will be used alternatively being intended that:
As a last step of an example TD LS CE method, the updated channel estimates in the time domain {tilde over (h)}(i) at the output of the block 438 are eventually transformed back to FD at a Discrete Fourier Transform (DFT) block 440 as may be required by OFDM receivers for demodulation purposes in order to generate and update channel estimates in the frequency domain {tilde over (h)}(f)(i). These channel estimates {tilde over (h)}(f)(i) are then delayed, e.g., by a buffer or memory 442, in order to provide the updated channel estimates {tilde over (h)}(f)(i−1) for the next cycle. As shown in
Channel-smoothing methods may require the knowledge of the CL (Lch) at the receiver. A possible workaround is to set Lch to a constant value. On the other hand, it is clear that for possibly shorter channels accurate channel-length estimation (CLE) would represent an added value for the overall system performance. Several CLE algorithms exist in literature. They can operate either in FD or TD. FD estimators base their outputs on some specific property of the estimated channel frequency response (such as the zero-level crossing rate [4]); they are characterized by low complexity, but, at best, do not exploit the information about the CL (carried by the received signal), and, therefore, may require several OFDM symbols to provide reliable estimates. TD estimators usually rely on a given CIR energy threshold level [5]: only the channel-tap estimates exceeding the threshold are retained at the receiver for further processing. A main drawback of this class of algorithms lies in their sensitivities to the threshold level, which is arbitrary.
More accurate results are achieved by using the Akaike Information Criterion (AIC) [6], [7] and its derivations, such as the corrected AIC (AICc) [8]. AIC estimator motivations arise from information theory, as a result of the problem of minimizing the information lost by the channel estimator. Despite its remarkable performance, so far AIC has not been suitable for practical receiver implementations for OFDM systems as they have to deal with virtual sub-carriers (i.e., unused OFDM sub-carriers), and this fact brings a significant complexity burden, as clarified in detail in the remainder of the present document. Indeed, a straight implementation would most likely require computing channel smoothing (involving FD-TD-FD passages and square-magnitude computations) repeatedly.
In an embodiment considered, the CLE and CE smoother 438 receives as input ĥ(i), that is the TD representation of the FD CE ĥ(f)(i) (where ĥ is a N×1 vector of OFDM symbol tones and i refers to i-th OFDM symbol to be processed). These channel estimates may be computed differently depending on the considered CE class (PACE or DACE). For example, when the initial preamble is being processed, the i-th FD CE ĥ(f)(i) may be computed based on the pilot symbols PS, as specified, e.g., in [1].
According to the DACE class of CE techniques, starting from the first data (or information payload) OFDM symbol onwards, the i-th CE can be updated through the following steps:
Several OFDM CE methods are known in the literature [11], [12], [13]. Some of these methods improve performance by exploiting the knowledge of an estimate of the CIR length, and are referred to as “reduced rank” (RR) methods. However the DD technique described herein can be applied to either reduced-rank or full-rank CE methods.
Generally, the following description will refer to an OFDM system characterized by the following parameters:
In various embodiments, the point-to-point multi-path wireless link is modeled using the well-known time-varying discrete wide-sense stationary-uncorrelated scattering (WSS-US) channel model, where the time-domain CIR has Tc spaced taps as:
where Lch is the number of significant taps at time instant kT.
Generally, in OFDM systems, blocks of N complex symbols are transmitted in parallel over N distinct sub-channels, where each complex symbol belongs to a generic (PSK or QAM) complex constellation.
In the remainder of the document:
In various embodiments, a cyclic prefix (or guard interval) of extension Lcp is pre-appended to each modulated OFDM symbol. The matrix associated with the (normalized) Inverse Discrete Fourier Transform (I-DFT) can be written in compact form as:
where the following property holds:
W
H
W=WW
H
=I
N×N (3)
Thus, W is orthonormal and WH represents the (normalized) Discrete Fourier Transform (DFT).
Moreover, the diagonal matrix of transmitted data symbols is denoted as:
P=diag([p0 p1 . . . pN−1]) (4)
Depending on the context, the sequence {pk} could be a training sequence belonging to a preamble of a packet transmission, or a sequence of estimated transmitted data symbols. Usually, only K out of N subcarriers are active. Some others, such as at the DC and close to the maximum frequency not affected by aliasing (i.e., the Nyquist frequency), may be unused. These unused carriers are called virtual carriers, and their pk may be set to 0.
For example, in case the CL is l≦N, the frequency-domain channel response (i.e., after a Fourier transform) may be modeled as:
Generally, the transmitted symbols P are distorted by the channel. For example, in various embodiments, after (e.g., perfect) time-frequency synchronization and removal of the cyclic prefix at the receiver, and if, moreover, no ISI is present because I≦Lcp+1, then the TD received signal may be written as:
where n: CN(0N×1, N0IN×N) is a Complex Normal (CN) distribution of Additive White Gaussian Noise (AWGN) and N0 is the power spectral density of the noise.
In various embodiments, the channel estimator exploits the received data rN×1 to estimate the channel estimates hl×1 for the symbol index i. For example in case of [1], the channel estimator may represent a first long training field LTF1 with a first received vector rN×1(1) and second long training field LTF2 with a second received vector rN×1(2).
As mentioned in the foregoing, an embodiment of the instant description is concerned with TD LS CE, which uses as input a simple tone-by-tone FD CE (ignoring possible sub-carrier correlation) and then refines it, yielding an improved TD output. It can be seen that even this two-phase approach actually coincides with the optimal maximum-likelihood (ML) estimator in case of AWGN noise. In detail, once the received OFDM symbol has been demodulated at the block 408, one can obtain the initial FD CE at the block 414 in (at least) two ways.
In an embodiment, the optimal solution may be provided by the matched filter (MF) PH:
ĥ
(f)
=P
H
W
H
r (7)
The TD CIR ĥ after the I-DFT at block 436 reads:
where:
Q=W(PHP)WH (9)
is a Toeplitz and Hermitian matrix, and is often denoted as a channel covariance matrix.
Generally, Toeplitz matrices have constant diagonal elements, i.e.
Q
i,j
=Q
i+k,j+k
, ∀i, j, k,
while Hermitian matrices have a complex-conjugate symmetry, i.e.:
Q
i,j
=Q
j,i
*, ∀i, j.
In one embodiment, the RR smoothed estimate {tilde over (h)}l×1 of the actual channel hl×1 is obtained by inverting the system of equation (6), in a least squares (LS) sense. For example, for a given CL l, the LS problem may be solved by the Moore-Penrose pseudo-inverse:
and for compactness of notation the following definitions will be used in the remainder of this document:
Thus, the channel estimation problem may be written as
{tilde over (h)}
l×1
=Q
1:l,1:l
−1
ĥ
1:l (13)
where
{tilde over (h)}
l×1
: CN(h,N0Q1:l,1:l−1).
A special case (yet very common, e.g., in PACE for systems as shown, e.g., in [1]) arises when all the non-null elements of P have the same norm, i.e., when pilot symbols pk come from a PSK constellation. In this case, assuming, without loss of generality, unit energy transmitted symbols:
P
H
P=A=diag([I(|p0|>0)I(|p1|>0) . . . I(|pN−1|>0)]) (14)
regardless of the specific value of any pk, where i(·) is the indicator function.
A straight consequence would be that the channel covariance matrix Q becomes constant:
Q=WAW
H (15)
and may not be recomputed when pilots change, thus avoiding many real-time computations.
A sub-optimal approach, yet attractive for its reduced complexity, is the adoption in equation (13) of the constant Q of equation (15) also when pilots pk belong to QAM constellation (such as 16-QAM). This can be done provided that the initial FD MF CE as shown in equation (7) is replaced by FD ZF CE, which basically zeroes the noise in correspondence to the virtual carriers and inverts the effect of the pilots transmitted over data carriers:
Furthermore, it can be shown that when the position of the active carriers is symmetric with respect to DC (as it happens for example using the training sequences specified in [1] and its derivatives) and their energy is uniform (such as for PSK symbols), Q is real, thus the algorithm complexity can be almost halved.
Accordingly, in various embodiments, reference will be made primarily to the constant Q case (by definition or forced to be, e.g., through FD ZF CE). Nevertheless, it is understood that an embodiment is valid also for the general case of non-uniform energy constellation symbols (such as M-QAM) and in the case of complex, time-varying Toeplitz matrices, to be considered to guarantee the optimal TD LS solution (i.e., characterized by lower MSE) of equation (8). In the following will be pointed out also possible differences among solutions, along with their advantages and disadvantages.
In various embodiments, more than one training OFDM symbol is available for CE, thus permitting to improve the final estimation, e.g., by averaging the partial estimates. For example, if the training-based estimates were two, as for [1]:
In fact, there is no need to compute RR TD LS with two different inputs and eventually averaging the outputs, because, due to the linear property of the operators, TD LS CE can be applied equivalently once to the average of the initial training-based CEs:
For example, in various embodiments, the above average shown in equation (18) is replaced by an equivalent average of ĥ(f)(1) and ĥ(f)(2) in the frequency domain, once again thanks to linearity, thus permitting to perform just one I-DFT computation rather than two.
Generally speaking, AIC is a criterion employed to choose among a (finite) set of models; a preferred model may be the one that better describes a given sequence of observations. From an information-theoretical perspective, the best model minimizes the information loss. If applied to the CLE problem, AIC chooses among different smoothed channels, i.e., out of a set of CL possible values. This way, AIC can provide accurate CLE [6], [7]. Yet, it suffers from high complexity.
In case of CLE, AIC attempts to find a good trade-off between the variance of the output RR TD LS CE from the input FD CE, and the number of parameters required to achieve it.
For example, the CL estimate {circumflex over (l)} may be the solution of the minimization problem:
For example, in case of systems [1], the AIC metric as a function of the CL l may be given by:
In the embodiments considered, the coefficient 0.03 is related to 2/N (where N=64 is the number of OFDM tones in [1]), {tilde over (h)}k(l)
In various embodiments, the AIC computation problem is faced by splitting it into two separate topics:
1) Computation of multiple terms σl2(i) as for equation (23), for every l=1, . . . , Lcp+1. For example, each term would involve:
a) one DFT of {tilde over (h)}(l) to move to FD;
b) M·N Euclidean distances with input complex values for each length l, where N and M are the number of carriers and training sequences, respectively; thus resulting in a rather prohibitive complexity.
2) Computation of the additive corrective term Ψl or “AIC penalty”, as a function of the current CL l (equal to 0.03 l in equation (22)).
Main aspects to take into account for that purpose may be:
a) Computation over virtual carriers: equation (23) includes all the frequency domain tones, 0 to N−1, involved in metric computation. In this way, also the unused OFDM tones across the DC and the Nyquist frequency affect the expression in equation (22), leading to useless DFT and metric computations;
b) AIC metric reliability: the results reported in literature show that the AIC metric is biased, i.e., on average, the AIC criterion chooses a model with more parameters than necessary, thus missing the goal of minimizing the information loss. For example, the original criterion [6] is unbiased only asymptotically, for a large number of observations;
c) Logarithm computation: from a hardware implementation perspective, logarithm computation is challenging and often involves the implementation of a dedicated hardware architecture to cope with latency and memory usage.
In the following is described an embodiment, in which the TD LS CE at block 438 is performed via a Levinson-Durbin algorithm.
In the embodiment considered, with reference to equation (13), the proposed TD LS CE 438 computes a Time Domain Least Squares channel impulse response (having a given maximum number of L taps). Specifically, the block 438 receives as input a channel covariance matrix Q (calculated, e.g., according to equation (9) or (15)) and receives (e.g., from the blocks 432 and 436) for each tap l=1, . . . , L a respective channel impulse response in the time-domain ĥl, wherein the channel impulse responses in the time-domain are grouped as a channel impulse response vector in the time domain ĥ. More specifically, the block 438 computes an updated channel-impulse-response vector in the time domain {tilde over (h)} by computing for each tap l the solution of the following linear system of equations:
Q
1:l,1:l
{tilde over (h)}
l×1
=ĥ
1:l (24)
In order to introduce the Levinson-Durbin algorithm from a general perspective, reference is made first to possible solutions for generally computing the system of the following equations:
Q
1:l,1:l
y
(l)
=x
1:l (25)
A possible solution includes computing:
y
(l)
=Q
1:l,1:l
−1
x
1:l.
However, it is well known that a matrix inversion of size l in general has a cost of O(l3) multiplications. Furthermore, if l=1, . . . , L, with L=Lcp+1, and all the corresponding matrix inversions are to be computed, the overall cost would even be O(L4), which may be unacceptable even for moderate L.
When the matrix Q does not change, i.e., the matrix Q has constant coefficients, and the above system is to be solved many times (i.e., for many input vectors x), a first improvement may be obtained by performing a pre-computation of all possible inverse sub-matrices Q1:l, 1:l−1 and then computing L matrix-vector multiplications Q1:l, 1:l−1x1:l in the run-time process, whose cost is O(l2) each, thus leading to an overall complexity O(L3). Nevertheless, this cost may still remain too high, and a memory size too large for the application may be required to store all the inverse-matrix coefficients.
However, in the case of channel estimation, the matrix Q1:l, 1:l has a Toeplitz structure, i.e., all the elements over diagonals are equal. Moreover, such a matrix inversion has only a cost of O(l2) and may be performed via a Levinson-Durbin (LD) algorithm.
In fact, in an embodiment, it has been discovered that the LD algorithm has two remarkable advantages. First of all, the LD algorithm avoids computing the inverse matrix, and directly provides the solution y(l) of the system shown in equation (25). This is done recursively by solving the 1×1 subsystem first, then correcting the result at the previous step to find the solution of the 2×2 system, and so on. This way, the intermediate steps of the algorithm are the exact solutions of the smaller-sized sub-systems embedded within equation (25), with l≦L, and the overall complexity still remains O(L2), even if the smoothed channel is computed for every CL hypothesis l.
Furthermore, the size of memory may be reduced. In fact, the storage of the inverse matrices would save
elements in the memory. The LD algorithm, on the contrary, can save only the first row of Q1:l, 1:l, and L other vectors of size l (as will be described in the following), thus leading to saving only
elements in memory; therefore, for L>2, using the LD algorithm may use less memory than a technique that includes computing an inverse matrix.
The LD algorithm can be split in two parts. Specifically, the first part corresponds to a core algorithm, to be repeated for every different vector x.
Specifically, in the instant case of channel estimation, the input vector x would be the current channel estimates in the time domain ĥl×1, i.e., the I-DFT of the FD CE, and the solution y would be the RR TD LS CE of the updated channel estimates in the time domain {tilde over (h)}l×1, with l being the number of channel taps.
In an embodiment, the core algorithm relies on a set of support vectors b, depending only on Q. For example, under the hypothesis of a constant matrix Q, the computation of vectors b can be done only once, outside of the recursive LD algorithm. Furthermore, in case Q is real, the support vectors b are also real.
In an embodiment, the LD algorithm finds a solution of the system expressed by equation (25) assuming that the solution of the sub-system:
Q
1:l−1,1:l−1
y
(l−1)
=x
1:l−1 (26)
has already been computed.
In an embodiment, the method augments y(l−1) to the size of the new problem, measures the introduced error, and applies a correction term to find y(l).
For example, in an embodiment, this is achieved by exploiting a set of additional vectors, namely “backward vectors” b(l), such that:
As will be described later, also these vectors can be efficiently computed.
In the embodiment considered, the augmented solution induces an error term in the last position of x:
where:
εy(l)=Ql,1:l−1y(l−1) (29)
Thanks to the linear property, the above result can be easily adjusted to be a solution of the l×l system:
In an embodiment, the backward vectors b(l) are computed as follows. Specifically, as mentioned in the foregoing, under the hypothesis of a constant matrix Q, these vectors may be also computed only once per matrix Q, and saved in a memory.
In an embodiment, a set of “forward vectors” f(l) is also used, so that:
These vectors are used only to compute the backward vectors, and, e.g., can be discarded at the end of the pre-processing.
As before, the computation of backward and forward vectors at the l-th iteration augments the results of the (l−1)-th iteration.
For example, in an embodiment, due to linearity, and exploiting the fact that the matrix Q is a Toeplitz matrix:
Q
1:l−1,1:l−1
=Q
2:l,2:l
therefore, it follows:
where:
εf(l)=Ql,1:l−1f(l−1) (34)
εb(l)=Q1,2:lb(l−1) (35)
Since the l−2 central equations in equation (33) are automatically satisfied, it suffices to find the proper coefficients α and β that force to 0 and 1 (in the case of b(l)) or to 1 and 0 (in the case of f(l)) the first row and the last row of the above system, i.e.:
In an embodiment, the computation of the forward vectors needs to be performed only if Q is not Hermitian (or not symmetric in the case that Q is a real matrix).
Conversely, in the case that Q is Hermitian (or is symmetric if Q is a real matrix), the following relationship between b(l) and f(l) holds:
b
(l)=(R(l)f(l))* (38)
where R is the matrix which reverses the vector elements.
The complexity of the LD solution described in the foregoing may be expressed in terms of a number of scalar real multiplications, wherein for simplicity, it will be assumed that each complex multiplication costs four real multiplications:
The last two items refer to the pre-processing (in an embodiment, necessary only in the case of varying Q), so it is worth counting them separately from the variable costs (real multiplications):
and the pre-processing costs (real multiplications):
If needed, all intermediate RR TD LS CE outputs with l=1, . . . , L (as in the case of AIC-based CLE, in order to compute equation (23)) can be obtained as by-products of the LD algorithm applied to the biggest-size problem, i.e., L. This allows keeping the overall complexity equal to the multiplication by Q1:L, 1:L−1. These considerations are useful to understand the fundaments of the joint CE and CLE embodiment described in the remainder of this document.
The LD algorithm allows avoiding memory storage of all the L possible matrices Q, thanks to their Toeplitz structure; it is enough to store one row of the Q matrix (elements from 1 to L). Indeed, the only operation involving Q is equation (29). Thus, just the last row of Q suffices:
εy(l)=Ql,1:l−1y(l−1)=QL,(L−l+1):L−1y(l−1) (39)
or, equivalently, if Q is Hermitian, it is sufficient to retain the first row:
εy(l)=(Q1,2:l)*Ry(l−1) (40)
For example,
Specifically, in the embodiment considered, the block 50 receives as input the current channel estimate ĥl. Specifically, this channel estimate ĥl is provided to a subtraction block 502 in order to calculate:
ĥ
l−εy(l)
Specifically, in the embodiment considered, the error εy0 is set to 0 for the first cycle, and the then updated errors εy(l) according to equation (39) are used. For example, in the embodiment considered, a switch 504 is used to select between the initial error εy(0)=0 and the subsequent errors εy(l).
In the embodiment considered, the result of the subtraction at block 502 is provided to a multiplier 506 in order to compute the updated channel estimates {tilde over (h)}(l) at the l-th step:
In the embodiment considered, the updated channel estimates {tilde over (h)}(l) at the l-th step are then stored in a memory 508 to provide the results {tilde over (h)}(l−1) for the next cycle, which may be used to compute at a block 510 the errors according to equations (29) and (39):
εy(l)=QL,(L−l+1):L−1{tilde over (h)}(l−1)
Accordingly, the embodiment shown in
Accordingly, in the embodiment considered, the update of the channel-impulse-response vector in the time domain {tilde over (h)} is performed by calculating recursively for each tap l a respective updated channel-impulse-response vector {tilde over (h)}(l) as a function of the updated channel-impulse-response vector {tilde over (h)}(l−1) at the (l−1)th tap, the channel impulse response at the l-th tap ĥl, an error term εy(l) determined as a function of the matrix Q and the updated channel impulse response vector {tilde over (h)}(l−1) at the (l−1)-th tap, and a backward vector b(l) at the l-th tap determined as a function of said matrix Q.
In the embodiment considered, the block 52 includes a block 522, which provides the initial value for the vectors b(1) and f(1). For example, in the embodiment considered, the vectors at the instant l=1 are provided via respective switches 550 and 552, and are set as follows:
b
(1)
=f
(1)=1/Q1,1
The backward vector at the instant l is then stored in a memory 524 and expanded at a block 526 to provide the vector:
Similarly, the forward vector at the instant l is stored in a memory 528 and expanded at a block 530 in order to provide the vector:
The delayed vectors at the output of the memories 524 and 528, or the expanded versions thereof, may be used to compute, at blocks 532 and 534, the errors εb(l) and εf(l) as per equations (34) and (35), i.e.:
εb(l)=Q1,2:lb(l−1)
εf(l)=Ql,1:l−1f(l−1)=QL,(L−l+1):L−1f(l−1).
In the embodiment considered, these errors εb(l) and εf(l) are provided to a block 536 for computing:
In the embodiment considered, the expanded backward vector at the output of the block 526 is multiplied at a block 538 with the error εf(l) to calculate:
and the expanded forward vector at the output of the block 530 is multiplied at a block 544 with the error εb(l) to calculate:
The vector at the output of the multiplier 544 is then subtracted at a block 540 from the expanded backward vector to compute:
Similarly, the vector at the output of the multiplier 538 is subtracted at a block 546 from the expanded forward vector in order to compute:
Finally, the vectors at the outputs of the blocks 540 and 546 are multiplied at respective multipliers 542 and 548 with the term computed by the block 536 to compute the updated vectors b(l) and f(l) at the instant l:
Accordingly, the block 52 determines the backward vector b(l) at the l-th tap as a function of the backward vector b(l−1) at the (l−1)-th tap, a forward vector f(l−1) at the (l−1)-th tap, and error terms εb(l), εf(l), determined as a function of said backward vector b(l−1) at the (l−1)-th tap, the forward vector f(l−1) at the (l−1)-th tap, and the matrix Q.
In an embodiment, which generally may be used in any channel estimator, also the “AIC penalty” is computed via a new metric expression that does not require the computation of a logarithm; this new metric expression is a variant of the Corrected AIC [8], or AICc, an unbiased version of the AIC criterion. Therefore, most or all of the results presented in the following the description are valid for either the AIC or the AICc reference criteria.
Specifically, an embodiment deals with avoiding the computation of virtual carriers.
Specifically, AIC criterion can be generalized by expressing it through a Residual Sum of Squares (log-RSS), as in [7]. On that basis, it can be proved that the contribution due to the virtual carriers is a constant factor, which may be expressed by the novel mathematical formula:
As a consequence, equation (23) may be written as:
where the term pk>0 refers to just the K active tones. It follows that a new “log-RSS” expression for the AIC metric can be derived using equations (41) and (42) with reference to PACE using M OFDM training symbols:
The AIC penalty term (2/K)·l differentiates from the one in equation (22) as it does not take into account the contribution of the virtual carriers. Specifically in case of [1], M=2 and σ02(1) and σ02(2) are the metrics computed over the LTF1 and LTF2 tones respectively, for l=0; K=52 is the number of used tones (data and pilots) and then Ψl=0.03851.
Moreover, in case of DACE, M=1 and equation (43) reduces to:
The AIC criterion is unbiased only asymptotically for a large number of observations. As mentioned in the foregoing, an alternative solution is to employ the AICc. The distinctive feature of AICc is to use a penalty term leading to unbiased CLE, also for a small number of observations. Similarly to AIC in equation (21), AICc is also used for the purpose of CLE in the minimization problem:
Accordingly, the channel length estimate {circumflex over (l)} is computed for a given maximum number of L taps by determining the channel length {circumflex over (l)}, which minimizes an Akaike information criterion metric determined as a function of the square-norm σl2 of the channel-impulse-response vector in the time domain ĥ.
Specifically, in the embodiment considered, the Akaike information criterion metric does not take into account unused Orthogonal Frequency Division Multiplexing sub-carriers by determining an additive penalty term Ψl depending on the number K of used OFDM sub-carriers and the channel length l, and using the metrics (43) or (44) for a Pilot Assisted Channel Estimation including M pilot symbols or for a Data Aided Channel Estimation respectively, where the penalty term Ψl may be calculated according to equation (43).
In an embodiment, AICc is used in combination with virtual-carriers computation avoidance, leading to the following mathematical formulation:
AICc(l)=ln σl2+Ψl (46)
where the penalty term Ψl may assume at least two alternative expressions:
where p=2 l and n=2K.
Specifically, equation (46), and equation (47) or equation (48), are valid for both PACE and DACE. Accordingly, the use of such values for Ψl allows avoiding the need to compute
but a single value σl2 can be used instead even in case of PACE with M≧2 OFDM training symbols.
Accordingly, in the embodiment considered, a compact Akaike information criterion metric is used which does not take into account unused Orthogonal Frequency Division Multiplexing sub-carriers by determining an additive penalty term Ψl depending on the number K of used OFDM sub-carriers and the channel length l, and using the metric (46) (for a Pilot Assisted Channel Estimation comprising M pilot symbols and/or for a Data Aided Channel Estimation), where the penalty term Ψl may be calculated according to equation (47) or (48).
In an embodiment, to achieve better results in case of PACE, equation (23) or (42) can be computed using as input {tilde over (h)}k(l)
The above average is usually also accomplished by standard channel estimators, thus no extra complexity is required.
AIC and AICc are special cases of a more comprehensive Generalized-AIC (GAIC) class of selection criteria. GAIC criteria differ in their penalty-term expressions.
Accordingly, even though the present description deals primarily with AIC, i.e., equation (43) (or (44)), and AICc, i.e., equation (46) with equation (47) or (48), criteria respectively, it is understood that other Generalized-AIC additive penalty terms may be used without limiting the generality of the present disclosure.
For example, an embodiment deals with a novel CLE arrangement, which allows efficient hardware implementation of the AIC or AICc metric computation.
For example, in an embodiment, the AIC or AICc equations shown in the foregoing are approximated with their exponential expressions, both in PACE and DACE contexts, thus avoiding logarithm calculation. In such cases, the CL estimate {circumflex over (l)} is the result of:
Accordingly, the Akaike information criterion metric is approximated with its exponential expression and the result {circumflex over (l)} is a close approximation to equation (21) or (45) respectively, as they may differ slightly because l takes a value out of a discrete-integer set.
In an embodiment, the general expression for AICexp(l) in case of PACE with M training symbols is given by:
while for DACE it reduces to:
AICexp(l)=σ12exp Ψl=σl2vl (53)
In an embodiment, the exponential versions of equations (43) and (44) can be considered:
Similarly, in an embodiment, the general expression for AICcexp(l) is given by:
AICcexp(l)=σl2exp Ψl=σl2vl (56)
with specializations given by the exponential versions of equations (47) and (48):
In an embodiment, in all cases the AICc/AIC additive penalty coefficients Ψl are replaced by the correction factors vl=exp Ψl.
Joint CE and CLE Using LD and AIC/AICc Criterion
As mentioned in the foregoing, the disclosed CE and CLE may be used independently. However, an embodiment concerns also a method for joint CE and CLE, attaining high performance and at the same time minimizing the number of computations and memory usages.
Specifically, an embodiment concerns an efficient method for the TD computation and recursive updating operation of the term
used, e.g., in equations (43), (44), (46), (52), (53), (54), (55), (56), (57), and (58).
Specifically, an embodiment reuses the outputs of the recursive LD algorithm applied to RR TD LS CE. In fact, time-domain computations are appealing as the computation of the following terms can be avoided compared to the baseline joint CL and CLE algorithm based on AIC:
a) the DFT of all the intermediate CIR estimates {tilde over (h)}(l) with l=1, . . . Lcp+1;
b) M·N Euclidean distance (ED) metrics with complex inputs, for each l with l=1, . . . , Lcp+1 (with M being the number of OFDM training sequences). For example, for AICc, M=1 for both PACE and DACE applied to OFDM systems [1], and for AIC criterion, M=1 for DACE and M=2 for PACE.
Instead, according to an embodiment, the ED metrics are computed as a whole only once (for l=0) and then updated recursively at each step (l=1, . . . , Lcp+1) of the recursive LD algorithm applied to RR TD LS CE. Such M·K ED metrics can be determined during an initialization or setup stage of the joint CL and CLE algorithm.
For example, in case of uniform-power pilots and sub-optimal FD ZF CE,
may be computed as follows:
or, in case of (optimal) TD LS CE,
Accordingly, the initial values may be:
For example the former is shown in
Conversely, for l>0, the initial value σ02 is updated step by step yielding σl2 for l=1, . . . , Lcp+1.
For example, in case of uniform-power pilots and sub-optimal FD ZF CE, a single σl2 may be computed as follows:
or, in case of (optimal) LS CE:
Accordingly, the initial values may be:
For example, the former is shown in
Specifically, as mentioned in the foregoing,
In the embodiments considered, the inputs of the block 446 are N×1 vectors, where N is the OFDM symbol length. Moreover, in the embodiments considered, the setup-up blocks 446 shown in
∥x∥=|x1|2+|x2|2+ . . . +|xn|2
For example, the embodiment shown in
Specifically, in the embodiment considered, the input vectors ĥ(1) and ĥ(2) are provided to respective multipliers 302 and 304 for calculating the terms ĥ(1)/2 and ĥ(2)/2. These terms are then summed at an adder 306 to calculate the channel estimate, e.g., according to equation (18), i.e.:
Moreover, in case of AIC or AICexp CLE criterion, the input vectors ĥ(1) and ĥ(2) are provided to respective blocks 308 and 310 for calculating the terms σ02(1)=∥ĥ(1)∥2 and σ02(2)=∥ĥ(2)∥2. These terms are provided to respective multipliers 312 and 314 and the results are summed at an adder 316 to calculate the parameter σ02 as follows:
The embodiment shown in
The embodiment shown in
The computation of σl2 may be performed recursively starting from σl−12 and using equation (61) (or (62)). Similarly, this holds for their average
for M>1, starting from
and using equation (59) (or (60).
The core of the operation is the recursive update of the term ({tilde over (h)}(l))Hĥ1:l of equations (59)-(62), represented by the block 56a in
Specifically, this operation may be performed by exploiting at each step the outputs of the LD algorithm applied, e.g., by block 50, to RR TD LS CE and results in:
where ĥl denote the l-th input TD tap of the CIR to be smoothed and {tilde over (h)}(l) is the l-tap RR TD LS CIR, computed up to the l-th step of the algorithm. Specifically, equation (63) derives from equation (31), where xl and y(l) have been replaced by ĥl and {tilde over (h)}(l), respectively. The same equation holds for both cases of uniform and non-uniform power pilots.
For example, the embodiment shown in
In the embodiment considered, the block 56a includes a multiplier 562, which receives, e.g., from the block 50 (
These terms may then be combined at a multiplier 562, which also performs a conjugate operation, in order to calculate the term:
(ĥl−ε{tilde over (h)}(l))*{tilde over (h)}l(l)
Specifically, at the first instant l=1, the output at the multiplier 562 may then be subtracted at a block 564 from the square-norm σ02
σ12=σ02−({tilde over (h)}(1))Hĥ1:1=σ02−(ĥ1−ε{tilde over (h)}(1)*{tilde over (h)}1(1)
This value is then stored in a memory 566 for the next cycle l. Specifically, in the embodiment considered, during the next cycle, the value stored in the memory is selected (e.g., via a switch 574) as input of the block 564, and accordingly:
Accordingly, generally the output of the block 564 calculates iteratively equation (61) or (62), which thus applies also to equation (59) or (60):
σ12=σ02−({tilde over (h)}(l))Hĥ1:l
Accordingly, the Akaike information criterion metric is determined by calculating for each tap the square-norm σl2 (which generally represents also the average square-norm in equations (59) or (60)) at the l-th tap recursively as a function of the square-norm σl−12 at the (l−1)-th tap, the updated channel impulse response at the l-th tap {tilde over (h)}l, the channel impulse response at the l-th tap ĥl and the error term εy(l) (also referred to as ε{tilde over (h)}(l)).
For example, in an embodiment, CE is performed through the AIC or AICc criterion and the criterion is calculated as:
AIC(l)=ln σl2+Ψl
For example, in the embodiment considered, the output of the block 564, i.e., σl2, is provided to a block 568 for calculating the logarithm of σl2, and the result is provided together with the AIC additive penalty terms “Ψl” to an adder 570 for calculating the AIC metric:
AIC(l)=ln σl2+Ψl
In the embodiment considered, the AIC additive penalty terms “Ψl” may be stored in memory.
Conversely,
Specifically, the embodiment considered corresponds substantially to the embodiment described with respect to
However, in this embodiment, the term σl2 is provided together with the AIC additive penalty terms “vl” to a multiplier 572 for calculating the AIC metric:
AIC(l)=σl2vl
An embodiment concerns the joint determination of CE and CLE.
The minimization of either the AIC(I) or AICc(I) (AICexp(I) or AICCexp(I), as well) metric provides a selection criterion of the estimated CIR and associated CL estimate {circumflex over (l)} out of L=Lcp+1 l-tap partial CIR candidate estimates, associated to the corresponding CLs l=1, . . . , Lcp+1.
In the embodiment considered, the selector 58 includes a memory 582 for storing the partial minimum AIC(I) or AICc(I) (AICexp(I) or AICcexp(I), as well) metrics determined as far as the computation of the L algorithm steps progress.
Specifically, in the embodiment considered, the term AICmin is initialized to the maximum value that can be represented with the available bits, e.g. AICmin=+∞ for I=0. For example, in the embodiment considered, this initialization is performed via a switch 584, which selects as an initial value for I=0 the maximum value (+∞).
The partial minimum AICmin is provided to a selector block 586, which selects the minimum value between the current partial minimum AICmin and the current AIC metric AIC(l) provided, e.g., by the block 56a (
The new partial minimum value is then stored again in the memory 582, e.g., by coupling the selector 586 to the memory 582 via the switch 584.
Accordingly, in the embodiment considered, the memory 582 contains, for each instant, the currently smallest value of AIC(I), and, consequently at the end of the processing, this memory contains the CLE {circumflex over (l)} as shown, e.g., in equation (45).
Moreover, the selector 586 is also used to drive a multiplexor 588.
Specifically, in the embodiment considered, the block 58 also includes a second memory 590 for storing the partial best channel estimate {tilde over (h)} at the instant l. In fact, in case a new minimum metric AICmin has been found via the selector 586, the current best channel estimate should also be updated and stored in the memory 590. Accordingly, the multiplexor 588 is configured for selecting the current best channel estimate {tilde over (h)} if the current metric AIC(I) is greater than the current minimum metric AICmin, and the current channel estimate at the l-th tap {tilde over (h)}(l) if the current metric AIC(I) is smaller than the current partial metric AICmin.
Accordingly, in the embodiment considered, the memory 590 contains, for each tap, the currently best channel estimate, and, consequently at the end of the processing, the memory 590 contains the best CE {tilde over (h)}, i.e., the block 58 may select as best updated channel-impulse-response vector {tilde over (h)} the updated channel-impulse-response vector {tilde over (h)}(l) at the l-tap, which minimizes said Akaike information criterion metric (provided, e.g., by the block 56).
a and 8b show embodiments of the complete channel estimator 438a and 438b, respectively. Specifically, the embodiment shown in
Specifically, both arrangements include an LD updater block 50 (as shown, e.g., in
The primary difference between the channel estimators 438a and 438b is that the coefficients QL,(L−l+1):L−1 and the backward vectors b(l) are fixed in
Accordingly, the AIC updater block 56 reuses values already calculated by the LD updater block 50, thus optimizing the complexity of the hardware implementation.
Moreover, the embodiments shown in
Of course, without prejudice to the principles of the present disclosure, the details of construction and the embodiments may vary widely with respect to what has been described and illustrated herein purely by way of example, without thereby departing from the scope of the present disclosure.
From the foregoing it will be appreciated that, although specific embodiments have been described herein for purposes of illustration, various modifications may be made without deviating from the spirit and scope of the disclosure. Furthermore, where an alternative is disclosed for a particular embodiment, this alternative may also apply to other embodiments even if not specifically stated.
The following references are incorporated by reference herein:
Number | Date | Country | Kind |
---|---|---|---|
TO2011A000808 | Sep 2011 | IT | national |