Method and apparatus for implementing sparse finite-impulse-response equalizers

Information

  • Patent Grant
  • 10382101
  • Patent Number
    10,382,101
  • Date Filed
    Tuesday, November 8, 2016
    8 years ago
  • Date Issued
    Tuesday, August 13, 2019
    5 years ago
Abstract
A method and apparatus may include receiving a transmission. The method can also include determining a sparsifying dictionary that sparsely approximates a data vector of the transmission. The determining the sparsifying dictionary includes performing a fast Fourier transform and/or an inverse fast Fourier transform. The method also includes configuring a filter based on the determined sparsifying dictionary.
Description
BACKGROUND
Field

Certain embodiments of the present invention relate to implementing sparse finite-impulse-response equalizers.


Description of the Related Art

In the technical field of signal processing, a finite impulse response (FIR) equalizer may be considered to be FIR filter whose impulse response is of a finite duration. A FIR filter can be considered to be a digital circuit that is used to filter an input signal. At each point in time, a set of samples of the input signal can be multiplied by coefficients and then summed. The number of samples of the input signal can be referred to as the number of taps of the FIR filter.


SUMMARY

One embodiment is directed to a method. The method may include receiving a transmission. The method may also include determining a sparsifying dictionary that sparsely approximates a data vector of the transmission, wherein the determining the sparsifying dictionary comprises performing a fast Fourier transform and/or an inverse fast Fourier transform. The method may further include configuring a filter based on the determined sparsifying dictionary.


According to an embodiment, the configuring the filter may include configuring a finite impulse response filter. According to another embodiment, the determining the sparsifying dictionary may include determining a sparsifying dictionary that has a smallest worst-case coherence metric. According to yet another embodiment, the performing the fast Fourier transform and/or the inverse fast Fourier transform is performed on a circulant matrix that estimates a Toeplitz matrix. According to a further embodiment, the performing the fast Fourier transform and/or the inverse fast Fourier transform on the circulant matrix replaces performing Cholesky and Eigen factorization.


Another embodiment is directed to an apparatus, which may include at least one processor, and at least one memory including computer program code. The at least one memory and the computer program code may be configured, with the at least one processor, to cause the apparatus at least to receive a transmission. The at least one memory and the computer program code may also be configured, with the at least one processor, to determine a sparsifying dictionary that sparsely approximates a data vector of the transmission, wherein the determining the sparsifying dictionary may include performing a fast Fourier transform and/or an inverse fast Fourier transform. The at least one memory and the computer program code may further be configured, with the at least one processor, to configure a filter based on the determined sparsifying dictionary.


According to an embodiment, configuring the filter may include configuring a finite impulse response filter. According to another embodiment, the determining the sparsifying dictionary may include determining a sparsifying dictionary that has a smallest worst-case coherence metric. According to yet another embodiment, the performing the fast Fourier transform and/or the inverse fast Fourier transform may be performed on a circulant matrix that estimates a Toeplitz matrix. According to a further embodiment, the performing the fast Fourier transform and/or the inverse fast Fourier transform on the circulant matrix replaces performing Cholesky and Eigen factorization.


Another embodiment is directed to a computer program product, embodied on a non-transitory computer readable medium. The computer program product may be configured to control a processor to perform the method described above.





BRIEF DESCRIPTION OF THE DRAWINGS

For proper understanding of the invention, reference should be made to the accompanying drawings, wherein:



FIG. 1 illustrates a flowchart of a method in accordance with certain embodiments of the invention.



FIG. 2 illustrates an apparatus in accordance with certain embodiments of the invention.



FIG. 3 illustrates an apparatus in accordance with certain embodiments of the invention.





DETAILED DESCRIPTION

With single-carrier transmissions over broadband channels, long finite impulse response (FIR) equalizers are typically implemented at high sampling rates to combat the channels' frequency selectivity. However, implementation of such equalizers can be prohibitively expensive, as the design complexity of FIR equalizers increases proportional to the square of the number of nonzero taps in the filter. Sparse equalization, where only a few nonzero taps (and only a few corresponding nonzero coefficients) are employed, is a technique that is used to reduce complexity at the cost of a tolerable performance loss. However, reliably determining the locations of these nonzero coefficients is often very challenging.


Certain embodiments of the present invention are directed to a computationally-efficient method, implemented in software, that designs/configures sparse FIR single input single output (SISO) linear equalizers and sparse FIR multi-input multi-output (MIMO) linear equalizers (LEs). Certain embodiments may also be directed to decision feedback equalizers (DFEs) that perform sparse approximation of a vector (which contains the filter taps) using different sparsifying dictionaries. These sparsifying dictionaries (full row-rank matrices) may be used to sparsely approximate the filter tap vector.


With regard to performing sparse approximation of a vector using different dictionaries, certain methods of the present invention can determine a sparsifying dictionary that leads to the sparsest FIR filter, hence reducing the FIR filter's implementation complexity, subject to performance constraints.


In addition, certain embodiments may be directed to applications that perform channel shortening for both single-carrier and multi-carrier transceivers. As such, certain embodiments of the present invention may significantly reduce an equalizer implementation complexity. Another feature of certain embodiments is directed to reducing a computational complexity of a method for determining a sparse equalizer design. As described below, this reduced computational complexity can be realized by exploiting an asymptotic equivalence of Toeplitz and circulant matrices, where matrix factorizations involved in certain embodiments can be carried out efficiently using fast Fourier transform (FFT) and inverse FFT. There may be negligible performance loss as the number of filter taps increases.


Simulation results demonstrating certain embodiments show that, if a little performance loss is allowed, such an allowance can yield a significant reduction in the number of active filter taps, which, in turn, results in a substantial reduction in the complexity of implementing the FIR equalizers/filters. Consequently, with a reduction in the complexity of the FIR equalizers/filters, a corresponding power consumption can also be decreased because a smaller number of complex multiply-and-add operations are required.


Additionally, simulations demonstrating certain embodiments have shown that a sparsifying dictionary of certain embodiments will generally result in a sparsest FIR filter design. That sparsifying dictionary (full row-rank matrix) may have the smallest worst-case coherence metric. In certain embodiments, the smaller the worst-case coherence, the sparser the equalizer is. The sparsifying dictionary could also be a square, fat or tall matrix. A good sparsifying dictionary may be that which has a small coherence between its columns, i.e., small worst-case coherence. Furthermore, the simulations demonstrate the superiority of certain embodiments in terms of both performance and reduction of computational complexity.


Certain embodiments may be utilized with the following system model, for example. One system model may include a linear time-invariant MIMO inter-symbol interference (ISI) channel with ni inputs and no outputs, for example. The ISI channel can be the channel upon which the input signals are passing through. The output of the ISI channel (distorted by the noise) may be taken as an input to the FIR equalizer. This equalizer may be designed such that the effect of ISI channel is mitigated and/or removed. The received samples from all no channel outputs at a sample time k are grouped into a no×1 column vector yk as follows:











y
k

=





l
=
0

υ




H
1



x

k
-
l




+

n
k



,




(
1
)







where H1 is the (no×ni) 1th channel matrix coefficient, and xk−1 is the size ni×1 input vector at time k−1. The parameter v (as shown below) is the maximum order of all of the noni Channel Impulse Responses (CIRs). Over a block of Nf output samples, the input-output relation in (1) above can be written compactly as

yk:k−Nf+1=Hxk:k−Nf−n+1+nk:k−Nf+1.  (2)


where yk:k−Nf+1, xk:k−Nf−v+1 and nk:k−Nf+1 are column vectors grouping the received, transmitted, and noise samples, respectively. Additionally, H is a block Toeplitz matrix whose first block row is formed by {Hl}l=0l=v followed by zero matrices. It may be useful to define the output auto-correlation and the input-output cross-correlation matrices based on a block of length Nf. Using (2), the ni(Nf+v)×ni(Nf+v) input correlation and the noNf×noNf noise correlation matrices are, respectively, defined by

Rxxcustom characterE[xk:k−Nf−v+1xk:k−Nf−v+1H]
and
Rnncustom characterE[nk:k−Nf+1nk:k−Nf−1H].

Both the input and noise processes are assumed to be white; hence, their auto-correlation matrices are assumed to be (multiples of) the identity matrix, i.e.,







R
xx

=



I


n
i



(


N
f

+
υ

)








and






R
nn


=


1
SNR




I


n
o



N
f



.








The key matrices used in this disclosure are summarized in Table I.









TABLE I







CHANNEL EQUALIZATION NOTATION AND KEY MATRICES


USED IN THIS PAPER.











Notation
Meaning
Size







H
Channel matrix
noNf × ni (Nf + v)



Rxx
Input auto-correlation
ni (Nf + v) ×




matrix
ni (Nf + v)



Rxy
Input-output
ni (Nf + v) × no (Nf)




cross-correlation matrix



Ryy
Output auto-correlation
noNf × noNf




matrix



Rnn
Noise auto-correlation
noNf × noNf




matrix



R

custom character  Rxx − Rxy Ryy−1 Ryx

ni (Nf + v) ×





ni (Nf + v)



W
FFF matrix cofficients
noNf × ni



B
FBF matrix cofficients
ni (Nf + v) × ni










Moreover, the output-input cross-correlation and the output auto-correlation matrices are, respectively, defined as

Ryxcustom characterE[yk:k−Nf+1xk:k−Nf−v+1H]=HRxx, and  (3)
Ryycustom characterE[yk:k−Nf+1yk:k−Nf+1H]=HRxxHH+Rnn.  (4)


Certain embodiments may determine different sparse FIR equalizer designs for different scenarios. As described below, certain embodiments may be directed to linear equalizers (LEs) for SISO and MIMO systems, respectively, and other embodiments may be directed to decision feedback equalizers (DFEs) for SISO and MIMO systems, respectively.


With regard to Sparse FIR Single-Input Single Output Linear Equalizers (SISO-LE), the received samples are passed through a FIR filter of length Nf for equalization. The resulting error at time k is given by

ek=xk−Δ−{circumflex over (x)}k−Δ−wHyk:k−Nf+1,  (5)


where Δ is the decision delay, typically 0≤Δ≤Nf+v−1, and w denotes the equalizer taps vector whose dimension is Nf×1. It can be shown that the mean square error (MSE), ξ(w), is given by:











ξ


(
w
)


=




ɛ
x

-


r
Δ
H



R
yy

-
1




r
Δ






ξ
m



+




(

w
-


R
yy

-
1




r
Δ



)

H




R
yy



(

w
-


R
yy

-
1




r
Δ



)







ξ
e



(
w
)






,




(
6
)







where εxcustom characterE[xk−Δ2], rΔ=RyxeΔ, and eΔ denotes an (Nf+v)-dimensional unit vector that is zero everywhere except in the (Δ+1)-th element where it is one. Since ξm does not depend on w, the MSE ξ(w) is minimized by minimizing the term ξe (w). Hence, the optimum choice for w, i.e., the optimum setting of the equalizer taps, in the minimum mean square error (MMSE) sense, is the Wiener solution:

wopt=Ryy−1rΔ.


However, in general, wopt is not sparse, and its implementation complexity increases proportional to (Nf)2, which can be computationally expensive. However, any choice for w other than wopt increases ξe(w) which leads to a performance loss. This suggests that the excess error ξe(w) can be used as a design constraint to achieve a desirable performance-complexity tradeoff.


Specifically, certain embodiments formulate the following problem for the design of a sparse FIR SISO-LE












w
^

s



=
Δ






arg





min


w




N
f








w


0






subject





to







ξ
e



(
w
)





δ
eq



,




(
7
)







where ∥w∥0 is the number of nonzero elements in its argument and δeq can be chosen as a function of the noise variance.


To solve (7), certain embodiments propose a general framework to sparsely design FIR LEs such that the performance loss does not exceed a prespecified limit.


With regard to sparse FIR multiple-input multiple output linear equalizers (MIMO-LE), certain embodiments may define the kth equalization error sample vector in the MIMO setting as

ek=[ek,1ek,2 . . . ek,ni]T,  (8)


where ek,i is the equalization error of the ith input stream. Similar to the sparse SISO-LE case, the kth error sample for the ith input stream is expressed as:

ek,i=xk−Δ,i−{circumflex over (x)}k=xk−Δ,i−wiHyk:k−Nf+1,  (9)

and the MSE, ξi(wi), for the ith input stream has the form












ξ
i



(

w
i

)


=


ξ

m
,
i


+




(


w
i

-


R
yy

-
1




r

Δ
,
i




)

H




R
yy



(


w
i

-


R
yy

-
1




r

Δ
,
i




)







=
Δ




ξ

ex
,
i




(

w
i

)







,




(
10
)







where ξm,icustom characterξa−rΔ,iHRyy−1rΔ,iH=RyxeΔ,i and eΔ,i is the (niΔ+i)-th column of Ini(Nf+v). The optimum choice for wi is the complex non-sparse solution wopt,i=Ryy−1rΔ,i. Thus, certain embodiments can use ξex,i(wi) as a design parameter to control the performance-complexity tradeoff.


Certain embodiments may be directed to a proposed framework that computes a sparse solution ws,i such that ξex,i(wi)≤δeq,i. This condition bounds the amount of the noise in the sparse solution.


With regard to sparse FIR SISO-DFE, the decision feedback equalizer (DFE) of certain embodiments may include two filters: a feedforward filter (FFF) with Nf taps, and a feedback filter (FBF) with (Nb+1) nonzero taps. The FFF and FBF are denoted by and wj=[wf,0wf,1 . . . wf,Nf−1]T and wb=[0l×Δ1wb,1 . . . wb,Nb0l×s]T, scustom characterNf−v−Δ—Nb−1, respectively. Assuming that the decisions are correct, it can be shown that the MSE has the form











ξ
m



(

w
b

)


=


w
b
H




(


R
xx

-


R
xy



R
yy

-
1




R
yx



)




R

x
/
y







w
b

.






(
11
)








Since the (Δ+1)th location of wb is unity, minimizing ξm(wb) is a constrained problem of the form












w
^

b



=
Δ






arg





min



w
b






N
f

+
v







ξ
m



(

w
b

)







subject





to






w
b
H



e
Δ


=
1


,




(
12
)








where eΔ denotes the (Nf+v)-dimensional unit vector that is zero everywhere except in the (Δ+1)th entry, where it is one. Moreover, in some scenarios a specific value of Nb is required, for example, due to complexity constraints, and a direct control on the number of the nonzero taps of the FBF is desirable.


To accomplish these goals, certain embodiments express Rx/y in (11) as Ax/yHAx/y, where Ax/y is the square-root matrix of Rx/y in the spectral-norm sense, which results from Cholesky or Eigen decompositions. Then, (11) can be written as














ξ
m



(

w
b

)


=




w
b
H



A

x
/
y

H



A

x
/
y




w
b








=







A

x
/
y




w
b




2
2








=









A
~


x
/
y





w
~

b


+

a

Δ
+
1





2
2


,







(
13
)








where aΔ+1 is the (Δ+1)th column of Ax/y, Ãx/y is composed of all columns of Ax/y except aΔ+1 and {tilde over (w)}b is formed by all elements of wb except the (Δ+1)th entry with unit value. The locations and weights of these taps need to be estimated such that ξm(wb) is minimized. Towards this goal, certain embodiments formulate the following problem for the design of a sparse FIR FBF filter












w
~


b
,
s




=
Δ






arg





min



w
b






N
f

+
v









w
b



0






subject





to







ξ
m



(

w
b

)





γ
eq



,




(
14
)








where the threshold γeq is a selected parameter to control the performance loss from the non-sparse highly complex conventional FBF, which is designed based on the MMSE criterion. Once {tilde over (w)}b,s is calculated, certain embodiments insert the unit tap in the (Δ+1)th entry to construct the sparse FBF, wb,s, vector. Then, the optimum FFF taps (in the MMSE sense) are given by

wf,opt=Ryy−1β,  (15)

where β=Ryxwb,s. Since wf,opt is, again, generally not sparse, certain embodiments propose a sparse implementation for the FFF taps as follows. After computing the feedback filter (FBF) coefficients, wb,s, the MSE will be a function only of wf and has the form













ξ


(

w
f

)


=





w
f
H



R

yy








w
f


-


w
f
H



R
yx



w

b
,
s



-


w

b
,
s




R
yx
H



w
f
H


+


w

b
,
s

H



R
xx



w

b
,
s










=






w

b
,
s

H



R

x
/
y




w

b
,
s






independent





of






w
f




+





(


w
f

-


R
yy

-
1



β


)

H




R
yy



(


w
f

-


R
yy

-
1



β


)







=
Δ




ξ
ex



(

w
f

)





.









(
16
)








Thus, ξ(wf) is minimized by minimizing the term ξex(wf). In particular, certain embodiments formulate the following problem for the design of sparse FIR FFF, wf












w
^


f
,
s




=
Δ






arg





min



w
f





N
f









w
f



0






subject





to







ξ
ex



(

w
f

)





γ
eq



,




(
17
)








where {tilde over (γ)}eq>0 is a design parameter that can be used to control the performance-complexity tradeoff.


With regard to Sparse FIR MIMO-DFE, the FIR MIMO-DFE includes a FFF matrix

WHcustom character[W0HW1H . . . WNf−1H].  (18)

with Nf matrix taps WHi, each of size no×ni, and a FBF matrix equal to

{tilde over (B)}H=[{tilde over (B)}0H{tilde over (B)}0H . . . {tilde over (B)}NbH],  (19)

where each {tilde over (B)}H, has (Nb+1) taps with size of ni×ni. By defining the size ni×ni (Nf+v) matrix BH=[0nixiΔ{tilde over (B)}H], where 0≤Δ≤Nf+v−1, it was shown that the MSE can be written as follows











ξ


(

B
,
W

)


=



Trace


{


B
H



R



B

}






=
Δ




ξ
min



(
B
)





+


Trace


{


S
H



R
yy


S

}






=
Δ




ξ
ex



(

W
,
B

)







,




(
20
)








where


Rcustom characterRxx−RxyRyy−1Ryx and SHcustom characterWH−BHRxyRyy−1. The second term of the MSE is equal to zero under the optimum FFF matrix filter coefficients, i.e., WH=BHRxyRyy−⊥, and the resulting MSE can then be expressed as follows (defining Rcustom characterAHA)














ξ
m



(
B
)


=




Trace


{


B
H



A

H



A



B

}


=





A



B



F
2













=









A




b

(
1
)







A




b

(
2
)












A




b

(

n
i

)





F
2
























A




b

(
1
)





2
2

+





A




b

(
2
)





2
2

+

+





A




b

(

n
i

)





2
2














(
21
)








where b(i) is the ith column of B. Hence, to compute the FBF matrix filter taps B that minimize ξm (B), certain embodiments minimize ξm (B) under the identity tap constraint (ITC) where certain embodiments restrict the ith matrix coefficient of B to be equal to the identity matrix, i.e., B0=Ini. Towards this goal, certain embodiments rewrite ξm (B) as follows












ξ
m



(
B
)


=




i
=
1


n
i









A


(



i

\






n
i



Δ

+
i

)




b

(



i

\






n
i



Δ

+
i

)



+

a

n
,

Δ
+
i






2
2



,




(
22
)








where A(i\niΔ+i) is formed by all columns of A except the (niΔ+i)th column, i.e., aniΔ+i, and b(i\niΔ+i) is formed by all elements of b(i) except the (niΔ+i)th entry with unit value. Then, certain embodiments formulate the following problem for the design of sparse FBF matrix filter taps B

{circumflex over (b)}(i\niΔ+i)custom characterargmin∥b(i\niΔ+i)0 subject to ∥A(i\niΔ+ni)b(i\niΔ+i)+aniΔ+i22≤γeq,i.  (23)

Once {circumflex over (b)}(i\niΔ+i), ∀i∈ni, is calculated, certain embodiments insert the identity matrix B0 in the ith location to form the sparse FBF matrix coefficients, Bs. Then, the optimum FFF matrix taps (in the MMSE sense) are determined from (20) to be

Wopt=Ryy−1RyxBs=Ryy−1β.  (24)

Since Wopt is not sparse in general, certain embodiments propose a sparse implementation for the FFF matrix as follows. After computing Bs, the MSE will be a function only of W and can be expressed as (defining Ryycustom characterAyHAy)













ξ


(


B
s

,
W

)


=





ξ
m



(

B
s

)


+










Trace


{


(


W
H

-



β
_

H



R
yy

-
1




)



A
y
H




A
y



(

W
-


R
yy

-
1




β
_



)



}








=





ξ
m



(

B
s

)


+








A
y


W

-


A
y

-
H




β
_





F
2





=
Δ




ξ
ex



(
W
)





.









(
25
)








By minimizing ξex(W), certain embodiments further minimize the MSE. This is achieved by a reformulation for ξex(W) to get a vector form of W, as in the case of (23), as follows












ξ
ex



(


w
_

f

)


=







(


I

n
i




A
y
H


)




Ψ
_










vec


(
W
)






w
_

f




-


vec


(


A
y

-
H




β
_


)






α
_

y






2
2


,




(
26
)








where vec is an operator that maps a n×n matrix to a vector by stacking the columns of the matrix. Afterward, certain embodiments solve the following problem to compute the FFF matrix filter taps

wfcustom characterargmin∥wf0 subject to ξex(wf)≤γeq,  (27)

where γeq>0 is used to control the performance-complexity tradeoff and it bounds the amount of the noise in the sparse solution vector.


With regard to implementing a proposed sparse approximation framework, unlike the previous approaches, certain embodiments provide a general framework for designing sparse FIR filters, for both single and multiple antenna systems, that can be considered as resolving a problem of sparse approximation using different dictionaries. Mathematically, this general framework poses the FIR filter design problem as follows












z
^

s



=
Δ






arg





min

z





z


0






subject





to









K


(


Φ





z

-
d

)




2
2



ϵ


,




(
28
)








According to (28), a sparse solution (z) is obtained such that the square norm of the error K (Øz−d) is upper-bounded by a predefined sparsity level (number of nonzero entries of z) or an upper bound on the noise.


In addition, Φ is the dictionary that will be used to sparsely approximate d, while K is a known matrix and d is a known data vector, both of which change depending upon the sparsifying dictionary Φ. Those variables are generally functions of the system parameters, e.g., cross- and auto-correlation functions. Furthermore, those parameters are obtained based on the problem formulation (as will be shown later) and not necessarily that d and Ø are equal to the received data, ISI channel, respectively. Notice that {circumflex over (z)}a corresponds to one of the elements in {ws, ws,i, wb,s, wf,s, {circumflex over (b)}(ni), wf} and ∈ is the corresponding element in {δeq, δeq,iγeq, {circumflex over (γ)}eqγeq,i, {circumflex over (γ)}eq,i}.


For all design problems, certain embodiments perform the suitable transformation to reduce the problem to the one shown in (28). For example, certain embodiments complete the square in (6) to reduce (7) to the formulation given in (28). Hence, one can use any factorization for Ryy, e.g., in (6) and (10), Rx/y, e.g., in (11), and R, e.g., in (20), to formulate a sparse approximation problem. Using the Cholesky or Eigen decomposition for Ryy; Rx/y or R, there will be different choices for K, Φ, and d. The matrices Ryy; Rx/y or R, may be obtained as shown in Table I. In addition, the sparsifying dictionary can be the square root factor of any of the aforementioned matrices or any linear combinations/transformations of any one of those matrices.


For instance, by defining the Cholesky factorization of Rx/y, in (11), as Rx/y custom characterLx/yLHx/y, or in the equivalent form Rx/ycustom characterPx/yΣx/yPHx/yx/yΩx/yH (where Lx/y is a lower-triangular matrix, Px/y is a lower-unit-triangular (unitriangular) matrix and Σx/y is a diagonal matrix), the problem in (28) can, respectively, take one of the forms shown below













min


w
b



C

N

f
+
n
-
1










w
b



0







s
.
t
.








(




L
~


x
/
y

H



w
b


+

l

Δ
+
1



)



2
2




γ
eq


,




(
29
)









min


w
b



C

N

f
+
n
-
1










w
b



0







s
.
t
.








(




Ω
~


x
/
y

H




w
~

b


+

p

Δ
+
1



)



2
2




γ
eq





(
30
)







Recall that {tilde over (Ω)}x/yH is formed by all columns of Ωx/yH except the (Δ+1)th column, pΔ+1 is the (Δ+1)th column of Ωx/yH, and {tilde over (w)}b is formed by all entries of wb, except the (Δ+1)th unity entry. Similarly, by writing the Cholesky factorization of Ryy in (10) as Ryy custom characterLyLyH or the Eigen decomposition of Ryy as Ryycustom characterUyDyUyH, certain embodiments can formulate the problem in (28) as follows:









TABLE II







EXAMPLES OF DIFFERENT SPARSIFYING DICTIONARIES THAT


CAN BE USED TO DESIGN ωf GIVEN IN (27).










Factorization Type
K
Φ
d





Ryy = LyLyH
I
Ini ⊗ LyH




vec


(


L
y

-
1




β
_


)











Ly−1
Ini ⊗ Ryy
vec(β)





Ryy = PyΛyPyH
I






I

n
i




Λ
y

1
2





P
y
H









vec


(


Λ
y

-

1
2





P
y

-
1




β
_


)










Ryy = UyDyUyH





D
y

-

1
2





U
y
H





Ini ⊗ Ryy
vec(β)






I






I

n
i




D
y

1
2





U
y
H









vec
(


D
y

-

1
2





U
y
H



β
_


)























min


w
i



C


n
o



N
f











w
i



0







s
.
t
.








(



L
y
H



w
i


-


L
y

-
1




r

Δ
,
i




)



2
2






δ

eq
,
i



,




(
31
)









min


w
i



C


n
o



N
f










w
i



0








s
.
t
.










D
y

1
2




U
y
H



w
i


-


D
y

-

1
2





U
y
H



r

Δ
,
i






2
2




δ



eq

,
i



,
and




(
32
)








min


w
i



C


n
o



N
f











w
i



0







s
.
t
.









L
y

-
1




(



R
yy



w
i


-

r

Δ
,
i



)




2
2







δ

eq
,
i


.





(
33
)







Note that the sparsifying dictionaries in (31), (32) and (33) are LyH, Dy1/2UyH and Ryy, respectively. Furthermore, the matrix K is an identity matrix in all cases except in (33), where it is equal to Ly−1. Additionally, some possible sparsifying dictionaries that can be used to design a sparse FFF matrix filter, given in (27), are shown in Table II.


It is worth pointing out that several other sparsifying dictionaries can be used to sparsely design FIR LEs, FBF and FFF matrix taps.


The problem of designing sparse FIR filters can be cast into one of sparse approximation of a vector by a fixed dictionary. The general form of this problem is given by (28). To solve this problem, certain embodiments use the Orthogonal Matching Pursuit (OMP) greedy algorithm [18] that estimates {circumflex over (z)}, by iteratively selecting a set S of the sparsifying dictionary columns (i.e., atoms ϕilδ is) of Φ that are most correlated with the data vector d and then solving a restricted least-squares problem using the selected atoms. The OMP stopping criterion can be either a predefined sparsity level (number of nonzero entries) of zs or an upper-bound on the Projected Residual Error (PRE), i.e., “K Residual Error.”


Unlike conventional compressive sensing techniques, where the measurement matrix is a fat matrix, the sparsifying dictionary in the framework of certain embodiments is either a tall matrix (fewer columns than rows) with a full column rank as in (29) and (30) or a square one with full rank as in (31)-(33). However, OMP and similar methods can still be used for obtaining {circumflex over (z)}s if Ryy, Rx/y and R can be decomposed into ΨΨH and if the data vector d is compressible.


A next challenge is to determine the best sparsifying dictionary for use in the framework of certain embodiments. It is known that the sparsity of the OMP solution tends to be inversely proportional to the worst-case coherence










μ


(
Φ
)


,


μ


(
Φ
)




=
Δ




max

i

j











ϕ
i

,

ϕ
j











ϕ
i



2






ϕ
j



2



.








[
21
]

,

[
22
]








Notice that μ(Φ)∈[0, 1]. Certain embodiments investigate the coherence of the dictionaries involved in the setup.


Certain embodiments are directed to implementing a reduced-complexity design for the FIR filters discussed above, including LEs and DFEs, for both SISO and MIMO systems. The proposed designs involve Cholesky factorization and/or Eigen decomposition, whose computational costs could be large for channels with large delay spreads. For all proposed designs, the suitable transformation (i.e., Cholesky or eigen decomposition) is performed to reduce the design problem to the one given in (28). In (29)-(31), it can be noticed that either Cholesky or eigen decomposition is needed to have the problems formulated in the given forms. These sparsifying dictionaries may be used to sparsely design FIR LEs, FBF and FFF filters. In summary, the proposed design method for the sparse FIR filters may involve the following steps:

    • 1) An estimate for the channel between the input(s) and the output(s) of the actual transmission channel is obtained. Then, the matrices defined in Table I are computed.
    • 2) The required matrices involved in our design, i.e., Rx/y, R⊥ or Ryy, are factorized using reduced-complexity design discussed below.
    • 3) Based on a desired performance-complexity tradeoff, ∈ is computed. Afterward, the dictionary with the smallest coherence is selected for use in designing the sparse FIR filter.
    • 4) The parameters Φ, d, and K are jointly used to estimate the locations and weights of the filter taps using the OMP algorithm.


For a Toeplitz matrix, the most efficient algorithms for Cholesky factorization are Levinson or Schur algorithms, which involve O(M2) computations, where M is the matrix dimension.


In contrast, since a circulant matrix is asymptotically equivalent to a Toeplitz matrix for reasonably large dimensions, the Eigen decomposition of a circulant matrix can be computed efficiently using the fast Fourier transform (FFT) and its inverse with only O(Mlog2(M)) operations. Certain embodiments can use this asymptotic equivalence between Toeplitz and circulant matrices to carry out the computations needed for Ryy, Rx/y and R factorizations efficiently using the FFT and inverse FFT. To further illustrate, Toeplitz and circulant matrices are asymptotic in the output block length which is equal to the time span (not number of nonzero taps) of the FFF. This asymptotic equivalence implies that the eigenvalues of the two matrices behave similarly. Furthermore, it also implies that factors, products, and inverses behave similarly


In addition, direct matrix inversion can be avoided when computing the coefficients of the filters. This approximation turns out to be quite accurate from simulations as will be shown later.


It is well known that a circulant matrix, C, has the discrete Fourier transform (DFT) basis vectors as its eigenvectors and the DFT of its first column as its eigenvalues. Thus, an M×M circulant matrix C can be decomposed as







C
=


1
M



F
M
H



Λ
c



F
M



,





where FM is the DFT matrix with


fk,l=e−j2πkl/M, 0≤k, l≤M−1, and Ac is an M×M diagonal matrix whose diagonal elements are the M-point DFT of c={c}i=0i=M−1, the first column of the circulant matrix. Further, from the orthogonality of DFT basis functions,

FMHFM=FMFMH=MIM and FNHFN=MIN+1

where FN is an M×N matrix, but FNFNH≠M IN+1 and instead

FNFNH=N[IN . . . IN]T[IN . . . IN].



R
yy, Ryx, Rx/y and R denote the circulant approximations to the matrices Ryy, Ryx, Rx/y and R respectively.


In addition, certain embodiments denote the noiseless channel output vector as {tilde over (y)}, i.e., {tilde over (y)}=Hx.


Certain embodiments first derive the circulant approximation for the block Toeplitz matrix Ryy, when n0≤2, and the case of SISO systems follows as a special case of the block Toeplitz case by setting n0=1.


The autocorrelation matrix Ryy is computed as











R
_

yy

=



E


[



y
~

k




y
~

k


]






R
_


yy
_




+



1
SNR




σ
n
2










I

N
f


.







(
37
)








To approximate the block Toeplitz Ryy as a circulant matrix, certain embodiments assume that {{tilde over (y)}k} is cyclic. Hence, E[ŷkŷb] can be approximated as a time-averaged autocorrelation function as follows (defining L=n0Nf)














R
_



y
_







y
_



=





1

N
f







k
=
0



N
f

-
1






y
~

k




y
~

k
H




=


1

N
f




C

Y
_




C

Y
_

H









=




1

N
f




(


1
L



F
L
H



Λ


Y
_

_




F

N
f



)







(


1
L



F

N
f

H



Λ


Y
_

_

H



F
L


)








=




1

L
2




F
L
H




Λ


Y
_

_




[




I

N
f












I

N
f





]





[


I

N
f














I

N
f



]





n
o






blocks









Λ



Y
_

_

H




F
L









=




1

L
2






F
L
H



[




Λ



Y
_

_

1












Λ



Y
_

_


n
o






]






[




Λ



Y
_

_

1

H











Λ



Y
_

_


n
o


H





]



F
L



,







(
38
)








where FL is a DFT matrix of size L×L, FNf is a DFT matrix of size L×Nf, the column vector {tilde over (Y)} is the L-point DFT of {tilde over (y)}1=[{tilde over (y)}Nf−1T{tilde over (y)}Nf−2T . . . {tilde over (y)}0T], {tilde over (Y)}1 is the ith subvector of {tilde over (Y)}, i.e., {tilde over (Y)}=[{tilde over (Y)}1 {tilde over (Y)}2 . . . {tilde over (Y)}n0]T, {tilde over (y)}i is the n0×1 output vector and Cy=circ({tilde over (y)}1) where circ denotes a circulant matrix whose first column is {tilde over (y)}1. Then,














R
_

yy

=





R
_




y





_



y
_



+


n
o



σ
n
2



I

n
o




N
f









=





1

L
2





F
L
H



[




Λ



Y
_

_

1












Λ



Y
_

_


n
o






]









[




Λ



Y
_

_

1

H











Λ



Y
_

_


n
o


H





]




Ψ

Y
_

H





F
L


+


n
o



σ
n
2



I
L









=





1

L
2





F
L
H



(



Ψ

Y
_




Ψ

Y
_

H


+


n
o


L






σ
n
2



I
L



)




F
L


=

Σ







Σ
H

.










(
39
)







Using the matrix inversion lemma, the inverse of Ryy is














R
_

yy

-
1


=




{


1

L
2





F
L
H



(



Ψ

Y
_




Ψ

Y
_

H


+


n
o


L






σ
n
2



I
L



)




F
L


}


-
1








=






F
L
H



(



Ψ

Y
_




Ψ

Y
_

H


+


n
o


L






σ
n
2



I
L



)



-
1




F
L









=




1


n
o


L






σ
n
2






F
L
H



(


I
L

-


Ψ

Y
_




Λ
Q

-
1




Ψ

Y
_

H



)




F
L



,
where







(
40
)






Q
=






i
=
1


n
o















Y
_

~

i









2





0
_



+


n
o


L






σ
n
2




1
L

.














Here
,









·








2














is defined


as the element-wise norm square

∥|[a0 . . . aNf−1]H∥|2=[|a0|2 . . . |aNf−1|2]H.  (41)

Notice that ΨYYHn=1n0∥|{tilde over (Y)}i∥|2=Nf Σi=1n0∥|H1∥|3. Without loss of generality, certain embodiments can express the noiseless channel output sequence {tilde over (y)}k in the discrete frequency domain as a column vector as follows

{tilde over (Y)}=HHPΔ{tilde over (X)}  (42)

where ⊙ denotes element-wise multiplication,

{tilde over (X)}=[XT . . . XT]T where X is the DFT of the data vector, PΔ=[{tilde over (P)}ΔT . . . {tilde over (P)}ΔT]T.

{tilde over (P)}Δ=[1e−j2×Δ/Nj . . . e−2×(Nu−1)Δ/Nf]T, and H is the DFT of the CIRs, H=[H1T . . . Hn0T]T.


To illustrate, for n0=1, Ryy in (39) reduces to

{tilde over (R)}yy={tilde over (R)}{tilde over (y)}{tilde over (y)}n2INf=FNfHQ1)FNf=QQH,  (43)

where


Q1=Nf∥|H ∥|2n2Nf1Nf, H is the Nf-point DFT of the CIR h and PΔ{tilde over (P)}Δ. Similarly, after some algebraic manipulations, R can be expressed as














R
_



=




1
L




F
N
H



(


I
N

-



[




I
M











I
M




]








Λ

y











y




[




I
M







I
M




]




)








F
N









=



Θ






Θ
H



,







(
44
)








where Ø denotes element-wise division and N=ni(Nf+v).


Additionally, Rx/y can be expressed as














R
_


x
/
y


=




R
xx

-



R
_

yx
H




R
_

yy

-
1





R
_

yx









=




R
xx

-



R
_

yx
H




R
_

yy

-
1




{


1

N
2





F

N
f

H



(


Λ


Y
_

_




Λ

X
_

H


)




F
N


}









=




R
xx

-



R
_

yx
H

×











{


1


N
2



σ
n
2






F

N
f

H



(



Λ

Y
_




Λ

X
_

H


-


Λ

Y
_




Λ

(


θ
_








θ

)




Λ

X
_

H



)




F
N


}







=




I
N

-



R
_

xy



{


1

N






σ
n
2






F
N
H



(


Λ


Y
_

_




Λ
θ

-
1




Λ


X
_

_

H


)




F
N


}









=




I
N

-

{


1

N
2





F
N
H



(


Λ

X
_




Λ


Y
_

_

H



Λ


Y
_

_




Λ
θ

-
1




Λ

X
_

H


)




F
N


}








=




1

N
2





F
N
H



(


NI
N

-


Λ

X
_




Λ

(


θ
_








θ

)




Λ

X
_

H



)




F
N









=





1
N




F
N
H



(


I
N

-

Λ

(


θ
_








θ

)



)




F
N


=

Γ






Γ
H




,







(
45
)








where N=Nf+v, FN is an N×N DFT matrix, FNf is an N×Nf DFT matrix, θ=θ+Nσn21N and θ=∥|{tilde over (Y)}∥|2. Note that {tilde over (Y)} is the N-point DFT of [{tilde over (y)}NfT{tilde over (y)}Nf−1T . . . {tilde over (y)}1T].


Using this low complexity fast computation matrix factorization approach, certain embodiments are able to design the FIR filters in a reduced-complexity manner, where neither a Cholesky nor an Eigen factorization is needed. Furthermore, direct inversion of the matrices involved in the design of filters is avoided.


Certain embodiments of the present invention can be used in the transceivers of single-carrier and multi-carrier wireless/wirelines communication systems. For example, certain embodiments of the present invention can be used in LTE, WiFi, DSL, and power line communications (PLC) modems.


Unlike earlier works, as described below, certain embodiments are directed to a method that transforms a problem of designing sparse finite-impulse response (FIR) linear equalizers, non-linear decision-feedback equalizers (DFEs), and channel-shortening equalizers into a problem of determining a sparsest-approximation of a vector in different dictionaries, for both single and multiple antenna systems.


In addition, several choices of the sparsifying dictionaries are compared by methods of certain embodiments, in terms of their worst-case coherence metric, which determines their sparsifying effectiveness. In addition, certain embodiments reduce the computational complexity of the sparse equalizer design process by exploiting an asymptotic equivalence of Toeplitz and circulant matrices. The superiority of certain embodiments, compared to conventional high-complexity methods, is demonstrated through numerical experiments.


The optimum FIR SISO/MIMO LEs and DFEs were investigated, where the design complexity of the equalizers is proportional to the product of the number of input and output streams. Sparse FIR equalizers and sparse channel shortening equalizers (CSEs) were proposed. However, designing such equalizers involved inversion of large matrices and Cholesky factorization, whose computational cost could be large for channels with large delay spreads (which is the case in broadband communications).


The use of higher sampling rates associated with broadband communications and more sophisticated signal processing schemes increase the complexity of equalizers considerably. Signal processing schemes may be based on multiple-antenna (MIMO) technology, for example. Hence, certain embodiments of the present invention are directed to an effective solution where the equalizers can be computed in software and implemented in either software or hardware at practical complexity levels.


Certain embodiments are directed to a general method that transforms the problem of designing sparse finite-impulse response (FIR) linear equalizers, non-linear decision-feedback equalizers (DFEs), and channel-shortening equalizers into the problem of sparsest-approximation of a vector in different dictionaries, for both single and multiple antenna systems.


Additionally, several choices of the sparsifying dictionaries are compared in terms of a worst-case coherence metric, which determines their sparsifying effectiveness. In addition, certain embodiments reduce the computational complexity of the sparse equalizer design process by exploiting the asymptotic equivalence of Toeplitz and circulant matrices.


The superiority of certain embodiments compared to conventional high-complexity methods has been demonstrated through numerical experiments.



FIG. 1 illustrates a flowchart of another method in accordance with certain embodiments of the invention. The method may include, at 110, receiving a transmission. The method can also include, at 120, determining a sparsifying dictionary that sparsely approximates a data vector of the transmission. The determining the sparsifying dictionary comprises performing a fast Fourier transform and/or an inverse fast Fourier transform. The method can also include, at 130, configuring a filter based on the determined sparsifying dictionary.



FIG. 2 illustrates an apparatus in accordance with certain embodiments of the invention. In one embodiment, the apparatus can be a receiver or transceiver, for example. The apparatus can be implemented in any combination of hardware and software, for example. Apparatus 10 can include a processor 22 for processing information and executing instructions or operations. Processor 22 can be any type of general or specific purpose processor. While a single processor 22 is shown in FIG. 2, multiple processors can be utilized according to other embodiments. Processor 22 can also include one or more of general-purpose computers, special purpose computers, microprocessors, digital signal processors (DSPs), field-programmable gate arrays (FPGAs), application-specific integrated circuits (ASICs), and processors based on a multi-core processor architecture, as examples.


Apparatus 10 can further include a memory 14, coupled to processor 22, for storing information and instructions that can be executed by processor 22. Memory 14 can be one or more memories and of any type suitable to the local application environment, and can be implemented using any suitable volatile or nonvolatile data storage technology such as a semiconductor-based memory device, a magnetic memory device and system, an optical memory device and system, fixed memory, and removable memory. For example, memory 14 may include any combination of random access memory (RAM), read only memory (ROM), static storage such as a magnetic or optical disk, or any other type of non-transitory machine or computer readable media. The instructions stored in memory 14 can include program instructions or computer program code that, when executed by processor 22, enable the apparatus 10 to perform tasks as described herein.


Apparatus 10 can also include one or more antennas (not shown) for transmitting and receiving signals and/or data to and from apparatus 10. Apparatus 10 can further include a transceiver 28 that modulates information on to a carrier waveform for transmission by the antenna(s) and demodulates information received via the antenna(s) for further processing by other elements of apparatus 10. In other embodiments, transceiver 28 can be capable of transmitting and receiving signals or data directly.


Processor 22 can perform functions associated with the operation of apparatus 10 including, without limitation, precoding of antenna gain/phase parameters, encoding and decoding of individual bits forming a communication message, formatting of information, and overall control of the apparatus 10, including processes related to management of communication resources.


In an embodiment, memory 14 can store software modules that provide functionality when executed by processor 22. The modules can include an operating system 15 that provides operating system functionality for apparatus 10. The memory can also store one or more functional modules 18, such as an application or program, to provide additional functionality for apparatus 10. The components of apparatus 10 can be implemented in hardware, or as any suitable combination of hardware and software.



FIG. 3 illustrates an apparatus in accordance with certain embodiments of the invention. Apparatus 300 can include a receiving unit 310 that receives a transmission. Apparatus 300 can also include a determining unit 320 that determines a sparsifying dictionary that sparsely approximates a data vector of the transmission. The determining the sparsifying dictionary comprises performing a fast Fourier transform and/or an inverse fast Fourier transform. Apparatus 300 can also include a configuring unit 330 that configures a filter based on the determined sparsifying dictionary.


The described features, advantages, and characteristics of the invention can be combined in any suitable manner in one or more embodiments. One skilled in the relevant art will recognize that the invention can be practiced without one or more of the specific features or advantages of a particular embodiment. In other instances, additional features and advantages can be recognized in certain embodiments that may not be present in all embodiments of the invention. One having ordinary skill in the art will readily understand that the invention as discussed above may be practiced with steps in a different order, and/or with hardware elements in configurations which are different than those which are disclosed. Therefore, although the invention has been described based upon these preferred embodiments, it would be apparent to those of skill in the art that certain modifications, variations, and alternative constructions would be apparent, while remaining within the spirit and scope of the invention.

Claims
  • 1. A method, comprising: receiving a transmission;determining a sparsifying dictionary (Φ) that sparsely approximates a data vector (d) of the transmission by performing a fast Fourier transform and/or an inverse fast Fourier transform to decompose a dictionary matrix and the data vector (d);projecting the data vector (d) into a lower dimensional vector space; andconfiguring a filter based on the determined sparsifying dictionary (Φ) by tuning a performance of the filter according to a predefined performance-complexity tradeoff level (ϵ),wherein the performing the fast Fourier transform and/or the inverse fast Fourier transform is performed on a circulant matrix that estimates a Toeplitz matrix,wherein the performing the fast Fourier transform and/or the inverse fast Fourier transform on the circulant matrix replaces performing Cholesky and Eigen factorization,wherein the configuring the filter comprises configuring a finite impulse response filter,wherein the method further comprises sparsing a time-domain impulse response of the finite impulse response filter,wherein the determining the sparsifying dictionary (Φ) comprises determining a sparsifying dictionary (Φ) that has a smallest worst-case coherence metric,wherein the worst-case coherence metric is a similarity measure between columns of a design matrix (K),wherein the smaller the worst-case coherence metric is, the less similar the columns are,wherein the method further comprises jointly using the sparsifying dictionary (Φ), the data vector (d), and the design matrix (K) by passing the sparsifying dictionary (Φ), the data vector (d), and the design matrix (K) into an orthogonal matching pursuit greedy algorithm to obtain an estimate of an unknown vector (z), andby minimizing ∥K(Φz−d)∥22,wherein values of the predefined performance-complexity tradeoff level (ϵ) determine a sparsity level and complexity level, a number of nonzero active taps, and a performance level of the filter, andwherein the sparsifying dictionary (Φ) has the lowest coherence out of a plurality of sparsifying dictionaries.
  • 2. A computer program product, embodied on a non-transitory computer readable medium, the computer program product configured to control a processor to perform a method according to claim 1.
  • 3. An apparatus, comprising: at least one processor; andat least one memory including computer program code,the at least one memory and the computer program code configured, with the at least one processor, to cause the apparatus at least toreceive a transmission;determine a sparsifying dictionary (Φ) that sparsely approximates a data vector (d) of the transmission by performing a fast Fourier transform and/or an inverse fast Fourier transform to decompose a dictionary matrix and the data vector (d);project the data vector (d) into a lower dimensional vector space; andconfigure a filter based on the determined sparsifying dictionary (Φ) by tuning a performance of the filter according to a predefined performance-complexity tradeoff level (ϵ),wherein the performing the fast Fourier transform and/or the inverse fast Fourier transform is performed on a circulant matrix that estimates a Toeplitz matrix,wherein the performing the fast Fourier transform and/or the inverse fast Fourier transform on the circulant matrix replaces performing Cholesky and Eigen factorization,wherein configuring the filter comprises configuring a finite impulse response filter,wherein the at least one memory and the computer program code are further configured, with the at least one processor, to cause the apparatus at least to sparse a time-domain impulse response of the finite impulse response filter,wherein the determining the sparsifying dictionary (Φ) comprises determining a sparsifying dictionary (Φ) that has a smallest worst-case coherence metric,wherein the worst-case coherence metric is a similarity measure between columns of a design matrix (K),wherein the smaller the worst-case coherence metric is, the less similar the columns are, andwherein the at least one memory and the computer program code configured, with the at least one processor, to cause the apparatus at least to jointly use the sparsifying dictionary (Φ), the data vector (d), and the design matrix (K)by passing the sparsifying dictionary (Φ), the data vector (d), and the design matrix (K) into an orthogonal matching pursuit greedy algorithm to obtain an estimate of an unknown vector (z), andby minimizing ∥K(Φz−d)∥22,wherein values of the predefined performance-complexity tradeoff level (ϵ) determine a sparsity level and complexity level, a number of nonzero active taps, and a performance level of the filter, andwherein the sparsifying dictionary (Φ) has the lowest coherence out of a plurality of sparsifying dictionaries.
US Referenced Citations (6)
Number Name Date Kind
20050031045 Mayor Feb 2005 A1
20060018398 Shamsunder Jan 2006 A1
20120140797 Malkin Jun 2012 A1
20130286903 Khojastepour Oct 2013 A1
20140108479 Rasmussen Apr 2014 A1
20160014393 Kadambi Jan 2016 A1
Related Publications (1)
Number Date Country
20180131548 A1 May 2018 US