In at least one aspect, the present invention is related to transmission in multiple-input-multiple-output systems.
Massive multiple-input-multiple-output (MIMO) systems, where the transmitter (TX) and/or receiver (RX) are equipped with a large array of antenna elements, are considered a key enabler of 5G cellular technologies due to the massive beamforming and/or spatial multiplexing gains they offer. This technology is especially attractive at millimeter (mm) wave and terahertz (THz) frequencies, where the massive antenna arrays can be built with small form factors, and where the resulting beamforming gain can help compensate for the large channel attenuation. Despite the numerous benefits, full complexity massive MIMO transceivers, where each antenna has a dedicated up/down-conversion chain, are hard to implement in practice. This is due to the cost and power requirements of the up/down-conversion chains—which include expensive and power hungry circuit components such as the analog-to-digital converters (ADCs) and digital-to-analog converters. A key solution to reduce the implementation costs of massive MIMO while retaining many of its benefits is Hybrid Beamforming, wherein a massive antenna array is connected to a smaller number of up/down-conversion chains via the use of analog hardware, such as phase-shifters and switches. While being comparatively cost and power efficient, the analog hardware can focus the transmit/receive power into the dominant channel directions, thus minimizing the performance loss in comparison to full complexity transceivers. In this paper, we focus on a special case of hybrid beamforming with only one up/down-conversion chain (for the in-phase and quadrature-phase signal components each), referred to as analog beamforming.
A major challenge for analog beamforming (and also hybrid beamforming in general) is the acquisition of channel state information (CSI) required for beamforming, referred henceforth as rCSI. Such rCSI may include, for example, average channel parameters or instantaneous parameters, and is commonly obtained by transmitting known signals (pilots) at the TX and performing channel estimation (CE) at the RX at least once per rCSI coherence time—which is the duration for which the rCSI remains approximately constant. Since one down-conversion chain has to be time multiplexed across the RX antennas for CE in analog beamforming, several pilot re-transmissions are required for rCSI acquisition. As an example, exhaustive CE approaches require O(MtxMrx) pilots per rCSI coherence time, where Mtx, Mrx are the number of TX and RX antennas, respectively and O(·) represents the scaling behavior in big-‘oh’ notation. Such a large pilot overhead may consume a significant portion of the time-frequency resources when the time for which rCSI remains constant is short, such as in vehicle-to-vehicle channels, in systems using narrow TX/RX beams, e.g., massive MIMO systems, or in channels with large carrier frequencies (high Doppler) and high blocking probabilities, e.g., at mm-wave, THz frequencies. The overhead also increases system latency and makes the initial access (IA) procedure cumbersome. As a solution, several fast CE approaches have been proposed in literature, which are discussed below assuming Mtx=1 for convenience. Side information aided CE approaches utilize spatial/temporal statistics of rCSI to reduce the pilot overhead. Compressed sensing based CE approaches exploit the sparse nature of the channel to reduce the number of pilots per coherence time up to O[L log(Mrx/L)], where L is the channel sparsity level. Iterative angular domain CE uses progressively narrower search beams at the RX to reduce the required pilot transmissions to O(log Mrx).
Approaches that utilize side information to improve iterative angular domain CE or perform angle domain tracking have also been considered. Sparse ruler based approaches exploit the possible Toeplitz structure of the spatial correlation matrix to reduce pilots to O(√{square root over (Mrx)}). Since the required pilots still scale with Mrx in these approaches, they are only partially successful in reducing the CE overhead. Furthermore, since some of these approaches require side information and/or prior timing/frequency synchronization, they may not be applicable for the IA process. Some approaches also require a long rCSI coherence time that spans the pilot re-transmissions and/or are only applicable for certain antenna array configurations and channel models. Finally, to reduce the impact of the transient effects of analog hardware on CE, the multiple pilots may have to be temporally spaced apart, thus potentially increasing the overhead and latency.
The main reason for the overhead is that conventional CE approaches require processing in the digital domain, while the RX has only one down-conversion chain. Prior to the growth of digital hardware and digital processing capabilities, some legacy systems used an alternate RX beamforming approach in single path channels, that does not require digital CE. In this approach, an analog phase locked loop (PLL) is used to recover the received signal carrier at each RX antenna, and the recovered carrier is then used for down-converting the received signal at that antenna to baseband. Since the carrier and data suffer the same inter-antenna phase shift (in single path channels), the down-conversion leads to compensation of this phase shift, enabling coherent combining of the signals from each antenna (i.e., beamforming). As this approach does not involve digital processing or pilots, it shows potential in solving the high CE overhead encountered with digital CE. Since carrier recovery can also be interpreted as estimation of the channel phase at the carrier frequency using analog hardware, we shall refer to this class of techniques as analog channel estimation (ACE). The delay domain counterpart of this approach was also explored for single antenna ultra-wideband systems, referred to as transmit reference schemes. However, such legacy ACE systems were mainly proposed for space communication and hence only supported single path channels. Additionally, recovering the carrier at the RX via a PLL is difficult at the low signal-to-noise ratios (SNRs) and high frequencies encountered in mm-wave/THz systems, and leads to a high RX phase-noise, i.e., random fluctuation in the instantaneous frequency of the recovered carrier that degrades system performance.
Accordingly, there is a need for improved, cost effective ACE architectures.
In at least one aspect, a MIMO system having a continuous analog channel estimation (CACE) receiver is provided. The MIMO system includes a transmitter (TX) that transmits a transmitted signal that includes a data signal combined with a predetermined reference signal. The system also includes a receiver (RX) that includes a plurality of antennas wherein each antenna receives the transmitted signal and outputs an associated received signal. The receiver further includes a baseband conversion processor that either contains an independent oscillator or recovers the transmitted reference signal including either or both of the signal amplitude and phase. Each associated received signal is then multiplied with the independent oscillator/recovered reference signal and with its quadrature component in the analog domain, resulting in processor output signals that are low-pass signals with at least partially compensated inter-antenna phase shift. The RX further may include an optional amplitude and phase compensation processor that adjusts outputs from the baseband conversion processor via analog phase shifters. The amplitude and phase compensation processor may utilize the baseband reference signal in the outputs from the baseband conversion processor as control signals to the phase-shifters, to further compensate the inter-antenna phase-shifts in the outputs from the baseband conversion processor. The RX also includes an analog adder that sums outputs signals from either the bandpass conversion processor or the optional amplitude and phase compensation processor as a summed signal output thereby emulating signal combining and beamforming without the RX applying explicit channel estimation.
In another aspect, a MIMO system having a periodic analog channel estimation (PACE) receiver is provided. The MIMO system includes a TX that transmits a transmitted signal that is a predetermined reference signal during a beamforming design phase and a data signal during a data transmission phase. The reference signal can be a reference tone with a predetermined frequency. The MIMO system also includes a RX that includes a plurality of antennas wherein each antenna receives the transmitted signal and outputs an associated received signal. The receiver also includes a phase and amplitude estimator circuit that recovers the reference signal during an a beamforming design phase and then multiplies each associated received signal during a beamforming design phase with the recovered reference signal and a quadrature component of the reference signal in the analog domain to form a plurality of in-phase-derived control signals and a plurality of quadrature-derived control signals with each antenna having an associated in-phase-derived control signal and an associated quadrature-derived control signal. The receiver also includes a plurality of variable gain phase shifters with each antenna having an associated variable gain phase-shifter wherein the associated variable gain phase shifter of each antenna receives the associated in-phase-derived (baseband) control signal and the associated quadrature-derived (baseband) control signal through which the data signal is processed during a data transmission phase. An analog adder sums outputs from the plurality of the variable gain phase shifters as a summed signal output.
In another aspect, a non-coherent MIMO system that applies a multiantenna frequency shift reference (MAFSR) receiver is provided. The MIMO system includes a transmitter (TX) that transmits a transmitted signal that includes a data signal combined with a predetermined reference signal. The reference signal can be a reference tone having a predetermined frequency. The MIMO system also includes a receiver (RX). The receiver includes a plurality of antennas wherein each antenna receives the transmitted signal and outputs an associated received signal. There receiver also includes a plurality of bandpass filters wherein each antenna is associated with a bandpass filter and each bandpass filter receives a corresponding associated received signal and outputs an associated filtered received signal. A squaring circuit squares each associated filtered received signal to form associated squared received signals, wherein each antenna is associated with a squaring circuit. The squared outputs involve, among other signals, a product between the reference signal and data signals with the inter-antenna phase shift compensated. Finally, an analog adder that adds the squared received signals from all antennas to produce a summed signal.
In another aspect, a novel transmission scheme (CACE) for low-complexity massive MIMO systems, that does not require phase-shifters or explicit CSI estimation at the RX is provided.
In another aspect, a novel transmission scheme (CACE) for low-complexity massive MIMO systems, that only requires base band phase-shifters and does not require explicit CST estimation at the RX is provided.
In another aspect, a novel transmission scheme (CACE) for low-complexity massive MIMO systems, that can mitigate the impact of oscillator phase noise.
In another aspect, an RX architecture for the CACE scheme and characterization of the achievable throughput in a wide-band channel, for a single spatial data-stream, is provided.
In still another aspect, a near-optimal power allocation for data streams and an 1A procedure for CACE is provided.
In another aspect, a novel beamforming scheme (e.g., PACE aided beamformer) that enables receive beamforming in massive MIMO systems with reduced hardware and energy cost is provided, which alleviates one or more problems of the prior art. In PACE the TX transmits a reference signal, which may be a sinusoidal tone at a known frequency, during periodic beamformer design phases. A carrier recovery circuit, such as a phase-locked loop (PLL), is used to recover the reference signal at one or a plurality of antennas. This recovered reference signal, and it's quadrature component, are then used to estimate the phase off-set and amplitude of the reference signal at each RX antenna, via a bank of ‘filter, sample and hold’ circuits (represented as integrators in
In another aspect, a novel PACE technique, that enables RX analog beamforming with low CE overhead is provided.
In another aspect, a novel PACE technique, that requires only one reference recovery circuit is provided.
In another aspect, a novel PACE technique with a reference recovery circuit that can extract the reference signal from multiple antennas is provided.
In yet another aspect, a novel PACE technique, that does not require continuous transmission of the reference signal is provided.
In still another aspect, a receiver architecture that supports the PACE technique, and characterizes the achievable system throughput in a wide-band channel is provided.
In another aspect, a non-coherent MIMO MA-FSR receiver architecture that does not require an oscillator at the receiver, and can perform beamforming with low CE overhead is provided.
In another aspect, a MIMO MA-FSR receiver architecture that suppresses transmit oscillator phase noise is provided.
In still another aspect, an MA-FSR scheme with a reference signal and data signals design is provided that ensures the product of the data signal with itself does not cause interference to the product between the reference signal and data signal at the outputs of the squaring circuits.
for simulations and
for analytic approximation. We assume A1(r)=1, A5(r)=0.7ejπ/3, A15(r)=0.5e−jπ/3 and the remaining parameters are from Table 1.
Reference will now be made in detail to presently preferred compositions, embodiments and methods of the present invention, which constitute the best modes of practicing the invention presently known to the inventors. The Figures are not necessarily to scale. However, it is to be understood that the disclosed embodiments are merely exemplary of the invention that may be embodied in various and alternative forms. Therefore, specific details disclosed herein are not to be interpreted as limiting, but merely as a representative basis for any aspect of the invention and/or as a representative basis for teaching one skilled in the art to variously employ the present invention.
It is also to be understood that this invention is not limited to the specific embodiments and methods described below, as specific components and/or conditions may, of course, vary. Furthermore, the terminology used herein is used only for the purpose of describing particular embodiments of the present invention and is not intended to be limiting in any way.
It must also be noted that, as used in the specification and the appended claims, the singular form “a,” “an,” and “the” comprise plural referents unless the context clearly indicates otherwise. For example, reference to a component in the singular is intended to comprise a plurality of components.
The term “comprising” is synonymous with “including,” “having,” “containing,” or “characterized by.” These terms are inclusive and open-ended and do not exclude additional, unrecited elements or method steps.
The phrase “consisting of” excludes any element, step, or ingredient not specified in the claim. When this phrase appears in a clause of the body of a claim, rather than immediately following the preamble, it limits only the element set forth in that clause; other elements are not excluded from the claim as a whole.
The phrase “consisting essentially of” limits the scope of a claim to the specified materials or steps, plus those that do not materially affect the basic and novel characteristic(s) of the claimed subject matter.
With respect to the terms “comprising,” “consisting of,” and “consisting essentially of,” where one of these three terms is used herein, the presently disclosed and claimed subject matter can include the use of either of the other two terms.
Throughout this application, where publications are referenced, the disclosures of these publications in their entireties are hereby incorporated by reference into this application to more fully describe the state of the art to which this invention pertains.
With reference to
Still referring to
In the variation depicted in
In the variation depicted in
With reference to
Still referring to
Still referring to
In a refinement, the reference signal can be recovered from one antenna or a plurality of antennas.
In a variation, MIMO system 60 includes a low pass filter and/or downconverter 100 that receives as input the summed signal output and outputs a baseband signal. The baseband signal is sampled by Analog to digital converter 102 to form a digital baseband signal. The digital baseband signal is then demodulated by OFDM demodulator 104 that demodulates the bandpass signal.
With reference to
With reference to
With respect to the systems of
The following examples illustrate the various embodiments of the present invention. Those skilled in the art will recognize many variations that are within the spirit of the present invention and scope of the claims.
1. Continuous Analog Channel Estimation Aided Beamforming for Massive MIMO Systems
I. Introduction
In this embodiment, a more generalized ACE approach for RX beamforming, called continuous ACE (CACE) is explored, that does not require carrier recovery at the RX, mitigates oscillator phase noise and works in multi-path channels. The latter is accomplished by exploiting not only phase of the carrier signal at the RX but also its amplitude. In CACE, a reference tone, i.e. a sinusoidal tone at a known frequency, is continuously transmitted along with the data by the TX as illustrated in
Notation: scalars are represented by light-case letters; vectors by bold-case letters; and sets by calligraphic letters. Additionally, j=√{square root over (−1)}, a* is the complex conjugate of a complex scalar a, |a| represents the 2-norm of a vector a, AT is the transpose of a matrix A and A† is the conjugate transpose of a complex matrix A. Finally, a is an a×a identity matrix, a,b is the a×b all zeros matrix, { } represents the expectation operator, represents equality in distribution, Re{·}/lm{·} refer to the real/imaginary component, respectively, and (a, B) represents a circularly symmetric complex Gaussian vector with mean a and covariance matrix B.
II. General Assumptions and System Model
We consider a single cell system in downlink, where a Mtx antenna base-station (BS) transmits data to multiple UEs simultaneously via spatial multiplexing. Since we mainly focus on the downlink, we shall use the terms BS/TX and UE/RX interchangeably. Each UE is assumed to have a hybrid architecture, with Mrx antennas and one down-conversion chain, and it performs CACE aided RX beamforming. On the other hand, the BS may have an arbitrary architecture and it transmits a single spatial data-stream to each scheduled UE. For convenience, we consider the use of noise-less and perfectly linear antennas, filters, amplifiers and mixers at both the BS and the representative UE. We assume the downlink BS-UE communication to be divided into three stages: (i) Initial Access (IA)(ii) TX beamformer design—where the TX acquires rCSI for all the UEs and uses it to perform UE scheduling, TX beamforming and power allocation and (iii) Data transmission—wherein the BS transmits data signals and the scheduled UEs use CACE to adapt the RX beams and receive the data. Through a major portion of this paper, we assume that the IA and TX beamformer design have been performed apriori and shall focus on the data transmission stage. However in Section 5, we shall also discuss how CACE beamforming can help in stages (i) and (ii).
In stage (iii), we assume the BS to transmit spatially orthogonal signals to the scheduled UEs to mitigate inter-user interference. This can be achieved, for example, by careful UE scheduling and/or via avoiding transmission to common channel scatterers (A. Adhikary, E. A. Safadi, M. K. Samimi, R. Wang, G. Caire, T. S. Rappaport, and A. F. Molisch, “Joint spatial division and multiplexing for mm-wave channels,” IEEE Journal on Selected Areas in Communications, vol. 32, pp. 1239-1255, June 2014). For this system model and for a given TX beamformer and power allocation, we shall restrict the analysis to a single representative UE without loss of generality. The BS is assumed to transmit orthogonal frequency division multiplexing (OFDM) symbols to the representative UE, with K=K1+K2+1 sub-carriers indexed as ={−K1, . . . , K2}. The 0-th sub-carrier is used as a reference tone, i.e., a pure sinusoidal signal with a pre-determined frequency, while data is transmitted on the K1−g lower and K2−g higher sub-carriers, represented by the index set {K1, . . . , −g−1, g+1, . . . , K2}. The 2g sub-carriers indexed as {−g, . . . , −1, 1, . . . , g} are blanked to act as a guard band between the reference and data sub-carriers as illustrated in
for −Tcp≤t≤Ts, where t is the Mtx×1 unit-norm TX beamforming vector for this UE (designed apriori in stage (ii)), E(r) is the energy-per-symbol allocated to the reference tone, ={−g, . . . , 0, . . . g} defines the non-data sub-carriers, xk is the data signal on the k-th sub-carrier, fc is the reference frequency, fk=k/Ts represents the frequency offset of the k-th sub-carrier and Ts, Tcp are the symbol duration and the cyclic prefix duration, respectively. Here we define the complex equivalent signal such that the actual (real) transmit signal is given by stx(t)=Re{{tilde over (s)}tx(t)}. For the data sub-carriers (k∈\), we assume the use of independent data streams with equal power allocation, and circularly symmetric Gaussian signaling, i.e., xk˜(0, E(d)). The transmit power constraint is then given by E(r)+(K−||)E(d)≤Es, where Es is the total OFDM symbol energy (excluding the cyclic prefix). The channel to the representative UE is assumed to have L MPCs with the Mrx×Mtx channel impulse response matrix and its Fourier transform, respectively, given as (M. Akdeniz, Y. Liu, M. Samimi, S. Sun, S. Rangan, T. Rappaport, and E. Erkip, “Millimeter wave channel modeling and cellular capacity evaluation,” IEEE Journal on Selected Areas in Communications, vol. 32, pp. 1164-1179, June 2014.19):
H(t)=arx()atx()†δ(t−), (1.2a)
(f)=arx()atx()†, (1.2b)
where αis the complex amplitude and τis the delay and atx(), arx() are the Mtx×1 TX and Mrx×1 RX array response vectors, respectively, of the -th MPC. As an illustration, the -th RX array response vector for a uniform planar array with MH horizontal and MV vertical elements (Mrx=MHMV) is given by arx()=ārx(ψazirx(), ψelerx()), where
for h∈{0, . . . , MH−1} and ν∈(1, . . . , MV), ψazirx(), ψelerx() are the azimuth and elevation angles of arrival for the -th MPC, ΔH, ΔV are the horizontal and vertical antenna spacings and λ is the carrier wavelength. Expressions for atx() can be obtained similarly. Note that in (1.2) we implicitly ignore the frequency variation of individual MPC amplitudes (α0, . . . , αL−1) and the beam squinting effects (S. K. Garakoui, E. A. M. Klumperink, B. Nauta, and F. E. van Vliet, “Phased-array antenna beam squinting related to frequency dependency of delay circuits,” in European Microwave Conference, pp. 1304-1307, October 2011.53), which are reasonable assumptions for moderate system bandwidths. It is emphasized that the complete channel response (including all MPCs) however still experiences frequency selective fading. To prevent inter-symbol interference, we let the cyclic prefix be longer than the maximum channel delay: Tcp>τL−1. We also consider a generic temporal variation model, where the time for which MPC parameters {,atx(),arx(),} stay approximately constant is much larger than the symbol duration Ts. Finally, we do not assume any distribution prior or side information on {,atx(),arx(),}.
Each RX antenna front-end is assumed to have a low noise amplifier (LNA) followed by a band-pass filter (BPF) that leaves the desired signal un-distorted but suppresses the out-of-band noise. The filtered signal is then converted to baseband using the in-phase and quadrature-phase components of an RX local oscillator, as depicted in
for 0≤t≤Ts, where the Re/Im parts of {tilde over (s)}rx,BB(t) are the outputs corresponding to the in-phase and quadrature-phase components of the RX oscillator, θ(t) is the phase-noise process of the RX oscillator and {tilde over (w)}(t) is the Mrx×1 complex equivalent, baseband, stationary, additive, vector Gaussian noise process, with individual entries being circularly symmetric, independent and identically distributed (i.i.d.), and having a power spectral density: w(f)=N0 for −fK
where wθ(t) is a real white Gaussian process with variance σθ2. We assume the RX to have apriori knowledge of σθ. As illustrated in
where we define A(t)LPFĝ{e−jθ(t)} and ŵ(t) is the Mrx×1 filtered Gaussian noise with power spectral density: w(f)=N0 for −fĝ≤f≤fĝ. An illustration of this filtering operation is provided in
where we define
Conventional OFDM demodulation follows on y[n] to obtain the different OFDM sub-carrier outputs, as analyzed in Section 3.
III. Analysis of the Demodulation Outputs
Without loss of generality, we shall restrict the analysis to the representative 0-th OFDM symbol and thus, we only focus on {y[n]|0≤n<K}. Note that the sampled, band-limited additive noise {tilde over (w)}[n] and the sampled RX phase-noise e−jθ[n] for 0≤n<K can be expressed using their normalized Discrete Fourier Transform (nDFT) expansions as:
{tilde over (w)}[n]=W[k]ej2πkn/K, (1.8a)
e−jθ[n]=Ω[k]ej2πkn/K, (1.8b)
where
are the corresponding nDFT coefficients. Here nDFT is an unorthodox definition for Discrete Fourier Transform, where the normalization by K is performed while finding W[k],Ω[k] instead of in (1.8). These nDFT coefficients are periodic with period K and satisfy the following lemma:
Lemma 3.1 The nDFT coefficients of e−jθ[n] for 0≤n<K satisfy:
for arbitrary integers k1, k2, where δa,bK=1 if a=b (mod K) or δa,bK=0 otherwise.
Proof. See Appendix 1A.
To test the accuracy of the approximation in Lemma 3.1, the Monte-Carlo simulations of Δk,k, Δk,k+1 and Δk,k+100 for a typical phase-noise process (−93 dBc/Hz at 10 MHz offset) are compared to (1.9b) in
Lemma 3.2 The nDFT coefficients of {tilde over (w)}[n], i.e., {W[k]|∀k}, are jointly Gaussian with:
for arbitrary integers k1, k2, where δa,bK=1 if a=b (mod K) or δa,bK=0 otherwise.
Proof. See Appendix 1B.
Note that using these nDFT coefficients, the low-pass filtered versions of {tilde over (w)}[n] and e−jθ[n] in (1.7) can be approximated as:
ŵ[n]≈W[k]ej2πkn/K, (1.11a)
A[n]≈Ω[k]ej2πkn/K, (1.11b)
where ={−ĝ, . . . , ĝ} and the approximations are obtained by replacing the linear convolution of {tilde over (s)}rx,BB(t) and the filter response LPFĝ{ } with a circular convolution. This is accurate when the filter response has a narrow support, i.e., for ĝ»1. The remaining results in this paper are based on the approximations in (1.9)-(1.11) and on an additional approximation discussed later in Remark 3.1. While we still use the ≤, =, ≥ operators in the following results for convenience of notation, it is emphasized that these equations are true in the strict sense only if the approximations in (1.9)-(1.11) and Remark 3.1 are met with equality. However simulation results are also used in Section VI to test the validity of these approximations. Substituting (8) and (11) into (7), the k-th OFDM demodulation output can be expressed as:
We shall split Yk as Yk=Sk+Ik+Zk where Sk, referred to as the signal component, involves the terms in (1.12) containing xk and not containing the channel noise, Ik, referred to as the interference component, involves the terms containing E(r), {x
A. Signal Component Analysis
From (1.12), the signal component for k∈\ can be expressed as:
Sk=Mrxβ0,k√{square root over (E(r))}xk[|Ω[{dot over (k)}]|2], (1.13)
where we define βk
{|Sk|2}=Mrx2|β0,k|2E(r)E(d){[|Ω[{dot over (k)}]|2]2}≥Mrx2|β0,k|2E(r)E(d)μ(0,ĝ)2 (1.14)
where we define μ(a,ĝ)Δa+{dot over (k)},a+{dot over (k)}, and (114) follows from Jensen's inequality and (1.9b).
B. Interference Component Analysis
From (1.12), the interference component for k∈\ can be expressed as:
Ik=Mrxβ0,
As is clear from above, the demodulation output Yk for k∈\ suffers inter-carrier interference (ICI) from other sub-carrier data streams due to the RX phase-noise. The first and second moments of Ik, averaged over the other sub-carrier data {
where , are obtained using the fact that {xk|k∈} have a zero-mean and are independently distributed; is obtained by defining βmaxmax|βk,k|, observing |β0,k|≤βmax, and using \[∪{k}]⊆\{k} in first term and using the Cauchy-Schwarz inequality for the second term; follows by changing the summation order in the first term and by using (11) for the second term; follows by using Ω[k]=Ω[k+K] and (11) for the first term and follows by using (12) and the Jensen's inequality. As shall be shown in Section VI, (21) may be a loose bound on ICI for lower subcarriers, i.e., |k|«K.
Remark 3.1 A tighter approximation for {|Ik|2} can be obtained by replacing μ(k,ĝ) in (21) with {tilde over (μ)}(k,ĝ)Δ{dot over (k)},{dot over (k)}Δ{dot over (k)}+k,{dot over (k)}+k.
This heuristic is obtained by assuming Ω[{dot over (k)}] and Ω[{umlaut over (k)}+k] to be independently distributed for {dot over (k)}, {umlaut over (k)}∈ and k∈\ in step of (21), but we skip the proof for brevity. As shall be verified in Section VI, Remark 3.1 offers a much better ICI approximation ∀k and hence we shall use {tilde over (μ)}(k,ĝ) instead of μ(k,ĝ) in the forthcoming derivations in Section VI.
C. Noise Component Analysis
From (16), the noise component of Yk for k∈\ can be expressed as:
Zk=Zk(1)+Zk(2)+Zk(3), where: (1.17)
Zk(1)=√{square root over (Ts)}W[{dot over (k)}]†((0)t√{square root over (E(r))}Ω[k+{dot over (k)}]+(f
Zk(2)=√{square root over (Ts)}t†(0)†W[k+{dot over (k)}]√{square root over (E(r))}Ω*[{dot over (k)}]
Zk(3)=TsW†[{dot over (k)}]w[k+{dot over (k)}]
Note that the noise consists of both signal-noise and noise-noise cross product terms. From Lemma 3.2, it can readily be verified that {Zk}=0 and {Zk(i)[Zk(j)]*}=0 for i≠j, where the expectation is taken over the noise realizations. Thus the second moment of Zk, averaged over the TX data, phase-noise and channel noise, can be expressed as {|Zk|2}={|Zk(1)|2}+{|Zk(2)|2}+{|Zk(3)|2}, where:
{|Zk(1)|2}N0|(0)t√{square root over (E(r))}Ω[{dot over (k)}+k]+(
{|Zk(2)|2}=(3)N0|(0)t|2E(r){|Ω[{dot over (k)}]|2}=(4)Mrxβ0,0N0E(r)Δk,k (1.18b)
{|Zk(3)|2}=Ts2{W[{dot over (k)}]†W[k+{dot over (k)}]W[k+{umlaut over (k)}]†W[{umlaut over (k)}]}=(5)Mrx||N02 (1.18c)
where ||=2ĝ+1, , follow from Lemma 3.2; , follow from (12), and follows from Lemma 3.2, (1.9b) and the result on the expectation of the product of four Gaussian random variables 9W. Bar and F. Dittrich, “Useful formula for moment computation of normal random variables with nonzero means,” IEEE Transactions on Automatic Control, vol. 16, pp. 263-265, June 1971). From (1.18), we can then upper-bound the noise power as:
{|Zk|2}≤MrxβmaxN0[E(r)+||E(d)]+Mrx||N02(1.19)
where we use the fact that |βk,k|≤βmax, [Δk+k,{dot over (k)}+k+Δk,{dot over (k)}]≤1 for k∈\ (as ĝ≤g/2) and Δk+{dot over (k)}−
IV. Performance Analysis
From (1.12)-(1.17), the effective single-input-single-output (SISO) channel between the k-th sub-carrier input and corresponding output can be expressed as:
Yk=Mrxβ0,k√{square root over (E(r))}[|Ω[{dot over (k)}]|2]xk+Ik+Zk, for k∈\ (1.20)
where Ik and Zk are analyzed in Sections III-B and III-C, respectively. As is evident from (1.20), the signal component suffers from two kinds of fading: (i) a frequency-selective and channel dependent slow fading component represented by β0,k and (ii) a frequency-flat and phase-noise dependent fast fading component, represented by |Ω[{dot over (k)}]|2. The estimation of these fading coefficients is discussed later in this section. In this paper, we consider the simple demodulation approach where xk is estimated only from Yk, and the Ik,Zk are treated as noise. For this demodulation approach, a lower bound to the signal-to-interference-plus-noise ratio (SINR) can be obtained from (1.14), (1.16b), Remark 3.1 and (1.19), as:
where β{β0,k|} and we use the fact that {|Ik+Zk|2}={|Ik|2}+{|Zk|2}.
Remark 4.1 If the RX array response vectors or the MPCs are mutually orthogonal i.e. arx(1)†arx(2)=Mrx then β{dot over (k)},{umlaut over (k)}=||2|atx()†t|2 and βmax=||2.
The orthogonality of array response vectors is approximately satisfied if the MPCs are well separated and Mrx»L (O. El Ayach, R. Heath, S. Abu-Surra, S. Rajagopal, and Z. Pi, “The capacity optimality of beam steering in large millimeter wave MIMO systems,” in IEEE International Workshop on Signal Processing Advances in Wireless Communications (SPAWC), pp. 100-104, June 2012). From Remark 4.1, note that even without explicit CE at the RX γkLB(β) scales with Mrx in the low SNR regime, which is a desired characteristic. While the ICI term also scales with Mrx, its contribution can be kept small in the desired SNR range by picking ĝ such that μ(0,ĝ)≈1. In a similar way, with perfect knowledge of the fading coefficients at the RX, an approximate lower bound to the ergodic capacity can be obtained as:
where is obtained by assuming Ik, Zk to be Gaussian distributed and using the expression for ergodic capacity (A. Goldsmith and P. Varaiya, “Capacity of fading channels with channel side information,” IEEE Transactions on Information Theory, vol. 43, no. 6, pp. 1986-1992, 1997), follows by sending the outer expectation into the log(·) functions and follows from (1.14), (1.16b) and (1.19). While is an approximation, it typically yields a lower bound since Variance{|Ω[{dot over (k)}]|2}≤μ(0,ĝ)[1−μ(0,ĝ)]«μ(0,ĝ)2 (from (1.9a) and (R. Bhatia and C. Davis, “A better bound on the variance,” The American Mathematical Monthly, vol. 107, p. 353, April 2000).
Note that for demodulating xk's and achieving the above SINR and capacity, the RX requires estimates of N0 and the SISO channel fading coefficients β and |Ω[{dot over (k)}]|2. Since the RX has a good beamforming gain (1.21), the channel parameters β, N0 can be tracked accurately at the RX with a low estimation overhead using pilot symbols and blanked symbols. These values, along with phase-noise parameter σθ, can further be fed back to the TX for rate and power allocation. Note that since these pilots are only used to estimate the SISO channel parameters and not the actual MIMO channel, the advantages of simplified CE are still applicable for a CACE based RX. On the other hand, the low variance albeit fast varying component |Ω[{dot over (k)}]|2 can be estimated for every symbol using the 0-th sub-carrier output Y0. It can be shown from (1.12) that Y0=Mrxβ0,0E(r)|Ω[{dot over (k)}]|2+I0+Mrx|N0+Z0, where we have {|I0|2}≤{|Ik|2} and {|Z0|2}≤2{|Zk|2} for any k∈\. Thus |Ω[{dot over (k)}]|2 can be estimated from Y0 with an
which is usually a large value.
A. Optimizing System Parameters
From (1.22), note that the approximate ergodic capacity Capprox(β) is a decreasing function of g for g≥2ĝ. Thus a Capprox(β) maximizing choice of g should satisfy g=2ĝ. From (1.22) and (1.21), we can also lower bound Capprox(β) as:
follows from the fact that log(1+γkLB(β))≥log(γkLB(β)) and by taking the summation over k in (1.22) into the denominator of the logarithm; and follows from the fact that {tilde over (μ)}(k,ĝ)≤μ(0,ĝ)Δk,k and E(d)(K−||)+E(r)=Es. It can be verified that the numerator of Θ(β) is a differentiable, strictly concave function of E(r), while the denominator is a positive, affine function of E(r). Thus Θ(β) is a strictly pseudo-concave function of E(r) (S. Schaible, “Fractional programming,” Zeitschrift für Operations Research, vol. 27, pp. 39-54, dec 1983), and the Capprox(β) maximizing power allocation can be obtained by setting
as:
where Q=Mrx|βmax|2[1−μ(0,ĝ)]μ(0,ĝ)Es+βmaxN0(K−||−||) and R=N0||[βmax+N0(K−||)/Es]. As evident from (1.23b), ĝ offers a trade-off between the phase-noise induced ICI and channel noise accumulation. While finding a closed form expression for (1.23b) maximizing ĝ is intractable, it can be computed numerically by performing a simple line search over 1≤ĝ≤min{K1, K2}/2, with g=2ĝ and E(r) as given by (1.24).
V. Initial Access, TX Beamforming and Uplink Beamforming
In this section we briefly discuss stages (i) and (ii) of downlink transmission (see Section 2), and uplink TX beamforming for CACE aided UEs. In the suggested IA protocol for stage (i), the BS performs beam sweeping along different angular directions, possibly with different beam widths, similar to the approach of 3GPP New Radio (NR). For each TX beam, the BS transmits primary (PSS) and secondary synchronization sequences (SSS) with the reference signal, in a form similar to (1.1). The UEs use CACE aided RX beamforming, and initiate uplink random access to the BS upon successfully detecting a PSS/SSS. As shall be shown in Section VI, the SINR expression (1.21) is resilient to frequency mismatches between TX and RX oscillators, and thus is also applicable for the PSS/SSSs where frequency synchronization may not exist. Since angular beam-sweeping is only performed at the BS, the IA latency does not scale with Mrx and yet the PSS/SSS symbols can exploit the RX beamforming gain, thus improving cell discovery radius and/or reducing IA overhead. This is in contrast to digital CE at the UE, which would require sweeping through many RX beam directions for each TX direction, necessitating several repetitions of the PSS/SSS for each TX beam. During downlink stage (ii), note that scheduling of UEs, designing TX beamformer and allocation of power requires knowledge of {|,atx()} for all the UEs. Such rCSI can be acquired at the BS either by downlink CE with CSI feedback from the UEs or by uplink CE. The protocol for downlink CE with feedback is similar to the IA protocol, with the BS transmitting pilot symbols instead of PSS and SSS. Uplink CE can be performed by transmitting orthogonal pilots from the UEs omni-directionally, and using any of the digital CE algorithms from Section I at the BS. Note that CACE cannot be used at the BS since the pilots from multiple UEs need to be separated via digital processing.
Note that the phase shifts used for RX beamforming at a CACE aided UE in downlink, can also be used for transmit beamforming in the uplink. However since the reference signal is not available at the UE during uplink transmission in time division duplexing systems, a mechanism for locking these phase shift values from a previous downlink transmission stage is required (similar to (V. V. Ratnam and A. F. Molisch, “Periodic analog channel estimation aided beamforming for massive MIMO systems,” IEEE Transactions on Wireless Communications, 2019, accepted to)). In contrast, frequency division duplexing can avoid such a mechanism due to continuous availability of the downlink reference, and consequently ŝrx,BB(t).
VI. Simulation Results
For the simulations, we consider a single cell scenario with a λ/2-spaced 32×8 (Mtx=256) antenna BS and one representative UE with a λ/2-spaced 16×4 (Mrx=64) antenna army, having perfect timing synchronization to the BS, one down-conversion chain, and using CACE aided beamforming. The BS has apriori rCSI and transmits one spatial OFDM data stream with Ts=1 μs, K1=K2+1=512 and fc=30 GHz along the strongest MPC, i.e., t=atx(
For testing the validity of the analytical results, we first consider a sample sparse channel matrix H(t) with L=3, ={0,20,40}ns, angles of arrival ψazirx={0, π/6, −π/6}, ψelerx={0.45π, π/2, π/2} and effective amplitudes
The UE uses ĝ=g/2=10, σθ2=1/Ts and E(r), E(d) from (1.24). For this model, the symbol error rates (SERs) for the sub-carriers, obtained by Monte-Carlo simulations, are compared to the analytical SERs for a Gaussian channel with SINR given by (1.21) (with/without Remark 3.1) in
(P. Sudarshan, N. Mehta, A. Molisch, and J. Zhang, “Channel statistics-based RF pre-processing with antenna selection,” IEEE Transactions on Wireless Communications, vol. 5, pp. 3501-3511, December 2006), which in turn is either (a) known apriori at BS or (b) is estimated by nested array based sampling (P. Pal and P. P. Vaidyanathan, “Nested arrays: A novel approach to array processing with enhanced degrees of freedom,” IEEE Transactions on Signal Processing, vol. 58, pp. 4167-4181, August 2010). To decouple the loss in beamforming gain due to CE errors from loss due to phase-noise, we assume σθ≈0. As is evident from
Note that the throughputs in
VII. Conclusions
This paper proposes the use of a novel CE technique called CACE for designing the RX beamformer in massive MIMO systems. In CACE, a reference tone is transmitted along with the data signals. At each RX antenna, the received signal is converted to baseband, the reference component is isolated, and is used to control the analog phase-shifter through which the data component is processed. The resulting baseband phase-shifted signals from all the antennas are then added, and fed to the down-conversion chain. This emulates using the received signal for reference as a matched filter for data, and enables both RX beamforming and phase-noise cancellation. The performance analysis suggests that in sparse channels and for ĝ»1, the SINR with CACE scales linearly with Mrx. The analysis and simulations also show that ĝ yields a trade-off between phase-noise induced ICI and noise accumulation. Simulations suggest that CACE suffers only a small degradation in beamforming gain in comparison to digital CE based beamforming in sparse channels, and is resilient to TX-RX oscillator frequency mismatch. In comparison to other ACE schemes, CACE performs marginally worse than PACE at high SNR but performs much better at lower SNR. It also performs much better than MA-FSR, albeit at a higher RX hardware complexity. Finally, CACE also provides phase-noise suppression unlike most other CE schemes. The CE overhead reduction with CACE is significant, especially when downlink CE with feedback is required. The IA latency reduction with CACE aided beamforming is also discussed. While baseband phase shifters are sufficient for a CACE based RX unlike in conventional analog beamforming, 2Mrx mixers may be required for the baseband conversion at the RX; thus adding to the hardware cost.
Proof of Lemma \refLemma_PN_properties. Note that from the definition of Ω[k], we have
where represents the nDFT Operation. Then using convolution property of the nDFT, we have:
which proves property (1.9a). Property (1.9b) can be obtained as follows:
where follows by using the expression for the characteristic function of the Gaussian random variable θ[{dot over (n)}]−θ[{umlaut over (n)}]; follows by defining u={dot over (n)}−{umlaut over (n)} and follows by changing the inner summation limits which is accurate for σθ2Ts»1 and follows from the expression for the sum of a geometric series.
Proof of Lemma \refLemma_N_properties. Note that each component of {tilde over (w)}(t) is independent and identically distributed as a circularly symmetric Gaussian random process. Hence its nDFT coefficients, obtained as
are also jointly Gaussian and circularly symmetric. For these coefficients at RX antennas a, b we obtain:
where we use the auto-correlation function of the channel noise at any RX antenna as: R{tilde over (w)}(t)=N0 sin(πKt/Ts)exp{−jπ(K1−K2)t/Ts}/πt.
Here we model the RX phase-noise θ(t) as a zero mean Omstein-Ulhenbeck (OU) process (J. L. Doob, “The brownian movement and stochastic equations,” The Annals of Mathematics, vol. 43, p. 351, April 1942), which is representative of the output of a type-1 phase-locked loop with a linear phase detector (A. Viterbi, Principles of coherent communication. McGraw-Hill series in systems science, McGraw-Hill, 1966; D. Petrovic, W. Rave, and G. Fettweis, “Effects of phase noise on OFDM systems with and without PLL: Characterization and compensation,” IEEE Transactions on Communications, vol. 55, pp. 1607-1616, August 2007; A. Mehrotra, “Noise analysis of phase-locked loops,” IEEE Transactions on Circuits and Systems I: Fundamental Theory and Applications, vol. 49, pp. 1309-1316, September 2002). For such a model, θ(t) satisfies:
where, wθ(t) is a standard real white Gaussian process, and ηθ, σθ are system parameters. From (6) it can be shown that θ(t) is a stationary Gaussian process (in steady state), with an auto-correlation function given by:
(D. Petrovic, W. Rave, and G. Fettweis, “Effects of phase noise on OFDM systems with and without PLL: Characterization and compensation,” IEEE Transactions on Communications, vol. 55, pp. 1607-1616, August 2007).
Lemma 10.1 For phase-noise modeled as an OU process we have:
for arbitrary integers k1, k2, where δa,bK=1 if a=b (mod K) or δa,bK=0 otherwise, and
Proof of Lemma \refLemma_OU_PN_properties. Note that from the definition of Ω[k], we have
where represents the nDFT Operation. Then using convolution property of the nDFT, we have:
which proves property (1.9a). Property (1.9b) can be obtained as follows:
where follows by using the expression for the characteristic function of the Gaussian random variable θ[{dot over (n)}]−θ[{umlaut over (n)}]; follows by defining u={dot over (n)}−{umlaut over (n)} and follows by using the fact that Rθ[u] has a limited support around u=0 and hence Rθ[u]≈Rθ[u−K]≈0 for u>(K−1)/2. Note that since e−R
2. Periodic Analog Channel Estimation Aided Beamforming for Massive MIMO Systems
I. Introduction
In the present embodiment, a novel ACE scheme, referred to as periodic ACE (PACE) is provided. In this embodiment, the reference is transmitted judiciously, and its amplitude and phase are explicitly estimated to drive an RX phase shifter array. In contrast to CACE, PACE requires one carrier recovery circuit and Mrx phase shifters (see
On the flip side, PACE requires some additional analog hardware components, such as mixers and filters, in comparison to conventional digital CE. Additionally, the accumulation of power from multiple MPCs may cause frequency selective fading in a wide-band scenario, which can degrade performance. Finally, the proposed approach in its current suggested form does not support reception of multiple spatial data streams and can only be used for beamforming at one end of a communication link. This architecture is therefore more suitable for use at the user equipment (UEs). The possible extensions to multiple spatial stream reception shall be explored in future work. While the proposed architecture is also applicable in narrow-band scenarios, in this paper we shall focus on the analysis of a wide-band scenario where the repetition interval of PACE and beamformer update is of the order of aCSI coherence time, i.e. time over which the aCSI stays approximately constant (also called stationarity time in some literature). The contributions of the present embodiment include:
1. The development of a novel transmission technique, namely PACE, and a corresponding RX architecture that enable RX analog beamforming with low CE overhead.
2. To enable the RX operation, two novel reference recovery circuits are explored. These circuits are non-linear, making their analysis non-trivial. We provide an approximate analysis of their phase-noise and the resulting performance that is tight in the high SNR regime.
3. The achievable system throughput with PACE aided beamforming in a wide-band channel is analytically characterized.
4. Simulations with practically relevant channel models are used to support the analytical results and compare performance to existing schemes.
Notation: scalars are represented by light-case letters; vectors by bold-case letters; and sets by calligraphic letters. Additionally, j=√{square root over (−1)}, a* is the complex conjugate of a complex scalar a, |a| represents the 2-norm of a vector a and A† is the conjugate transpose of a complex matrix A. Finally, { } represents the expectation operator, ⊗ represents the Kronecker product, represents equality in distribution, Re{·}/Im{·} refer to the real/imaginary component, respectively, (a,B) represents a circularly symmetric complex Gaussian vector with mean a and covariance matrix B, Exp{a} represents an exponential distribution with mean a and Uni{a, b} represents a uniform distribution in range [a, b].
II. PACE General Assumptions and System Model
We consider the downlink of a single-cell MIMO system, wherein one base station (BS) with Mtx antennas transmits to several UEs with Mrx antennas each. Since focus is on the downlink, we shall use abbreviations BS & TX and UE & RX interchangeably. Each UE is assumed to have one up/down-conversion chain, while no assumptions are made regarding the BS architecture.
Here we assume the communication between the BS and UEs to involve three important phases: (i) initial access (IA)—where the BS and UEs find each other, timing/frequency synchronization is attained and spectral resources are allocated; (ii) analog beamformer design—where the BS and UEs obtain the required aCSI to update the analog precoding/combining beams; and (iii) data transmission. The relative time scale of these phases are illustrated in
The BS transmits one spatial data-stream to each scheduled UE, and all such scheduled UEs are served simultaneously via spatial multiplexing. Furthermore, the data to the UEs is assumed to be transmitted via orthogonal precoding beams, such that, there is no inter-user interference. Under these assumptions and given transmit precoding beams and power allocation, we shall restrict the analysis to one representative UE without loss of generality. For convenience, we shall also assume the use of noise-less and perfectly linear antennas, filters, amplifiers and mixers at both the BS and UE. An analysis including the non-linear effects of these components is beyond the scope of this paper. The BS transmits orthogonal frequency division multiplexing (OFDM) symbols with K sub-carriers, indexed as ={−K1, . . . , K2−1, K2} with K1+K2+1=K, to this representative UE. The BS transmits two kinds of symbols: reference symbols and data symbols. In a reference symbol, only a reference tone, i.e., a sinusoidal signal with a pre-determined frequency known both to the BS and UE, is transmitted on the 0-th subcarrier, and the remaining sub-carriers are all empty. On the other hand, in a data symbol all the K sub-carriers are used for data transmission. The purpose of the reference symbols is to aid PACE and beamformer design at the RX, as shall be explained later. Since the BS can afford an accurate oscillator, we shall assume that the BS suffers negligible phase noise. The Mtx×1 complex equivalent transmit signal for the 0-th symbol, if it is a reference or data symbol, respectively, can then be expressed as:
for −Tcp≤t≤Ts, where t is the Mtx×1 unit-norm TX beamforming vector for this UE with |t|=1, xk(d) is the data signal at the k-th OFDM sub-carrier, j=√{square root over (−1)}, fc is the carrier/reference frequency, fk=k/Ts represents the frequency offset of the k-th sub-carrier, Tcs=Tcp+Ts and Ts, Tcp are the symbol duration and the cyclic prefix duration, respectively. Here we define the complex equivalent signal such that the actual (real) transmit signal is given by stx(·)(t)=Re{{tilde over (s)}tx(t)}. For the data symbols, we assume the use of Gaussian signaling with Ek(d)={|xk|2}, for each k∈. The total average transmit OFDM symbol energy (including cyclic prefix) allocated to the UE is defined as Ecs, where Ecs≥E(r) and Ecs≥Ek(d). For convenience we also assume that fc is a multiple of 1/Tcs, which ensures that the reference tone has the same initial phase in consecutive reference symbols.
The channel to the representative UE is assumed to be sparse with L resolvable MPCs (L«Mtx, Mrx), and the corresponding Mrx×Mtx channel impulse response matrix is given as (M. Akdeniz, Y. Liu, M. Samimi, S. Sun, S. Rangan, T. Rappaport, and E. Erkip, “Millimeter wave channel modeling and cellular capacity evaluation,” IEEE Journal on Selected Areas in Communications, vol. 32, pp. 1164-1179, June 2014):
H(t)=arx()atx()†δ(t−), (2.2)
where is the complex amplitude and is the delay and atx(), arx() are the TX and RX array response vectors, respectively, of the -th MPC. As an illustration, the -th RX array response vector for a uniform planar array with MH horizontal and MV vertical elements (Mrx=MHMV) is given by arx()=ārx(ωazirx(), ψelerx()), where we define:
ψazirx(), ψelerx() are azimuth and elevation angles of arrival for the -th MPC, ΔH, ΔV are the horizontal and vertical antenna spacings and A is the wavelength of the carrier signal. Expressions for atx() can be obtained similarly. Note that in (2.2) we implicitly assume frequency-flat MPC amplitudes {α0, . . . , αL−1} and ignore beam squinting effects (S. K. Garakoui, E. A. M. Klumperink, B. Nauta, and F. E. van Vliet, “Phased-army antenna beam squinting related to frequency dependency of delay circuits,” in European Microwave Conference, pp. 1304-1307, October 2011), which are reasonable assumptions for moderate system bandwidths. To prevent inter symbol interference, we also let the cyclic prefix be longer than the maximum channel delay: Tcp>τL−1. To model a time varying channel, we treat {,atx(),arx()} as aCSI parameters, that remain constant within an aCSI coherence time and may change arbitrarily afterwards. However since the channel is more sensitive to delay variations, the MPC delays {τ0, . . . , τL−1} are modeled as iCSI parameters that only remain constant within a shorter interval called the iCSI coherence time. Note that this time variation of delays is an equivalent representation of the Doppler spread experienced by the RX. Finally, we do not assume any distribution prior or side information on {,atx(),arx(),}.
The RX front-end is assumed to have a low noise amplifier followed by a band-pass filter at each antenna element that leaves the desired signal un-distorted but suppresses the out-of-band noise. The Mrx×1 filtered complex equivalent received waveform for the 0-th symbol can then be expressed as:
{tilde over (s)}rx(·)(t)=arx()atx()†{tilde over (s)}tx(·)(t−)+√{square root over (2)}{tilde over (w)}(·)(t)ej2πf
for 0≤t≤Ts, where (·)=(r)/(d), {tilde over (w)}(·)(t) is the Mrx×1 complex equivalent, baseband, stationary, additive, vector Gaussian noise process, with individual entries being circularly symmetric, independent and identically distributed (i.i.d.), and having a power spectral density: w(f)=N0 for −fK
III. Analog Beamformer Design at the Receiver
During each beamformer design phase, the BS transmits D consecutive reference symbols to facilitate PACE at the RX. This process involves two steps: locking a local RX oscillator to the received reference tone and using this locked oscillator to estimate the amplitude and phase-offsets at each antenna. Here locking refers to ensuring that the phase difference between the oscillator and the received reference tone is approximately constant. The first D1 reference symbols are used for the former step and the remaining D2=D−D1 symbols are used for the latter step. Therefore D is independent of Mrx and is mainly determined by the time required for oscillator locking (see Remark 3.1). The first step shall be referred to as recovery of the reference tone and is analyzed in Section 3.1 and while the latter step is discussed in Section 3.2. As shall be shown both steps are significantly impaired by channel noise. Therefore in Section 3.3, we propose an improved architecture for reference tone recovery that provides better noise performance, albeit with a slightly higher hardware complexity. For convenience, we shall assume that the MPC delays do not change within the beamformer design phase, and are represented as {{circumflex over (τ)}0, . . . , {circumflex over (τ)}L−1} (see also Remark 3.2). However the delays may be different during the data transmission phase, as shall be considered in Section 4. Without loss of generality, assuming the first reference symbol to be the 0-th OFDM symbol, the complex equivalent RX signal for the D reference symbols at antenna m can be expressed as:
is the amplitude of the reference tone at antenna m.
III-A Recovery of the Reference Tone—Using One PLL
For locking a local RX oscillator to the reference signal, we first consider the use of a type 2 analog PLL at RX antenna 1, as illustrated in
Here LF is assumed to be a first-order active low-pass filter with a transfer function (s)=1+∈/s and the loop gain G is assumed to adapt to the amplitude of the input such that G|A1(r)|=constant. For convenience, we also ignore the VCO's internal noise (A. Mehrotra, “Noise analysis of phase-locked loops,” IEEE Transactions on Circuits and Systems I: Fundamental Theory and Applications, vol. 49, pp. 1309-1316, September 2002; D. Petrovic, W. Rave, and G. Fettweis, “Effects of phase noise on OFDM systems with and without PLL: Characterization and compensation,” IEEE Transactions on Communications, vol. 55, pp. 1607-1616, August 2007). Without loss of generality, let the output of the VCO (i.e. the recovered reference tone) be expressed as:
sPLL(t)=svco(t)=√{square root over (2)} cos [2πfct+
where θ(t) may be arbitrary and we define
where fvco is the free running frequency of the VCO with no input, we use (2.5) and assume fc is much larger than the bandwidth of LF. In this subsection, we are interested in finding the time required for locking (D1 Tcs), i.e., for θ(t) to (nearly) converge to a constant and characterizing the distribution of the PLL output sPLL(t), or equivalently θ(t), during the last D2 reference symbols when the PLL is locked to the reference tone. The first part is answered by the following remark:
Remark 3.1 For the PLL considered, the phase lock acquisition time is
in the no noise scenario (S. C. Gupta, “Phase-locked loops,” Proceedings of the IEEE, vol. 63, pp. 291-306, February 1975; A. Viterbi, Principles of coherent communication. McGraw-Hill series in systems science, McGraw-Hill, 1966). Thus ∈ and |A1(r)|G must be of the orders of 1/Ts and 2π|fc−fvco| respectively, to keep Dt small.
Numerous techniques (D. Messerschmitt, “Frequency detectors for PLL acquisition in timing and carrier recovery,” IEEE Transactions on Communications, vol. 27, pp. 1288-1295, September 1979; Y. Venkataramayya and B. S. Sonde, “Acquisition time improvement of PLLs using some aiding functions,” Indian Institute of Science Journal, vol. 63, pp. 73-88, March 1981) have been proposed to further reduce the lock acquisition time, which are not explored here for brevity. In the locked state, it can be shown that θ(t) suffers from random fluctuations due to the input noise {tilde over (w)}1(r)(t) in (2.7), and that θ(t) (modulo 2π) is approximately a zero mean random process (S. C. Gupta, “Phase-locked loops,” Proceedings of the IEEE, vol. 63, pp. 291-306, February 1975; A. Viterbi, Principles of coherent communication. McGraw-Hill series in systems science, McGraw-Hill, 1966). This fluctuation manifests as phase noise of sPLL(t). While several attempts have been made to characterize the locked state θ(t) (see (S. C. Gupta, “Phase-locked loops,” Proceedings of the IEEE, vol. 63, pp. 291-306, February 1975; A. Viterbi, Principles of coherent communication. McGraw-Hill series in systems science, McGraw-Hill, 1966.] and references therein), closed form results are available only for a few simple scenarios that are not applicable here. Therefore, for analytical tractability, we linearize (2.7) using the following widely used approximations (A. Viterbi, Principles of coherent communication. McGraw-Hill series in systems science, McGraw-Hill, 1966):
1. We neglect cycle slips and assume that the deviations of θ(t) about its mean value are small, such that e−jθ(t)≈1−jθ(t) in the locked state.
2. We assume that the distribution of the baseband noise process {tilde over (w)}1(r)(t) is invariant to multiplication with e−j[
Approximation 1 is accurate in the locked state and in the large SNR regime, while Approximation 2 is accurate when the noise bandwidth is much larger than the loop filter bandwidth (A. Viterbi, Principles of coherent communication. McGraw-Hill series in systems science, McGraw-Hill, 1966; A. J. Viterbi, “Phase-locked loop dynamics in the presence of noise by Fokker-Planck techniques,” Proceedings of the IEEE, vol. 51, pp. 1737-1753, December 1963]. Using these approximations and the definition of
where we replace θ(t) by θL(t) to denote use of the linear approximation. Note that for sufficient SNR, θ(t)θL(t) (modulo 2π) during the last D2 reference symbols. Assuming θL(0)=0 and the PLL input to be 0 for t≤0 and taking the Laplace transform on both sides of (2.8), we obtain:
where ΘL(s) and Ŵ1(r)(s) are the Laplace transforms of θL(t) and ŵ1(r)(t), respectively. It can be verified using the final value theorem that the contribution of the last term on the right hand side of (2.9) vanishes for t»0 (i.e., in locked state). Therefore ignoring this term in (2.9), we observe that θL(t) is a zero mean, stationary Gaussian process (A. Mehrotra, “Noise analysis of phase-locked loops,” IEEE Transactions on Circuits and Systems I: Fundamental Theory and Applications, vol. 49, pp. 1309-1316, September 2002), in the locked state. Furthermore, the locked state power spectral density, auto-correlation function and variance of θL(t) can then be computed, respectively, as:
where 2a=G|A1(r)|+√{square root over (G2|A1(r)|2−4G|A1(r)|∈)}, 2b=G|A1(r)|−√{square root over (G2|A1(r)|2−4G|A1(r)|∈)}, (2.12)-(2.13) follow from finding the inverse Fourier transform via partial fraction expansion and the final expressions follow by observing that w(f)≤N0 for all f. Since θL(t) is stationary and Gaussian in locked state, note that its distribution is completely characterized by (2.10)-(2.11).
III-B Phase and Amplitude Offset Estimation
This subsection analyzes the procedure for reference signal phase and amplitude offset estimation at each RX antenna. As illustrated in
{tilde over (s)}PLL(t)=√{square root over (2)}ej[2πf
At each RX antenna, the received reference signal is multiplied by the in-phase and quadrature-phase components of the PLL signal, and the resulting outputs are fed to ‘filter, sample and hold’ circuits. This circuit involves a low pass filter with a bandwidth of ≈1/(D2Tcs), followed by a sample and hold circuit that samples the filtered output at the end of the D reference symbols. For convenience, in this paper we shall approximate this ‘filter, sample and hold’ by an integrate and hold operation as depicted in
where
is a scaling factor, T1D1Tcs−Tcp, T2DTcs−Tcp, (fk)arx()atx()† is the Mrx×Mtx frequency-domain channel matrix for the k-th subcarrier during beamformer design phase and ŵ(r)(t){tilde over (w)}(r)(t)e−j[
where follows from the fact that θ(t)θL(t) (modulo 2π) in locked state, follows by defining
and by using the characteristic function for the stationary Gaussian process θL(t). Since ŵ(r)(t) is i.i.d. Gaussian with a power spectral density w(f), it can be verified that ŵ(r)˜[M
«K1, K2. From (2.15), note that the signal component of the sample and hold output IPACE is directly proportional to the channel matrix at the reference frequency. The outputs are used as a control signals to the RX phase-shifter array, to generate the RX analog beam to be used during the data transmission phase. From (2.15) and (2.12), note that either D2 or |A1(r)| can be increased, to reduce the impact of noise ŵ(r) on the analog beam. Since |A1(r)| is a non-decreasing function of E(r) (see (2.5)), this implies that E(r) should be kept as large as possible while satisfying E(r)≤Ecs and meeting the spectral mask regulations.
Note that the results in this section are based on several approximations, including the linear phase noise analysis in Section 3.1. To test the accuracy of these results, the numerical values of |∫T
in
Remark 3.2 The preceding derivations assumed that the MPC delays are identical for the D reference symbols. However since the PLL continuously tracks the RX signal and phase/amplitude estimation at each antenna is performed simultaneously, these results are valid even if the delays change slowly within the beamformer design phase.
Remark 3.3 The RX phase-shifter array or the down-conversion chain are not utilized during the D reference symbols of the beamformer design phase. Therefore, data reception is also possible during these D reference symbols in parallel, as long as a sufficient guard band between the data sub-carriers and the reference sub-carrier is provided (similar to (2.27)) to reduce impact on the PLL performance.
Note that in a multi-cell scenario, use of the same reference tone in adjacent cells can cause reference tone contamination, i.e., IPACE may contain components corresponding to the channel from a neighboring BS. This is analogous to pilot contamination in conventional CE approaches (T. Marzetta, “Noncooperative cellular wireless with unlimited numbers of base station antennas,” IEEE Transactions on Wireless Communications, vol. 9, pp. 3590-3600, November 2010), and can be avoided by using different, well-separated reference frequencies in adjacent cells.
III-C. Recovery of the Reference Tone—Using Weighted Carrier Arraying
For reducing the PLL SNR threshold and improving performance, in this subsection we propose a new reference recovery technique called weighted carrier arraying, as illustrated in
In
svcop(t)=√{square root over (2)} cos [2π(fc−fIF)t+θ(t)]
svco,ms(t)=√{square root over (2)} cos [2πf1Ft+
respectively, where θ(t), ϕm(t) are arbitrary, fIF is the common free running frequency of the secondary VCOs, and
where we define ŵm(r)(t){tilde over (w)}m(r)(t)e−j[
where fvcop is the free running frequency of the primary VCO, Gp is the loop gain and LFp is an active low pass filter with transfer function LFp(s)=(1+ϵp/s). Similar to Section 3.1, to obtain the locked state distribution of θ(t) we shall rely on the linear PLL analysis by using: 1) e−j[ϕ
where Ŵm(r)(s), ΘL(s) and ΦmL(s) are the Laplace transforms of ŵm(r)(t), linear approximation θL(t) and linear approximation ϕmL(t), respectively. We assume that the loop gains of the PLLs adapt to the amplitudes of the input such that |Am(r)|Gms=μ∀m∈ and Gp|Am(r)|2=constant. Then solving the system of equations in (2.18), we obtain:
It can be verified using the final value theorem that the last term in (2.19) only contributes a constant phase shift for t»0 (in locked state), say
where [Arss(r)]2=|Am(r)|2. Comparing (2.21) to (2.12), note that the PLL phase noise is essentially reduced by the maximal ratio combining gain corresponding to the antennas. As this variation in θL(t) manifests as phase noise of sPLL(t) in
IV. Data Transmission
During the data transmission phase, OFDM symbols of type (1 b) are transmitted and the corresponding received signals are processed via the phase-shifter array with IPACE as the control signals. Without loss of generality, again assuming the 0-th OFDM symbol as a representative data symbol, the combined data signal at the RX for 0≤t≤Ts can be expressed as:
where the 1/√{square root over (2)} is a scaling constant for convenience and we assume that the MPC delays for this representative data symbol are {τ0, . . . , τL−1}. This phase shifted and combined signal R(t) is then converted to baseband by a separate RX oscillator, and any resulting phase noise is assumed to be mitigated via some digital phase noise compensation techniques (D. Petrovic, W. Rave, and G. Fettweis, “Effects of phase noise on OFDM systems with and without PLL: Characterization and compensation,” IEEE Transactions on Communications, vol. 55, pp. 1607-1616, August 2007; P. Robertson and S. Kaiser, “Analysis of the effects of phase-noise in orthogonal frequency division multiplex (OFDM) systems,” in IEEE International Conference on Communications (ICC), vol. 3, pp. 1652-1657 vol. 3, June 1995; S. Wu, P. Liu, and Y. Bar-Ness, “Phase noise estimation and mitigation for OFDM systems,” IEEE Transactions on Wireless Communications, vol. 5, pp. 3616-3625, December 2006; S. Randel, S. Adhikari, and S. L. Jansen, “Analysis of RF-pilot-based phase noise compensation for coherent optical OFDM systems,” IEEE Photonics Technology Letters, vol. 22, pp. 1288-1290, September 2010). Therefore neglecting the down-conversion phase noise, the resulting baseband signal can be expressed as RBB(t)=R(t)e−j2πf
where (fk)arx()atx()† is the Mrx×Mtx frequency domain channel matrix for the k-th data subcarrier and
with {tilde over (W)}(d)[k] being independently distributed for each k∈ as {tilde over (W)}(d)[k]˜[M
where we neglect the cyclic prefix overhead in (2.24) for convenience. Note that the iSE maximizing data power allocation {Ek(d)|k∈} can be obtained via water-filling across the sub-carriers. While the exact expressions for (2.23)-(2.24) are involved, their expectations with respect to PACE can be bounded, as stated by the following theorem.
Theorem 4.1 If the RX array response vectors for the channel MPs are mutually orthogonal, i.e., arx()†arx(i)=0 for ≠i, the effective SNR and iSE, averaged over the beamformer noise Ŵ(r), can be bounded as in (2.25)
where β({dot over (f)},{umlaut over (f)})=||2|atx()†t|2 and ≳ represents a ≥ inequality at a high enough SNR such that the approximations in Section 3 are accurate.
Proof Substituting (2.15) in (2.22), and by treating the received signal component corresponding to {tilde over (W)}(r), i.e., [{tilde over (W)}(r)]†(fk)txk, as noise, we can obtain a lower bound to the mean SNR as:
where follows from the Jensen's inequality and from the orthogonality of the array response vectors. Similarly, by treating [Ŵ(r)]†(fk)txk as Gaussian noise independent of xk, a lower bound on the mean iSE can be obtained as:
where we use similar steps to (2.26).
The array response orthogonality condition in Theorem 4.1 is satisfied if the scatterers corresponding to different MPCs are well separated and Mrx»L (O. El Ayach, R. Heath, S. Abu-Surra, S. Rajagopal, and Z. Pi, “The capacity optimality of beam steering in large millimeter wave MIMO systems,” in IEEE International Workshop on Signal Processing Advances in Wireless Communications (SPAWC), pp. 100-104, June 2012). Note that even though the RX does not explicitly estimate the array response vectors arx() for the MPCs, we still observe an RX beamforming gain of Mrx in (2.25a). The impact of imperfect MRC combining and the resulting frequency-selective fading is quantified by β(fc,fk), where note that |β(fc,fk)|≤|β(0,0). Another drawback of the fading is that it may cause a drastic drop in performance of the one PLL architecture in Section III-A if |A1(r)|—the reference signal strength at the antenna 1—falls in a fading dip, as is evident from (2.12) and (225). Note however that the weighted arraying architecture in Section 3.3 enjoys diversity against such fading by recovering the reference tone from multiple antennas .
V. Initial Access and aCSI Estimation at the BS
In this section we suggest how aCSI can be acquired at the BS during the TX beamformer design phase and also propose a sample IA protocol that can utilize PACE. Note that power allocation, user-scheduling and design of the TX beamformer t requires knowledge of the TX array response vectors and amplitudes {||,atx()} for the different UEs. Such aCSI can be acquired at the BS either via uplink CE, or by downlink CE with CSI feedback from the RX. Uplink CE can be performed by transmitting an orthogonal pilot from each UE omni-directionally, and using any of the digital CE algorithms from Section 1 at the BS. Note that PACE cannot be used at the BS since the pilots from multiple UEs need to be separated via digital processing. For downlink CE with feedback, the BS transmits reference signals sequentially along different transmit precoder beams (beam sweeping), with D reference symbols for each beam. The UEs perform PACE for each TX beam, and provide the BS with uplink feedback about the corresponding link strength for data transmission.
The suggested IA protocol is somewhat similar to the downlink CE with feedback, where the BS performs beam sweeping along different angular directions, possibly with different beam widths. For each TX beam, the BS transmits D reference symbols, followed by a sequence of primary (PSS) and secondary synchronization sequences (SSS). The RX performs PACE, and provides uplink feedback to the BS upon successfully detecting a PSS. However due to lack of prior timing synchronization during IA phase, the ‘filter, sample and hold’ circuit in Section 3.2 cannot be used directly for the PACE. One alternative is to allow continuous transmission of the reference tone even during the PSS and SSS with the following suggested symbol structure:
where defines a guard band around the reference tone, to reduce the impact of the data sub-carriers on the PLL output. The amplitude and phase estimation can then be performed similar to Section 3.2, by multiplying the received signal at each antenna with the PLL output and then filtering with a low pass filter with cut-off frequency 1/(D2Tc). Due to the continuous availability of the reference tone, the filter outputs can be directly used to control the phase shifter at each antenna without the ‘sample and hold’ operation. Since D=O(1), the IA latency does not scale with Mrx and yet the PSS/SSS symbols can exploit the RX beamforming gain, thus improving cell discovery radius and/or reducing IA overhead.
VI. Simulation Results
For the simulation results, we consider a single cell scenario with a λ/2-spaced 32×8 (Mtx=256) antenna BS and one representative UE with a λ/2-spaced 16×4 (Mrx=64) antenna array, having one down-conversion chain and using PACE aided beamforming. The BS has perfect aCSI and transmits one spatial OFDM data stream to this UE with K=1024 sub-carriers and the beamformer t aligned with the strongest channel MPC. The RX beamformer design phase is assumed to last D=6 symbols with D2=2, where the BS transmits reference symbols with power E(r)=20Ecs/K (to satisfy spectral mask regulations). The system parameters for the one PLL and weighted arraying case, respectively, are as given in Table 1. For comparison to existing schemes, we include the performance of RTAT—the continuous ACE based beamforming scheme in (V. V. Ratnam and A. Molisch, “Reference tone aided transmission for massive MIMO: analog beamforming without CSI,” in IEEE International Conference on Communications (ICC), (Kansas City, USA), May 2018), and of statistical RX analog beamforming (P. Sudarshan, N. Mehta, A. Molisch, and J. Zhang, “Channel statistics-based RF pre-processing with antenna selection,” IEEE Transactions on Wireless Communications, vol. 5, pp. 3501-3511, December 2006), where the beamformer is the largest eigen-vector of the RX spatial correlation matrix:
For both these schemes we ignore impact of phase noise and additionally, for statistical beamforming we consider two cases: (a) perfect knowledge of Rrx(t) at the RX and (b) estimate of Rrx(t) obtained using sparse-ruler sampling (P. Pal and P. P. Vaidyanathan, “Nested arrays: A novel approach to array processing with enhanced degrees of freedom,” IEEE Transactions on Signal Processing, vol. 58, pp. 4167-4181, August 2010)—a reduced complexity digital CE technique. Note that PACE uses 6 reference symbols per beamformer update phase, RTAT avoids reference symbols but requires continuous transmission of the reference and sparse-ruler sampling requires 21 pilot symbols for Mrx=64.
We first consider a sparse multi-path channel having L=3 MPCs with delays ={0,20,40}ns, angles of arrival ψazirx={0, π/6, −π/6}, ωelerx={0.45π, π/2, π/2} and effective amplitudes
respectively, during the RX beamformer design phase and =+{30,25,25} ps for one snapshot of the data transmission phase. For this channel, the mean iSE of PACE aided beamforming, obtained using Monte-Carlo simulations with the non-linear PLL equations (2.7), (2.16), (2.17), is compared to the analytical approximation (2.25b), and the performance other schemes in
To study the impact of more realistic channels and number of MPCs, we next model the channel as a rich scattering stochastic channel with L resolvable MPCs, each with 10 unresolved sub-paths. Here the MPCs and sub-paths are generated identically to the clusters and rays, respectively, in the 3GPP TR38.900 Rel 14 channel model (UMi NLoS scenario)(TR38.900, “Study on channel model for frequency spectrum above 6 GHz (release 14),” Tech. Rep. V14.3.1, 3GPP, 2017). The only difference from (TR38.900, “Study on channel model for frequency spectrum above 6 GHz (release 14),” Tech. Rep. V14.3.1, 3GPP, 2017) is that we use an intra-cluster delay spread of ins and an intra-cluster angle spread of π/50 (for all elevation, azimuth, arrival and departure), to ensure that the sub-paths of each MPC are unresolvable. The channel SNR at each RX antenna (including the TX beamforming gain) is fixed at 0 dB, and the channel variation between beamformer design phase and one snapshot of the data transmission phase is modeled by assuming that the RX moves a distance of d=2 cm in a random azimuth direction without changing its orientation. Note that this channel can also be represented by our system model by replacing L in (2.2) with 10L. For this stochastic channel model, the mean iSE for PACE aided beamforming, averaged over channel realizations, is compared to RTAT and statistical beamforming in
Note that for the iSE results in this section, we did not include the CE overhead. While digital approaches like sparse ruler sampling (S. Haghighatshoar and G. Caire, “Massive MIMO channel subspace estimation from low-dimensional projections,” IEEE Transactions on Signal Processing, vol. 65, pp, 303-318, January 2017; P. Pal and P. P. Vaidyanathan, “Nested arrays: A novel approach to array processing with enhanced degrees of freedom,” IEEE Transactions on Signal Processing, vol. 58. pp. 4167-4181, August 2010) require 21 pilots (for Mrx=64), PACE uses only D=6 pilots. The corresponding overhead reduction is significant when downlink CE with feedback is used for aCSI acquisition at the BS, such as in frequency division duplexing systems. For example with exhaustive beamscanning (C. Jeong, J. Park, and H. Yu, “Random access in millimeter-wave beamforming cellular networks: issues and approaches,” IEEE Communications Magazine, vol. 53, pp. 180 . . . 185, January 2015) at the TX and an aCSI coherence time of 10 ms, the BS aCSI acquisition overhead reduces from 40% for sparse ruler techniques to 11% for PACE (see Section 5 for protocol). The overhead reduction is expected to be higher if the additional time required for beam switching and settling (O. S. Sands, “Beam-switch transient effects in the RF path of the ICAPA receive phased array antenna,” tech. rep., NASA Technical Memorandum TM-2003-212588, February 2002; K. Venugopal, A. Alkhateeb, N. G. Prelcic, and R. W. Heath, “Channel estimation for hybrid architecture-based wideband millimeter wave systems,” IEEE Journal on Selected Areas in Communications, vol. 35, pp. 1996-2009, September 2017) are also taken into account. Thus, PACE aided beamforming shows potential in solving the CE overhead issue of hybrid massive MIMO systems, with minimal degradation in performance.
VII. Conclusions
This paper proposes the use of PACE for designing the RX beamformer in massive MIMO systems. This process involves transmission of a reference sinusoidal tone during each beamformer design phase, and estimation of its received amplitude and phase at each RX antenna using analog hardware. A one PLL based carrier recovery circuit is proposed to enable the PACE receiver, and its analysis suggests that the quality of obtained channel estimates decay exponentially with inverse of the SNR at the PLL input. To remedy this and also to obtain diversity against fading, a multiple PLL based weighted carrier arraying architecture is also proposed. The performance analysis suggests that PACE aided beamforming can be interpreted as using the channel estimates on one sub-carrier to perform beamforming on other sub-carriers, with an additional loss factor corresponding to the circuit phase-noise. Simulation results suggest that PACE aided beamforming suffers only a small beamforming loss in comparison to conventional analog beamforming in sparse channels, at sufficiently high SNR. This loss however increases with the number of channel MPCs L, and hence PACE is mostly suitable for sparse channels with few MPCs. The CE overhead reduction with PACE is significant when downlink CE with feedback is required. Benefits of PACE aided beamforming during IA phase are also discussed, although a more detailed analysis will be a subject for future work. Similarly the performance of PACE at very low SNR and with system mismatches/imperfections also requires more attention.
3. Multi-Antenna FSR Receivers: Low Complexity, Non-Coherent, Massive Antenna Receivers
I. Introduction
In the present embodiment, a novel multi-antenna frequency shift reference (MA-FSR) receiver is provided. The MA-FSR receiver (RX) uses only one down-conversion chain, supports wide-band transmission with non-coherent demodulation, and can perform receive beamforming without requiring phase-shifters, explicit channel estimation, or complicated signal processing—thus alleviating the drawbacks of the above mentioned schemes. Inspired by the frequency shift reference (FSR) schemes for single-input-single-output (SISO) ultra-wideband (UWB) systems, in this scheme the transmitter (TX) transmits a reference signal and several data signals on different frequency sub-carriers via orthogonal frequency division multiplexing (OFDM). At each RX antenna, the received waveform corresponding to the data sub-carriers is then correlated with the received waveform corresponding to the reference signal via a simple squaring operation. The outputs are then summed up and fed to a single down-conversion chain for data demodulation. As shall be shown later, this operation emulates maximal ratio combining (MRC) at the RX with imperfect channel estimates. Since the RX beamforming is enabled without channel estimation, MA-FSR is especially suitable for fast time-varying channels, such as in V2V or V2X networks. Furthermore, due to the non-coherent RX architecture, the phase noise of the transmit signal has negligible influence on the performance. The RX also exploits power from all the channel multi-path components (MPCs) and is therefore resilient to blocking of MPCs. Unlike conventional UWB FSR systems, there is no bandwidth spreading of the data signal involved. Therefore, the noise enhancement due to the non-linear RX architecture is significantly smaller, making it practically viable. On the flip side, the proposed scheme only uses 50% of frequency sub-carriers for data transmission, can only support a single spatial data-stream, cannot suppress interference and can only be used for beamforming in the receive mode of a node. Therefore, MA-FSR is more suitable for scenarios with abundant spectrum and where beamforming at the TX is unnecessary or where beamforming at TX is achieved using conventional channel estimation methods. Examples include device-to-device networks where beamforming at RX provides sufficient link margin or infrastructure based networks where down-link traffic is dominant. The contributions of the present embodiment, include:
1. Development of an MA-FSR RX architecture for massive MIMO systems, that allows non-coherent transmission, lowers implementation cost and energy consumption at the cost of 50% bandwidth efficiency and that does not require phase-shifters or channel estimation at the RX.
2. Characterization of the achievable throughput for the proposed MA-FSR system, both analytically and via simulations, for the single-input-multiple-output (SIMO) scenario in a wide-band channel.
3. Presentation of a class of improved MA-FSR architectures that can further improve performance, albeit, with a higher hardware complexity.
Notation: scalars are represented by light-case letters; vectors by bold-case letters; and sets by calligraphic letters. Additionally, j=√{square root over (−1)},* { } represents the expectation operator, c* is the complex conjugate of a complex scalar c, c† is the Hermitian transpose of a complex vector c, δ(t) represents the Dirac delta function, δa,b represents the Kronecker delta function and Re{·}/Im{·} refer to the real/imaginary component, respectively. Furthermore a* and a† denote the complex conjugate and the conjugate transpose of a vector a, respectively.
II. General Assumptions and System Model
We consider a SIMO link (which can be part of a larger system) where the TX has a single antenna and the RX has M»1 antennas and one down-conversion chain. Note that this model also covers a MIMO link where the TX transmits a single spatial data stream, since the combination of TX precoding vector and propagation channel creates an effective SIMO link. The TX transmits OFDM symbols with 2K sub-carriers, indexed as (0, . . . , 2K−1). A reference signal is transmitted on the 0-th sub-carrier and K−g data signals are transmitted on the sub-carrier set {K, K+1, . . . , 2K−g−1}. Here g ensures that the transmit signal lies within the system bandwidth, and is usually small, determined by the TX phase noise. The remaining sub-carriers, i.e., {1, . . . , K−1}∪{2K−g, . . . , 2K−1} are unused. While it uses only ≈50% of the sub-carriers for data transmission, this OFDM structure is necessary to prevent inter-stream interference, as shall be shown in section 3. Then, the complex equivalent transmit signal for the 0-th symbol (for −Tp≤t≤Ts) can be expressed as:
where Er is the energy allocated to the reference signal, xk is the data signal for k-th OFDM sub-carrier, fc is the carrier frequency, fk=k/Ts represents the frequency offset of k-th sub-carrier from the reference signal, θ(t) represents the phase noise process at the TX and Ts, Tp are the symbol duration and the cyclic prefix duration, respectively. Here we define the complex equivalent signal such that the actual (real) transmit signal is given by Re{s(t)}. We assume further that the data signals on the sub-carriers {xk|k∈} are mutually independently distributed with zero means. The total average transmit symbol energy is then given by Es=Er+Ed(k), where Ed(k)={|xk|2} is the energy allocated to the k-th sub-carrier.
The channel is assumed to have L«M scatterers with the M×1 channel impulse response vector given as (M. Akdeniz, Y. Liu, M. Samimi, S. Sun, S. Rangan, T. Rappaport, and E. Erkip, “Millimeter wave channel modeling and cellular capacity evaluation,” IEEE Journal on Selected Areas in Communications, vol. 32, pp. 1164-1179, June 2014):
h(t)=δ(t−), (3.2)
where is the complex gain, is the delay and a is the RX array response vector, respectively, of the -th MPC. As an illustration, the array response vector for a λ/2-spaced uniform linear array is given by: []i=, where λ is the wavelength of the carrier signal and is the angle of arrival of the -th MPC. Note that here we implicitly assume the system bandwidth is small enough to ignore beam squinting effects. For ease of analysis, we assume that the array response vectors for the scatterers are mutually orthogonal i.e. =M. This assumption is reasonable if the scatterers are well separated and M»L (O. El Ayach, R. Heath, S. Abu-Surra, S. Rajagopal, and Z. Pi, “The capacity optimality of beam steering in large millimeter wave MIMO systems,” in IEEE International Workshop on Signal Processing Advances in Wireless Communications (SPAWC), pp. 100-104, June 2012). Later, in section 5 we shall also study the system performance when the above assumption is relaxed. To prevent inter symbol interference, we let the cyclic prefix be longer than maximum channel delay: Tp≥τL−1. To model a generic time varying channel, we assume that the MPC parameters remain constant for at least a coherence time interval Tcoh, and may/may not change afterwards.
The RX front-end is assumed to have a low noise amplifier followed by a band-pass filter (BPF) at each antenna element, as depicted in
r(t)=sBB(t−+n(t)ej2πf
where sBB(t)s(t)e−j2πf
where rm(t)=[r(t)]m is the complex equivalent received signal at the m-th antenna. Note that both rm(t)2, rm*(t)2 are high pass signals with a carrier frequency of 2fc. This summed signal rsq(t) is then low-pass filtered (with a cut-off frequency of 2K/Ts) to get:
where we use the orthogonality assumption for the array response vectors. Finally, rLPF(t) is sampled by an ADC at a sampling rate of 4K/Ts samples/sec and conventional OFDM demodulation follows. Note that since rLPF(t) is a real signal with maximum frequency 2K/Ts, the ADC sampling rate must be at least 4K/Ts samples/sec to prevent aliasing. However it can be shown that the signal of interest i.e., the product between the reference and data sub-carriers only lies within the frequency range K/Ts≤|f|≤(2K−g−1)/Ts. Thus the same performance can also be obtained by replacing the low-pass filter by a band-pass filter with a pass-band of K/Ts≤|f|≤2K/Ts, and using an ADC sampling rate of 2K/Ts samples/sec.
III. Analysis of the Demodulation Outputs
Inspired by our similar analysis for the UWB FSR RX in (V. V. Ratnam, A. F. Molisch, A. Alasaad, F. Alawwad, and H. Behairy, “Bit and power allocation in QAM capable multi-differential frequency-shifted reference UWB radio,” in IEEE Global Communications Conference (GLOBECOM), pp. 1-7. December 2017), the current section analyzes the OFDM demodulation outputs. The OFDM demodulated output for the k-th subcarrier of the 0-th symbol can be expressed as (−2K≤k≤2K−1):
We shall express each demodulation output as Yk=Sk+Zk where Sk, referred to as the signal component, involves terms in (3.6) not containing the channel noise and Zk, referred to as the noise component, containing the remaining terms. It can be verified from (3.5) and the expression for sBB(t) that only the demodulation outputs {Yk|K≤|k|≤(2K−g)} involve signal components. We shall therefore consider a sub-optimal, albeit simple, demodulation approach where only the outputs {Yk|k∈} are utilized for demodulating the data, and the noise components are treated as noise.
A. Signal Component Analysis
From (3.5)-(3.6), the signal components of the OFDM demodulated outputs Yk, for k∈, can be expressed as:
where follows from the sub-carrier allocation, which ensures that, despite the non-linear RX architecture, only the cross-product between the reference signal and the data on the k-th sub-carrier contribute to Sk, for k≥K. Essentially, the MA-FSR RX utilizes the received vector corresponding to the reference tone as weights to combine the received signal corresponding to the data, thus emulating maximal ratio combining of the contributions from the different antennas with imperfect channel estimates. Since this combining takes place via squaring in the analog domain, the proposed RX enables beamforming without channel estimation or use of phase-shifters. However, as is evident from (3.7), the signals from the L multi-path components do not add up in-phase, at the demodulation outputs. This is due to the fact that the reference signal and the k-th data stream pass through slightly different channels owing to the difference in modulating frequency. This leads to some amount of frequency selective fading, as shall be explained later in Section IV.
B. Noise Component Analysis
From (3.6), the noise component of the OFDM demodulation output Yk, for k∈, can be expressed as:
Note that the noise consists of both: noise-noise cross product and data-noise cross product terms. Given the transmit data vector x and channel impulse response h(t), the conditional mean of the noise components can be computed as:
{Zk|x,h}=0,
where we use the fact that the noise process n(t) is stationary, has a zero mean and
Similarly, the conditional second order statistics of the noise components can be computed as (detailed steps are given in Appendix 8):
where a,b≥K, βa,b(h)||2, (a,b){(k1,k2)|k1, k2∈k1−k2=a−b, k2≥b} and (a,b){(k1, k2)|k1,k2 ∈, k1−k2=a+b−4K, k2≥b}. Clearly, from (3.9a)-(3.9b) we see that the noise components at the OFDM outputs are mutually correlated and are further dependent on the data vector x. For reducing computational complexity, we consider the sub-optimal approach where each sub-carrier data is decoded independently. Under this assumption, the noise variances, averaged over the transmit data vector x reduce to:
k,k(h)=M[(2K−k)N02
+N0⊕0,0(Er+Σk
k,k(h)=0, (3.10b)
where k∈ and we use the fact that the data streams have a zero mean and are mutually independent (see Section II). We shall henceforth approximate the noise component at each OFDM output {Zk|k∈} to be jointly Gaussian distributed. While this allows finding a lower bound to the system capacity, the accuracy of this assumption is also justified via simulations in Section V.
IV. Performance Analysis
Using (3.7) and (3.10), the effective SISO channel between the transmit data and the k-th demodulation output (k∈) can be expressed as:
Yk=M√{square root over (ErEd(k))}βk,0(h)
where
Note that even though the RX does not explicitly perform channel estimation, we still observe a beamforming gain of M in γk(h(t)). However since the different MPCs do not add up in phase at the RX and the noise power varies with k, the system suffers from frequency selective fading, which causes some loss in performance. Similarly, we define the instantaneous sum rate (iSR) as:
where we neglect the cyclic prefix overhead for convenience.
A. Power Allocation
Since both the signal and noise variances in (3.11) are affected by the transmit powers in a non-linear way, finding the iSR maximizing power allocation to the data and reference tone is difficult. We shall therefore rely on the following sub-optimal solution.
Lemma 1 For any feasible power allocation {Er, Ed(k)|k∈}, we have: γk(h(t))≤2{tilde over (γ)}k(h(t)) for all k∈, where {tilde over (γ)}k(h(t)) is the effective SNR with:
Proof Case 1: Let Er≥Es/2. Then for any power allocation {Er, Ed(k)|k∈} and an k∈ we have:
where follows from the AM-GM inequality, and follows by noting that Σk
Case 2: If on the other hand Er<Es/2, then from (3.12), we can write for any k∈:
where follows from the fact that {tilde over (E)}r>Er and 2{tilde over (E)}d(k)>Ed(k)>{tilde over (E)}d(k) (see (3.14)). Therefore the theorem follows.
As a consequence of Lemma 1, using Er=Es/2 can at-worst cause a 3 dB loss in the SNR of the data streams. Note that the SNR expression in (3.12) can be approximated as:
which is obtained by using by replacing Ed(k) by (Es−Er)/(K−g). Now using {circumflex over (γ)}k(h(t)) instead of γk(h(t)) in W. Bar and F. Dittrich, “Useful formula for moment computation of normal random variables with nonzero means,” IEEE Transactions on Automatic Control, vol. 16, pp. 263-265, June 1971. (3.13) with Er=Es/2 and Σk Ed(k)=Es/2, a sub-optimal iSR maximizing power allocation for {Ed(k)|k∈} can be obtained by the water-filling algorithm. In fact, it can be shown that this allocation is optimal, as
IV. Simulation Results
For simulations we consider a SIMO system, where the RX has a half-wavelength spaced uniform linear array (M=64) with one down-conversion chain and is equipped with a MA-FSR RX. The TX transmits OFDM symbols with Ts=2 μs, Tcp=0.2 μs, g=5 and fc=30 GHz. The phase noise at the TX is modeled as a Wiener process with {|θ(t+Ts)−θ(t)|2}=π2. We consider a sample channel impulse response h(t) with: L=3, =50|−1|ns, =(−1)π/10, ={−exp/σϕ}, σϕ=π/10 and []i=. We also assume perfect timing synchronization, and perfect knowledge of {βk,0|k∈} at the RX. For this h(t), the symbol error rate (SER) for (3.6), obtained by Monte-Carlo simulations, is compared to the analytical SER for the effective channel (3.11) in
We next compare the analytical iSR of MA-FSR (3.13) to the iSR of a coherent RX with analog beamforming that only occupies half the bandwidth, i.e., |f|≤K/Ts, in
The results show that for β0,0Es/N0≥10 dB, MA-FSR suffers from an SNR loss of ≤9 dB in comparison to analog beamforming. However at lower values, the SNR loss increases significantly, as is also evident from (3.13). Note that β0,0Es/N0=10 dB corresponds to a per sub-carrier SNR of around −10 dB without the RX beamforming gain, and thus, indeed represents a scenario where the RX beamforming gain is essential. Furthermore, we observe that the performance of equal data power allocation is comparable to water-filling. However these results depend on L. Larger values of L intensify the frequency selective fading of MA-FSR, thereby possibly increasing the SNR loss in
VI. Improved MA-FSR Designs
Note that the MA-FSR RX performance degrades significantly below a certain threshold SNR. This is mainly due to the noise enhancement resulting from the squaring operation, which leads to the large noise-noise cross term in (3.8). Since the transmit signal is mainly restricted to frequencies
(ignoring phase noise), the impact of this noise enhancement can be significantly reduced by suppressing the noise at lower frequencies by a factor of √{square root over (ϵ)} using a filter, as illustrated in
where Esum(k)=Er+Ed(k)+Σk
VII. Conclusion
In this work a novel non-coherent massive antenna RX is proposed, that only requires a single down-conversion chain, can support high data-rates and can perform RX beamforming without phase shifters or channel estimation. The MA-FSR RX essentially uses the received signal for a reference tone to combine the received signal corresponding to the data, via a squaring operation at each antenna. The carefully designed sub-carrier allocation prevents inter-carrier interference. The analysis suggests that the effective channel between the sub-carrier inputs and the demodulated outputs behaves like a parallel Gaussian channel with frequency selective fading, where the frequency selectivity arises due to modulating frequency mismatch between the reference and data sub-carriers and due to the varying noise levels. These varying noise levels arise due to the noise enhancement experienced by the squaring operation at the RX. The simulation results show that MA-FSR suffers only ≈6 dB SNR loss in comparison to analog beamforming in sparse channels, as long as the mean received power is above a certain threshold. This threshold behavior is due to the noise enhancement due to the squaring operation, and several improved FSR designs that can reduce its impact are also proposed,
From (3.8), the conditional cross-covariance between the noise components at the a-th and b-th sub-carriers can further be computed as:
where follows from the fact that n(t) is zero-mean Gaussian and therefore the odd moments of n(t) are zero; follows by using the identity Re{A}Re{B}=½Re{AB*+AB} for any scalars A, B, and by ignoring the terms involving pseudo-covariance of the circularly symmetric Gaussian noise and follows by defining
for any 1≤i≤M, using the results on expectation of a product of four Gaussian random variables [15], and from the orthogonality of the array response vectors. Defining a new variable w=v−u and using change of variables, we can approximate a,b(x, h) as:
where follows by assuming that the phase noise ejθ(t) is constant within the support of the noise auto-correlation function Rn[w] and follows by changing the summation limits since Rn[w] has a very narrow support of around O(1) and by defining βa,b||2. Note that is accurate as long as the system bandwidth is much larger than the phase noise bandwidth [16]. Now taking a summation over u, we obtain:
where we define (a, b){(k1,k2)|k1, k2 ∈, k1−k2=a−b} and the remaining terms vanish since |a−b|≤K−1<k1, k2≤2K−g−1. Note that the sampled noise auto-correlation function can be expressed in terms of the power spectral density as:
Thus, substituting Rn[w] we have:
where follows from the identity:
and using the fact that Sn(f) is non-zero only in the range 0≤f≤2K/Ts and we define (a, b){(k1, k2)∈(a, b)|k2≥b}. Using a similar sequence of steps the noise pseudo-covariance can be computed as:
a,b(x,h)=Mβk
where we define (a,b){(k1, k2)|k1,k2 ∈, k1−k2=a+b−4K, k2≥b}.
While exemplary embodiments are described above, it is not intended that these embodiments describe all possible forms of the invention. Rather, the words used in the specification are words of description rather than limitation, and it is understood that various changes may be made without departing from the spirit and scope of the invention. Additionally, the features of various implementing embodiments may be combined to form further embodiments of the invention.
This application is the U.S. national phase of PCT Appln. No. PCT/US2019/025587 filed Apr. 3, 2019, which claims the benefit of U.S. Provisional Application No. 62/652,056 filed Apr. 3, 2018, the disclosures of which are hereby incorporated in their entirety by reference herein.
Filing Document | Filing Date | Country | Kind |
---|---|---|---|
PCT/US2019/025587 | 4/3/2019 | WO |
Publishing Document | Publishing Date | Country | Kind |
---|---|---|---|
WO2019/195426 | 10/10/2019 | WO | A |
Number | Name | Date | Kind |
---|---|---|---|
9071474 | Zhang | Jun 2015 | B1 |
20030071925 | Kanno | Apr 2003 | A1 |
20040029543 | Steele | Feb 2004 | A1 |
20040121753 | Sugar et al. | Jun 2004 | A1 |
20040201415 | Amano | Oct 2004 | A1 |
20080004078 | Barratt et al. | Jan 2008 | A1 |
20080233879 | Sasaki | Sep 2008 | A1 |
20160329949 | Cloutier | Nov 2016 | A1 |
20170331604 | Zirwas | Nov 2017 | A1 |
20170338874 | Pratt | Nov 2017 | A1 |
Number | Date | Country |
---|---|---|
10-2013-0119788 | Nov 2013 | KR |
2017004546 | Jan 2017 | WO |
Entry |
---|
Bogale, T.E. et al., “Hybrid Analog-Digital Channel Estimation and Beamforming: Training-Throughput Tradeoff,” arXiv:1509.05091v2, (cs.IT), Oct. 9, 2015 [https://arxiv.org/pdf/1509.05091.pdf, 41 pgs. |
International Search Report and Written Opinion dated Jul. 30, 2019 for PCT/US2019/025587, 14 pgs. |
Number | Date | Country | |
---|---|---|---|
20210167996 A1 | Jun 2021 | US |
Number | Date | Country | |
---|---|---|---|
62652056 | Apr 2018 | US |