METHOD CONFIGURING A PLURALITY OF TRANSMIT ANTENNAS

TECHNICAL FIELD

The present invention relates to the field of decoding digital communications in overloaded channels.

BACKGROUND

Spatial modulation (SM) as a promising technique that can reduce the hardware complexity and the costs and massive multiple-input multiple-output (MIMO) wireless communication systems without sacrificing a bit error rate (BIR) spectrum efficiency (SE) performances. In particular, in SM schemes, information bits are embedded not only in the selection of transmit symbols (a.k.a consolation dimension), but also in the selection of the transmitted antennas utilized during transmission (a.k.a spatial dimension).

Through this approach, the vast stationary sources associated with massive MIMO systems can be efficiently utilized the fault requiring an equally large number of radiofrequency (RF) chain components. This efficient utilization of RF-chains makes SM schemes attractive for future wireless systems such as beyond fifth generation (B5G), which will continue to make extensive use of millimeter-wave (mmWave) bands, and sixth generation (6G) networks [6], which are expected to also incorporate Terahertz and visible light communications (VLC) bands.

A major drawback of early SM schemes is, however, that only one antenna is selected per transmission, which severely limits achievable SEs. In order to circumvent this limitation, a generalized spatial modulation (GSM) scheme was later developed, where multiple antennas are selected at each transmission, leading to substantial increase in SE. But another drawback of early SM methods including GSM schemes, was the exclusive focus on increasing SE without a matching effort to reduce BER, e.g., via the exploitation of transmit diversity. That limitation motivated the idea of combining SM with space-time coding (STC), examples of which are the space-time shift keying (STSK) schemes based on linear dispersion (LD) coding, the methods incorporating space-time block coding (STBC) and the spatial modulation with cyclic structure (CSM).

Based on this knowledge then it was proceeded for further optimize the SM transmitter design, leading to the discovery of quadrature spatial modulation (QSM) approaches in which the SM concept is independently applied to each of the real and imaginary components of the modulated signals, via dedicated spatial-temporal dispersion matrices. The idea was further developed in a succession of QSM techniques with progressively enhanced dispersion matrix designs, which include the diversity-achieving quadrature spatial modulation (DA-QSM) scheme by the incorporation of Alamouti codes, and the more recent enhanced diversity-achieving quadrature spatial modulation (EDA-QSM) method in which dispersion matrices are constructed using the full-diversity full-rate (FDFR) codes with block-by-block sphere-decodability of [24]. Among all aforementioned schemes, the EDA-QSM is the best QSM scheme known today, both in terms of BER and SE performance.

Despite these advantages, the EDA-QSM scheme, and as a consequence preceding QSM schemes, still have two major shortcomings. The first is that the dispersion matrices used in QSM schemes proposed so far are based on 2×2 STBCs, which limits both the diversity and coding gains achieved by the methods. With regards to that first limitation, we will in fact show in this article that QSM designs based on STBCs of a size T that does not scale with n_Tare fundamentally sub-optimal in the SE sense. The second shortcoming is that current QSM detection schemes are based either on exhaustive maximum likelihood (ML) or, at best, sphere detectors. Here, it is worth noting that it has actually been, contrary to previous claims, that sphere decoding has an average complexity that still grows exponentially with the number of jointly decoded symbol periods. That result was corroborated by some findings, where a cubic closed-form expression for the expected complexity of sphere detectors was derived, as well as, where it was shown that lattice-reduction does not improve the tail exponent of the complexity distribution of sphere detectors. With regards to that second limitation, it will be shown in this application that in fact the complexity of ML- and sphere detection (SD)-based QSM receivers are both geometric on n_Tand T, with P as exponent, such that these techniques are fundamentally non-scalable in the context of QSM systems. In other words, a severe and two-folded scalability challenge exists among current QSM schemes, namely, the absence of scalable transmitter and receiver designs.

Motivated by this challenge, in this application a new QSM solution that is both, at the transmitter side, scalable to arbitrary block sizes i.e., with no limits on n_T, T and P, and, at the receiver side, decodable in polynomial time i.e., practical for large n_Tand T, with moderate P is contributed. As a bonus, which can be seen is that the proposed QSM scheme with every possibility to optimize SE, diversity and coding gains. To that end, we first introduce the optimal FDFR Golden STBC code of in the design of the QSM dispersion matrices. The Golden code is a fast-decodable STBC known to be optimal i.e., FDFR with highest coding gain over Gaussian constellations, and which was shown in to be constructible generally for arbitrary block sizes. The resulting optimized scalable QSM (OS-QSM) scheme is the first method proposed so far which has this feature.

The new OS-QSM design is further enhanced with a new algorithm to select the indices of the dispersion matrices employed in the scheme, which ensures that all transmit antennas are utilized as often and with the same likelihood over the transmission of multiple blocks, thus ensuring optimally diverse utilization of all spatial-temporal resources. Finally, in order to also ensure feasible decodability to the scalable transmitter design, a new greedy boxed iterative shrinkage thresholding algorithm (GB-ISTA) QSM detector based on sparse recovery methods is proposed.

Thanks to its sparse signal processing approach, the proposed decoding scheme does not require any restriction on the core code design, unlike preceding sphere-detection methods which requires block-diagonal fast-decodability. But in addition and most importantly, a major advantage of the new proposed GB-ISTA QSM receiver is that it does not require a search over the large codebook space, unlike the ML and state codewords matched block-by-block sphere decoding (SCMB-SD). In fact, the complexity order of the proposed receiver is shown to be cubic on T, quadratic on P, and only linear on n_T.

All in all, the contributions of the article can be summarized as follows: • Spectral Efficiency-Optimality: A closed form expression for the optimum number of encoded symbols P required for a QSM to achieve SE optimality are given, which combined to the rate-optimality condition of STBCs, highlights the importance of systematic scalability of the STBC size T in the design of SE-optimal QSM schemes. • Optimal Diversity and Coding Gains: A new Golden code-based quadrature spatial modulation (GQSM) transmission scheme is obtained via the design of dispersion matrices based on the 2×2 Golden code, which is known to achieve optimal coding gain over integer symbol constellations. • Scalability of Transmitter: the new GQSM design is generalized via the extension of the 2×2 Golden code into its T×T FDFR STBC variation, yielding the OS-QSM scheme, which is applicable to arbitrary n_T, T and P.

- Optimality of Resource Utilization: In method 1, a new mechanism to select the optimal set of dispersion matrix indices is offered, which ensures that all Q spatial-temporal resources are equally utilized over time, as required for optimal diversity gain.
- Scalability at Receiver: A new low-complexity greedy iterative shrinkage thresholding algorithm (ISTA)-based demodulation algorithm for GSM schemes is proposed, which not only is feasible at large scales due to its linear complexity, but also can be applied to other STBC-QSM schemes.
- Complexity of Receiver: A novel complexity expression of the proposed receiver is derived and shown to be cubic on T, quadratic on P, and linear on n_T, in contrast to the ML and SD receivers which are geometric on T and n_T, with P as exponent.

Complex matrices and vectors are denoted in bold-face uppercase and lowercase letters, with their elements denoted by indexed normal lowercase letters, as in X, x and x_i, respectively. The real and the imaginary parts of a complex number x are respectively denoted by x^Rand x^I, respectively, and for the sake of future convenience we define for a complex vector x=[x₁, x₂, . . . , x_n]^Tthe associated decoupled vector x custom-character [x₁^R, x₁^I, . . . , x_n^R, x_n^I]^Tand corresponding quadrature representation

$\overset{ˇ}{x} \overset{Δ}{=} [\begin{matrix} x^{R} & - x^{I} \\ x^{I} & x^{R} \end{matrix}] .$

The quadrature operator (·) will also be applied to m×n complex matrices X, for which it yields the corresponding 2m×2n matrix,

$\overset{ˇ}{X} \overset{Δ}{=} [\begin{matrix} {\overset{ˇ}{x}}_{1, 1} & \dots & {\overset{ˇ}{x}}_{1, n} \\ {\overset{ˇ}{x}}_{m, 1} & \dots & {\overset{ˇ}{x}}_{m, n} \end{matrix}] .$

In turn, the complex conjugate, transpose, Hermitian, trace, vectorization, and the diagonalization operators are denoted by (·)*, (·)^T, (·)^H, tr (·), vec (·), and diag (·), respectively, while the n×n identity and the m×n-sized all-zero and all-one matrices are respectively denoted by I_n, 0_m×n, and 1_m×n. The p-norm with p≥0 is denoted by ∥·∥_p, while |·| denotes either element-wise absolute value operation (for vectors) or cardinality (for sets), respectively, and the sets of real, complex, and integer numbers are denoted by custom-character , , and , respectively. Expectation is denoted as [·], the floor to the nearest power of 2 is represented by └·┘₂×, the conversion operation of a left-most-significant binary vector to the corresponding base-10 integer is denoted by [·]₍₁₀₎. The binomial coefficient is denoted by

$(\begin{matrix} Q \\ P \end{matrix}),$

and ⊗ denotes the Kronecker product. The projection of a scalar v onto the set χ is denoted by custom-character _χ(v), and the complex Gaussian distribution with mean μ and variance σ²is denoted by ˜(μ, σ²).

All the previous State of the Art spatial modulation (SM) schemes use ML such as in most spatial modulation (SM) or modified tree-search algorithms such as in the state of the art SM scheme of EDA-QSM to decode SM signals and no notable attempts on sparse detection or greedy approaches.

The tree-search algorithm is extremely unrealistic due to the highly combinatoric nature of the SM and the resulting search space size. Furthermore, the SM imposes various restrictions on the structure of the decoded signal such that naïve low-complexity decoding is impossible.

The problem associated with prior art is that there is no sparse detection solution for the problem. Solutions using other approaches features prohibitive complexity.

Massive multiple-input multiple-output (MIMO) systems in beyond-fifth generation (B5G) and sixth generation (6G) wireless communications expect incorporation of many transmit and receive antennas.

Utilization of Spatial Modulation (SM) and its variants such as quadrature spatial modulation (QSM) is one promizing candidate for massive MIMO systems. However, as the system size grows, the classic decoding complexity of the SM schemes becomes infeasible (i.e., complexity is not affordable) as the SotA uses maximum likelihood (ML) decoding or ML based tree-search algorithms.

Proposed method for detection is presented in a flowchart in FIG. 12:

With the highlight of introducing four novel method concepts:

- A) The notion of applying sparse detection (compressed sensing) algorithms to decode SM signals.
- B) (Marked B) in the flowchart FIG. 11)
  - From the above idea of applying sparse detection, a modification to the iterative shrinkage-thresholding algorithm (ISTA) via boxing (range limiting) and hard-thresholding.
- C) (Marked C) in the flowchart FIG. 13)
  - Using the boxed-hard ISTA from above, a greedy selection of the positions of the antennas index and the symbol estimates, and their independent decoding of the corresponding “antenna modulated” and “symbol modulated” bits.
- D) (Marked D) in the flowchart FIG. 11)
  - A process working parallel to the greedy detections, to ensure valid estimates of the index vectors (from the given finite set of index vectors) are produced at the output of the algorithm, and to apply interference cancellation with the confirmed values.
    - While keeping track of which indices have been retrieved from the greedy selections, before every iteration check whether from the currently decoded indices, a final confirmation can be made
    - If it cannot be made, remove the interference by the previous greedy selection and make the next iteration

The proposed decoder possesses significantly low complexity compared to the existing ML and tree-search algorithms. The decoder possesses a quadratic complexity on the number of transmit antennas.

Furthermore, the proposed decoder also does not require design restrictions such as block-diagonality or orthogonality in the transmission scheme, only the sparsity which is inherent with spatial modulation schemes, therefore providing larger freedom in transmitter development as well. In other words, the proposed decoder is capable of coping with many different encoding structures (flexibility).

The decoder can be used in any MIMO system where the number of antennas is expected to be large, such that ML approaches are infeasible.

The scheme can be used to increase efficiency in cellular networks and V2X communications (eMBB).

These and other objects, features and advantages of the present invention will become clearer when the drawings as well as the detailed description are taken into consideration.

One embodiments of the computer-implemented decoding method configuring a plurality of transmit antennas is configuring a plurality of transmit antennas to each represent an in-phase spatial constellation symbol within an in-phase spatial constellation, and a quadrature spatial constellation symbol within a quadrature spatial constellation, mapping source data to the in-phase spatial constellation symbols and the quadrature spatial constellation symbols represented by the plurality of transmit antennas, wherein the notion of applying sparse detection, compressed sensing algorithms to decode SM signals is proceeded.

Another embodiment of the computer-implemented decoding method is characterized by a modification to the iterative shrinkage-thresholding algorithm (ISTA) via boxing, range limiting and hard-thresholding.

Another embodiment of the computer-implemented decoding method is characterized by proceeding the iterative shrinkage-thresholding algorithm via boxing-hard (ISTA), a greedy selection of the positions of the antennas index and the symbol estimates, and their independent decoding of the corresponding antenna modulated and symbol modulated bits.

Another embodiment of the computer-implemented decoding method is characterized by, wherein process working parallel to the greedy detections, to ensure valid estimates of the index vectors from the given finite set of index vectors are produced as an output and to apply interference cancellation with the confirmed values,

- while keeping track of which indices have been retrieved from the greedy selections, before every iteration check whether from the currently decoded indices, a final confirmation can be made
- If it cannot be made, remove the interference by the previous greedy selection and make the next iteration

Another embodiment is characterized by a receiver (R) of a communication system having a processor, volatile and/or non-volatile memory, at least one interface adapted to receive a signal in an communication channel, wherein the non-volatile memory stores computer program instructions which, when executed by the microprocessor, configure the receiver to implement the decoding method of one or more embodiments cited above.

Another embodiment is characterized by a receiver by computer program product comprising computer executable instructions, which, when executed on a computer, cause the computer to perform the decoding method of one or more embodiments cited above.

Another embodiment is characterized by a computer-readable medium storing and/or transmitting the computer program product cited above.

Another embodiment is characterized by vehicle unit comprising a communication system with a receiver (R) in a vehicle wherein the system is adapted to execute the method according to one or more the decoding method of one or more embodiments cited above.

Another embodiment is characterized by a vehicle having one or more vehicle units cited above.

All aspects in this application can be integrated in a mobile devices, base station and components in wireless systems. All the describes components can be integrated in vehicles.

BRIEF DESCRIPTION OF THE DRAWINGS

For a fuller understanding of the nature of the present invention, reference should be had to the following detailed description taken in connection with the accompanying drawings in which:

FIG. 1 shows Schematic diagram depicting the generic structure of a QSM transmission scheme.

FIG. 2 shows the Spectral efficiency of OS-QSM scheme with T=2, 4, and 8, for a given system with n_T=8 and M=4.

FIG. 3 shows the Effect of T and M on the optimum ratio P*/T between the number of transmit symbols and epochs.

FIG. 4 shows the Behavior of fractional peak spectral efficiency as a function of n_T, for different sizes of T and M.

FIG. 5 shows the Bipartite graph representing the spatial-temporal resource usage associated with each index vector k_nfor a QSM system with P=3 and Q=8. The particular examples of k₁, k₃₇and k₅₄are explicitly illustrated.

FIG. 6 shows the Comparison of ISTA thresholding and BH-ISTA thresholding functions Λ(s; τ) as per [29], and Π(s; τ), as per equation (29).

FIG. 7 shows the Convergence of u{circumflex over ( )}_Π^(η)and u{circumflex over ( )}_Λ^(η), as per equations (28) and (30), respectively, as a function of iterations η;

FIG. 7(a) shows the Sparsity convergence with various threshold values;

FIG. 7(b) shows the MSE convergence with optimal threshold values.

FIG. 8 shows the Schematic diagram depicting the structure of the proposed GB-ISTA receiver for QSM demodulation.

FIG. 9 shows the Effect of scalable parameters on the complexity of QSM receivers;

FIG. 9 (a) shows the Fixed P, various T, as a function of n_T;

FIG. 9 (b) shows the Fixed T, various P, as a function of n_T;

FIG. 9 (c) shows the Fixed n_T, various T, as a function of P.

FIG. 10 shows the Effect of scalability on BER performance of GB-ISTA-detected OS-QSM schemes with fixed SE.

FIG. 11 shows the Effect of scaling P on BER performance of GB-ISTA-detected OS-QSM schemes.

FIG. 12 illustrates the proposed computer-implemented detection method represented a a flowchart.

FIG. 13 shows the Comparison of ISTA thresholding and BH-ISTA thresholding functions Λ(s; τ), and Π(s; τ).

Quadrature spatial modulation (QSM) schemes are considered, which are capable of conveying large numbers of bits while a combination of transmitting a relatively small number P all M-ary modulated symbols from a dynamic selection of n_Ttransmit into mass, according to a designed dispersion pattern.

DETAILED DESCRIPTION

Following are detailed descriptions of concepts, system/network architectures, and detailed designs for many aspects of a wireless communications network targeted to address the requirements and use cases for 5G. The terms “requirement,” “need,” or similar language are to be understood as describing a desirable feature or functionality of the system in the sense of an advantageous design of certain embodiments, and not as indicating a necessary or essential element of all embodiments. As such, in the following each requirement and each capability described as required, important, needed, or described with similar language, is to be understood as optional.

In the discussion that follows, this wireless communications network, which includes wireless devices, radio access networks, and core networks, is referred to as “NX.” It should be understood that the term “NX” is used herein as simply a label, for convenience. Implementations of wireless devices, radio network equipment, network nodes, and networks that include some or all of the features detailed herein may, of course, be referred to by any of various names. In future development of specifications for 5G, for example, the terms “New Radio,” or “NR,” or “NR multi-mode” may be used—it will be understood that some or all of the features described here in the context of NX may be directly applicable to these specifications for NR. Likewise, while the various technologies and features described herein are targeted to a “5G” wireless communications network, specific implementations of wireless devices, radio network equipment, network nodes, and networks that include some or all of the features detailed herein may or may not be referred to by the term “5G.” The present invention relates to all individual aspects of NX, but also to developments in other technologies, such as LTE, in the interaction and interworking with NX. Furthermore, each such individual aspect and each such individual development constitutes a separable embodiment of the invention.

FIG. 1 shows schematic diagram depicting the generic structure of a QSM transmission scheme.

A. System Model

Consider a point-to-point (P2P) MIMO communication system in which a transmitter equipped with n_Ttransmit antennas exchanges information with a receiver equipped with n_Rreceive antennas employing SM The received signal corresponding to T consecutive time slots during which the channel is assumed to be constant can be compactly written as

$\begin{matrix} Y = HX + V \in ℂ^{n_{R} \times T}, & (1) \end{matrix}$

where Y∈ custom-character ⁿ^R^Tis the matrix collecting the signals received at each antenna and time slot, H∈ⁿ^R^×n^Tis the flat-fading channel matrix with elements h_i,j˜(0,1), X∈ⁿ^t^×Tis the space-time transmit signal, and V∈ⁿ^R^×Tis the additive white Gaussian noise (AWGN) matrix with elements v_i,j˜ custom-character (0, N₀), where N₀is the noise variance.

It is assumed hereafter that the quasi-static Rayleigh fading channel matrix H is known at the receiver but not at the transmitter, and we remark that since the channel power per matrix entry is unitary, the fundamental signal-to-noise ratio (SNR) is given by

$ρ \overset{Δ}{=} \frac{1}{N_{0}} 𝔼 [tr (X^{H} X)] .$

In turn, in accordance with related QSM literature and as illustrated in FIG. 1, the transmit signal matrix X is constructed in a manner to convey the information of a bit sequence b, both in the form of P digitally modulated signals, as well as in the form of the allocation of such transmissions to different antennas and time instances, as described by

$\begin{matrix} X = \sum_{p = 1}^{P} (s_{p}^{R} A_{k_{p}^{R}} + s_{p}^{I} B_{k_{p}^{I}}), & (2) \end{matrix}$

where s_p=s_p^R+js_p^I, with p={1, . . . , P}, are transmit symbols chosen from a complex constellation constellation custom-character of cardinality ||=M; A_k_p_Rand B_k_p_Iare dispersion matrices belonging to the sets ={A_q}_q=1^Q∈ⁿ^T^×Tand ={B_q}_q=1^Q∈ⁿ^T^×T, with QT×n_T; and the indices k_p^Rand k_p^Iare the p-th elements of the index vectors k^Rand k^I, respectively, which are selected from an optimized set of index vectors custom-character ={k_n}_n=1^N∈^P, with

$N \overset{Δ}{=} {⌊ (\begin{matrix} Q \\ P \end{matrix}) ⌋}_{2^{\times}} .$

With regards to equation (2), and again referring to FIG. 1, we clarify that in QSM schemes the bit sequence b is subdivided into a sequence b^S, of length custom-character P log₂||=P log₂M, which corresponds to the information encoded in the symbols s={s₁, . . . , s_P}, taken from , and the conjugate sequences b^Rand b^I, both of length log₂||=log₂N, which correspond to the information encoded in the selection of spatial-temporal resources according to the dispersion matrix index vectors k^Rand k^Ifrom custom-character .

In view of the above, it can be said that the design of a specific QSM scheme amounts essentially to the method employed in the construction of each of the Q dispersion matrices A_qand B_qin the sets custom-character and , and the selection of the set containing the index vectors k^Rand k^Iwhich inform the choices of dispersion matrices used in each transmission.

To exemplify how state-of-the-art (SotA) QSM schemes can be cast into the general framework described by equation (2), consider first the QSM scheme proposed in 119. In this case, the dispersion matrices reduce to dispersion vectors (i.e., T=1 and Q=n_T) which are given by

$\begin{matrix} A_{q} = e_{q} and B_{q} = j e_{q}, & (3) \end{matrix}$

where e_qis the q-th column of I_Q, and no specific design criteria are given for the selection of the indices in the index vectors k^Rand k^I.

In turn, in the DA-QSM scheme of [20], two-column dispersion matrices (i.e., T=2 and Q=n_T) are employed so as to exploit transmit diversity. In particular, in this scheme

$\begin{matrix} A_{q} = M_{n_{T}}^{q - 1} \tilde{A} and B_{q} = j M_{n_{T}}^{q - 1} \tilde{B} & (4) \end{matrix}$

with

$\tilde{A} \overset{△}{=} [\begin{matrix} I_{2} \\ 0_{(n_{T} - 2) \times 2} \end{matrix}] and \tilde{B} \overset{△}{=} [\begin{matrix} M_{2} \\ 0_{(n_{T} - 2) \times 2} \end{matrix}],$

where M_nis an n×n cyclic lower-shift matrix M_nis obtained by circularly shifting the bottom row of I_nto the top, such that, e.g.,

$M_{3} = [\begin{matrix} 0 & 0 & 1 \\ 1 & 0 & 0 \\ 0 & 1 & 0 \end{matrix}],$

such that its (q−1)-th power pre-multiplied to a given matrix results in a shift of the bottom (q−1) rows of the latter to the top.

From the above it is visible that the DA-QSM scheme improves over the QSM scheme of 19 essentially by adding diversity, i.e., by extending the transmission instances from T=1 to T=2. However, the dispersion matrices of the DA-QSM method are still real, just as those of the QSM scheme, implicating that no additional multiplexing capability is aggregated, and that coding gains is not optimized.

In contrast, the EDA-QSM method improves over the latter on both aspects. In particular, in this scheme the dispersion matrices are more elaborately designed as

$\begin{matrix} A_{q} = A_{4 (ℓ - 1) + i} = e_{ℓ} \otimes C_{i} and B_{q} = B_{4 (ℓ - 1) + i} = e_{ℓ} \otimes D_{i}, & (5) \end{matrix}$

where e_lis the l-th column custom-character of I_L, with Ln_T/2, the indices i∈{1, . . . , 4} and l∈{1, . . . , L}, and the core matrices C_iand D_iare based on the Sezginer-Sari-Biglieri (SSB) STBC of 24 described by

$\begin{matrix} S_{S S B} = [\begin{matrix} s_{1} + b s_{3} & - s_{2}^{*} + {jbs}_{4}^{*} \\ s_{2} + b s_{4} & s_{1}^{*} - {jbs}_{3}^{*} \end{matrix}] = \sum_{i = 1}^{4} s_{i}^{R} C_{i} + s_{i}^{I} D_{i} & (6) \end{matrix}$

$b \overset{△}{=} \frac{(1 - \sqrt{7}) + j (1 + \sqrt{7})}{4}, n$

$\begin{matrix} C_{1} \overset{△}{=} I_{2}, C_{2} \overset{△}{=} Z, C_{3} \overset{△}{=} bW, and C_{4} \overset{△}{=} Z \cdot C_{3} & (7 a) \end{matrix}$

$\begin{matrix} D_{1} \overset{△}{=} {jM}_{2} \cdot Z, D_{2} \overset{△}{=} Z \cdot D_{1}, D_{3} \overset{△}{=} - {bM}_{2} \cdot W \cdot M_{2}, and D_{4} \overset{△}{=} Z \cdot D_{3}, & (7 b) \end{matrix}$

$where Z \overset{△}{=} [\begin{matrix} 0 & - 1 \\ 1 & 0 \end{matrix}] and W \overset{△}{=} [\begin{matrix} 1 & 0 \\ 0 & - j \end{matrix}]$

Through the concise description above it becomes easy to see that the fundamental distinction between the DA-QSM and the EDA-QSM methods is that the dispersion matrices of EDA-QSM are complex-valued, such that the orthogonality between the real and imaginary dimensions are better exploited in order to reap multiplexing and coding gains.

Two fair criticisms that can be made of the aforementioned schemes—and in fact, to the best of our knowledge of all existing SotA QSM methods proposed so far—are, however: a) that the scheme does not scale systematically simultaneously over space and time, for arbitrary T>2; and b) that the coding gain achieved is not optimum. Mitigating these two limitations is the objective of our first contribution described in the following section.

III. Optimized Scalable Quadrature Spatial Modulation Transmitter Design
A. Spectral Efficiency Optimality of QSM Schemes

Given the number of bits carried by the transmission of each QSM transmit symbol X as per equation (2), and the fact that such transmission requires T successive channel uses, the SE ζ of any QSM scheme is given by

$\begin{matrix} ζ (P, T; M, n_{T}) = \frac{B}{T} = \frac{1}{T} (2 ⌊ \log_{2} (\begin{matrix} Q \\ P \end{matrix}) ⌋ + P \log_{2} M) & (8) \end{matrix}$

where we recall that Q custom-character Tn_Tand adopt a notation meant to emphasize that P and T are seen as fundamental QSM design parameters, while M and n_Tare considered to be system constraints.

FIG. 2 shows the Spectral efficiency of OS-QSM scheme with T=2, 4, and 8, for a given system with n_T=8 and M=4

The presence of the binomial coefficient

$(\begin{matrix} Q \\ P \end{matrix})$

in equation (8) implicates that the SE function ζ(P, T; M, n_T) is monotonically descending on T, for a fixed P, and concave on P, for a fixed T, as well as on the ratio P/T. This is well illustrated in the plots offered in FIG. 2 from which it can be seen that in a system with n_T=8 and M=4, the highest attainable SEs denoted by ζ*, are achieved with P=11, 21 and 42, for T=2, 4 and 8, respectively.

Motivated by the discussion above, we seek analytical expressions for the optimum ratio P/T that maximizes the SE, given n_Tand M, which in turn can be used to determine the relative SE reduction incurred in setting T<n_Tfor large-scale systems with n_T→∞. To this end, consider the upper and lower-bounds on the binomial coefficient, namely

$\begin{matrix} \frac{e^{- \frac{1}{8}}}{\sqrt{2 π P}} {(\frac{Q}{Q - P})}^{Q + \frac{1}{2}} \cdot {(\frac{Q - P}{P})}^{P} < (\begin{matrix} Q \\ P \end{matrix}) < \overset{\overset{△}{=} β (P; Q)}{\overset{︷}{\frac{1}{\sqrt{2 π P}} {(\frac{Q}{Q - P})}^{Q + \frac{1}{2}} {(\frac{Q - P}{P})}^{P}}}, & (9) \end{matrix}$

$\forall 1 \leq P < Q,$

where for future convenience we implicitly defined the upper-bounding function β(P; Q).

Using equation (9) into equation (8) yields the bound

$\begin{matrix} ζ (P, T; M, n_{T}) < \overset{\overset{△}{=} ζ^{+} (P, T; M, n_{T})}{\overset{︷}{\frac{1}{T} \log_{2} (β^{2} (P; Q) \cdot M^{P})}} . & (10) \end{matrix}$

Taking the derivative of the latter expression with respect to P yields

$\begin{matrix} \begin{matrix} \frac{\partial ζ^{+}}{\partial P} = \frac{1}{T} \cdot \frac{\partial}{\partial P} [2 \log_{2} (\frac{1}{\sqrt{2 π P}} {(\frac{Q}{Q - P})}^{Q + \frac{1}{2}} {(\frac{Q - P}{P})}^{P}) + P \log_{2} (M)] \\ = \frac{1}{T \ln (2)} [2 (\ln (\frac{1 - ε}{ε}) + \frac{1 - 2 ε}{2 Q (ε - 1) ε}) + \ln (M)] \end{matrix}, = \frac{1}{T \ln (2)} [\ln (\frac{{(1 - ε)}^{2} \cdot M}{ε^{2}}) + (\frac{1 - 2 ε}{Q ε (ε - 1)})] & (11) \end{matrix}$

where in the second line we relax the constraint that P∈ custom-character and expressed more generally P=εQ, introducing the positive quantity ε≤1

Equating the expression in equation (11) to zero yields the following analytical implicit expression to determine the optimal number of symbols P* that maximizes the SE of a QSM system with Q=Tn_Tspatial-temporal resources and employing an M-ary constellation

$\begin{matrix} P^{*} = ⌊ ε Q ⌋ ❘ \frac{2 ε - 1}{(ε - 1) \ln ({(\frac{1 - ε}{ε})}^{2} M)} = εQ \overset{△}{=} P, & (12) \end{matrix}$

where we emphasize that the quantity on the righthand side of the expression is in fact sought after number of transmitted symbols P.

But recalling that the desired P* is also the largest possible, equation (12) implies that

$\begin{matrix} P^{*} = ⌊ \max (\frac{2 ε - 1}{(ε - 1) {\ln (\frac{1 - ε}{ε})}^{2} M}) ⌋, & (13) \end{matrix}$

which in turn implies that the optimum ε is such that

${(\frac{1 - ε}{ε})}^{2} M = 1,$

i.e., the solution of the quadratic polynomial (M−1)ε²−2Mε+M, which finally yields, simply

$\begin{matrix} P^{*} = ⌊ \frac{M - \sqrt{M}}{M - 1} Q ⌋ = ⌊ ε_{M}^{*} T n_{T} ⌋, & (14) \end{matrix}$

where we introduced the implicitly-defined optimum gradient

$ε_{M}^{*} \overset{△}{=} \frac{M - \sqrt{M}}{M - 1} .$

We emphasize that the elegant result offered in equation (14) is general for any QSM scheme. From this result it is seen that the optimum ratio P/T that maximizes the SE of the QSM scheme is linear on the number of transmit antennas n_T. In other words, for any given M and n_T, an SE-optimum QSM must be such that P/T scales linearly with n_T, as illustrated and confirmed by the simulation results shown in FIG. 3

This means, FIG. 3 showing the effect of T and M on the optimum ratio P*/T between the number of transmit symbols and epochs

Recall also that QSM dispersion matrices are generally constructed with basis on STBCs characterized by T×T square encoding matrices. Consequently, it follows that if P must scale with n_Tin order for the QSM to be SE-optimal, so must the size T of the code, in order for the the underlying STBC itself to retain SE-optimality. In other words, equation (14) also implicates that in order to achieve SE-optimality, a QSM scheme conveying M-ary symbols must employ an underlying full-rate STBC of a size that scales proportionally to the number of transmit antennas n_T.

It must be remarked, that setting T=n_Tis not a scalable proposition, not only because it implies furbishing the transmitter with an equal number of RF chains, which can be prohibitively expensive, but also because it results in fully dense signals, which in turn require also prohibitively complex ML receivers. This observation motivates the comparisons given in FIG. 4. which shows the fraction of the maximum attainable spectral efficiencies ζ* occurring at P*, obtained by QSM schemes employing STBCs of different sizes, as a function of n_Tand for different M. It can be seen that QSM schemes with T sufficiently large, but still significantly less than n_T, also asymptotically achieve near optimal SE as long as n_Tis sufficiently large.

FIG. 4 depicts the Behavior of fractional peak spectral efficiency as a function of n_T, for different sizes of T and M.

In view of these results, in the next section it is introduce the a new QSM transmitter design, including both the description of how to construct QSM dispersion matrices based on optimal STBCs of arbitrary size, as well as a new systematic mechanism to obtain the associated set of index vectors used in their selection during transmission. For clarity of explanation, we first take the simpler example of the 2×2 case, to introduce the construction of dispersion index set for optimal diversity gain. The extension of the scheme to generalized T follows subsequently.

B. Golden (2×2) Dispersion Matrices and Optimal Index Sets

Before we proceed to describing the proposed dispersion matrix construction let us, without loss of generality and to facilitate comparison with existing methods, impose the assumption that the transmit signal matrix X as per equation (2) satisfies the unity average transmission power constraint for each active transmit antenna, such that custom-character [tr (X^HX)]=PT under constellations with unity average power symbols. Then, consider the 2×2 Golden code, which compactly encodes four symbols {s₁, s₂, s₃, s₄} into the matrix,

$\begin{matrix} S_{G} = \frac{1}{\sqrt{5}} [\begin{matrix} α (s_{1} + s_{2} θ) & α (s_{3} + s_{4} θ) \\ j \overline{α} (s_{3} + s_{4} \bar{θ}) & \overline{α} (s_{1} + s_{2} \bar{θ}) \end{matrix}], & (15) \end{matrix}$

where θ and θ denote the complementary Golden numbers θ=(1+√{square root over (5)})/2 and θ=(1−√{square root over (5)})/2), respectively, and α=1+j(1−θ) and α=1+j(1−θ) are the optimized coefficients for the Gaussian integer constellation sets.

The construction of QSM dispersion matrices based on the latter Golden code follows from the decomposition of S_Ginto the auxiliary matrices C_iand D_i, which are used to modulate the real part s_i^Rand imaginary part s_i^Iof each i-th symbol encoded, respectively, such that

where

$\begin{matrix} S_{G} = \frac{1}{\sqrt{5}} \sum_{i = 1}^{4} (s_{i}^{R} C_{i} + s_{i}^{I} D_{i}), & (16) \end{matrix}$

$C_{1} \overset{Δ}{=} [\begin{matrix} α & 0 \\ 0 & \overline{α} \end{matrix}], C_{2} \overset{Δ}{=} θ \cdot C_{1}, C_{3} \overset{Δ}{=} J_{2, 1} \cdot C_{1} \cdot M_{2}, C_{4} \overset{Δ}{=} J_{2, 1} \cdot C_{2} \cdot M_{2}, and D_{i} \overset{Δ}{=} {jC}_{i},$

with

$θ \overset{Δ}{=} [\begin{matrix} θ & 0 \\ 0 & \overline{θ} \end{matrix}] and J_{2, 1} \overset{Δ}{=} [\begin{matrix} 1 & 0 \\ 0 & j \end{matrix}],$

and note that post-multiplying the circular lower-shift matrix M_nto a given matrix X results in a column-wise circular shift of X to the left. In possession of the above auxiliary matrices, the Golden dispersion matrices are then built using the Kronecker product operations following a similar strategy, namely

$\begin{matrix} A_{q} = A_{4 (ℓ - 1) + i} = \sqrt{\frac{2}{5}} e_{ℓ} \otimes C_{i} and B_{q} = B_{4 (ℓ - 1) + i} = \sqrt{\frac{2}{5}} e_{ℓ} \otimes D_{i} . & (18) \end{matrix}$

Regarding the scaling factor in equation (18), the denominator

$\frac{1}{\sqrt{5}}$

is passed over from the coefficient of the Golden code as in equations (15) and (16), while the numerator √{square root over (2)} is the result of power scaling required to ensure that the transmit power constraint custom-character [tr (X^HX)]]=PT is satisfied. To elaborate further, from equations (2) and (18) it follows that [tr (X^HX)]]=PT implies that the dispersion matrices must satisfy tr (A_q^HA_q)=T and tr (B_q^HB_q)=T, for all q∈{1, . . . , Q}, whereas from the construction of the auxiliary matrices C_iand D_ias per equations (17) it is evident that tr (C_i^HC_i)=1 and tr (D_i^HD_i)=1, such that a power scaling of T=2 onto C_iand D_i, i.e., an amplitude scaling of √{square root over (2)} onto C_iand D_i, is needed.

The Golden codes are known to outperform the SSB codes employed in the EDA-QSM scheme, while having structure very similar to the latter, such that their utilization in the construction of dispersion matrices as described above is, in and of itself, bound to improve the performance of QSM schemes over those briefly described in Subsection II-B, as shall be demonstrated later via simulated comparisons.

There is, however, another mechanism to improve the performance of QSM schemes employing STBCs, namely, to optimize the selection of the index vectors in custom-character ={k_n}_n=1^N∈^Pthat determine which dispersion matrices are assigned to the real and imaginary parts of each encoded symbol. This is because each index vectors k_nis, according to equations (17) and (18), associated with different subsets of spatial-temporal resources utilized by the QSM scheme in the transmission of a given set of spatially encoded bits. To illustrate the issue, define the set custom-character * of all

$(\begin{matrix} Q \\ P \end{matrix})$

distinct index vectors for a given pair (P, Q), and consider the corresponding example compiled in Table π for the case P=3 and Q=2n_T=8.

Recall also that each dispersion matrix in the transmission of s_p^Ror s_p^Iuses two given pairs of antennas and time slots, as per equations (17) and (18), such that for the sake of conciseness we hereafter refer to each pair of one antenna and one time slot simply as a spatial temporal resource r_q, defining also for future convenience the set of all available and utilized spatial temporal resources, denoted respectively by custom-character * and . Then, if resources and dispersion matrix indices are represented by a rectangular and a circular nodes, respectively, a bipartite graph such as the one shown in FIG. 5 for the case in question (i.e., P=3 and Q=8) can be built, in which an edge connecting a circular and a rectangular nodes indicates that the corresponding resource is used by the given dispersion matrix.

FIG. 5 is the bipartite graph representing the spatial-temporal resource usage associated with each index vector k_nfor a QSM system with P=3 and Q=8. The particular examples of k₁, k₃₇and k₅₄are explicitly illustrated

As illustrated by the graph, the inclusion of a given index set k_nfrom custom-character * into the set is associated with the use of certain resources, occasionally with multiplicity, identified by the graph edges intercepted by the enclosure encircling the corresponding indices. We shall therefore use the notation k_n⇒r_nto indicate that the index set k_nimplicates the utilization of the set of resources r_n, and μ_k_n(r_q) to denote the multiplicity of the resource r_qin the set k_n.

For example, the use of the resources r₁={2×(1,1), (1,2), (2,1), 2×(2,2)} results from having k₁=[1, 2, 3] in custom-character , such that we may write concisely k₁⇒r₁, with μ_k₁(1,1)=μ_k₁(2,2)=2. Similarly, k₃₇=[3, 4, 5]⇒r₃₇=(1,1), (1,2), (2,1), (2,2), (3,1), (4,2)), and k₅₄=[5, 6, 8]⇒r₅₄={(3,1), 2×(3,2), 2×(4,1), (4,2)}, with μ_r₃₇(3,2)=μ_r₃₇(4,1)=2.

It is evident from all the above that in order to avoid redundancy and uneven utilization of spatial-temporal resources, so as to optimize the performance of QSM schemes, the sets of dispersion matrix indices custom-character (with corresponding resource set ) must satisfy the following conditions:

- a) no two index vectors k_nand k_min the set can be equal (i.e., k_n≠k_m, ∀n≠m);
- b) no two elements in each index vector can be equal (i. e., [k_n]_i≠[k_n]_j, ∀k_nand i≠j);
- c) the utilization of all resources available must be ensured (i.e., (r_q)>0∀r_q∈);
- d) all resources are utilized as often (i.e., μ(r₁)= . . . =μ(r_Q)), and finally
- e) the cardinality of the set must be a power of 2 in order to enable the encoding of codewords

$(i . e ., N = ❘ 𝒦 ❘ = {⌊ (\begin{matrix} Q \\ P \end{matrix}) ⌋}_{2^{\times}}) .$

As an example, we highlight in Table I the set of index vectors custom-character ={k₁, k₂, k₃, k₅, . . . , k₈, k₁₀, k₁₁, k₁₉, . . . , k₂₃, k₂₆, k₂₇, k₂₈, k₃₅, . . . , k₃₈, k₄₁, k₄₂, k₄₇, k₄₈, k₅₀, . . . , k₅₆}. The reader can verify that by this choice of , all resources in the associated set have multiplicity 24. In contrast, a naive truncation of the first 32 index vectors in Table I. i.e. custom-character ={k₁, . . . , k₃₂} as suggested e.g. in 23, leads to an uneven utilization pattern in which μ(1,1)=μ(2,2)=32, μ(1,2)=μ(2,1)=28, μ(3,1)=μ(4,2)=19 and μ(3,2)=μ=17, which is obviously sub-optimum as it leads to antennas 1 and 2 being used far more often than antennas 3 and 4.

The problem of selecting the optimum set custom-character as described and illustrated above relates to a classic problem in combinatorics graph theory known as the Vertex Cover Problem. In the context hereby, however, the problem has the additional difficulties that: a) the graph in question is bipartite, b) coverage with equal multiplicity is required, and c) nodes must be selected in subsets of three at a time.

TABLE I

Sets of All Possible Index Vectors custom-character

* and

Resources custom-character

*(P = 3, n_T= 4, T = 2)

Elements of custom-character

*

Elements of custom-character

*

k₁
[1, 2, 3]
(1, 1), (2, 2), (1, 2), (2, 1), (1, 1), (2, 2)

k₂
[1, 2, 4]
(1, 1), (2, 2), (1, 2), (2, 1), (1, 2), (2, 1)

k₃
[1, 2, 5]
(1, 1), (2, 2), (1, 2), (2, 1), (3, 1), (4, 2)

k₄
[1, 2, 6]
(1, 1), (2, 2), (1, 2), (2, 1), (3, 2), (4, 1)

k₅
[1, 2, 7]
(1, 1), (2, 2), (1, 2), (2, 1), (3, 1), (4, 2)

k₆
[1, 2, 8]
(1, 1), (2, 2), (1, 2), (2, 1), (3, 2), (4, 1)

k₇
[1, 3, 4]
(1, 1), (2, 2), (1, 1), (2, 2), (1, 2), (2, 1)

k₈
[1, 3, 5]
(1, 1), (2, 2), (1, 1), (2, 2), (3, 1), (4, 2)

k₉
[1, 3, 6]
(1, 1), (2, 2), (1, 1), (2, 2), (3, 2), (4, 1)

k₁₀
[1, 3, 7]
(1, 1), (2, 2), (1, 1), (2, 2), (3, 1), (4, 2)

k₁₁
[1, 3, 8]
(1, 1), (2, 2), (1, 1), (2, 2), (3, 2), (4, 1)

k₁₂
[1, 4, 5]
(1, 1), (2, 2), (1, 2), (2, 1), (3, 1), (4, 2)

k₁₃
[1, 4, 6]
(1, 1), (2, 2), (1, 2), (2, 1), (3, 2), (4, 1)

k₁₄
[1, 4, 7]
(1, 1), (2, 2), (1, 2), (2, 1), (3, 1), (4, 2)

k₁₅
[1, 4, 8]
(1, 1), (2, 2), (1, 2), (2, 1), (3, 2), (4, 1)

k₁₆
[1, 5, 6]
(1, 1), (2, 2), (3, 1), (4, 2), (3, 2), (4, 1)

k₁₇
[1, 5, 7]
(1, 1), (2, 2), (3, 1), (4, 2), (3, 1), (4, 2)

k₁₈
[1, 5, 8]
(1, 1), (2, 2), (3, 1), (4, 2), (3, 2), (4, 1)

k₁₉
[1, 6, 7]
(1, 1), (2, 2), (3, 2), (4, 1), (3, 1), (4, 2)

k₂₀
[1, 6, 8]
(1, 1), (2, 2), (3, 2), (4, 1), (3, 2), (4, 1)

k₂₁
[1, 7, 8]
(1, 1), (2, 2), (3, 1), (4, 2), (3, 2), (4, 1)

k₂₂
[2, 3, 4]
(1, 2), (2, 1), (1, 1), (2, 2), (1, 2), (2, 1)

k₂₃
[2, 3, 5]
(1, 2), (2, 1), (1, 1), (2, 2), (3, 1), (4, 2)

k₂₄
[2, 3, 6]
(1, 2), (2, 1), (1, 1), (2, 2), (3, 2), (4, 1)

k₂₅
[2, 3, 7]
(1, 2), (2, 1), (1, 1), (2, 2), (3, 1), (4, 2)

k₂₆
[2, 3, 8]
(1, 2), (2, 1), (1, 1), (2, 2), (3, 2), (4, 1)

k₂₇
[2, 4, 5]
(1, 2), (2, 1), (1, 2), (2, 1), (3, 1), (4, 2)

k₂₈
[2, 4, 6]
(1, 2), (2, 1), (1, 2), (2, 1), (3, 2), (4, 1)

k₂₉
[2, 4, 7]
(1, 2), (2, 1), (1, 2), (2, 1), (3, 1), (4, 2)

k₃₀
[2, 4, 8]
(1, 2), (2, 1), (1, 2), (2, 1), (3, 2), (4, 1)

k₃₁
[2, 5, 6]
(1, 2), (2, 1), (3, 1), (4, 2), (3, 2), (4, 1)

k₃₂
[2, 5, 7]
(1, 2), (2, 1), (3, 1), (4, 2), (3, 1), (4, 2)

k₃₃
[2, 5, 8]
(1, 2), (2, 1), (3, 1), (4, 2), (3, 2), (4, 1)

k₃₄
[2, 6, 7]
(1, 2), (2, 1), (3, 2), (4, 1), (3, 1), (4, 2)

k₃₅
[2, 6, 8]
(1, 2), (2, 1), (3, 2), (4, 1), (3, 2), (4, 1)

k₃₆
[2, 7, 8]
(1, 2), (2, 1), (3, 1), (4, 2), (3, 2), (4, 1)

k₃₇
[3, 4, 5]
(1, 1), (2, 2), (1, 2), (2, 1), (3, 1), (4, 2)

k₃₈
[3, 4, 6]
(1, 1), (2, 2), (1, 2), (2, 1), (3, 2), (4, 1)

k₃₉
[3, 4, 7]
(1, 1), (2, 2), (1, 2), (2, 1), (3, 1), (4, 2)

k₄₀
[3, 4, 8]
(1, 1), (2, 2), (1, 2), (2, 1), (3, 2), (4, 1)

k₄₁
[3, 5, 6]
(1, 1), (2, 2), (3, 1), (4, 2), (3, 2), (4, 1)

k₄₂
[3, 5, 7]
(1, 1), (2, 2), (3, 1), (4, 2), (3, 1), (4, 2)

k₄₃
[3, 5, 8]
(1, 1), (2, 2), (3, 1), (4, 2), (3, 2), (4, 1)

k₄₄
[3, 6, 7]
(1, 1), (2, 2), (3, 2), (4, 1), (3, 1), (4, 2)

k₄₅
[3, 6, 8]
(1, 1), (2, 2), (3, 2), (4, 1), (3, 2), (4, 1)

k₄₆
[3, 7, 8]
(1, 1), (2, 2), (3, 1), (4, 2), (3, 2), (4, 1)

k₄₇
[4, 5, 6]
(1, 2), (2, 1), (3, 1), (4, 2), (3, 2), (4, 1)

k₄₈
[4, 5, 7]
(1, 2), (2, 1), (3, 1), (4, 2), (3, 1), (4, 2)

k₄₉
[4, 5, 8]
(1, 2), (2, 1), (3, 1), (4, 2), (3, 2), (4, 1)

k₅₀
[4, 6, 7]
(1, 2), (2, 1), (3, 2), (4, 1), (3, 1), (4, 2)

k₅₁
[4, 6, 8]
(1, 2), (2, 1), (3, 2), (4, 1), (3, 2), (4, 1)

k₅₂
[4, 7, 8]
(1, 2), (2, 1), (3, 1), (4, 2), (3, 2), (4, 1)

k₅₃
[5, 6, 7]
(3, 1), (4, 2), (3, 2), (4, 1), (3, 1), (4, 2)

k₅₄
[5, 6, 8]
(3, 1), (4, 2), (3, 2), (4, 1), (3, 2), (4, 1)

k₅₅
[5, 7, 8]
(3, 1), (4, 2), (3, 1), (4, 2), (3, 2), (4, 1)

k₅₆
[6, 7, 8]
(3, 2), (4, 1), (3, 1), (4, 2), (3, 2), (4, 1)

Method 1 Greedy Construction of Optimal Set of Index Vectors K

Internal Parameters: Number of resources Q) = T · n_Tand set of all possible indices custom-character

*.

Inputs: Number of symbols P, of transmit antennas n_Tand dimension T of FDFR STBC.

Outputs: Optimized set of index vectors custom-character

.

1: Choose a random seed n ∈ {1, ··· , (_P^Q)} and start with custom-character

= Ø;

2: while | custom-character

| ≠ └(_P^Q)┘₂_× do

3: Insert k_ninto the set custom-character

of selected index vectors;

4: Sort all indices k ∈ {1, ··· , Q} in ascending order of their multiplicities in custom-character

;

5: Set D = P and construct/clear the empty set custom-character

= Ø of candidate index vectors;

6: while | custom-character

| = 0 do

7: Construct a list κ of candidate indices with the D lowest multiplicities in custom-character

;

8: Construct the set custom-character

of all (_P^D) index vectors {tilde over (k)}_mwith indices in κ;

9: Remove from custom-character

all index vectors already in custom-character

;

10: if | custom-character

| = 0 then

11: Increment D by 1;

12: end if

13: end while

14: Select next n ∈ {1, ··· , (_P^Q) as the position of the first index vector {tilde over (k)}_mof custom-character

*;

15: end while

Due to these peculiarities, the problem itself is, to the best of our knowledge, original and cannot be solved by known variations of the Vertex Cover algorithm. Fortunately, the highly symmetric structure of the associated bipartite graph can be exploited to design an efficient algorithm to solve the selection problem at hand. To that end, let us commit a slight abuse of notation and define the multiplicity of a dispersion matrix index □²q in the set custom-character as (q). Then, by virtue of the symmetry of the graph (see FIG. 5], a solution in which (1)= . . . =(Q) implies a solution in which each of the spatial temporal resources {(1,1), (1,2), (2,1), (2,2), (3,1), (4,2)} have the same multiplicity. Consequently, the problem can be solved efficiently by the greedy selection of indices, as described in method 1.

C. Optimal Generalized Design (T×T)

Due to the greedy optimal index vector selection algorithm described above, which is general on P, T and n_T, the last limiting factor preventing the generalization of QSM to arbitrary T is the construction of the dispersion matrices with basis on STBCs of arbitrary size. This obstacle is eliminated by considering the design of QSM dispersion matrices based on the Perfect FDFR STBC.

a T×T FDFR STBC encodes T²symbols such that the average energy transmitted per antenna is normalized to unity, an energy efficiency-shaping constraint is enforced, and a SE-preserving lower bound on the coding gain (a.k.a, non-vanishing determinant) is maximized. Ultimately, for given T∈ custom-character ⁺ the design can be described by

$\begin{matrix} S_{P} = \sum_{t = 1}^{T} diag (R \cdot s_{t}) \cdot J_{T, t - 1} \cdot N_{T}^{t - 1} & (19) \end{matrix}$

where s_t=[s_1+(t−1)T, s_2+(t−1)T, . . . , s_tT]^T, with t={1, . . . , T} are vectors each carrying T distinct transmit symbols, R is a T×T optimum lattice generating matrix J_T,nis a T×T matrix constructed by replacing the last n diagonal entries of the identity matrix by the elementary complex number j, and N_Tis a T×T cyclic upper-shift matrix (notice that J_T,ngeneralizes J_2,1used in equation 17. In turn, oppositely to M_n, N_nis obtained by circularly shifting the top row of I_nto the bottom. Some examples are

$J_{2, 0} = [\begin{matrix} 1 & 0 \\ 0 & 1 \end{matrix}], J_{3, 2} = [\begin{matrix} 1 & 0 & 0 \\ 0 & j & 0 \\ 0 & 0 & j \end{matrix}] and N_{3} = [\begin{matrix} 0 & 1 & 0 \\ 0 & 0 & 1 \\ 1 & 0 & 0 \end{matrix}],$

such that post-multiplying it to a given matrix X results in a column-wise shift of X to the right.

Notice that the Perfect FDFR STBC of fully generalizes the 2×2 Golden code of [28]. To see that, suffice it to consider the case T=2 and with the corresponding lattice generating matrix

$R = \frac{1}{\sqrt{5}} [\begin{matrix} α & αθ \\ \overline{α} & \overline{α} \overline{θ} \end{matrix}],$

such that equation (19) yields

$\begin{matrix} \begin{matrix} S_{P} = diag (\frac{1}{\sqrt{5}} [\begin{matrix} α & αθ \\ \overline{α} & \overline{α} \overline{θ} \end{matrix}] [\begin{matrix} s_{1} \\ s_{2} \end{matrix}]) \cdot J_{2, 0} \cdot N_{2}^{0} + diag (\frac{1}{\sqrt{5}} [\begin{matrix} α & αθ \\ \overline{α} & \overline{α} \overline{θ} \end{matrix}] [\begin{matrix} s_{3} \\ s_{4} \end{matrix}]) \cdot J_{2, 1} \cdot N_{2}^{1} \\ = \frac{1}{\sqrt{5}} [\begin{matrix} α (s_{1} + s_{2} θ) & α (s_{3} + s_{4} θ) \\ j \overline{α} (s_{3} + s_{4} \bar{θ}) & \overline{α} (s_{1} + s_{2} \bar{θ}) \end{matrix}] = S_{G} \end{matrix} & (20) \end{matrix}$

It follows that in order to be employ Perfect FDFR STBCs in the design of QSM. suffice it to decompose the core code structure of equation 19 in terms of corresponding auxiliary dispersion matrices C_iand D_idue to symmetry, namely

$\begin{matrix} C_{i} = C_{T (w - 1) + t} = diag (R \cdot e_{t}) \cdot J_{T, w - 1}, \cdot N_{T}^{w - 1} and D_{i} = j C_{i} & (21) \end{matrix}$

where the generalized indices i∈{1, . . . , T²} are constructed systematically on t∈{1, . . . , T} and w∈{1, . . . , T}, and e_tis the t-th column of I_T.

Following this, the full set of dispersion matrices custom-character and can be built, i.e.,

$\begin{matrix} A_{q} = A_{T^{2} (ℓ - 1) + i} = γ e_{ℓ} \otimes C_{i} and B_{q} = B_{T^{2} (ℓ - 1) + i} = γ e_{ℓ} \otimes D_{i}, & (22) \end{matrix}$

where again q∈{1, . . . , Q}, i∈{1, . . . , T²} as from equation (21), e_lis the l-th column of I_L, but l∈{1, . . . , L} with L=└n_T/T┘, as well as a generalized scaling factor γ determined depending on the specific STBC in order to adjust the powers of the dispersion matrices such that tr (A_q^HA_q)=T and tr (B_q^HB_q)=T.

Next, we turn our attention to the construction of the optimal set of index vectors custom-character , via a straightforward generalization of the method described in Subsection III-B. Indeed, as can be learned by inspecting equation (21, each auxiliary matrix C_iand D_iis a T×T sparse matrix obtained from a cyclic rotation of a diagonal matrix containing only T non-zero elements of R.

Method 2 QSM Signal Generation

Internal Parameters: Number of symbols P, transmit antennas n_T, time slots T and

spatial-temporal resources Q = T · n_T; and cardinalities M = |S|, and N = | custom-character

| = └(_P^Q)┘₂_×;

Global Quantities: Symbol constellation S, optimum lattice generating matrix R,

sets of dispersion matrices custom-character

= {A_g}_q=1^Qand custom-character

= {B_g}_q=1^Qwith A_qand B_qas in eq. (22).

set of index vectors custom-character

obtained from Algorithm 1.

Input: Information bit sequence b = [b^R, b^I, b^S];

Outputs: Transmitted signal X.

1: Select index vector k^Ras the ([b^R]₍₁₀₎+ 1)-th vector in custom-character

;

2: Select index vector k^Ias the ([b^I]₍₁₀₎+ 1)-th vector in custom-character

;

3: Assign the bits b^Sto P symbols {s₁, ··· , s_p} selected from S, with s_p= s_p^R+ js_p¹;

4: Construct X = Σ_p=1^P(s_p^RA_k_p_R + s_p^IB_k_p^I) as per equation (2)

Consequently, the associated dispersion matrices obtained from equation (22) are all sparse matrices with only T non-zero entries, corresponding to the T spatial-temporal resources utilized. In other words, while in the Golden QSM scheme of Subsection III-B each dispersion matrix index q is associated with 2 resources, in the Perfect STBC-based construction here described each index q associates to T resources, such that the corresponding bipartite graph illustrated in FIG. 5 is merely expanded to a similar graph with T·n_Tindex (circular) nodes and T·n_Tresource (rectangular) nodes, with each resource node connected to T index nodes and vice versa. As a result, the greedy strategy described earlier remains valid, as evidenced by the fact that Algorithm 1 applies to general T. For the convenience of the reader, we summarize the structure of the proposed scalable QSM scheme in Method 2.

IV. Proposed Receiver Design
A. Sparse Formulation of QSM Receivers

Together, methods 1 and 2 introduced above demonstrate that the design of OS-QSM transmitters is possible and tractable. There is, however, no true scalability without feasibility, such that in order to complete the task it is also necessary to show that the proposed OS-QSM design is effectively decodable at reasonable complexity.

To put the challenge into context, for given P, M, T and n_T, with Q=T·n_T, an ML receiver would have to go through

${({⌊ (\begin{matrix} Q \\ P \end{matrix}) ⌋}_{2^{x}})}^{2} \cdot M^{P}$

combinations of symbols and selected spatial-temporal resources in order to detect a sequence of

$2 \cdot ⌊ \log_{2} (\begin{matrix} Q \\ P \end{matrix}) ⌋ + P \cdot \log_{2} M$

bits. That means that even for the minimal setting of T=2, P=2 and M=4, a system with n_T=8 transmit antennas would require the receiver to go through

${({⌊ (\begin{matrix} Q \\ P \end{matrix}) ⌋}_{2^{\times}})}^{2} \cdot {(4)}^{2} = 16.777 .216$

combinations in order to decode the corresponding log₂(16.777.216)=24 bits. In other words, ML decoding is highly impractical in QSM systems, especially in the context of massive MIMO systems.

We emphasize that this challenge applies not only to the OS-QSM scheme of Subsection III-B but also to current SotA QSM methods such as those in [20]-[23], as the example given above is for T=2, which is the size of the core codes used in the latter. We furthermore stress that the utilization of SD receivers is also not viable in scaled cases, because the nature of tree search algorithms still requires excessive computational complexity in large systems. Finally, we also remark that since convenient properties such as fast-decodability and block-diagonality are known not to be retainable without sacrifice of optimality for STBC of arbitrary size, a scalable detector for QSM schemes cannot rely on such features.

In light of the above, we introduce hereafter a new detection method for QSM schemes which relies neither on tree-search, nor on specific properties of STBCs and which is completely independent of the infeasible combinatorial factor

${⌊ (\begin{matrix} Q \\ P \end{matrix}) ⌋}_{2 \times} .$

In addition, given prior information on the encoding construction, the proposed decoder is valid to detect any QSM signal.

The core idea of our approach is to take full advantage of a sparse representation of QSM signals over the entire channel (i.e., for all spatial temporal resources available), assumed known at the receiver. The proposed decoding method then leverages the iterative shrinkage thresholding algorithm (ISTA) to greedily extract symbol and dispersion index estimates, resulting in significantly lower complexities compared to ML and SD-based methods. To that end, first combine equations (1) and (2), and consider the vectorized form of the QSM received signal

$\begin{matrix} y \overset{Δ}{=} vec (Y) = \overset{\overset{Δ}{=} Φ H}{\overset{︷}{(I_{T} \otimes H)}} (Ξ_{A} u^{R} + Ξ_{B} u^{I}) + \overset{\overset{Δ}{=} v}{\overset{︷}{vec (V)}} = Φ_{H} \cdot (Ξ_{A} u^{R} + Ξ_{B} u^{I}) + v \in ℂ^{T n_{R} \times 1}, & (23) \end{matrix}$

where we implicitly defined the block-diagonal channel matrix Φ_Hand vectorized noise v; the dispersion matrices in custom-character and are also vectorized into a_qvec (A_q) and b_qvec (B_q) and concatenated into Ξ_A[a₁, . . . , a_Q]∈^Q×Qand Ξ_B=[b₁, . . . , b_Q]∈^Q×Q, respectively; and u^R∈^Q×1(respectively u^I∈^Q×1) is set to zero everywhere, except for its elements of indices k^R∈ (respectively k^I∈ custom-character ), which are set to {s₁^R, . . . , s_P^R} (respectively {s₁^I, . . . , s_P^I}).

Equation (23) can be further simplified by defining the combined and real-imaginary decoupled information and noise vectors

$\begin{matrix} u \overset{Δ}{=} {[u_{1}^{R}, u_{1}^{I}, \dots, u_{Q}^{R}, u_{Q}^{I}]}^{T} \in ℝ^{2 Q \times 1} and v \overset{Δ}{=} [v_{1}^{R}, v_{1}^{I}, \dots, v_{T n_{R}}^{R}, v_{T n_{R}}^{I}], & (24) \end{matrix}$

as well as the decoupled versions of a_qand b_q, namely

$\begin{matrix} a_{q} \overset{Δ}{=} {[a_{q_{1}}^{R}, a_{q_{1}}^{I}, \dots, a_{q_{Q}}^{R}, a_{q_{Q}}^{I}]}^{T} and b_{q} \overset{Δ}{=} {[b_{q_{1}}^{R}, b_{q_{1}}^{I}, \dots, b_{q_{Q}}^{R}, b_{q_{Q}}^{I}]}^{T}, & (25) \end{matrix}$

which in turn can be combined into a single dispersion matrix Ψ_D∈ custom-character ^2Q×2Q, namely

$\begin{matrix} Ψ_{D} \overset{Δ}{=} [a_{1}, b_{1}, a_{2}, b_{2}, \dots, a_{Q}, b_{Q}], & (26) \end{matrix}$

such that the vectorized system model of equation (23) can be re-written as

$\begin{matrix} y = {\overset{⋁}{Φ}}_{H} Ψ_{D} \cdot u + v = Φ_{H} \cdot Ψ_{D} \cdot u + v = G \cdot u + v \in ℝ^{2 T n_{R} \times 1} & (27) \end{matrix}$

where {hacek over (Φ)}_His the quadrature-operated block diagonal channel matrix Φ_H, which we implicitly relabeled Φ_H, as with the effective channel matrix G custom-character Φ_H·Ψ_D, for future convenience.

To elaborate on equation (27) with an example, consider a system with P=3, T=2 and n_T=4, and assume that for a particular bit sequence b=[b^R, b^I, b^S], the selected index vectors are given by k^R=k₁₀=[1, 3, 7] and k^I=k₄₇=[4, 5, 7]. Then, the corresponding combined information vector becomes u=[s₁^R, 0, 0, 0, s₂^R, 0, 0, s₁^I, 0, s₂^I, 0, 0, s₃^R, s₃^I, 0, 0]^T. Notice that while u carries in the entries s_p^Rand s_P^Ithe P log₂M bits corresponding to the b^Ssubsequence, the remaining 2 log₂N bits corresponding to the subsequences b^Rand b^Iare encoded merely by positions of non-zero elements in u, regardless of what the values of s_p^Rand s_P^Imight be, which suggests that the detection of b^Scould be done separately from that of b^Rand b^I.

In principle, the latter feature could be utilized to design an SD receiver for the OS-QSM method proposed above, similarly to how block-separability was exploited in to do so for the EDA-QSM scheme. The problem with that idea is, of course, the prohibitively large number of combinations how the 2P elements of the decoupled symbol vector s=[s₁^R, s₁^I, . . . , s_P^R, s_P^I] can be placed among the 2Q entries of u. In order to circumvent this challenge, we instead seek to exploit the facts demonstrated in Subsection III-A namely, that: a) the optimum number P* of symbols maximizing SE is a fraction of the total spatial-temporal resources Q=T·n_T, as per equation (12); and b) that in large-scale systems with n_T>>1, a significantly smaller block size T suffices to asymptotically achieve SE optimality, as shown in FIG. 4 Together, these facts imply that the sparsity of u becomes increasingly more prominent in large-scale SE-optimal QSM schemes, which in turn favors sparse recovery algorithms. It is also evident from the inspection of equation (27) that the matrices Φ_Hand Ψ_Dcan be respectively interpreted as the sensing and dictionary matrices typical of compressive sensing (CS) models such that recent progress on sparse and discrete-aware receivers can be leveraged.

Taking into account the focus on scalability, which at the receiver side translates to controlling complexity, two suitable candidate methods to be applied for OS-QSM demodulation are the generalized approximate message-passing (GAMP) method, and the iterative shrinkage thresholding algorithm [ISTA], both of which possess quadratic complexity on the size 2Tn_Tof the signal vector u. It is well-known, however, that the GAMP algorithm relies strongly on the particular structure of measurement matrix and the independence of the received signal, which in the case of QSM cannot be generally assumed, as a direct consequence of the utilization of STBCs in the dispersion matrices. In the absence of the required conditions, GAMP receivers yield poor performance, characterized by error-floors at high SNRS.

Motivated by this fact, it is therefore chosen to follow a ISTA-based approach in the design of a low-complexity demodulator for QSM systems, which is described in the sequel. In particular, it is introduced a method to detect QSM signals, which is based on a purpose-built variation of ISTA that incorporates modifications both on the thresholding function and on the index vector estimation process, specifically to QSM detection.

B. Greedy Boxed ISTA-Based QSM Decoder

Consider the standard ISTA recursion,

$\begin{matrix} {\hat{u}}^{(η + 1)} = Λ ({\hat{u}}^{(η)} + \frac{1}{α} G^{T} (y - G {\hat{u}}^{(η)}); \frac{λ}{2 α}), & (28) \end{matrix}$

where û^(η)is the estimate of u at the η-th iteration, α=maxeig (G^TG) is the shrinkage step size, (the actual requirement is that α>maxeig (G^TG), however, we will assume the minimum step-size, which is sufficient), λ is the threshold factor, and Λ(s; τ) is the soft-thresholding function. A first meaningful modification to such standard ISTA recursions in equation (28) is to account for the fact that the symbols in the real-valued projected constellation custom-character _R() are finite, such that in addition to the lower limit τ in the vicinity of the origin used to enforce sparsity in the solution, an upper limit max(_R) can be introduced into the thresholding function. In other words, for the case at hand we replace ISTA s standard soft-thresholding function Λ(s; τ) by a hard-thresholding function leading to the “boxed” hard-thresholding function Π(s; τ) illustrated in FIG. 6 and defined by

$\begin{matrix} Π (s; τ) \overset{Δ}{=} {\begin{matrix} \min (𝒮_{R}) & s \leq m (𝒮_{R}), \\ s & \min (𝒮_{R}) \leq s \leq - τ, \\ 0 & ❘ s ❘ \leq τ, \\ s & τ \leq s \leq \min (𝒮_{R}), \\ \max (𝒮_{R}) & \max (𝒮_{R}) \leq s . \end{matrix} & (29) \end{matrix}$

FIG. 6 is the Comparison of ISTA thresholding and BH-ISTA thresholding functions λ(s; τ), and Π(s; τ), (29)

Incorporating this modification yields the boxed-hard ISTA (BH-ISTA) receiver described by

$\begin{matrix} {\hat{u}}^{(η + 1)} = Π ({\hat{u}}^{(η)} + \frac{1}{α} G^{T} (y - G {\hat{u}}^{(η)}); \frac{λ}{2 α}) . & (30) \end{matrix}$

Notice that the computational cost of repeatedly evaluating equation 30 is dominated by the term Gû^(η), therefore quadratic on the number of non-zero entries (i.e., custom-character ₀-norm) of û^(η), which reduces with the iterations η, as illustrated in FIG. 7

FIG. 7 shows the Convergence of û_Π^(η)and û_Λ^(η), as per equations 28 and (30, respectively, as a function of iterations η.

FIG. 7(a) is the Sparsity convergence with various threshold values

FIG. 7(b) is the MSE convergence with optimal threshold values

In particular, FIG. 7(a) shows a comparison of the convergence of |û^(η)|₀as a function of η for various values of threshold parameter τ, with û^(η)obtained both from equations (28) and (30), i.e., via conventional and BH-ISTA, respectively, which for convenience will be hereafter denoted û_Π^(η)and û_Λ^(η).

It can in fact be seen that as a result of boxing and hard-thresholding, |û_Π⁽⁰⁾|₀<|û_Λ⁽⁰⁾|₀, such that the expected order of complexity associated with evaluating equation 28 is lower than that of evaluating (30), which can be bounded both below and above by the lower- and upper-limits ( custom-character (4P²) and (4Q²)).

More details will be given in Section V-A. In turn, FIG. 7(b) shows that the mean-squared error (MSE) obtained with the proposed BH-ISTA approach is better than that obtained with conventional ISTA, which illustrates the effectiveness of the boxed and hard-thresholding modification here proposed for the demodulation of QSM signals. It is left for us to address, however, how the bits associated with the choices of dispersion matrix indices {k^R, k^I}∈ custom-character can be efficiently detected. To that end, another addition is introduced to the ISTA-based sparse detector, namely, a greedy hard-detection procedure for each symbol recovered, with a concomitant update of equation (30), which can be described as follows.

Let us consider that multiple runs of the BH-ISTA iterations described by equation (30) are performed, such that prior to the m-th run a modification is made to y, G and u, which can be expressed by rewriting equation (30) as

$\begin{matrix} {\hat{u}}_{m}^{(η + 1)} = Π ({\hat{u}}_{m}^{(η)} + \frac{1}{α} G_{m}^{T} (y_{m} - G_{m} {\hat{u}}_{m}^{(η)}); \frac{λ}{2 α}), & (31) \end{matrix}$

where we convene that for the first run (m=1) we set y₁=y, G₁=G and û₀⁽¹⁾=0_2Q.

Let η* be the last iteration of the m-th run of the latter estimator, with its corresponding outcome denoted by û_m^(η*). And finally, let {tilde over (s)}_{{circumflex over (q)}}_mbe the entry of û_m^(η*)with the largest amplitude, whose position is denoted by {circumflex over (q)}_m, such that we may write

$\begin{matrix} {{\tilde{s}}_{{\hat{q}}_{m}} = {{[{\hat{u}}_{m}^{(η^{*})}]}_{{\hat{q}}_{m}} {❘ ❘ [{\hat{u}}_{m}^{(η^{*})}]}_{{\hat{q}}_{m}} ❘ > ❘ [{\hat{u}}_{m}^{(η^{*})}]}_{ℓ} |, \forall ℓ \in {1, \dots, 2 Q}}, & (32) \end{matrix}$

where [x]_ldenotes the l-th element of a generic vector x. It is emphasized that in the greedy procedure summarized by equation 32, two distinct pieces of information on the bit sequence b are obtained, namely, a soft estimate {tilde over (s)}_{{circumflex over (q)}}_mof one of the modulated symbols {s_p^R, s_p^I}∈ custom-character , and a hard estimate {circumflex over (q)}_mof one of the indices contained in the selected index sets {k^R, k^I}∈. In possession of such information, the following steps are then executed in order to produce the modified quantities required to perform the next run of the BH-ISTA recursion described by equation (31).

First, a hard-detected version of {tilde over (s)}_mis obtained by projecting in onto custom-character _R, that is

$\begin{matrix} {\hat{s}}_{{\hat{q}}_{m}} = 𝒫_{𝒮_{R}} ({\tilde{s}}_{{\hat{q}}_{m}}) . & (33) \end{matrix}$

Then, the remaining quantities are updated following

$\begin{matrix} {\hat{u}}_{m + 1}^{(1)} = (I_{2 Q} - diag (e_{{\hat{q}}_{m}})) {\hat{u}}_{m}^{(η^{*})}, y_{m + 1} = y_{m} - G_{m} \cdot e_{{\hat{q}}_{m}} {\hat{s}}_{{\hat{q}}_{m}}, & (34) \end{matrix}$

$and G_{m + 1} = G_{m} (I_{2 Q} - diag (e_{{\hat{q}}_{m}})),$

where I_2Qis the identity matrix of size 2Q and e_{{circumflex over (q)}}_mits {circumflex over (q)}_m-th column. Recall that due to the quadrature-decomposed structure of the sparse vector u, all odd index estimates {circumflex over (q)}_mcorrespond to the real parts of modulated symbols, while even {circumflex over (q)}_mcorrespond to the imaginary parts, respectively. It is therefore sensible that, as equations (31) through (35) are evaluated iteratively, the obtained index estimates {{circumflex over (q)}₁, {circumflex over (q)}₂, . . . , {circumflex over (q)}_m} be split and collected accordingly into the subsequences

$\begin{matrix} {\hat{q}}_{m}^{R} \overset{Δ}{=} {q_{m} | \mod ({\hat{q}}_{m}, 2) = 1, \forall m} and & (35) \end{matrix}$

${\hat{q}}_{m}^{I} \overset{Δ}{=} {q_{m} | \mod ({\hat{q}}_{m}, 2) = 0, \forall m},$

where mod(x, 2) denotes the modulo-2 operation onto x.

If there are no errors during the detection process, after exactly m=2P runs, the sequences {{circumflex over (q)}_m^R, {circumflex over (q)}_m^I} can be perfectly mapped to {k^R, k^I}, in particular via

$\begin{matrix} {\hat{k}}^{R} = \frac{1}{2} ({\hat{q}}_{m}^{R} + 1) and {\hat{k}}^{I} = \frac{1}{2} {\hat{q}}_{m}^{I}, & (36) \end{matrix}$

such that procedure comes to a stop.

More generally, however, errors may occur, such that either {circumflex over (q)}_m^R, or {circumflex over (q)}_m^I, or both, contain incorrect indices even with cardinality P. In such cases, the procedure continues until both subsequences contains the first P-tuple of indices included in the dispersion matrix index vector set custom-character , at which point a modification of the update equations is required, which can be described as follows.

Let custom-character (q) denote the projection of a sequence q onto the set , such that either a sequence k∈ or the empty set Ø is returned by the projection, depending on whether or not q contains within it a sequence from . If multiple valid k∈ exist in the combination of elements in q, the viable elements with lower indices in q (not of the element values themselves) take priority, such that the notion of greedy selection is coherent. Then, equation (34) can be expanded into

$\begin{matrix} {\hat{u}}_{m + 1}^{(1)} = {\begin{matrix} [I_{2 Q} - \sum_{q \in odd} diag (e_{q})] {\hat{u}}_{m}^{(η^{*})} & upon confirmation of {\hat{k}}^{R} from {\hat{q}}_{m}^{R}, or \\ [I_{2 Q} - \sum_{q \in even} diag (e_{q})] {\hat{u}}_{m}^{(η^{*})} & upon confirmation of {\hat{k}}^{I} from {\hat{q}}_{m}^{I}, or \\ [I_{2 Q} - diag (e_{{\hat{q}}_{m}})] {\hat{u}}_{m}^{(η^{*})} & otherwise . \end{matrix} & (37) \end{matrix}$

In plain words, equation (37) establishes that after the m-th run of the BH-ISTA detector, the initial state of the estimate vector û_m+1⁽¹⁾for the next run is either:

- a) updated by removing the latest estimate symbol, when neither of {circumflex over (q)}_m^Rand {circumflex over (q)}_m^Ican be projected to , which happens either when the number of indices acquired are insufficient (less than P) to decide on valid estimates of k^Ror k^I, or when the number of indices are sufficient (P or larger) but none contains valid combinations of indices to any k∈; or
- b) updated by nulling all odd entries of q∈{1, 3, . . . , 2Q−3, 2Q−1}, when a hard decision of {circumflex over (k)}^Ris confirmed from the projection of {circumflex over (q)}^Ronto , which will only happen once throughout the demodulation procedure; or c) updated by nulling all even entries q∈{2, 4, . . . , 2Q−2, 2Q}, when a hard-decision of {circumflex over (k)}_m^Iis confirmed from the projection of {circumflex over (q)}_m^Ionto , which also can happen only once throughout the demodulation procedure; or
- c) updated by nulling all even entries q∈{2, 4, . . . , 2Q−2, 2Q}, when a hard-decision of {circumflex over (k)}_m^Iis confirmed from the projection of {circumflex over (q)}_m^Ionto , which also can happen only once.

Obviously, the only other alternative to those above is when both {circumflex over (k)}^Rand {circumflex over (k)}^Ihave been acquired, and consequently also the entire set of symbol estimates {ŝ₁^R, ŝ₁^I, . . . , ŝ_P^R, ŝ_p^I} have been obtained, in which case the procedure is terminated.

Similarly to the above, the updates of y_mand G_mmust also be revised so as to account for the effect of hard-decisions onto {circumflex over (k)}^Rand {circumflex over (k)}^I, so as to cancel the effect of hard-decided indices and symbols, and to nullify the channel corresponding to confirmed indices, yielding respectively

$\begin{matrix} y_{m + 1} = {\begin{matrix} y - G \cdot \sum_{q \in [2 {\hat{k}}^{R} - 1, {\hat{q}}_{m}^{I}]} e_{q} {\hat{s}}_{q} & upon confirmation of {\hat{k}}^{R} from {\hat{q}}_{m}^{R}, or \\ y - G \cdot \sum_{q \in [{\hat{q}}_{m}^{R}, 2 {\hat{k}}_{q}^{I}]} e_{q} {\hat{s}}_{q} & upon confirmation of {\hat{k}}^{I} from {\hat{q}}_{m}^{I}, or \\ y_{m} - G_{m} \cdot e_{{\hat{q}}_{m}} {\hat{s}}_{{\hat{q}}_{m}} & otherwise, {\hat{k}}^{R} from {\hat{q}}_{m}^{R}, or \end{matrix} & (38) \end{matrix}$

$\begin{matrix} G_{m + 1} = {\begin{matrix} G_{m} [I_{2 Q} - \sum_{q \in odd} diag (e_{q})] & upon confirmation of {\hat{k}}^{R} from {\hat{q}}_{m}^{R}, or \\ G_{m} [I_{2 Q} - \sum_{q \in even} diag (e_{q})] & upon confirmation of {\hat{k}}^{I} from {\hat{q}}_{m}^{I}, or \\ G_{m} [I_{2 Q} - diag (e_{{\hat{q}}_{m}})] & otherwise . \end{matrix} & (39) \end{matrix}$

FIG. 8 is the Schematic diagram depicting the structure of the proposed GB-ISTA receiver for QSM demodulation

The procedure described by equations (31) through (33) and (35) through (39) amount to a greedy—i.e., symbol-by-symbol and index set-by-index set—modification of the GB-ISTA detector introduced earlier, for which it is dubbed as the greedy boxed iterative shrinkage thresholding algorithm for QSM demodulation.

Notice that at the end of the process, estimates {circumflex over (k)}^Rand {circumflex over (k)}^Iof the selected dispersion matrix index vectors, as well as hard-decision estimates ŝ=[ŝ₁^R, ŝ₁^I, . . . , ŝ_P^R, ŝ_P^I] of the modulated symbols are obtained, from which the corresponding encoded bits b=[b^R, b^I, b^S] can be retrieved at a fraction of the complexity of sphere detection or exhaustive maximum likelihood searches. A diagram illustrating the proposed GB-ISTA QSM receiver is offered in FIG. 8 and a summarized in the form of pseudo-code in method 3. FIG. 8 is the schematic diagram depicting the structure of the proposed GB-ISTA receiver for QSM demodulation

Method 3 Greedy Boxed-(Hard) ISTA Receiver for QSM Schemes

Global Quantities: Real-valued projected symbol constellation custom-character

_R, set of index vectors custom-character

and threshold factor λ;

Inputs: Received signal y and effective channel matrix G.

Outputs: Estimated index and symbol vectors {circumflex over (k)}^R, {circumflex over (k)}^land ŝ.

1: Set m = 1 and α = maxeig(G^TG);

2: Initialize y_m= y, G_m= G and ú_m⁽¹⁾= 0_2Q;

3: while 𝒫_{𝒦} (\frac{1}{2} [{\hat{q}}_{m}^{R} + 1]) = \emptyset or 𝒫_{𝒦} (\frac{1}{2} {\hat{q}}_{m}^{I}) = \emptyset do

4: Iterate equation (31) until convergence obtaining û_m^(η*)

5: Obtain symbol soft estimate {tilde over (s)}{dot over (_q)}_m and index hard estimate {circumflex over (q)}_m, via equation (32);

6: Obtain hard symbol estimate ś{dot over (_q)}_m via equation (33);

7: Insert index estimate {circumflex over (q)}_minto its subsequence {circumflex over (q)}_m^Ror {circumflex over (q)}_m^I, as per equation (35);

8: Construct û_m+1⁽¹⁾, y_m+1 and G_m+1 via equations (37), (38) and (39);

9: Increment m by 1;

10: end while

11:  Output  the  estimate  index  vector  as {\hat{k}}^{R} \leftarrow 𝒫_{𝒦} (\frac{1}{2} [{\hat{q}}_{m}^{R} + 1]) and {\hat{k}}^{I} \leftarrow 𝒫_{𝒦} (\frac{1}{2} {\hat{q}}_{m}^{I}), respectively,

and the estimate symbol vector s as the intercalation of {ŝ_2k_R_-1} and {ŝ_2k_I}.

V. Complexity and Performance Analysis

In this section we analyze the performance of the proposed OS-QSM via computer simulations. Given that our focus is on the scalability of the system, all simulation results to be shown are for relatively large number of transmit antennas (i.e., n_T≥6) and for increasing number of transmission slots (i.e., T≥2), with the number of digitally-modulated transmit symbols P and the cardinality of corresponding constellation M adjusted on a case-by-case basis in order to highlight the main finding of each simulated experiment. To the best of our knowledge, simulation results on QSM schemes with such parameters have not appeared so far in the literature, due to the prohibitive computational complexity of existing receivers.

A. Complexity: GB-ISTA Versus ML and SCMB-SD Receivers

In light of the latter remark, let us start by assessing the decoding complexity of scaled QSM systems, in particular by deriving the complexity orders of the conventional ML and SCMB-SD approaches, and of the proposed GB-ISTA algorithm described in Section IV.

For any given n_T, T and P, the brute-force ML decoder requires a search among all possible

${⌊ (\begin{matrix} {Tn}_{T} \\ P \end{matrix}) ⌋}_{2^{\times}}$

antenna activation patterns, independently selected according to {k^R, k^I}∈ custom-character to transmit the real and imaginary parts of the P digitally modulated symbols in s∈^P, as well as another search, for each possible activation pattern, of all possible P-tuples of symbols selected from the constellation , of cardinality M.

Assuming, idealistically and for simplicity, that each search consumes a single floating point operation (flop), the ML search process alone yields a complexity order lower-bounded by

$(4^{⌊ \log_{2}} (\begin{matrix} {Tn}_{T} \\ P \end{matrix}) ⌋ \times M^{P}),$

in order to detect

$2 \cdot ⌊ \log_{2} (\begin{matrix} {Tn}_{T} \\ P \end{matrix}) ⌋ + P . \log_{2} M$

bits, which even for moderately small P and M quickly become unfeasible. For example, a search over 16.777.216 combinations is required to detect the 24 bits of each transmit signal in a relatively small system with n_T=6, T=3, P=3 and M=4. Just doubling the number of transmit antennas to n_T=12, with other parameters unchanged, the complexity of the ML search space already surges to 10⁹combinations, for a mild increase to 30 bits per transmission, while keeping n_T=6 and doubling the number of transmit symbols to P=6 requires a search over more than 10¹²combinations in order to detect only 40 bits. Taking the most significant operations required to perform each ML search into account, the order of complexity of the ML receiver to decode each bit of QSM schemes becomes with the lower bound obtained by keeping only the higher-order terms and neglecting coefficients.

- construction of Φ_HΞ_Aand Φ_H·Ξ_Bas in equation (23)

$\begin{matrix} 𝒪 (\overset{︷}{12 T^{3} n_{T} n_{R}} + n_{R} (2 P + 1) - 1) \cdot 4^{⌊ \log_{2}} (\begin{matrix} {Tn}_{T} \\ P \end{matrix}) \cdot M^{P}) > 𝒪 (n_{R} \cdot P \cdot T^{P + 1} \cdot M^{P} \cdot n_{T}^{P - 1}), & (40) \end{matrix}$

The practical unfeasibility of ML-based detection of QSM systems is clearly highlighted by equation (40), as it exposes the fact that the number of transmit symbols P is a complexity order exponent of all theoretically scalable quantities n_T, T and M. Next, let us show that this challenge cannot be satisfactorily mitigated by the SD approach. To that end, consider again idealistically and for simplicity, that SD can reduce the search radius to a single symbol, such that the factor M^Pin equation (40) can be neglected. In other words, we find that the order of complexity associated to SD based QSM receivers can, at best, be reduced to

$\begin{matrix} 𝒪 (n_{R} \cdot P \cdot T^{P + 1} \cdot n_{T}^{P - 1}) & (41) \end{matrix}$

From the latter it can be concluded that, in the context of scalable QSM schemes, the only advantage of SD is to enable scaling of the digital constellation cardinality M, which not only impacts negatively on the corresponding BER, but also is not the most significant factor in increasing the SE of the system, since the total number of bits conveyed by a QSM scheme is

$B = 2 \cdot ⌊ \log_{2} (\begin{matrix} {Tn}_{T} \\ P \end{matrix}) ⌋ + P \cdot \log_{2} M,$

such that even for mildly large n_T, T and P we have

$2 \cdot ⌊ \log_{2} (\begin{matrix} {Tn}_{T} \\ P \end{matrix}) ⌋  P \cdot \log_{2} M .$

In summary, it can be concluded that sphere detection is not particularly useful as an enabler of spectrally-efficient scalable QSM, from a receiver perspective.

Finally, let us address the computation complexity of the proposed GB-ISTA For starters, observe from equations (31) through (36) that the GB-ISTA receiver obtains the spatially encoded bits b^Rand b^Inot from a search, but directly from the sparse-recovery process, i.e., the value and locations of non-zero elements of û. As a consequence of removing such combinatorial search, the impact of the scalable parameters n_T, T and P onto GB-ISTA is significantly smaller, as demonstrated by the following complexity analysis of the steps of method 3.

- 1) Method 3 takes as input the effective matrix G given in equation (27), whose construction requires evaluating the product of the sparse block-diagonal matrix Φ_H∈^2Tn^R^×2Tn^T, which contains 2n_Tnon-zero entries per row, against the matrix Ψ_D∈^2Tn^T^×2Tn^T, which has T non-zero entries per column, yielding a cost of 2T (2Tn_R)(2Tn_T)=8T³n_Tn_Rflops since typically we have (in scaled QSM schemes) T<<2n_T.
- 2) Next the GB-ISTA receiver performs multiple runs of evaluating equation (31), the first step to which is computing

${\hat{u}}^{(η)} + \frac{1}{α} G^{T} (y - G {\hat{u}}^{(η)}) .$

The cost of computing α as per line 1 of method 3 are ignored, under the argument that for large systems, the largest eigenvalues G^TG converges almost sure to a constant dependent only on the structure and the energy of G.

That operation would cost (2Tn_T)(2Tn_R)+(2Tn_R)+(2Tn_R)(2Tn_T)+(2Tn_R)=8T²n_Tn_R+4Tn_Rflops to accomplish, but since the sparsity of û^(η)quickly reduces to the actual value 2P, as shown in FIG. 7(a) the complexity of that step is more precisely estimated at 8PTn_R+4Tn_R=4Tn_R(2P+1) flops Then, including 2Tn_Tflops required for the Boxed-Hard thresholding function Π, and remembering η* iterations are necessary, the total cost associated with each evaluation run of equation (31) can be estimated at η*(4Tn_R(2P+1)+2Tn_T) flops

- 3) After convergence of equation (31), the receiver obtains the sparse estimate vector û_m^(η*), from which the soft symbol estimate {tilde over (s)}_{{circumflex over (q)}}_mis extracted at negligible cost via maximum value search 44, along with the estimate index {circumflex over (q)}_mgiven by the position of ŝ_{{circumflex over (q)}}_m, as expressed in equation (32). With these quantities at hand, up to √{square root over (M)} flops are consumed to obtain the hard symbol estimate ŝ_{{circumflex over (q)}}_mas per equation (33).
- 4) Considering that the cost of the element, interference and column removals expressed by equations (37) through (39) are negligible, the next significant cost of the receiver is the validation of acquired indices. In particular, after at least P runs, when a sufficient number of position indices {circumflex over (q)}_mhave been detected to construct any or both of the index subsequences {circumflex over (q)}_m^R, and/or {circumflex over (q)}_m^I, and map them to corresponding estimate index vectors {circumflex over (k)}^Rand/or {circumflex over (k)}^Ias per equation (36), said estimates need be validated against the optimal set of index vectors . Assuming the cost of such operation is of order (1), this step contributes to the total complexity of the GB-ISTA detector with an additional cost of P flops
- 5) Lastly, as described in line 11 of Algorithm 3, the GB-ISTA outputs both the pair of estimate index vectors {circumflex over (k)}^Rand {circumflex over (k)}^I, as well as the digitally modulated symbol vector estimate ŝ∈^Pwhich requires the intercalation of the real and imaginary parts, at estimated cost of P flops

From the above, the total complexity order of the GB-ISTA can be estimated at

$\begin{matrix} 𝒪 (\overset{construction of G}{\overset{︷}{8 T^{3} n_{T} n_{R}}} + \underset{number of runs}{\underset{︸}{2 P}} (\overset{evaluation of eq . (31)}{\overset{︷}{η^{*} (4 {Tn}_{R} (2 P + 1) + 2 {Tn}_{T})}} + \underset{evaluation of eq . (35) and validation of {\hat{q}}_{m}^{I} or {\hat{q}}_{m}^{R}}{\underset{︸}{\sqrt{M} + \frac{1}{2}}}) + \overset{intercalation of {{\hat{s}}_{2 \hat{k} n_{- 1}}} and {{\hat{s}}_{2 k} t} into \hat{s}}{\overset{︷}{P}}) & (42) \end{matrix}$

The per-bit complexity orders of the ML and the proposed GB-ISTA decoders, obtained by dividing the expressions in equations (40) and (42) by the number of bits detected per transmission

$B = 2 ⌊ \log_{2} (\begin{matrix} {Tn}_{T} \\ P \end{matrix}) ⌋ + P \log_{2} M,$

are compared in FIG. 9 for various settings in terms of the scalable parameters n_T, T and P.

FIG. 9 shows the Effect of scalable parameters on the complexity of QSM receivers.

FIG. 9 (a) shows the Fixed P, various T, as a function of n_T

FIG. 9 (b) shows the Fixed T, various P, as a function of n_T

FIG. 9 (c) shows the Fixed n_T, various T, as a function of P.

B. BER Performance of the Proposed OS-QSM Scheme

Empowered by the significant reduction in complexity obtained by GB-ISTA over ML detection, as shown above, we proceed to assess the BER performance of the proposed OS-QSM scheme, decoded via the GB-ISTA. In general, our simulated experiments aim to further demonstrate that the proposed OS-QSM are feasible with relatively large numbers of transmit antennas, and can achieve very low BER at very low E_b/N₀with rather high spectral efficiencies, whilst using relatively few spatial-temporal resources per transmission.

FIG. 10: is the Effect of scalability on BER performance of GB-ISTA-detected OS-QSM schemes with fixed SE.

To that end, our first set of results, shown in FIG. 10, compares the BER performances of the proposed method for various values of n_T, T and P, with the ratio P/T kept constant and M adjusted such that all curves corresponds to systems with the same spectral efficiency. We remark that the curves for T=2 can be considered as a reference corresponding to the SotA EDA-QSM scheme of [22], [23], although the results shown actually incorporate improvements due to the enhancements described in Subsection III-B, namely, the utilization of: a) optimal Golden codes [28], as opposed to the block-by-block sphere-decodable [FDFR]STBC of [24; and b) the optimal construction of index vectors set custom-character , given in method 1.

Two important facts can be learned from the results of FIG. 10. The first is that significant improvement in BER achieved by scaling occurs in spite of the fact that the number of receive antennas is rather large (n_R=12). This indicates that the gains are not only due to an increase in diversity (since receive diversity is already large), but also due to the coding gain reaped from the utilization of optimal FDFR] STBCs employed in the OS-QSM design. And the second is that results shown are actually simulated, down to rather low BERs using a usual computer (i.e. no particularly powerful machine was required), and for settings which are virtually impossible to simulate with ML or SD based receivers. The latter point is strengthened by the results of the right side of FIG. 10, which includes a curve for a system with n_T=12, which serves to further highlight the true feasibility of the proposed GB-ISTA receiver.

One criticism that could be raised about the results of FIG. 10 is, however, that values of P adopted thereby are not the optimal ones for the corresponding T and n_T, as per equation 14. We once again clarify that the parameterization used in FIG. 10 is such that all systems have the same SE, so as to allow their direct comparison under equivalent conditions, which is, incidentally also the reason why all curves are plotted against E_b/N₀as opposed to SNR.

FIG. 11 illustrates the Effect of scaling P on BER performance of GB-ISTA-detected OS-QSM schemes.

In any case, in order to dispel any doubts about the ability of the proposed OS-QSM design and of the corresponding GB-ISTA receiver to actually achieve feasible and optimized spectral efficient combined with low BERs, we shown in FIG. 11 additional results obtained by varying P up to the optimal value given by equation (14). We remark that, due to the floor operation in the expression of the achievable SE given by equation 8 values of P adjacent to that given by equation—i.e., P*—are also optimum, as they result in exactly the same SE For instance, with n_T=6, T=2 and M=4, as is the case of FIG. 10, equation (14) yields P*=8, but the values P={7, 8, 9} all result in ζ=16 in equation 8. Similarly, P={10, 11, 12, 13} all yield the largest SE of ζ=17 with n_T=6, T=3 and M=4 as is the case of FIG. 11.

With these remarks made, turning to the results obtained in FIG. 11, it can be seen that only a very mild degradation of BER is observed when up-scaling P, which is in fact smaller in for larger T as shown in FIG. 11, which is a small and fair price to pay for almost doubling spectral efficiency of the system. We clarify that the slight BER degradation observed when up-scaling the ratio P/T towards SE optimality is a consequent of the corresponding reduction of sparsity in the vectorized received signal, which tends to be less critical in systems with higher diversity and coding gains, as a result of up-scaling n_T, T or both. That trend is in fact observable in FIG. 11 as the gap between BER curves narrows as T=2 is increased to T=3.

FIG. 12 is the proposed computer-implemented detection method represented a a flowchart.

With the highlight of introducing four approaches within this application.

The notion of applying sparse detection (compressed sensing) process steps to decode SM signals.

(Marked B) in the Flowchart FIG. 12)

From the above idea of applying sparse detection, a modification to the iterative shrinkage-thresholding process steps (ISTA) [2] via boxing (range limiting) and hard-thresholding.

(Marked C) in the Flowchart FIG. 12)

Using the boxed-hard ISTA from above, a greedy selection of the positions of the antennas index and the symbol estimates, and their independent decoding of the corresponding “antenna modulated” and “symbol modulated” bits.

(Marked D) in the Flowchart FIG. 12)

A process working parallel to the greedy detections, to ensure valid estimates of the index vectors (from the given finite set of index vectors) are produced at the output of the algorithm, and to apply interference cancellation with the confirmed values.

- While keeping track of which indices have been retrieved from the greedy selections, before every iteration check whether from the currently decoded indices, a final confirmation can be made.
- If it cannot be made, remove the interference by the previous greedy selection and make the next iteration.

The proposed decoder possesses significantly low complexity compared to the existing ML and tree-search algorithms. The decoder possesses a quadratic complexity on the number of transmit antennas.

The decoder can be used in any MIMO system where the number of antennas is expected to be large, such that ML approaches are infeasible. The scheme can be used to increase efficiency in cellular networks and V2X communications (eMBB).

FIG. 13 shows the Comparison of ISTA thresholding and BH-ISTA thresholding functions Λ(s; τ) as per [29], and Π(s; τ), as per equation (29).

In this application it is a new transmitter and receiver designs for QSM schemes, focusing on their scalability in terms of the number of transmit antennas n_T, number of transmit instances T and number of encoded M-ary symbols P, as well as on their performance optimization in terms of SE diversity and coding gains. The contributions are motivated by the demonstrated fact that, in order for SE optimality to be achieved, QSM schemes must scale n_T, T and P, which is not possible with SotA methods. At the transmitter, the newly proposed OS-QSM scheme differs from SotA alternatives in that its dispersion matrices are designed based on the FDFR] STBCs, and in that dispersion matrix index selection is performed via a new greedy algorithm given, which ensures that all spatial-temporal resources of the transmitter are utilized evenly over multiple transmissions. In turn, at the receiver, the proposed art contributes with a new ISTA-based receiver, which thanks to its reliance on the sparse structure of QSM signaling, eliminates the combinatorial nature of existing ML or SD based approaches, further enabling the scaling of the system from a feasibility perspective. In fact, a complexity analysis is offered, which shows that the proposed GB-ISTA receiver enjoys a complexity order that is cubic on T, quadratic on P, and only linear on n_T, in contrast to the ML and SD detectors which have geometric complexities on T and n_T, with P as exponent, rendering them unfeasible in the scaled scenario. Simulation results for set-ups of scales never before shown in related literature, corroborate both the high performance and feasibility of the proposed OS-QSM scheme and GB-ISTA receiver.

METHOD CONFIGURING A PLURALITY OF TRANSMIT ANTENNAS

Information

Publication Number

Date Filed

Date Published

Inventors

Original Assignees

CPC

International Classifications

Abstract

Description

Claims

Priority Claims (1)

PCT Information