OPTIMAL INDEX VECTOR SELECTION METHOD FOR SPATIAL MODULATION

Information

  • Patent Application
  • 20240380452
  • Publication Number
    20240380452
  • Date Filed
    July 22, 2022
    2 years ago
  • Date Published
    November 14, 2024
    2 months ago
Abstract
A computer-implemented decoding method configures a plurality of transmit antennas to each represent an in-phase spatial constellation symbol within an in-phase spatial constellation, and a quadrature spatial constellation symbol within a quadrature spatial constellation, and maps source data to the in-phase spatial constellation symbols and the quadrature spatial constellation symbols represented by the plurality of transmit antennas, wherein the method constructs the set which has equal multiplicities of the transmit antenna activation, which ensures maximum possible transmit diversity.
Description
TECHNICAL FIELD

The present invention relates to the field of decoding digital communications in overloaded channels.


BACKGROUND

In advanced spatial modulation (SM) schemes, only a fraction of transmit antennas are activated per symbol slot, via dispersion matrices which determine the activation patterns. This selection is determined by so-called index vectors which hold integer values corresponding to the index of the dispersion matrix to be simultaneously activated. The construction of the set index vectors is done naively in combinatorial order up to the state-of-the-art (SotA), however this leads to unequal allocations for the transmit antennas and hence a degradation in transmit diversity


Spatial modulation (SM) as a promising technique that can reduce the hardware complexity and the costs and massive multiple-input multiple-output (MIMO) wireless communication systems without sacrificing a bit error rate (BIR) spectrum efficiency (SE) performances. In particular, in SM schemes, information bits are embedded not only in the selection of transmit symbols (a.k.a consolation dimension), but also in the selection of the transmitted antennas utilized during transmission (a.k.a spatial dimension).


Through this approach, the vast stationary sources associated with massive MIMO systems can be efficiently utilized the fault requiring an equally large number of radiofrequency (RF) chain components. This efficient utilization of RF-chains makes SM schemes attractive for future wireless systems such as beyond fifth generation (B5G), which will continue to make extensive use of millimeter-wave (mmWave) bands, and sixth generation (6G) networks [6], which are expected to also incorporate Terahertz and visible light communications (VLC) bands.


A major drawback of early SM schemes is, however, that only one antenna is selected per transmission, which severely limits achievable SEs. In order to circumvent this limitation, a generalized spatial modulation (GSM) scheme was later developed, where multiple antennas are selected at each transmission, leading to substantial increase in SE. But another drawback of early SM methods including GSM schemes, was the exclusive focus on increasing SE without a matching effort to reduce BER, e.g., via the exploitation of transmit diversity. That limitation motivated the idea of combining SM with space-time coding (STC), examples of which are the space-time shift keying (STSK) schemes based on linear dispersion (LD) coding, the methods incorporating space-time block coding (STBC) and the spatial modulation with cyclic structure (CSM).


Based on this knowledge then it was proceeded for further optimize the SM transmitter design, leading to the discovery of quadrature spatial modulation (QSM) approaches in which the SM concept is independently applied to each of the real and imaginary components of the modulated signals, via dedicated spatial-temporal dispersion matrices. The idea was further developed in a succession of QSM techniques with progressively enhanced dispersion matrix designs, which include the diversity-achieving quadrature spatial modulation (DA-QSM) scheme by the incorporation of Alamouti codes, and the more recent enhanced diversity-achieving quadrature spatial modulation (EDA-QSM) method in which dispersion matrices are constructed using the full-diversity full-rate (FDFR) codes with block-by-block sphere-decodability of [24]. Among all aforementioned schemes, the EDA-QSM is the best QSM scheme known today, both in terms of BER and SE performance.


Despite these advantages, the EDA-QSM scheme, and as a consequence preceding QSM schemes, still have two major shortcomings. The first is that the dispersion matrices used in QSM schemes proposed so far are based on 2×2 STBCs, which limits both the diversity and coding gains achieved by the methods. With regards to that first limitation, we will in fact show in this article that QSM designs based on STBCs of a size T that does not scale with nT are fundamentally sub-optimal in the SE sense. The second shortcoming is that current QSM detection schemes are based either on exhaustive maximum likelihood (ML) or, at best, sphere detectors. Here, it is worth noting that it has actually been, contrary to previous claims, that sphere decoding has an average complexity that still grows exponentially with the number of jointly decoded symbol periods. That result was corroborated by some findings, where a cubic closed-form expression for the expected complexity of sphere detectors was derived, as well as, where it was shown that lattice-reduction does not improve the tail exponent of the complexity distribution of sphere detectors. With regards to that second limitation, it will be shown in this application that in fact the complexity of ML- and sphere detection (SD)-based QSM receivers are both geometric on nT and T, with P as exponent, such that these techniques are fundamentally non-scalable in the context of QSM systems. In other words, a severe and two-folded scalability challenge exists among current QSM schemes, namely, the absence of scalable transmitter and receiver designs.


Motivated by this challenge, in this application a new QSM solution that is both, at the transmitter side, scalable to arbitrary block sizes i.e., with no limits on nT, T and P, and, at the receiver side, decodable in polynomial time i.e., practical for large nT and T, with moderate P is contributed. As a bonus, which can be seen is that the proposed QSM scheme with every possibility to optimize SE, diversity and coding gains. To that end, we first introduce the optimal FDFR Golden STBC code of in the design of the QSM dispersion matrices. The Golden code is a fast-decodable STBC known to be optimal i.e., FDFR with highest coding gain over Gaussian constellations, and which was shown in to be constructible generally for arbitrary block sizes. The resulting optimized scalable QSM (OS-QSM) scheme is the first method proposed so far which has this feature.


The new OS-QSM design is further enhanced with a new algorithm to select the indices of the dispersion matrices employed in the scheme, which ensures that all transmit antennas are utilized as often and with the same likelihood over the transmission of multiple blocks, thus ensuring optimally diverse utilization of all spatial-temporal resources. Finally, in order to also ensure feasible decodability to the scalable transmitter design, a new greedy boxed iterative shrinkage thresholding algorithm (GB-ISTA) QSM detector based on sparse recovery methods is proposed.


Thanks to its sparse signal processing approach, the proposed decoding scheme does not require any restriction on the core code design, unlike preceding sphere-detection methods which requires block-diagonal fast-decodability. But in addition and most importantly, a major advantage of the new proposed GB-ISTA QSM receiver is that it does not require a search over the large codebook space, unlike the ML and state codewords matched block-by-block sphere decoding (SCMB-SD). In fact, the complexity order of the proposed receiver is shown to be cubic on T, quadratic on P, and only linear on nT.


All in all, the contributions of the article can be summarized as follows:

    • Spectral Efficiency-Optimality: A closed form expression for the optimum number of encoded symbols P required for a QSM to achieve SE optimality are given, which combined to the rate-optimality condition of STBCs, highlights the importance of systematic scalability of the STBC size T in the design of SE-optimal QSM schemes.
    • Optimal Diversity and Coding Gains: A new Golden code-based quadrature spatial modulation (GQSM) transmission scheme is obtained via the design of dispersion matrices based on the 2×2 Golden code, which is known to achieve optimal coding gain over integer symbol constellations.
    • Scalability of Transmitter: the new GQSM design is generalized via the extension of the 2×2 Golden code into its T×T FDFR STBC variation, yielding the OS-QSM scheme, which is applicable to arbitrary nT, T and P.
    • Optimality of Resource Utilization: In method 1, a new mechanism to select the optimal set of dispersion matrix indices is offered, which ensures that all Q spatial-temporal resources are equally utilized over time, as required for optimal diversity gain.
    • Scalability at Receiver: A new low-complexity greedy iterative shrinkage thresholding algorithm (ISTA)-based demodulation algorithm for GSM schemes is proposed, which not only is feasible at large scales due to its linear complexity, but also can be applied to other STBC-QSM schemes.
    • Complexity of Receiver: A novel complexity expression of the proposed receiver is derived and shown to be cubic on T, quadratic on P, and linear on nT, in contrast to the ML and SD receivers which are geometric on T and nT, with Pas exponent.


Hence a method to construct the index vectors abiding to the construction restrictions while also keeping equal average activation of the antennas is presented by this application.


This problem was not addressed in any previous spatial modulation schemes leading up to the enhanced diversity-achieving quadrature spatial modulation (EDA-QSM), [1] which is the SotA and no previous solution found.


The inventive method which iteratively constructs the optimal set of index vectors given the system parameters such as the number of transmit antennas, number of consecutive symbol periods, and total number of symbols transmitted.


Specifically, in the dotted highlighted section of the QSM signal generation flowchart/block diagram in FIG. 1 represent the.


The target function is the equal allocation of transmit antennas indices in the final optimal set. The algorithm, beginning from an empty set on each iteration adds a vector to the set. The vector added at each iteration consists of indices with the least multiplicity in the building set.


The pseudocode of the method is listed as method 1 on page 19 and by a visual flowchart of the method is illustrated in FIG. 12


Complex matrices and vectors are denoted in bold-face uppercase and lowercase letters, with their elements denoted by indexed normal lowercase letters, as in X, x and xi, respectively. The real and the imaginary parts of a complex number x are respectively denoted by xR and xI, respectively, and for the sake of future convenience we define for a complex vector x=[x1, x2, . . . , xn]T the associated decoupled vector x≙[x1R, x1I, . . . , xnR, xnI]T and corresponding quadrature representation







x
˙


=
Δ



[




x
R




-

x
I







x
I




x
R




]

.





The quadrature operator (·) will also be applied to m×n complex matrices X, for which it yields the corresponding 2m×2n matrix,







X
ˇ


=
Δ



[





x
ˇ


1
,
1









x
ˇ


1
,
n








x
ˇ


m
,
1









x
ˇ


m
,
n





]

.





In turn, the complex conjugate, transpose, Hermitian, trace, vectorization, and the diagonalization operators are denoted by (·)*, (·)T, (·)H, tr(·), vec(·), and diag(·), respectively, while the n×n identity and the m×n-sized all-zero and all-one matrices are respectively denoted by In, 0m×n, and 1m×n. The p-norm with p≥0 is denoted by ∥·∥p, while |·| denotes either element-wise absolute value operation (for vectors) or cardinality (for sets), respectively, and the sets of real, complex, and integer numbers are denoted by custom-character, custom-character, and custom-character, respectively. Expectation is denoted as custom-character[·], the floor to the nearest power of 2 is represented by └·┘2x, the conversion operation of a left-most-significant binary vector to the corresponding base-10 integer is denoted by [·](10). The binomial coefficient is denoted by (PQ) and ⊗ denotes the Kronecker product. The projection of a scalar v onto the set X is denoted by custom-characterx(v), and the complex Gaussian distribution with mean μ and variance σ2 is denoted by ˜custom-character(μ,σ2).


All the previous State of the Art spatial modulation (SM) schemes use ML such as in most spatial modulation (SM) or modified tree-search algorithms such as in the state of the art SM scheme of EDA-QSM to decode SM signals and no notable attempts on sparse detection or greedy approaches.


The tree-search algorithm is extremely unrealistic due to the highly combinatoric nature of the SM and the resulting search space size. Furthermore, the SM imposes various restrictions on the structure of the decoded signal such that naïve low-complexity decoding is impossible.


The problem associated with prior art is that there is no sparse detection solution for the problem. Solutions using other approaches features prohibitive complexity.


Massive multiple-input multiple-output (MIMO) systems in beyond-fifth generation (B5G) and sixth generation (6G) wireless communications expect incorporation of many transmit and receive antennas.


Utilization of Spatial Modulation (SM) and its variants such as quadrature spatial modulation (QSM) is one promizing candidate for massive MIMO systems. However, as the system size grows, the classic decoding complexity of the SM schemes becomes infeasible (i.e., complexity is not affordable) as the SotA uses maximum likelihood (ML) decoding or ML based tree-search algorithms.


Proposed method is presented in flowchart in FIG. 12:


The method constructs the set which has equal multiplicities of the transmit antenna activation, which ensures maximum possible transmit diversity, like it is shown in FIG. 13. In the previous SotA, this was not the case as there was no equal multiplicity and hence non-optimal transmit diversity was exploited from the number of available antennas, i.e., all the antennas are used by the same amount).


All existing spatial modulations schemes benefit, as all spatial modulation schemes already utilize index vector selection (or can be reformulated such). The scheme can be used to increase data rate and increase efficiency in cellular networks and V2X communications (eMBB).


These and other objects, features and advantages of the present invention will become clearer when the drawings as well as the detailed description are taken into consideration.


The proposed decoder possesses significantly low complexity compared to the existing ML and tree-search algorithms. The decoder possesses a quadratic complexity on the number of transmit antennas.


Furthermore, the proposed decoder also does not require design restrictions such as block-diagonality or orthogonality in the transmission scheme, only the sparsity which is inherent with spatial modulation schemes, therefore providing larger freedom in transmitter development as well. In other words, the proposed decoder is capable of coping with many different encoding structures (flexibility).


The decoder can be used in any MIMO system where the number of antennas is expected to be large, such that ML approaches are infeasible.


The scheme can be used to increase efficiency in cellular networks and V2X communications (eMBB).


These and other objects, features and advantages of the present invention will become clearer when the drawings as well as the detailed description are taken into consideration.


One embodiments of the computer-implemented Optimal Index Vector Selection method configuring a plurality of transmit antennas is configuring a plurality of transmit antennas to each represent an in-phase spatial constellation symbol within an in-phase spatial constellation, and a quadrature spatial constellation symbol within a quadrature spatial constellation, mapping source data to the in-phase spatial constellation symbols and the quadrature spatial constellation symbols represented by the plurality of transmit antennas, wherein the method constructs the set which has equal multiplicities of the transmit antenna activation, which ensures maximum possible transmit diversity.


Another embodiment of the computer-implemented decoding method is characterized by a modification to the iterative shrinkage-thresholding algorithm (ISTA) via boxing, range limiting and hard-thresholding.


Another embodiment of the computer-implemented decoding method is characterized by proceeding the iterative shrinkage-thresholding algorithm via boxing-hard (ISTA), a greedy selection of the positions of the antennas index and the symbol estimates, and their independent decoding of the corresponding antenna modulated and symbol modulated bits.


Another embodiment of the computer-implemented decoding method is characterized by, wherein process working parallel to the greedy detections, to ensure valid estimates of the index vectors from the given finite set of index vectors are produced as an output and to apply interference cancellation with the confirmed values,

    • while keeping track of which indices have been retrieved from the greedy selections, before every iteration check whether from the currently decoded indices, a final confirmation can be made.
    • If it cannot be made, remove the interference by the previous greedy selection and make the next iteration


Another embodiment of the method is characterized by proceeding with the Input of the number of symbols P, number of symbols slots T and the number of transmit antennas nT.

    • in a first step an empty valid vectors set are generated
    • in the second step a random seed vector to set is added, afterwards a routine in order to find least used indices in the set is proceeded,
    • in the next step a vector with indices a picked and added to the set,
    • wherein it is checked, if the set has reached a maximum size, and if this is not the case the method proceeds to the step in which the least used indices in set has to be found, and if the set has reached a maximum size, the method comes to an end.


Another embodiment is characterized by a receiver (R) of a communication system having a processor, volatile and/or non-volatile memory, at least one interface adapted to receive a signal in an communication channel, wherein the non-volatile memory stores computer program instructions which, when executed by the microprocessor, configure the receiver to implement the decoding method of one or more embodiments cited above.


Another embodiment is characterized by a receiver by computer program product comprising computer executable instructions, which, when executed on a computer, cause the computer to perform the decoding method of one or more embodiments cited above.


Another embodiment is characterized by a computer-readable medium storing and/or transmitting the computer program product cited above.


Another embodiment is characterized by vehicle unit comprising a communication system with a receiver (R) in a vehicle wherein the system is adapted to execute the method according to one or more the decoding method of one or more embodiments cited above.


Another embodiment is characterized by a vehicle having one or more vehicle units cited above.


All aspects in this application can be integrated in a mobile devices, base station and components in wireless systems. All the describes components can be integrated in vehicles.





BRIEF DESCRIPTION OF THE DRAWINGS

For a fuller understanding of the nature of the present invention, reference should be had to the following detailed description taken in connection with the accompanying drawings in which:



FIG. 1: is Schematic diagram depicting the generic structure of a QSM transmission scheme.



FIG. 2: is the Spectral efficiency of OS-QSM scheme with T=2, 4, and 8, for a given system with nT=8 and M=4



FIG. 3: is the Effect of T and M on the optimum ratio P*/T between the number of transmit symbols and epochs



FIG. 4: is the Behavior of fractional peak spectral efficiency as a function of nT, for different sizes of T and M.



FIG. 5 is the bipartite graph representing the spatial-temporal resource usage associated with each index vector kn for a QSM system with P=3 and Q=8. The particular examples of k1, k37 and k54 are explicitly illustrated



FIG. 6 is the Comparison of ISTA thresholding and BH-ISTA thresholding functions Λ(s; τ) as per [29], and Π(s; τ), as per equation (29)



FIG. 7: is the Convergence of u{circumflex over ( )}nΠ(η) and μΛ(η), as per equations (28) and (30), respectively, as a function of iterations η.



FIG. 7(a) is the Sparsity convergence with various threshold values.



FIG. 7(b) is the MSE convergence with optimal threshold values.



FIG. 8 is the Schematic diagram depicting the structure of the proposed GB-ISTA receiver for QSM demodulation.



FIG. 9 is the Effect of scalable parameters on the complexity of QSM receivers.



FIG. 9(a) is the Fixed P, various T, as a function of nT



FIG. 9(b) is the Fixed T, various P, as a function of nT



FIG. 9(c) is the Fixed nT, various T, as a function of P.



FIG. 10: is the Effect of scalability on BER performance of GB-ISTA-detected OS-QSM schemes with fixed SE.



FIG. 11: is the Effect of scaling P on BER performance of GB-ISTA-detected OS-QSM schemes.



FIG. 12 is the proposed computer-implemented Optimal Index Vector Selection method for Spatial Modulation represented as a flowchart.



FIG. 13 is the bipartite graph representing the spatial-temporal resource usage associated with each index vector kn for a QSM system with P=3 and Q=8. The particular examples of k1, k37 and k54 are explicitly illustrated.





Quadrature spatial modulation (QSM) schemes are considered, which are capable of conveying large numbers of bits while a combination of transmitting a relatively small number P all M-ary modulated symbols from a dynamic selection of nT transmit into mass, according to a designed dispersion pattern.


DETAILED DESCRIPTION

Following are detailed descriptions of concepts, system/network architectures, and detailed designs for many aspects of a wireless communications network targeted to address the requirements and use cases for 5G. The terms “requirement,” “need,” or similar language are to be understood as describing a desirable feature or functionality of the system in the sense of an advantageous design of certain embodiments, and not as indicating a necessary or essential element of all embodiments. As such, in the following each requirement and each capability described as required, important, needed, or described with similar language, is to be understood as optional.


In the discussion that follows, this wireless communications network, which includes wireless devices, radio access networks, and core networks, is referred to as “NX.” It should be understood that the term “NX” is used herein as simply a label, for convenience. Implementations of wireless devices, radio network equipment, network nodes, and networks that include some or all of the features detailed herein may, of course, be referred to by any of various names. In future development of specifications for 5G, for example, the terms “New Radio,” or “NR,” or “NR multi-mode” may be used—it will be understood that some or all of the features described here in the context of NX may be directly applicable to these specifications for NR. Likewise, while the various technologies and features described herein are targeted to a “5G” wireless communications network, specific implementations of wireless devices, radio network equipment, network nodes, and networks that include some or all of the features detailed herein may or may not be referred to by the term “5G.” The present invention relates to all individual aspects of NX, but also to developments in other technologies, such as LTE, in the interaction and interworking with NX. Furthermore, each such individual aspect and each such individual development constitutes a separable embodiment of the invention.



FIG. 1 shows schematic diagram depicting the generic structure of a QSM transmission scheme.


A. System Model

Consider a point-to-point (P2P) MIMO communication system in which a transmitter equipped with nT transmit antennas exchanges information with a receiver equipped with np receive antennas employing SM The received signal corresponding to T consecutive time slots during which the channel is assumed to be constant can be compactly written as










Y
=


HX
+
V






n
R

×
T




,




(
1
)







where Y∈custom-characternR×T is the matrix collecting the signals received at each antenna and time slot, H∈custom-characternR×nT is the flat-fading channel matrix with elements hi,j˜custom-character(0,1), X∈custom-characternT×T is the space-time transmit signal, and V∈custom-characternR×T is the additive white Gaussian noise (AWGN) matrix with elements vi,j˜custom-character(0, N0), where N0 is the noise variance.


It is assumed hereafter that the quasi-static Rayleigh fading channel matrix H is known at the receiver but not at the transmitter, and we remark that since the channel power per matrix entry is unitary, the fundamental signal-to-noise ratio (SNR) is given by






ρ

=
Δ



1

N
0





𝔼
[

tr



(


X
H


X

)


]

.






In turn, in accordance with related QSM literature and as illustrated in FIG. 1, the transmit signal matrix X is constructed in a manner to convey the information of a bit sequence b, both in the form of P digitally modulated signals, as well as in the form of the allocation of such transmissions to different antennas and time instances, as described by










X
=




p
=
1

P


(



s
p
R



A

k
p
R



+


s
p
I



B

k
p
I




)



,




(
2
)







where sp=spR+jspI, with p={1, . . . , P}, are transmit symbols chosen from a complex constellation constellation custom-character of cardinality |custom-character|=M; AkpR and BkpI are dispersion matrices belonging to the sets custom-character={Aq}q=1Qcustom-characternT×T and custom-character={Bq}q=1Qcustom-characternT×T, with Q≙T×nT; and the indices kPR and kPI are the p-th elements of the index vectors kR and kI, respectively, which are selected from an optimized set of index vectors custom-character={kn}n=1Ncustom-characterP, with N≙└(PQ)┘2x.


With regards to equation (2), and again referring to FIG. 1, we clarify that in QSM schemes the bit sequence b is subdivided into a sequence bS, of length BS≙P log2 |custom-character|=P log2 M, which corresponds to the information encoded in the symbols s={s1, . . . , sP}, taken from custom-character, and the conjugate sequences bR and bI, both of length custom-character≙log2|custom-character|=log2 N, which correspond to the information encoded in the selection of spatial-temporal resources according to the dispersion matrix index vectors kR and kI from custom-character.


In view of the above, it can be said that the design of a specific QSM scheme amounts essentially to the method employed in the construction of each of the Q dispersion matrices Aq and Bq in the sets custom-character and custom-character, and the selection of the set custom-character containing the index vectors kR and kI which inform the choices of dispersion matrices used in each transmission.


To exemplify how state-of-the-art (SotA) QSM schemes can be cast into the general framework described by equation (2), consider first the QSM scheme proposed in 119. In this case, the dispersion matrices reduce to dispersion vectors (i.e., T=1 and Q=nT) which are given by











A
q

=



e
q



and



B
q


=

je
q



,




(
1
)







where eq is the q-th column of IQ, and no specific design criteria are given for the selection of the indices in the index vectors kR and kI.


In turn, in the DA-QSM scheme of [20], two-column dispersion matrices (i.e., T=2 and Q=nT) are employed so as to exploit transmit diversity. In particular, in this scheme










A
q

=



M

n
T


q
-
1




A
~



and



B
q


=


jM

n
T


q
-
1




B
~







(
4
)







with








A
~


=
Δ




[




I
2






0


(


n
T

-
2

)

×
2





]



and



B
~



=
Δ


[




M
2






0


(


n
T

-
2

)

×
2





]



,




where Mn is an n×n cyclic lower-shift matrix, Mn is obtained by circularly shifting the bottom row of In to the top, such that, e.g.,







M
3

=


[



0


0


1




1


0


0




0


1


0



]

.





such that its (q−1)-th power pre-multiplied to a given matrix results in a shift of the bottom (q−1) rows of the latter to the top.


From the above it is visible that the DA-QSM scheme improves over the QSM scheme of 19 essentially by adding diversity, i.e., by extending the transmission instances from T=1 to T=2. However, the dispersion matrices of the DA-QSM method are still real, just as those of the QSM scheme, implicating that no additional multiplexing capability is aggregated, and that coding gains is not optimized.


In contrast, the EDA-QSM method improves over the latter on both aspects. In particular, in this scheme the dispersion matrices are more elaborately designed as











A
q

=


A


4


(


-
1

)


+
i


=




e




C
i




and



B
q


=


B


4


(


-
1

)


+
i


=


e




D
i






,




(
5
)







where ecustom-character is the custom-character-th column of IL, with L≙nT/2, the indices i∈{1, . . . , 4} and custom-character∈{1, . . . , L}, and the core matrices Ci and Di are based on the Sezginer-Sari-Biglieri (SSB) STBC of 24 described by










S
SSB

=


[





s
1

+

bs
3






-

s
2
*


+

jbs
4
*








s
2

+

bs
4






s
1
*

-

jbs
3
*





]

=





i
=
1

4




s
i
R



C
i



+


s
i
I



D
i








(
6
)











with


b


=
Δ




(

1
-

7


)

+

j

(

1
+

7


)


4


,











C
1


=
Δ


I
2


,


C
2


=
Δ

Z

,


C
3


=
Δ

bW

,


and



C
4



=
Δ


Z
·

C
3







(

7

a

)














D
1


=
Δ



jM
2

·
Z


,


D
2


=
Δ


Z
·

D
1



,


D
3


=
Δ



-

bM
2


·
W
·

M
2



,


and



D
4



=
Δ


Z
·

D
3



,




(

7

b

)










where


Z


=
Δ




[



0



-
1





1


0



]



and


W


=
Δ


[



1


0




0



-
j




]






Through the concise description above it becomes easy to see that the fundamental distinction between the DA-QSM and the EDA-QSM methods is that the dispersion matrices of EDA-QSM are complex-valued, such that the orthogonality between the real and imaginary dimensions are better exploited in order to reap multiplexing and coding gains.


Two fair criticisms that can be made of the aforementioned schemes- and in fact, to the best of our knowledge of all existing SotA QSM methods proposed so far—are, however: a) that the scheme does not scale systematically simultaneously over space and time, for arbitrary T>2; and b) that the coding gain achieved is not optimum. Mitigating these two limitations is the objective of our first contribution described in the following section.


III. Optimized Scalable Quadrature Spatial Modulation Transmitter Design
A. Spectral Efficiency Optimality of QSM Schemes

Given the number of bits carried by the transmission of each QSM transmit symbol X as per equation (2), and the fact that such transmission requires T successive channel uses, the SE ζ of any QSM scheme is given by











ζ

(

P
,

T
;
M

,

n
T


)

=


B
T

=


1
T



(


2





log
2

(



Q




P



)




+

P


log
2


M


)




,




(
8
)







where we recall that Q≙TnT and adopt a notation meant to emphasize that P and T are seen as fundamental QSM design parameters, while M and nT are considered to be system constraints.



FIG. 2 shows the Spectral efficiency of OS-QSM scheme with T=2, 4, and 8, for a given system with nT=8 and M=4


The presence of the binomial coefficient (PQ) in equation (8) implicates that the SE function ζ(P, T; M, nT) is monotonically descending on T, for a fixed P, and concave on P, for a fixed T, as well as on the ratio P/T. This is well illustrated in the plots offered in FIG. 2 from which it can be seen that in a system with nT=8 and M=4, the highest attainable SEs denoted by ζ*, are achieved with P=11, 21 and 42, for T=2, 4 and 8, respectively.


Motivated by the discussion above, we seek analytical expressions for the optimum ratio P/T that maximizes the SE, given nT and M, which in turn can be used to determine the relative SE reduction incurred in setting T<nT for large-scale systems with nT→∞. To this end, consider the upper and lower-bounds on the binomial coefficient discovered in [32], namely













e


-
1

/
8




2

π

P







(

Q

Q
-
P


)


Q
+

1
2



·


(


Q
-
P

P

)

P



<

(



Q




P



)

<





1


2

π

P






(

Q

Q
-
P


)


Q
+

1
2






(


Q
-
P

P

)

P






=
Δ


β

(

P
;
Q

)




,




1

P
<
Q


,




(
9
)







where for future convenience we implicitly defined the upper-bounding function β(P; Q).


Using equation (9) into equation (8) yields the bound










ζ

(

P
,

T
;
M

,

n
T


)

<





1
T




log
2

(



β
2

(

P
;
Q

)

·

M
P


)






=
Δ



ζ
+

(

P
,

T
;
M

,

n
T


)



.





(
10
)







Taking the derivative of the latter expression with respect to P yields













ζ
+




P


=



1
T

·





P



[


2



log
2

(


1


2

π

P






(

Q

Q
-
P


)


Q
+

1
2






(


Q
-
P

P

)

P


)


+

P



log
2

(
M
)



]



=



1

T


ln

(
2
)



[


2


(


ln

(


1
-
ε

ε

)

+


1
-

2

ε



2


Q

(

ε
-
1

)


ε



)


+

ln

(
M
)


]

==


1

T


ln

(
2
)



[


ln

(




(

1
-
ε

)

2

·
M


ε
2


)

+

(


1
-

2

ε



Q


ε

(

ε
-
1

)



)


]







(
11
)







where in the second line we relax the constraint that P∈custom-character and expressed more generally P=εQ, introducing the positive quantity ε≤1


Equating the expression in equation (11) to zero yields the following analytical implicit expression to determine the optimal number of symbols P* that maximizes the SE of a QSM system with Q=TnT spatial-temporal resources and employing an M-ary constellation











P
*

=





ε

Q







2

ε

-
1



(

ε
-
1

)



ln

(



(


1
-
ε

ε

)

2


M

)




=


ε

Q


=
Δ

P



,




(
12
)







where we emphasize that the quantity on the righthand side of the expression is in fact sought after number of transmitted symbols P.


But recalling that the desired P* is also the largest possible, equation (12) implies that











P
*

=



m
(



2

ε

-
1



(

ε
-
1

)




ln

(


1
-
ε

ε

)

2


M


)




,




(
13
)







which in turn implies that the optimum & is such that










(


1
-
ε

ε

)

2


M

=
1

,




i.e., the solution of the quadratic polynomial (M−1)ε2−2Mε+M, which finally yields, simply











P
*

=






M
-

M



M
-
1



Q



=




ε
M
*



Tn
T






,




(
14
)







where we introduced the implicitly-defined optimum gradient







ε
M
*


=
Δ




M
-

M



M
-
1


.





We emphasize that the elegant result offered in equation (14) is general for any QSM scheme. From this result it is seen that the optimum ratio P/T that maximizes the SE of the QSM scheme is linear on the number of transmit antennas nT. In other words, for any given M and nT, an SE-optimum QSM must be such that P/T scales linearly with nT, as illustrated and confirmed by the simulation results shown in FIG. 3


This means, FIG. 3 showing the effect of T and M on the optimum ratio P*/T between the number of transmit symbols and epochs


Recall also that QSM dispersion matrices are generally constructed with basis on STBCs characterized by T×T square encoding matrices. Consequently, it follows that if P must scale with nT in order for the QSM to be SE-optimal, so must the size T of the code, in order for the underlying STBC itself to retain SE-optimality. In other words, equation (14) also implicates that in order to achieve SE-optimality, a QSM scheme conveying M-ary symbols must employ an underlying full-rate STBC of a size that scales proportionally to the number of transmit antennas nT.


It must be remarked, that setting T=nT is not a scalable proposition, not only because it implies furbishing the transmitter with an equal number of RF chains, which can be prohibitively expensive, but also because it results in fully dense signals, which in turn require also prohibitively complex ML receivers. This observation motivates the comparisons given in FIG. 4. which shows the fraction of the maximum attainable spectral efficiencies ζ* occurring at P*, obtained by QSM schemes employing STBCs of different sizes, as a function of nT and for different M. It can be seen that QSM schemes with T sufficiently large, but still significantly less than nT, also asymptotically achieve near optimal SE as long as nT is sufficiently large.



FIG. 4 depicts the Behavior of fractional peak spectral efficiency as a function of nT, for different sizes of T and M.


In view of these results, in the next section it is introduce the a new QSM transmitter design, including both the description of how to construct QSM dispersion matrices based on optimal STBCs of arbitrary size, as well as a new systematic mechanism to obtain the associated set of index vectors used in their selection during transmission. For clarity of explanation, we first take the simpler example of the 2×2 case, to introduce the construction of dispersion index set for optimal diversity gain. The extension of the scheme to generalized T follows subsequently.


B. Golden (2×2) Dispersion Matrices and Optimal Index Sets

Before we proceed to describing the proposed dispersion matrix construction let us, without loss of generality and to facilitate comparison with existing methods, impose the assumption that the transmit signal matrix X as per equation (2) satisfies the unity average transmission power constraint for each active transmit antenna, such that custom-character[tr(XHX)]=PT under constellations with unity average power symbols. Then, consider the 2×2 Golden code, which compactly encodes four symbols {s1, s2, s3, s4} into the matrix,











S
G

=


1

5


[




α

(


s
1

+


s
2


θ


)




α

(


s
3

+


s
4


θ


)






j



α
¯

(


s
3

+


s
4



θ
¯



)






α
¯

(


s
1

+


s
2



θ
¯



)




]


,




(
15
)







where θ and θ denote the complementary Golden numbers θ=(1+√{square root over (5)})/2 and θ=(1−5)/2), respectively, and α=1+j(1−θ) and α=1+j(1−θ) are the optimized coefficients for the Gaussian integer constellation sets.


The construction of QSM dispersion matrices based on the latter Golden code follows from the decomposition of SG into the auxiliary matrices Ci and Di, which are used to modulate the real part siR and imaginary part siI of each i-th symbol encoded, respectively, such that


where











S
G

=


1

5







i
=
1

4


(



s
i
R



C
i


+


s
i
I



D
i



)




,




(
16
)











C
1


=
Δ


[



α


0




0



α
_




]


,


C
2


=
Δ


θ
·

C
1



,


C
3


=
Δ



J

2
,
1


·

C
1

·

M
2



,


C
4


=
Δ



J

2
,
1


·

C
2

·

M
2



,



and



D
i



=
Δ


jC

i




,




with







Θ

=
Δ




[



θ


0




0



θ
_




]



and



J

2
,
1




=
Δ


[



1


0




0


j



]



,




and note that post-multiplying the circular lower-shift matrix Mn to a given matrix X results in a column-wise circular shift of X to the left. In possession of the above auxiliary matrices, the Golden dispersion matrices are then built using the Kronecker product operations following a similar strategy, namely










A
q

=


A


4


(


-
1

)


+
i


=




2
5





e




C
i




and



B
q


=


B


4


(


-
1

)


+
i


=



2
5





e





D
i

.










(
18
)







Regarding the scaling factor in equation (18), the denominator 1/√{square root over (5)} is passed over from the coefficient of the Golden code as in equations (15) and (16), while the numerator √{square root over (2)} is the result of power scaling required to ensure that the transmit power constraint custom-character[tr(XHX)]=PT is satisfied. To elaborate further, from equations (2) and (18) it follows that custom-character[tr(XHX)]=PT implies that the dispersion matrices must satisfy tr(AqHAq)=T and tr(BqHBq)=T, for all q∈{1, . . . , Q}, whereas from the construction of the auxiliary matrices C and Di as per equations (17) it is evident that tr(CiHCi)=1 and tr(DiHDi)=1, such that a power scaling of T=2 onto Ci and Di, i.e., an amplitude scaling of √{square root over (2)} onto Ci and Di, is needed.


The Golden codes are known to outperform the SSB codes employed in the EDA-QSM scheme, while having structure very similar to the latter, such that their utilization in the construction of dispersion matrices as described above is, in and of itself, bound to improve the performance of QSM schemes over those briefly described in Subsection II-B, as shall be demonstrated later via simulated comparisons.


There is, however, another mechanism to improve the performance of QSM schemes employing STBCs, namely, to optimize the selection of the index vectors in custom-character={kn}n=1Ncustom-characterP that determine which dispersion matrices are assigned to the real and imaginary parts of each encoded symbol. This is because each index vectors kn is, according to equations (17) and (18), associated with different subsets of spatial-temporal resources utilized by the QSM scheme in the transmission of a given set of spatially encoded bits. To illustrate the issue, define the set custom-character* of all (PQ) distinct index vectors for a given pair (P,Q), and consider the corresponding example compiled in Table π for the case P=3 and Q=2nT=8.


Recall also that each dispersion matrix in the transmission of spR or spI uses two given pairs of antennas and time slots, as per equations (17) and (18), such that for the sake of conciseness we hereafter refer to each pair of one antenna and one time slot simply as a spatial temporal resource rq, defining also for future convenience the set of all available and utilized spatial temporal resources, denoted respectively by custom-character* and custom-character. Then, if resources and dispersion matrix indices are represented by a rectangular and a circular nodes, respectively, a bipartite graph such as the one shown in FIG. 5 for the case in question (i.e., P=3 and Q=8) can be built, in which an edge connecting a circular and a rectangular nodes indicates that the corresponding resource is used by the given dispersion matrix.



FIG. 5 is the bipartite graph representing the spatial-temporal resource usage associated with each index vector kn for a QSM system with P=3 and Q=8. The particular examples of k1, k37 and k54 are explicitly illustrated


As illustrated by the graph, the inclusion of a given index set kn from custom-character* into the set custom-character is associated with the use of certain resources, occasionally with multiplicity, identified by the graph edges intercepted by the enclosure encircling the corresponding indices. We shall therefore use the notation kn⇒rn to indicate that the index set kn implicates the utilization of the set of resources rn, and μkn (rq) to denote the multiplicity of the resource rq in the set kn.


For example, the use of the resources r1={2×(1,1),(1,2),(2,1),2×(2,2)} results from having k1=[1,2,3] in custom-character, such that we may write concisely k1⇒r1, with μk1(1,1)=μk1(2,2)=2. Similarly, k37=[3,4,5]⇒r37={(1,1),(1,2),(2,1),(2,2),(3,1),(4,2)}, and k54=[5,6,8]⇒154={(3,1),2×(3,2),2×(4,1), (4,2)}, with μr37(3,2)=μr37(4,1)=2.


It is evident from all the above that in order to avoid redundancy and uneven utilization of spatial-temporal resources, so as to optimize the performance of QSM schemes, the sets of dispersion matrix indices K (with corresponding resource set R) must satisfy the following conditions:

    • a) no two index vectors kn and km in the set can be equal (i.e., kn≠km, ∀n≠m);
    • b) no two elements in each index vector can be equal (i.e., [kn]; #[kn]i≠[kn]j, and i≠j);
    • c) the utilization of all resources available must be ensured (i.e., custom-character(rq)>0∀rqcustom-character);
    • d) all resources are utilized as often (i.e., custom-character(r1)= . . . =custom-character(rQ)), and finally
    • e) the cardinality of the set must be a power of 2 in order to enable the encoding of codewords (i.e., N=|custom-character|=└(PQ)┘2x).


As an example, we highlight in Table I the set of index vectors custom-character={k1, k2, k3, k5, . . . , kg, k10,


k11, k19, . . . , k23, k26, k27, k28, k35, . . . , k38, k41, k42, k47, k48, k50, . . . , k56}. The reader can verify that by this choice of custom-character, all resources in the associated set custom-character have multiplicity 24. In contrast, a naive truncation of the first 32 index vectors in Table I. i.e. custom-character={k1, . . . , k32} as suggested e.g. in 23, leads to an uneven utilization pattern in which custom-character(1,1)=custom-character(2,2)=32,custom-character(1,2)=custom-character(2,1)=28,custom-character(3,1)=custom-character(4,2)=19 and custom-character(3,2)=custom-character(4,1)=17, which is obviously sub-optimum as it leads to antennas 1 and 2 being used far more often than antennas 3 and 4.


The problem of selecting the optimum set custom-character as described and illustrated above relates to a classic problem in combinatorics graph theory known as the Vertex Cover Problem. In the context hereby, however, the problem has the additional difficulties that: a) the graph in question is bipartite, b) coverage with equal multiplicity is required, and c) nodes must be selected in subsets of three at a time.









TABLE I







Sets of All Possible Index Vectors custom-character  * and


Resources custom-character  * (P = 3, nT = 4, T = 2)








Elements of custom-character  *
Elements of custom-character  *












k1
[1, 2, 3]
(1, 1), (2, 2), (1, 2), (2, 1), (1, 1), (2, 2)


k2
[1, 2, 4]
(1, 1), (2, 2), (1, 2), (2, 1), (1, 2), (2, 1)


k3
[1, 2, 5]
(1, 1), (2, 2), (1, 2), (2, 1), (3, 1), (4, 2)


k4
[1, 2, 6]
(1, 1), (2, 2), (1, 2), (2, 1), (3, 2), (4, 1)


k5
[1, 2, 7]
(1, 1), (2, 2), (1, 2), (2, 1), (3, 1), (4, 2)


k6
[1, 2, 8]
(1, 1), (2, 2), (1, 2), (2, 1), (3, 2), (4, 1)


k7
[1, 3, 4]
(1, 1), (2, 2), (1, 1), (2, 2), (1, 2), (2, 1)


k8
[1, 3, 5]
(1, 1), (2, 2), (1, 1), (2, 2), (3, 1), (4, 2)


k9
[1, 3, 6]
(1, 1), (2, 2), (1, 1), (2, 2), (3, 2), (4, 1)


k10
[1, 3, 7]
(1, 1), (2, 2), (1, 1), (2, 2), (3, 1), (4, 2)


k11
[1, 3, 8]
(1, 1), (2, 2), (1, 1), (2, 2), (3, 2), (4, 1)


k12
[1, 4, 5]
(1, 1), (2, 2), (1, 2), (2, 1), (3, 1), (4, 2)


k13
[1, 4, 6]
(1, 1), (2, 2), (1, 2), (2, 1), (3, 2), (4, 1)


k14
[1, 4, 7]
(1, 1), (2, 2), (1, 2), (2, 1), (3, 1), (4, 2)


k15
[1, 4, 8]
(1, 1), (2, 2), (1, 2), (2, 1), (3, 2), (4, 1)


k16
[1, 5, 6]
(1, 1), (2, 2), (3, 1), (4, 2), (3, 2), (4, 1)


k17
[1, 5, 7]
(1, 1), (2, 2), (3, 1), (4, 2), (3, 1), (4, 2)


k18
[1, 5, 8]
(1, 1), (2, 2), (3, 1), (4, 2), (3, 2), (4, 1)


k19
[1, 6, 7]
(1, 1), (2, 2), (3, 2), (4, 1), (3, 1), (4, 2)


k20
[1, 6, 8]
(1, 1), (2, 2), (3, 2), (4, 1), (3, 2), (4, 1)


k21
[1, 7, 8]
(1, 1), (2, 2), (3, 1), (4, 2), (3, 2), (4, 1)


k22
[2, 3, 4]
(1, 2), (2, 1), (1, 1), (2, 2), (1, 2), (2, 1)


k23
[2, 3, 5]
(1, 2), (2, 1), (1, 1), (2, 2), (3, 1), (4, 2)


k24
[2, 3, 6]
(1, 2), (2, 1), (1, 1), (2, 2), (3, 2), (4, 1)


k25
[2, 3, 7]
(1, 2), (2, 1), (1, 1), (2, 2), (3, 1), (4, 2)


k26
[2, 3, 8]
(1, 2), (2, 1), (1, 1), (2, 2), (3, 2), (4, 1)


k27
[2, 4, 5]
(1, 2), (2, 1), (1, 2), (2, 1), (3, 1), (4, 2)


k28
[2, 4, 6]
(1, 2), (2, 1), (1, 2), (2, 1), (3, 2), (4, 1)


k29
[2, 4, 7]
(1, 2), (2, 1), (1, 2), (2, 1), (3, 1), (4, 2)


k30
[2, 4, 8]
(1, 2), (2, 1), (1, 2), (2, 1), (3, 2), (4, 1)


k31
[2, 5, 6]
(1, 2), (2, 1), (3, 1), (4, 2), (3, 2), (4, 1)


k32
[2, 5, 7]
(1, 2), (2, 1), (3, 1), (4, 2), (3, 1), (4, 2)


k33
[2, 5, 8]
(1, 2), (2, 1), (3, 1), (4, 2), (3, 2), (4, 1)


k34
[2, 6, 7]
(1, 2), (2, 1), (3, 2), (4, 1), (3, 1), (4, 2)


k35
[2, 6, 8]
(1, 2), (2, 1), (3, 2), (4, 1), (3, 2), (4, 1)


k36
[2, 7, 8]
(1, 2), (2, 1), (3, 1), (4, 2), (3, 2), (4, 1)


k37
[3, 4, 5]
(1, 1), (2, 2), (1, 2), (2, 1), (3, 1), (4, 2)


k38
[3, 4, 6]
(1, 1), (2, 2), (1, 2), (2, 1), (3, 2), (4, 1)


k39
[3, 4, 7]
(1, 1), (2, 2), (1, 2), (2, 1), (3, 1), (4, 2)


k40
[3, 4, 8]
(1, 1), (2, 2), (1, 2), (2, 1), (3, 2), (4, 1)


k41
[3, 5, 6]
(1, 1), (2, 2), (3, 1), (4, 2), (3, 2), (4, 1)


k42
[3, 5, 7]
(1, 1), (2, 2), (3, 1), (4, 2), (3, 1), (4, 2)


k43
[3, 5, 8]
(1, 1), (2, 2), (3, 1), (4, 2), (3, 2), (4, 1)


k44
[3, 6, 7]
(1, 1), (2, 2), (3, 2), (4, 1), (3, 1), (4, 2)


k45
[3, 6, 8]
(1, 1), (2, 2), (3, 2), (4, 1), (3, 2), (4, 1)


k46
[3, 7, 8]
(1, 1), (2, 2), (3, 1), (4, 2), (3, 2), (4, 1)


k47
[4, 5, 6]
(1, 2), (2, 1), (3, 1), (4, 2), (3, 2), (4, 1)


k48
[4, 5, 7]
(1, 2), (2, 1), (3, 1), (4, 2), (3, 1), (4, 2)


k49
[4, 5, 8]
(1, 2), (2, 1), (3, 1), (4, 2), (3, 2), (4, 1)


k50
[4, 6, 7]
(1, 2), (2, 1), (3, 2), (4, 1), (3, 1), (4, 2)


k51
[4, 6, 8]
(1, 2), (2, 1), (3, 2), (4, 1), (3, 2), (4, 1)


k52
[4, 7, 8]
(1, 2), (2, 1), (3, 1), (4, 2), (3, 2), (4, 1)


k53
[5, 6, 7]
(3, 1), (4, 2), (3, 2), (4, 1), (3, 1), (4, 2)


k54
[5, 6, 8]
(3, 1), (4, 2), (3, 2), (4, 1), (3, 2), (4, 1)


k55
[5, 7, 8]
(3, 1), (4, 2), (3, 1), (4, 2), (3, 2), (4, 1)


k56
[6, 7, 8]
(3, 2), (4, 1), (3, 1), (4, 2), (3, 2), (4, 1)



















Method 1 Greedy Construction of Optimal Set of Index Vectors K







 Internal Parameters: Number of resources Q = T · nT and set of all possible indices custom-character *.


 Inputs: Number of symbols P, of transmit antennas nT and dimension T′ of FDFR STBC.


 Outputs: Optimized set of index vectors custom-character .











 1:







Choose


a


random


seed


n




{

1
,


,

(



Q




P



)


}



and


start


with


𝒦


=


;









 2:





while





"\[LeftBracketingBar]"

𝒦


"\[RightBracketingBar]"









(



Q




P



)




2
×




do










 3:
 Insert kn into the set custom-character  of selected index vectors;


 4:
 Sort all indices k ∈ {1, . . . , Q} in ascending order of their multiplicities in custom-character :


 5:
 Set D = P and construct/clear the empty set custom-character  = Ø of candidate index vectors;


 6:
 while |custom-character | = 0 do


 7:
  Construct a list κ of candidate indices with the D lowest multiplicities in custom-character ;





 8:
  
Constructtheset𝒦_ofall(DP)indexvectorsk_mwithindicesinκ;






 9:
  Remove from custom-character  all index vectors already in custom-character ;


10:
  if | custom-character | = 0 then


11:
   Increment D by 1;


12:
  end if


13:
 end while





14:

Selectnextn{1,,(QP)}asthepositionofthefirstindexvectork~mof𝒦~in𝒦*;






15:
end while









Due to these peculiarities, the problem itself is, to the best of our knowledge, original and cannot be solved by known variations of the Vertex Cover algorithm. Fortunately, the highly symmetric structure of the associated bipartite graph can be exploited to design an efficient algorithm to solve the selection problem at hand. To that end, let us commit a slight abuse of notation and define the multiplicity of a dispersion matrix index □2 q in the set custom-character as custom-character(q). Then, by virtue of the symmetry of the graph (see FIG. 5], a solution custom-character in which custom-character(1)= . . . =custom-character(Q) implies a solution custom-character in which each of the spatial temporal resources {(1,1), (1,2), (2,1), (2,2), (3,1), (4,2)} have the same multiplicity. Consequently, the problem can be solved efficiently by the greedy selection of indices, as described in method 1.


C. Optimal Generalized Design (T×T)

Due to the greedy optimal index vector selection algorithm described above, which is general on P, T and nT, the last limiting factor preventing the generalization of QSM to arbitrary Tis the construction of the dispersion matrices with basis on STBCs of arbitrary size. This obstacle is eliminated by considering the design of QSM dispersion matrices based on the Perfect FDFR STBC.


A T×T FDFR STBC encodes T2 symbols such that the average energy transmitted per antenna is normalized to unity, an energy efficiency-shaping constraint is enforced, and a SE-preserving lower bound on the coding gain (a.k.a, non-vanishing determinant) is maximized. Ultimately, for given T∈custom-character+ the design can be described by










S
P

=




t
=
1

T


diag




(

R
·

s
t


)

·

J

T
,

t
-
1



·

N
T

t
-
1









(
19
)







where st=[s1+(t-1)T, s2+(t-1)T, . . . , stT]T, with t={1, . . . , T} are vectors each carrying T distinct transmit symbols, R is a T×T optimum lattice generating matrix JT,n is a T×T matrix constructed by replacing the last n diagonal entries of the identity matrix by the elementary complex number j, and NT is a T×T cyclic upper-shift matrix (notice that JT,n generalizes J2,1 used in equation 17. In turn, oppositely to Mn, Nn is obtained by circularly shifting the top row of In to the bottom. Some examples are








J

2
,
0


=

[



1


0




0


1



]


,


J

3
,
2


=



[



1


0


0




0


j


0




0


0


j



]



and







N
3


=

[



0


1


0




0


0


1




1


0


0



]



,




such that post-multiplying it to a given matrix X results in a column-wise shift of X to the right.


Notice that the Perfect FDFR STBC of fully generalizes the 2×2 Golden code of [28]. To see that, suffice it to consider the case T=2 and with the corresponding lattice generating matrix







R
=


1

5


[



α



α

θ






α
_





α
_



θ
_





]


,




such that equation (19) yields













S
P

=


diag




(



1

5


[



α



α

θ






α
_





α
_



θ
_





]


[




s
1






s
2




]

)

·

J

2
,
0


·

N
2
0



+

diag




(



1

5


[



α



α

θ






α
_





α
_



θ
_





]


[




s
3






s
4




]

)

·


J

2
,
1


·

N
2
1










=




1

5


[




α

(


s
1

+


s
2


θ


)




α

(


s
3

+


s
4


θ


)






j



α
¯

(


s
3

+


s
4



θ
¯



)






α
¯

(


s
1

+


s
2



θ
¯



)




]

=

S
G









(
20
)







It follows that in order to be employ Perfect FDFR STBCs in the design of QSM. suffice it to decompose the core code structure of equation 19 in terms of corresponding auxiliary dispersion matrices Ci and Di due to symmetry, namely











C
i

=


C


T

(

w
-
1

)

+
t


=

diag




(

R
·

e
t


)

·

J

T
,

w
-
1







,



·

N
T

w
-
1





and



D
i


=

j


C
i







(
21
)







where the generalized indices i∈{1, . . . , T2} are constructed systematically on t∈{1, . . . , T} and w∈{1, . . . , T}, and et is the t-th column of IT.


Following this, the full set of dispersion matrices custom-character and custom-character can be built, i.e.,











A
q

=


A



T
2

(


-
1

)

+
i


=


γ



e




C
i




and



B
q


=


B



T
2

(


-
1

)

+
i


=

γ



e




D
i







,




(
22
)







where again q∈{1, . . . , Q}, i∈{1, . . . , T2} as from equation (21), custom-character is the custom-character-th column of IL, but custom-character∈{1, . . . , L} with L=└nT/T┘, as well as a generalized scaling factor γ determined depending on the specific STBC in order to adjust the powers of the dispersion matrices such that tr(AqHAq)=T and tr(BqHBq)=T.


Next, we turn our attention to the construction of the optimal set of index vectors custom-character, via a straightforward generalization of the method described in Subsection III-B. Indeed, as can be learned by inspecting equation (21, each auxiliary matrix Ci and Di is a T×T sparse matrix obtained from a cyclic rotation of a diagonal matrix containing only T non-zero elements of R.












Method 2 QSM Signal Generation







 Internal Parameters: Number of symbols P, transmit antennas nT, time slots T and






spatial-temporalresourcesQ=T·nT;andcardinalitiesM"\[LeftBracketingBar]"𝒮"\[RightBracketingBar]",andN="\[LeftBracketingBar]"𝒦"\[RightBracketingBar]"=[(QP)]2x;






 Global Quantities: Symbol constellation custom-character , optimum lattice generating matrix R,






setsofdispersionmatrices𝒜={Aq}q=1Qand={Bq}q=1QwithAqandBqasineq.(22),






 set of index vectors custom-character  obtained from Algorithm 1.


 Input: Information bit sequence b = [bR, bT, bS];


 Outputs: Transmitted signal X.


1: Select index vector κR as the ([bR](10) + 1)-th vector in custom-character ;


2: Select index vector κT as the ([bT](10) + 1)-th vector in custom-character ;










3
:

Assign


the


bits



b
S



to


P


symbols



{


s
1

,


,

s
P


}



selected


from


𝒮

,



with







s
p


=


s
p
R

+

js
p
t



;















4
:

Construct


X

=







p
=
1

P



(



s
p
R



A

k
p
R



+


s
p
t



B

k
p
τ




)



as


per


equation



(
2
)















Consequently, the associated dispersion matrices obtained from equation (22) are all sparse matrices with only T non-zero entries, corresponding to the T spatial-temporal resources utilized. In other words, while in the Golden QSM scheme of Subsection III-B each dispersion matrix index q is associated with 2 resources, in the Perfect STBC-based construction here described each index q associates to T resources, such that the corresponding bipartite graph illustrated in FIG. 5 is merely expanded to a similar graph with T·nT index (circular) nodes and T·nT resource (rectangular) nodes, with each resource node connected to T index nodes and vice versa. As a result, the greedy strategy described earlier remains valid, as evidenced by the fact that Algorithm 1 applies to general T. For the convenience of the reader, we summarize the structure of the proposed scalable QSM scheme in Method 2.


IV. Proposed Receiver Design
A. Sparse Formulation of QSM Receivers

Together, Methods 1 and 2 introduced above demonstrate that the design of OS-QSM transmitters is possible and tractable. There is, however, no true scalability without feasibility, such that in order to complete the task it is also necessary to show that the proposed OS-QSM design is effectively decodable at reasonable complexity.


To put the challenge into context, for given P, M, T and nT, with Q=T·nT, an ML receiver would have to go through (└(PQ)┘2x)2·MP combinations of symbols and selected spatial-temporal resources in order to detect a sequence of 2·└log2(PQ)┘+P·log2 M bits. That means that even for the minimal setting of T=2, P=2 and M=4, a system with nT=8 transmit antennas would require the receiver to go through (└(PQ)┘2x)2·(4)2=16.777.216 combinations in order to decode the corresponding log2 (16.777.216)=24 bits. In other words, ML decoding is highly impractical in QSM systems, especially in the context of massive MIMO systems.


We emphasize that this challenge applies not only to the OS-QSM scheme of Subsection III-B but also to current SotA QSM methods such as those in [20]-[23], as the example given above is for T=2, which is the size of the core codes used in the latter. We furthermore stress that the utilization of SD receivers is also not viable in scaled cases, because the nature of tree search algorithms still requires excessive computational complexity in large systems. Finally, we also remark that since convenient properties such as fast-decodability and block-diagonality are known not to be retainable without sacrifice of optimality for STBC of arbitrary size, a scalable detector for QSM schemes cannot rely on such features.


In light of the above, we introduce hereafter a new detection method for QSM schemes which relies neither on tree-search, nor on specific properties of STBCs and which is completely independent of the infeasible combinatorial factor └(PQ)┘2x. In addition, given prior information on the encoding construction, the proposed decoder is valid to detect any QSM signal.


The core idea of our approach is to take full advantage of a sparse representation of QSM signals over the entire channel (i.e., for all spatial temporal resources available), assumed known at the receiver. The proposed decoding method then leverages the iterative shrinkage thresholding algorithm (ISTA) to greedily extract symbol and dispersion index estimates, resulting in significantly lower complexities compared to ML and SD-based methods. To that end, first combine equations (1) and (2), and consider the vectorized form of the QSM received signal










y

=
Δ



vec


(
Y
)


=






(


I
T


H

)




=
Δ




(



Ξ
A



u
R


+


Ξ
B



u
I



)


+



vec


(
V
)





=
Δ



=





Φ
H

·

(



Ξ
A



u
R


+


Ξ
B



u
I



)


+
v





T


n
R

×
1






,




(
23
)







where we implicitly defined the block-diagonal channel matrix ΦH and vectorized noise v; the dispersion matrices in custom-character and custom-character are also vectorized into

    • aq≙vec(Aq) and bq≙vec (Bq) and concatenated into
    • Ξ≙[a1, . . . , aQ]∈custom-characterQ×Q and ΞB=[b1, . . . , bQ]∈custom-characterQ×Q, respectively;
    • and uRcustom-characterQ×1 (respectively uIcustom-characterQ×1) is set to zero everywhere, except for its elements of indices kRcustom-character (respectively kIcustom-character), which are set to {s1R, . . . , sPR} (respectively {s1I, . . . , sPI}).


Equation (23) can be further simplified by defining the combined and real-imaginary decoupled information and noise vectors










u

=
Δ





[


u
1
R

,

u
1
I

,


,

u
Q
R

,

u
Q
I


]

T






2

Q
×
1




and


v



=
Δ


[


v
1
R

,

v
1
I

,


,

v

T


n
R


R

,

v

T


n
R


I


]



,




(
24
)







as well as the decoupled versions of aq and bq, namely











a
q


=
Δ





[


a

q
1

R

,

a

q
1

I

,


,

a

q
Q

R

,

a

q
Q

I


]

T



and



b
q



=
Δ



[


b

q
1

R

,

b

q
1

I

,


,

b

q
Q

R

,

b

q
Q

I


]

T



,




(
25
)







which in turn can be combined into a single dispersion matrix ΨDcustom-character2Q×2Q, namely











Ψ
D


=
Δ


[


a
1

,

b
1

,

a
2

,

b
2

,


,

a
Q

,

b
Q


]


,




(
26
)







such that the vectorized system model of equation (23) can be re-written as









y
=





Φ
ˇ

H




Ψ
D

·
u


+
v

=




Φ
H

·

Ψ
D

·
u

+
v

=



G
·
u

+
v





2

T


n
R

×
1









(
27
)







where {hacek over (Φ)}H is the quadrature-operated block diagonal channel matrix ΦH, which we implicitly relabeled ΦH, as with the effective channel matrix G≙ΦH·ΨD, for future convenience.


To elaborate on equation (27) with an example, consider a system with P=3, T=2 and nT=4, and assume that for a particular bit sequence b=[bR, bI, bS], the selected index vectors are given by kR=k10=[1,3,7] and kI=k47=[4,5,7]. Then, the corresponding combined information vector becomes u=[s1R,0,0,0, s2R,0,0,s1I,0, s2I,0,0,s3R,s3I,0,0]T. Notice that while u carries in the entries sPR and sPI the P log2 M bits corresponding to the bS subsequence, the remaining 2 log2 N bits corresponding to the subsequences bR and bI are encoded merely by positions of non-zero elements in u, regardless of what the values of sPR and sPI might be, which suggests that the detection of bS could be done separately from that of bR and bI.


In principle, the latter feature could be utilized to design an SD receiver for the OS-QSM method proposed above, similarly to how block-separability was exploited in to do so for the EDA-QSM scheme. The problem with that idea is, of course, the prohibitively large number of combinations how the 2P elements of the decoupled symbol vector s=[s1R, s1I, . . . , sPR, sPI] can be placed among the 20 entries of u. In order to circumvent this challenge, we instead seek to exploit the facts demonstrated in Subsection III-A namely, that: a) the optimum number P* of symbols maximizing SE is a fraction of the total spatial-temporal resources Q=T·nT, as per equation (12); and b) that in large-scale systems with nT»1, a significantly smaller block size T suffices to asymptotically achieve SE optimality, as shown in FIG. 4 Together, these facts imply that the sparsity of u becomes increasingly more prominent in large-scale SE-optimal QSM schemes, which in turn favors sparse recovery algorithms. It is also evident from the inspection of equation (27) that the matrices ΦH and ΨD can be respectively interpreted as the sensing and dictionary matrices typical of compressive sensing (CS) models such that recent progress on sparse and discrete-aware receivers can be leveraged.


Taking into account the focus on scalability, which at the receiver side translates to controlling complexity, two suitable candidate methods to be applied for OS-QSM demodulation are the generalized approximate message-passing (GAMP) method, and the iterative shrinkage thresholding algorithm [ISTA], both of which possess quadratic complexity on the size 2TnT of the signal vector u. It is well-known, however, that the GAMP algorithm relies strongly on the particular structure of measurement matrix and the independence of the received signal, which in the case of QSM cannot be generally assumed, as a direct consequence of the utilization of STBCs in the dispersion matrices. In the absence of the required conditions, GAMP receivers yield poor performance, characterized by error-floors at high SNRS.


Motivated by this fact, it is therefore chosen to follow a ISTA-based approach in the design of a low-complexity demodulator for QSM systems, which is described in the sequel. In particular, it is introduced a method to detect QSM signals, which is based on a purpose-built variation of ISTA that incorporates modifications both on the thresholding function and on the index vector estimation process, specifically to QSM detection.


B. Greedy Boxed ISTA-based QSM Decoder

Consider the standard ISTA recursion,












u
^


(

η
+
1

)


=

Λ

(




u
^


(
η
)


+


1
α




G
T

(

y
-

G



u
^


(
η
)




)



;

λ

2

α



)


,




(
28
)







where û(η) is the estimate of u at the n-th iteration, α=maxeig (GTG) is the shrinkage stepsize, (the actual requirement is that α>maxeig (GTG), however, we will assume the minimum step-size, which is sufficient), A is the threshold factor, and Λ(s; τ) is the soft-thresholding function.


A first meaningful modification to such standard ISTA recursions in equation (28) is to account for the fact that the symbols in the real-valued projected constellation custom-characterRcustom-character(custom-character) are finite, such that in addition to the lower limit t in the vicinity of the origin used to enforce sparsity in the solution, an upper limit max (custom-character) can be introduced into the thresholding function.


In other words, for the case at hand we replace ISTA s standard soft-thresholding function Λ(s; τ) by a hard-thresholding function leading to the “boxed” hard-thresholding function Π(s; τ) illustrated in FIG. 6 and defined by










Π

(

s
;
τ

)


=



{





min

(

𝒮
R

)





s


min

(

𝒮
R

)


,





s





min

(

𝒮
R

)


s


-
τ


,





0







"\[LeftBracketingBar]"

s


"\[RightBracketingBar]"



τ

,





s




τ

s


max

(

𝒮
R

)


,






max

(

𝒮
R

)





max

(

𝒮
R

)


s




.






(
29
)








FIG. 6 is the Comparison of ISTA thresholding and BH-ISTA thresholding functions Λ(s; τ), and Π(s; τ), (29)


Incorporating this modification yields the boxed-hard ISTA (BH-ISTA) receiver described by











u
^


(

η
+
1

)


=


Π

(




u
^


(
η
)


+


1
α




G
T

(

y
-

G



u
^


(
η
)




)



;

λ

2

α



)

.





(
30
)







Notice that the computational cost of repeatedly evaluating equation 30 is dominated by the term Gû(η), therefore quadratic on the number of non-zero entries (i.e., custom-character0-norm) of û(η), which reduces with the iterations η, as illustrated in FIG. 7



FIG. 7 shows the Convergence of ûΠ(η) and ûΛ(η), as per equations 28 and (30, respectively, as a function of iterations η.



FIG. 7(a) is the Sparsity convergence with various threshold values



FIG. 7(b) is the MSE convergence with optimal threshold values


In particular, FIG. 7(a) shows a comparison of the convergence of |û(η)|0 as a function of η for various values of threshold parameter τ, with û(η) obtained both from equations (28 and (30, i.e., via conventional and BH-ISTA, respectively, which for convenience will be hereafter denoted ûΠ(η) and ûΛ(η).


It can in fact be seen that as a result of boxing and hard-thresholding, |ûΠ(0)|0<|ûΛ(0)|0, such that the expected order of complexity associated with evaluating equation 28 is lower than that of evaluating (30), which can be bounded both below and above by the lower- and upper-limits (custom-character(4P2) and custom-character(4Q2)).


More details will be given in Section V-A. In turn, FIG. 7(b) shows that the mean-squared error (MSE) obtained with the proposed BH-ISTA approach is better than that obtained with conventional ISTA, which illustrates the effectiveness of the boxed and hard-thresholding modification here proposed for the demodulation of QSM signals. It is left for us to address, however, how the bits associated with the choices of dispersion matrix indices {kR, kI}∈custom-character can be efficiently detected. To that end, another addition is introduced to the ISTA-based sparse detector, namely, a greedy hard-detection procedure for each symbol recovered, with a concomitant update of equation (30), which can be described as follows.


Let us consider that multiple runs of the BH-ISTA iterations described by equation (30) are performed, such that prior to the m-th run a modification is made to y, G and u, which can be expressed by rewriting equation (30) as












u
^

m

(

η
+
1

)


=

Π

(




u
^

m

(
η
)


+


1
α




G
m
T

(


y
m

-


G
m




u
^

m

(
η
)




)



;

λ

2

α



)


,




(
31
)







e we convene that for the first run (m=1) we set y1=y, G1=G and û1(1)=02Q.


Let η* be the last iteration of the m-th run of the latter estimator, with its corresponding outcome denoted by ûm(η*). And finally, let {tilde over (s)}{circumflex over (q)}m be the entry of ûm(η*)with the largest amplitude, whose position is denoted by {circumflex over (q)}m, such that we may write
















s
~



q
^

m


=

{



[


u
^

m

(

η
*

)


]



q
^

m







[


u
^

m

(

η
*

)


]



q
^

m








"\[RightBracketingBar]"


>



"\[LeftBracketingBar]"



[


u
^

m

(

η
*

)


]





"\[RightBracketingBar]"



,






{

1
,


,

2

Q


}




}

,




(
32
)







where custom-character, denotes the custom-character-th element of a generic vector x. It is emphasized that in the greedy procedure summarized by equation 32, two distinct pieces of information on the bit sequence b are obtained, namely, a soft estimate {tilde over (s)}{circumflex over (q)}m of one of the modulated symbols {spR,spI}∈custom-character, and a hard estimate {circumflex over (q)}m of one of the indices contained in the selected index sets {kR,kI}∈custom-character. In possession of such information, the following steps are then executed in order to produce the modified quantities required to perform the next run of the BH-ISTA recursion described by equation (31).


First, a hard-detected version of {tilde over (s)}m is obtained by projecting in onto SR, that is











s
^



q
^

m


=



𝒫

𝒮
R


(


s
~



q
^

m


)

.





(
33
)







Then, the remaining quantities are updated following













u
^


m
+
1


(
1
)


=


(


I

2

Q


-

diag

(

e


q
^

m


)


)




u
^

m

(

η
*

)




,


y

m
+
1


=


y
m

-



G
m

·

e


q
^

m






s
^



q
^

m





,

and






G

m
+
1


=


G
m

(


I

2

Q


-

diag

(

e


q
^

m


)


)


,





(
34
)







where I2Q is the identity matrix of size 2Q and e{circumflex over (q)}m its {circumflex over (q)}m-th column. Recall that due to the quadrature-decomposed structure of the sparse vector u, all odd index estimates {circumflex over (q)}m correspond to the real parts of modulated symbols, while even {circumflex over (q)}m correspond to the imaginary parts, respectively. It is therefore sensible that, as equations (31) through (35) are evaluated iteratively, the obtained index estimates {{circumflex over (q)}1, {circumflex over (q)}2, . . . , {circumflex over (q)}m} be split and collected accordingly into the subsequences












q
^

m
R


=




{





q
^

m



mod

(



q
^

m

,
2

)


=
1

,


m


}



and








q
^

m
I


=



{





q
^

m



mod

(



q
^

m

,
2

)


=
0

,


m


}


,





(
35
)







where mod(x, 2) denotes the modulo-2 operation onto x.


If there are no errors during the detection process, after exactly m=2P runs, the sequences {{circumflex over (q)}mR, {circumflex over (q)}mI} can be perfectly mapped to {kR,kI}, in particular via












k
^

R

=



1
2



(



q
^

m
R

+
1

)



and




k
^

I


=


1
2




q
^

m
I




,




(
36
)







such that procedure comes to a stop.


More generally, however, errors may occur, such that either {circumflex over (q)}mR or {circumflex over (q)}mI, or both, contain incorrect indices even with cardinality P. In such cases, the procedure continues until both subsequences contains the first P-tuple of indices included in the dispersion matrix index vector set custom-character, at which point a modification of the update equations is required, which can be described as follows.


Let custom-character(q) denote the projection of a sequence q onto the set custom-character, such that either a sequence k∈custom-characteror the empty set Ø is returned by the projection, depending on whether or not q contains within it a sequence from custom-character. If multiple valid k∈custom-character exist in the combination of elements in q, the viable elements with lower indices in q (not of the element values themselves) take priority, such that the notion of greedy selection is coherent. Then, equation (34) can be expanded into











u
^


m
+
1


(
1
)


=




(
37
)









{






[


I

2

Q


-




q

odd



diag

(

e
q

)



]




u
^

m

(

η
*

)







upon


confirmation


of




k
^

R



from




q
^

m
R


,
or







[


I

2

Q


-




q

even



diag

(

e
q

)



]




u
^

m

(

η
*

)







upon


confirmation


of




k
^

I



from




q
^

m
I


,
or







[


I

2

Q


-

diag

(

e


q
^

m


)


]




u
^

m

(

η
*

)





otherwise



.





In plain words, equation (37) establishes that after the m-th run of the BH-ISTA detector, the initial state of the estimate vector ûm+1(1) for the next run is either:

    • a) updated by removing the latest estimate symbol, when neither of {circumflex over (q)}mR and âmI can be projected to custom-character, which happens either when the number of indices acquired are insufficient (less than P) to decide on valid estimates of kR or kI, or when the number of indices are sufficient (P or larger) but none contains valid combinations of indices to any k∈custom-character; or
    • b) updated by nulling all odd entries of q∈{1, 3, . . . , 2Q−3, 2Q−1}, when a hard decision of {circumflex over (k)}R is confirmed from the projection of {circumflex over (q)}R onto custom-character, which will only happen once throughout the demodulation procedure; or c) updated by nulling all even entries q∈{2, 4, . . . , 2Q−2, 2Q}, when a hard-decision of {circumflex over (k)}mI is confirmed from the projection of {circumflex over (q)}mI onto custom-character, which also can happen only once.
    • c) updated by nulling all even entries q∈{2, 4, . . . , 2Q−2, 2Q}, when a hard-decision of {circumflex over (k)}mI is confirmed from the projection of am onto custom-character, which also can happen only once.


Obviously, the only other alternative to those above is when both {circumflex over (k)}R and {circumflex over (k)}I have been acquired, and consequently also the entire set of symbol estimates {ŝ1R, ŝ1I, . . . , ŝPR, ŝPI} have been obtained, in which case the procedure is terminated.


Similarly to the above, the updates of ym and Gm must also be revised so as to account for the effect of hard-decisions onto {circumflex over (k)}R and {circumflex over (k)}I, so as to cancel the effect of hard-decided indices and symbols, and to nullify the channel corresponding to confirmed indices, yielding respectively










y

m
+
1


=




(
38
)









{




y
-

G
·




q


[



2



k
^

R


-
1

,


q
^

m
I


]





e
q




s
^

q









upon


confirmation


of




k
^

R



from




q
^

m
R


,
or






y
-

G
·




q


[



q
^

m
R

,

2



k
^

q
I



]





e
q




s
^

q









upon


confirmation


of




k
^

I



from




q
^

m
I


,
or







y
m

-



G
m

·

e


q
^

m






s
^



q
^

m







otherwise
,



k
^

R



from




q
^

m
R


,
or














G

m
+
1


=




(
39
)









{






G
m

[


I

2

Q


-




q

odd



diag

(

e
q

)



]





upon


confirmation


of




k
^

R



from




q
^

m
R


,
or







G
m

[


I

2

Q


-




q

even



diag

(

e
q

)



]





upon


confirmation


of




k
^

I



from




q
^

m
I


,
or







G
m

[


I

2

Q


-

diag

(

e


q
^

m


)


]



otherwise



.






FIG. 8 is the Schematic diagram depicting the structure of the proposed GB-ISTA receiver for QSM demodulation


The procedure described by equations (31) through (33) and (35) through (39) amount to a greedy—i.e., symbol-by-symbol and index set-by-index set—modification of the GB-ISTA detector introduced earlier, for which it is dubbed as the greedy boxed iterative shrinkage thresholding algorithm for QSM demodulation.


Notice that at the end of the process, estimates IR and kI of the selected dispersion matrix index vectors, as well as hard-decision estimates ŝ=[ŝ1R, ŝ1I, . . . , ŝPR, ŝPI] of the modulated symbols are obtained, from which the corresponding encoded bits b=[bR, bI, bS] can be retrieved at a fraction of the complexity of sphere detection or exhaustive maximum likelihood searches. A diagram illustrating the proposed GB-ISTA QSM receiver is offered in FIG. 8 and a summarized in the form of pseudo-code in method 3.












 Method 3 Greedy Boxed-(Hard) ISTA Receiver for QSM Schemes







 Global Quantities: Real-valued projected symbol constellation custom-characterR, set of index vectors custom-character


 and threshold factor λ;


 Inputs: Received signal y and effective channel matrix G.


 Outputs: Estimated index and symbol vectors {circumflex over (κ)}R, {circumflex over (κ)}I and ŝ.








 1:
Set m = 1 and α = maxeig(GT G);


 2:
Initialize ym = y, Gm = G and ûm(1) = 02Q;





 3:





while




𝒫
𝒦

(


1
2

[



q
^

m
R

+
1

]

)


=





or




𝒫
𝒦

(


1
2




q
^

m
I


)


=




do











 4:
 Iterate equation (31) until convergence obtaining üm(η*)


 5:
 Obtain symbol soft estimate {hacek over (s)}{circumflex over (q)}m, and index hard estimate {circumflex over (q)}m, via equation (32);


 6:
 Obtain hard symbol estimate ŝ{circumflex over (q)}m via equation (33);


 7:
 Insert index estimate {circumflex over (q)}m into its subsequence {circumflex over (q)}mR or {circumflex over (q)}mI as per equation (35);


 8:
 Construct ûm+1(1), ym+1 and Gm+1 via equations (37), (38) and (39);


 9:
 Increment m by 1;


10:
end while





11:






Output


the


estimate


index


vectors


as




k
^

R






𝒫
𝒦

(


1
2

[



q
~

m
R

+
1

]

)



and








k
^

I





𝒫
𝒦

(


1
2




q
^

m
I


)


,
respectively














and


the


estimate


symbol


vector






s


as


the


intercalation


of



{


s
^



2



k
^

R


-
1


}



and




{


s
^


2



k
^

J



}

.














V. Complexity and Performance Analysis

In this section we analyze the performance of the proposed OS-QSM via computer simulations. Given that our focus is on the scalability of the system, all simulation results to be shown are for relatively large number of transmit antennas (i.e., nT≥6) and for increasing number of transmission slots (i.e., T≥2), with the number of digitally-modulated transmit symbols P and the cardinality of corresponding constellation M adjusted on a case-by-case basis in order to highlight the main finding of each simulated experiment. To the best of our knowledge, simulation results on QSM schemes with such parameters have not appeared so far in the literature, due to the prohibitive computational complexity of existing receivers.


A. Complexity: GB-ISTA Versus ML and SCMB-SD Receivers

In light of the latter remark, let us start by assessing the decoding complexity of scaled QSM systems, in particular by deriving the complexity orders of the conventional ML and SCMB-SD approaches, and of the proposed GB-ISTA algorithm described in Section IV.


For any given nT, T and P, the brute-force ML decoder requires a search among all possible









(




T


n
T






P



)




2
×





antenna activation patterns, independently selected according to {kR, kI}∈custom-character to transmit the real and imaginary parts of the P digitally modulated symbols in s∈custom-characterP, as well as another search, for each possible activation pattern, of all possible P-tuples of symbols selected from the constellation of cardinality M.


Assuming, idealistically and for simplicity, that each search consumes a single floating point operation (flop), the ML search process alone yields a complexity order lower-bounded by










𝒪
(


4




log
2






(




T


n
T






P



)





×

M
P


)

,




in order to detect








2
·




log
2




(




T


n
T






P



)





+


P
.

log
2



M


bits


,




which even for moderately small P and M quickly become unfeasible. For example, a search over 16.777.216 combinations is required to detect the 24 bits of each transmit signal in a relatively small system with nT=6, T=3, P=3 and M=4. Just doubling the number of transmit antennas to nT=12, with other parameters unchanged, the complexity of the ML search space already surges to 109 combinations, for a mild increase to 30 bits per transmission, while keeping nT=6 and doubling the number of transmit symbols to P=6 requires a search over more than 1012 combinations in order to detect only 40 bits.


Taking the most significant operations required to perform each ML search into account, the order of complexity of the ML receiver to decode each bit of QSM schemes becomes with the lower bound obtained by keeping only the higher-order terms and neglecting coefficients.


construction of ΦHcustom-characterA and ΦH·custom-characterB as in equation (23) (40)














𝒪

(



12


T
3



n
T



n
R




+


n
R

(


2

P

+
1

)

-
1

)

·

4



log
2







(




Tn
T





P



)

·

M
P



)

>

𝒪

(


n
R

·
P
·

T

P
+
1


·

M
P

·

n
T

P
-
1



)


,




(
40
)







The practical unfeasibility of ML-based detection of QSM systems is clearly highlighted by equation (40), as it exposes the fact that the number of transmit symbols P is a complexity order exponent of all theoretically scalable quantities nT, T and M. Next, let us show that this challenge cannot be satisfactorily mitigated by the SD approach. To that end, consider again idealistically and for simplicity, that SD can reduce the search radius to a single symbol, such that the factor MP in equation (40) can be neglected. In other words, we find that the order of complexity associated to SD based QSM receivers can, at best, be reduced to









𝒪

(


n
R

·
P
·

T

P
+
1


·

n
T

P
-
1



)




(
41
)







From the latter it can be concluded that, in the context of scalable QSM schemes, the only advantage of SD is to enable scaling of the digital constellation cardinality M, which not only impacts negatively on the corresponding BER, but also is not the most significant factor in increasing the SE of the system, since the total number of bits conveyed by a QSM scheme is







B
=


2
·




log
2

(




T


n
T






P



)




+


P
·

log
2



M



,




such that even for mildly large nT, T and P we have







2
·




log
2

(




T


n
T






P



)







P
·

log
2




M
.






In summary, it can be concluded that sphere detection is not particularly useful as an enabler of spectrally-efficient scalable QSM, from a receiver perspective.


Finally, let us address the computation complexity of the proposed GB-ISTA For starters, observe from equations (31) through (36) that the GB-ISTA receiver obtains the spatially encoded bits bR and bI not from a search, but directly from the sparse-recovery process, i.e., the value and locations of non-zero elements of û. As a consequence of removing such combinatorial search, the impact of the scalable parameters nT, T and P onto GB-ISTA is significantly smaller, as demonstrated by the following complexity analysis of the steps of method 3.

    • 1) Method 3 takes as input the effective matrix G given in equation (27), whose construction requires evaluating the product of the sparse block-diagonal matrix ΦHcustom-character2TnR×2nT, which contains 2nT non-zero entries per row, against the matrix ΨDcustom-character2TnR×2nT, which has T non-zero entries per column, yielding a cost of 2T (2TnR)(2TnT)=8T3nTnR flops since typically we have (in scaled QSM schemes) T«2nT.
    • 2) Next the GB-ISTA receiver performs multiple runs of evaluating equation (31), the first step to which is computing








u
^


(
η
)


+


1
α





G
T

(

y
-

G



u
^


(
η
)




)

.






The cost of computing a as per line 1 of method 3 are ignored, under the argument that for large systems, the largest eigenvalues GTG converges almost sure to a constant dependent only on the structure and the energy of G.

    • That operation would cost (2TnT) (2TnR)+(2TnR)+(2TnR) (2TnT)+(2TnR)=8T2nTnR+4TnR flops to accomplish, but since the sparsity of û(η) quickly reduces to the actual value 2P, as shown in FIG. 7(a) the complexity of that step is more precisely estimated at 8PTnR+4TnR=4TnR(2P+1) flops Then, including 2TnT flops required for the Boxed-Hard thresholding function Π, and remembering η* iterations are necessary, the total cost associated with each evaluation run of equation (31) can be estimated at η* (4TnR(2P+1)+2TnT) flops.
    • 3) After convergence of equation (31), the receiver obtains the sparse estimate vector ûm(η*), from which the soft symbol estimate {tilde over (s)}{circumflex over (q)}m is extracted at negligible cost via maximum value search 44, along with the estimate index {circumflex over (q)}m given by the position of {tilde over (s)}{circumflex over (q)}m in ûm(η*), as expressed in equation (32). With these quantities at hand, up to √{square root over (M)} flops are consumed to obtain the hard symbol estimate ŝ{circumflex over (q)}m as per equation (33).
    • 4) Considering that the cost of the element, interference and column removals expressed by equations (37) through 39 are negligible, the next significant cost of the receiver is the validation of acquired indices. In particular, after at least P runs, when a sufficient number of position indices {circumflex over (q)}m have been detected to construct any or both of the index subsequences {circumflex over (q)}mR and/or qmI, and map them to corresponding estimate index vectors {circumflex over (k)}R and/or {circumflex over (k)}I as per equation 3, said estimates need be validated against the optimal set of index vectors custom-character. Assuming the cost of such operation is of order custom-character(1), this step contributes to the total complexity of the GB-ISTA detector with an additional cost of P flops
    • 5) Lastly, as described in line 11 of Algorithm 3, the GB-ISTA outputs both the pair of estimate index vectors {circumflex over (k)}R and {circumflex over (k)}I, as well as the digitally modulated symbol vector estimate ŝ∈custom-characterP, which requires the intercalation of the real and imaginary parts, at estimated cost of P flops


From the above, the total complexity order of the GB-ISTA can be estimated at









𝒪



(




8


T
3



n
T



n
R





construction


of


G


+




2

P




number


of


runs




(





η
*



(


4


Tn
R



(


2

P

+
1

)


+

2


Tn
T



)





evaluation


of



eq
.


(
31
)




+




M

+

1
2








evaluation


of



eq
.


(
35
)




and






validation


of




q
^

m
I



or




q
^

m
R







)


+


P






intercalation


of



{


s
^


2



k
^

R


…1


}



and







{


s
^



s
^


2



k
^

I




}



into



s
^







)

.





(
42
)







The per-bit complexity orders of the ML and the proposed GB-ISTA decoders, obtained by dividing the expressions in equations (40) and (42) by the number of bits detected per transmission B=2└log2(PTnT)┘+P log2 M, are compared in FIG. 9 for various settings in terms of the scalable parameters nT, T and P



FIG. 9 shows the Effect of scalable parameters on the complexity of QSM receivers.



FIG. 9(a) shows the Fixed P, various T, as a function of nT



FIG. 9(b) shows the Fixed T, various P, as a function of nT



FIG. 9(c) shows the Fixed nT, various T, as a function of P.


B. BER Performance of the Proposed OS-QSM Scheme

Empowered by the significant reduction in complexity obtained by GB-ISTA over ML detection, as shown above, we proceed to assess the BER performance of the proposed OS-QSM scheme, decoded via the GB-ISTA. In general, our simulated experiments aim to further demonstrate that the proposed OS-QSM are feasible with relatively large numbers of transmit antennas, and can achieve very low BER at very low Eb/N0 with rather high spectral efficiencies, whilst using relatively few spatial-temporal resources per transmission.



FIG. 10: is the Effect of scalability on BER performance of GB-ISTA-detected OS-QSM schemes with fixed SE.


To that end, our first set of results, shown in FIG. 10, compares the BER performances of the proposed method for various values of nT, T and P, with the ratio P/T kept constant and M adjusted such that all curves corresponds to systems with the same spectral efficiency. We remark that the curves for T=2 can be considered as a reference corresponding to the SotA EDA-QSM scheme of [22], [23], although the results shown actually incorporate improvements due to the enhancements described in Subsection III-B, namely, the utilization of: a) optimal Golden codes [28], as opposed to the block-by-block sphere-decodable [FDFR] STBC of [24; and b) the optimal construction of index vectors set custom-character, given in method 1.


Two important facts can be learned from the results of FIG. 10. The first is that significant improvement in BER achieved by scaling occurs in spite of the fact that the number of receive antennas is rather large (nR=12). This indicates that the gains are not only due to an increase in diversity (since receive diversity is already large), but also due to the coding gain reaped from the utilization of optimal FDFR-STBCs employed in the OS-QSM design. And the second is that results shown are actually simulated, down to rather low BERs, using a usual computer (i.e. no particularly powerful machine was required), and for settings which are virtually impossible to simulate with ML- or SD-based receivers. The latter point is strengthened by the results of the right side of FIG. 10, which includes a curve for a system with nT=12, which serves to further highlight the true feasibility of the proposed GB-ISTA receiver.


One criticism that could be raised about the results of FIG. 10 is, however, that values of P adopted thereby are not the optimal ones for the corresponding T and nT, as per equation (14). We once again clarify that the parameterization used in FIG. 10 is such that all systems have the same SE, so as to allow their direct comparison under equivalent conditions, which is, incidentally also the reason why all curves are plotted against Eb/N0 as opposed to SNR.



FIG. 11 illustrates the Effect of scaling P on BER performance of GB-ISTA-detected OS-QSM schemes.


In any case, in order to dispel any doubts about the ability of the proposed OS-QSM design and of the corresponding GB-ISTA receiver to actually achieve feasible and optimized spectral efficient combined with low BERs, we shown in FIG. 11 additional results obtained by varying P up to the optimal value given by equation (14). We remark that, due to the floor operation in the expression of the achievable SE given by equation 8 values of P adjacent to that given by equation—i.e., P*—are also optimum, as they result in exactly the same SE For instance, with nT=6, T=2 and M=4, as is the case of FIG. 10, equation (14) yields P*=8, but the values P={7,8,9} all result in ζ=16 in equation 8. Similarly, P={10,11,12,13} all yield the largest SE of ζ=17 with nT=6, T=3 and M=4 as is the case of FIG. 11.


With these remarks made, turning to the results obtained in FIG. 11, it can be seen that only a very mild degradation of BER is observed when up-scaling P, which is in fact smaller in for larger T as shown in FIG. 11, which is a small and fair price to pay for almost doubling spectral efficiency of the system. We clarify that the slight BER degradation observed when up-scaling the ratio P/T towards SE optimality is a consequent of the corresponding reduction of sparsity in the vectorized received signal, which tends to be less critical in systems with higher diversity and coding gains, as a result of up-scaling nT, T or both. That trend is in fact observable in FIG. 11 as the gap between BER curves narrows as T=2 is increased to T=3.



FIG. 12 is the proposed computer-implemented Optimal Index Vector Selection method for Spatial Modulation represented as a flowchart.


The method start with the Input of the number of symbols P, number of symbols slots T and the number of transmit antennas nT. In a first step a empty valid vectors set are generated. In the second step a random seed vector to set is added. Afterwards a routine in order to find least used indices in the set is proceeded. In the next step a vector with indices a picked and added to the set. Then it is checked, if the set has reached a maximum size. If this is not the case the method proceeds to the step in which the least used indices in set have to be found. If the set has reached a maximum size, the method comes to an end.


Furthermore, the proposed decoder also does not require design restrictions such as block-diagonality or orthogonality in the transmission scheme, only the sparsity which is inherent with spatial modulation schemes, therefore providing larger freedom in transmitter development as well. In other words, the proposed decoder is capable of coping with many different encoding structures (flexibility).


The decoder can be used in any MIMO system where the number of antennas is expected to be large, such that ML approaches are infeasible. The scheme can be used to increase efficiency in cellular networks and V2X communications (eMBB).


In this application it is a new transmitter and receiver designs for QSM schemes, focusing on their scalability in terms of the number of transmit antennas nT, number of transmit instances T and number of encoded M-ary symbols P, as well as on their performance optimization in terms of SE, diversity and coding gains proposed. The contributions are motivated by the demonstrated fact that, in order for SE optimality to be achieved, QSM schemes must scale nT, T and P, which is not possible with SotA methods. At the transmitter, the newly proposed OS-QSM scheme differs from SotA alternatives in that its dispersion matrices are designed based on the FDFR STBCs, and in that dispersion matrix index selection is performed via a new greedy algorithm given, which ensures that all spatial-temporal resources of the transmitter are utilized evenly over multiple transmissions. In turn, at the receiver, the proposed art contributes with a new ISTA-based receiver, which thanks to its reliance on the sparse structure of QSM signaling, eliminates the combinatorial nature of existing ML- or SD-based approaches, further enabling the scaling of the system from a feasibility perspective. In fact, a complexity analysis is offered, which shows that the proposed GB-ISTA receiver enjoys a complexity order that is cubic on T, quadratic on P, and only linear on nT, in contrast to the ML and SD detectors which have geometric complexities on T and nT, with P as exponent, rendering them unfeasible in the scaled scenario. Simulation results for set-ups of scales never before shown in related literature, corroborate both the high performance and feasibility of the proposed OS-QSM scheme and GB-ISTA receiver.

Claims
  • 1. A computer-implemented optimal index vector selection method for configuring a plurality of transmit antennas, the method comprising: configuring the plurality of transmit antennas to each represent an in-phase spatial constellation symbol within an in-phase spatial constellation, and a quadrature spatial constellation symbol within a quadrature spatial constellation,mapping source data to the in-phase spatial constellation symbols and the quadrature spatial constellation symbols represented by the plurality of transmit antennas,wherein the method constructs the set which has equal multiplicities of the transmit antenna activation, which ensures maximum possible transmit diversity.
  • 2. The method of claim 1, further comprising a modification to the iterative shrinkage-thresholding algorithm (ISTA) via boxing, range limiting, and hard-thresholding.
  • 3. The method of claim 1, further comprising: proceeding the iterative shrinkage-thresholding algorithm via boxing-hard (ISTA), a greedy selection of the positions of the antennas index and the symbol estimates, and their independent decoding of the corresponding antenna modulated and symbol modulated bits.
  • 4. The method of claim 3, wherein process working parallel to the greedy detections, to ensure valid estimates of the index vectors from the given finite set of index vectors are produced as an output and to apply interference cancellation with the confirmed values, while keeping track of which indices have been retrieved from the greedy selections, before every iteration check whether from the currently decoded indices, a final confirmation can be calculated;if it cannot be made, remove the interference by the previous greedy selection and make the next iteration.
  • 5. The method of claim 1, with the input of the number of symbols P, number of symbols slots T, and the number of transmit antennas nT, in a first step, an empty valid vectors set is generated,in a second step, a random seed vector is added to the set,afterwards, a routine, in order to find least used indices in the set, is proceeded,in the next step, a vector with indices is picked and added to the set, wherein it is checked, if the set has reached a maximum size, and, if this is not the case, the method proceeds to the step in which the least used indices in set has to be found, and,if the set has reached a maximum size, the method comes to an end.
  • 6. A receiver of a communication system having a processor, volatile and/or non-volatile memory, at least one interface adapted to receive a signal in an communication channel, wherein the non-volatile memory stores computer program instructions which, when executed by the microprocessor, configure the receiver to perform operations comprising: configuring the plurality of transmit antennas to each represent an in-phase spatial constellation symbol within an in-phase spatial constellation, and a quadrature spatial constellation symbol within a quadrature spatial constellation,mapping source data to the in-phase spatial constellation symbols and the quadrature spatial constellation symbols represented by the plurality of transmit antennas,wherein the method constructs the set which has equal multiplicities of the transmit antenna activation, which ensures maximum possible transmit diversity.
  • 7. (canceled)
  • 8. (canceled)
  • 9. (canceled)
  • 10. (canceled)
  • 11. The receiver of claim 6, further comprising a modification to the iterative shrinkage-thresholding algorithm (ISTA) via boxing, range limiting, and hard-thresholding.
  • 12. The receiver of claim 6, further comprising: proceeding the iterative shrinkage-thresholding algorithm via boxing-hard (ISTA), a greedy selection of the positions of the antennas index and the symbol estimates, and their independent decoding of the corresponding antenna modulated and symbol modulated bits.
  • 13. The receiver of claim 12, wherein process working parallel to the greedy detections, to ensure valid estimates of the index vectors from the given finite set of index vectors are produced as an output and to apply interference cancellation with the confirmed values, while keeping track of which indices have been retrieved from the greedy selections, before every iteration check whether from the currently decoded indices, a final confirmation can be calculated;if it cannot be made, remove the interference by the previous greedy selection and make the next iteration.
  • 14. The receiver of claim 6, with the input of the number of symbols P, number of symbols slots T, and the number of transmit antennas nT, in a first step, an empty valid vectors set is generated,in a second step, a random seed vector is added to the set,afterwards, a routine, in order to find least used indices in the set, is proceeded,in the next step, a vector with indices is picked and added to the set, wherein it is checked, if the set has reached a maximum size, and, if this is not the case, the method proceeds to the step in which the least used indices in set has to be found, and,if the set has reached a maximum size, the method comes to an end.
Priority Claims (1)
Number Date Country Kind
10 2021 207 918.0 Jul 2021 DE national
PCT Information
Filing Document Filing Date Country Kind
PCT/EP2022/070716 7/22/2022 WO