Today, Wi-Fi networks are used to connect hundreds of millions of people worldwide. Wi-Fi is so ubiquitous that cellular operators are expected to offload 63% of their traffic to Wi-Fi by 2022. To attest to the need for higher data rates, the IEEE is currently standardizing 802.11be (Wi-Fi 7), which will support throughput of up to 46 Gbps through wider signal bandwidths and the usage of multi-user multiple-input and multiple-output (MU-MIMO) techniques. MU-MIMO will become fundamental also to effectively decongest the unlicensed spectrum bands through spatial reuse, which are increasingly saturated. To correctly beamform transmissions, MU-MIMO requires access points (APs) to periodically collect channel state information (CSI) from each connected station (station) to beamform the transmissions. According to the IEEE 802.11 standard, the beamforming feedback (BF) is constructed by (i) measuring the CSI through pilot signals and (ii) computing the BF through singular value decomposition (SVD). Then, the BF is decomposed into Givens rotation (GR) angles that produce the beamforming matrix (BM).
Example embodiments include a wireless communications network. A transmitter may be configured to generate beamformed signals and omnidirectional signals. A receiver may communicatively coupled to the transmitter via a wireless channel and may be configured to 1) generate channel state information (CSI) based on a received signal from the transmitter; 2) generate, via a first neural network (NN), a compressed representation of beamforming feedback as a function of the CSI; and 3) transmit the compressed representation to the transmitter via the wireless channel. The transmitter may then determine, via a second NN, the beamforming matrix as a function of the compressed representation; and generate a subsequent beamformed signal toward the receiver as a function of the beamforming matrix.
The first and second NNs may each operate a distinct subset of a common NN model, the common NN model being configured to output the beamforming matrix in response to an input comprising the CSI. The distinct subsets of the common NN model may include 1) a first subset configured to output the compressed representation in response to an input comprising the CSI, and 2) a second subset configured to output the beamforming matrix in response to an input comprising the compressed representation. The common NN model may include an intermediate layer between the distinct subsets, the intermediate layer outputting the compressed representation in response to an input comprising the CSI.
The transmitter may be further configured to update a beamforming configuration based on the beamforming matrix, and may generate the subsequent beamformed signal via an antenna array. The transmitter may generate the subsequent beamformed signal via a multiple-input and multiple-output (MIMO) process. The compressed representation may be less than 50% of the data size of the beamforming feedback. The received signal may be an omnidirectional signal or a beamformed signal.
Further embodiments include a method of wireless communication. At a receiver communicatively coupled to a transmitter via a wireless channel, channel state information (CSI) may be generated based on a received signal from the transmitter. Via a first neural network (NN), a compressed representation of beamforming feedback may be generated as a function of the CSI. The compressed representation may then be transmitted to the transmitter via the wireless channel. At the transmitter, the beamforming matrix may be determined via a second NN as a function of the compressed representation. Via a second NN, the beamforming matrix may be determined as a function of the compressed representation. A subsequent beamformed signal may then be generated toward the receiver as a function of the beamforming matrix.
Further embodiments include a network transmitter. A transceiver may be configured to generate a beamformed signal toward a receiver, and receive a compressed representation of beamforming feedback from the receiver, the compressed representation being a function of channel state information (CSI) determined by the receiver. A neural network (NN) may be configured to generate the beamforming matrix as a function of the compressed representation. The transceiver may be further configured to generate a subsequent beamformed signal toward the receiver as a function of the beamforming matrix.
The foregoing will be apparent from the following more particular description of example embodiments, as illustrated in the accompanying drawings in which like reference characters refer to the same parts throughout the different views. The drawings are not necessarily to scale, emphasis instead being placed upon illustrating embodiments.
A description of example embodiments follows. The teachings of all patents, published applications and references cited herein are incorporated by reference in their entirety.
A key challenge in MIMO systems is that the size of the beamforming feedback (BF) grows with the number of subcarriers, transmitting and receiving antennas. For example, in an 8×8 network at 160 MHz of bandwidth, the BF in 802.11 will be of size (486 subcarriers×56 angles/subcarrier×16 bits/angle=) 435,456 bits≃54.43 kB, if the maximum angle resolution is used. If BFs are sent back every 10 ms, the airtime overhead is 435,456/0.01≃43.55 Mbit/s. Moreover, the BF computation imposes a significant burden on the stations, which may become intolerable for low-power devices. Specifically, the complexity of SVD and GR are:
wherein Nt, Nr, and S denote the number of transmitting and receiving antennas and subcarriers. Because Wi-Fi 7 will support more spatial streams (up to 16) and bandwidth (up to 320 MHz), a thorough revision of how MIMO is performed in Wi-Fi is essential to keep the complexity under control.
Existing approaches to reduce MIMO complexity come with excessive computation overhead and/or performance loss, with most of them not being compliant to the IEEE 802.11 standard. Example embodiments, described below, take a different approach, providing an IEEE 802.11 standard-compliant framework leveraging split computing to drastically decrease both computational load and BF size while maintaining reasonable beamforming accuracy.
An advantage of example embodiments is that the complexity of the head model and the BF representation size can be adjusted by modifying the bottleneck placement and size. Indeed, the bottleneck can trade off computational load, feedback size and beamforming accuracy, which was not available in previous approaches. This is crucial for constrained Wi-Fi devices and systems, which will cater to heterogeneous devices with different processing capacities.
Example embodiments provide a novel framework for BF compression and station computation reduction in MU-MIMO Wi-Fi networks. A complexity analysis, described below, shows that example embodiments successfully reduces the station computational load and the BF size by respectively 92% and 91% when compared to the standardized 802.11 algorithm.
A bottleneck optimization problem (BOP) is formulated below to determine the bottleneck placement and size with the goal of minimizing airtime and computation overhead, while ensuring that the bit error rate (BER) and end-to-end delay are below the application's desired level. Given its complexity, a heuristic algorithm is introduced, and a customized training procedure for the resulting DNN is described herein.
Example embodiments may leveredge off-the-shelf Wi-Fi equipment to collect CSI data in two different propagation environments, and the performance of example embodiments can be compared favorably with IEEE 802.11ac/ax CSI feedback algorithm (henceforth called 802.11) and the state-of-the-art DNN-based compression technique, LB-SciFi. Example embodiments have been demonstrated to reduce computational load and feedback size by up to 84% and 81% with respect to 802.11. Also, with the same compression rate, the computational load is reduced by up to 89% compared to LB-SciFi.
Example embodiments may be synthesized in field-programmable gate array (FPGA) hardware by using a customized library to show the feasibility of such embodiments in real-world Wi-Fi systems. Our experimental results show that the maximum end-to-end latency incurred by example embodiments is less than 7 milliseconds (ms) in the case of 4×4 MIMO operating at 160 MHz and lowest compression rate, which is well below the suggested threshold of 10 ms in MU-MIMO Wi-Fi systems.
Example embodiments are described below with the following notation for mathematical expressions. Boldface uppercase letters denote matrices. The superscripts T and † denote the transpose and the complex conjugate transpose (i.e., the Hermitian). The symbol ∠C defines the matrix containing the phases of the complex-valued matrix C. Moreover, diag(c1, . . . , cj) indicates the diagonal matrix with elements (c1, . . . , cj) on the main diagonal. The (c1, c2) entry of matrix C is defined by [C]c1,c2, while Ic refers to an identity matrix of size c×c and Ic×d is a c×d generalized identity matrix. The notations R and C indicate the set of real and complex numbers, respectively.
where ρ denotes the signal-to-noise-ratio (SNR) and is assumed equal for all users. Ni is the complex additive white Gaussian noise (AGWN) for station i as CN(0, 1). To simplify notation, equation (1) is given in terms of the frequency domain for a single subcarrier and subcarrier index(s) is omitted. We assume the number of transmit antennas is set to be the sum total of all the used spatial streams:
The first term in (1) denotes the desired signal and the second term is the inter-user interference, which can be eliminated due to the beamforming. Ideally, HiWj=0 when i≠j. Therefore, the received signal can be reduced to:
can be calculated using a multi-user channel sounding mechanism, as shown in
First, the access point may begin the process by transmitting a null data packet (NDP) announcement frame, used to gain control of the channel and identify the stations. The access point follows the NDP announcement frame with a NDP for each spatial stream. Second, upon reception of the NDP, each station i analyzes the NDP training fields—for example, VHT-LTF (Very High Throughput Legacy Training Field) in 802.11ac—and estimates the channel matrix Hi(s) for all subcarriers s, which is then decomposed by using SVD:
where Ui(s)∈CNr,i×Nr,i and Zi(s)∈CNt×Nt are unitary matrices, while the singular values are collected in the Nr,i×Nt diagonal matrix Si(s). With this notation, the complex-valued BM Vi(s) is defined by collecting the first Nss,i columns of Zi(s). To simplify the notation, the i subscript can be dropped in favor of a generic receiver. To reduce the channel overhead, V(s) is converted into polar coordinates as detailed in Algorithm 1, provided below. The output is matrices Ds,t and Gs,l,t, defined as:
The above equations (3) and (4) allow rewriting V(s) as:
In the {tilde over (V)}(s) matrix, the last row (i.e., the feedback for the Ntth transmitting antenna) consists of non-negative real numbers by construction. Using this transformation, the station is only required to transmit the ϕ and ψ angles to the access point. Moreover, the beamforming performance can be equivalent when using V(s) or {tilde over (V)}(s). Thus, the feedback for {tilde over ( )}Dk is not fed back to the access point.
Third, the access point transmits a beamforming report poll (BRP) frame to retrieve the angles from each station. The angles are further quantized using bϕ∈{7, 9} bits for ϕ and bψ=bϕ−2 bits for ψ, to further reduce the channel occupancy. The quantized values—qϕ={0, . . . , 2bϕ−1} and qψ={0, . . . , 2bϕ−1}—are packed into a compressed beamforming frame (CBF). Each contains A number of angles for each of the S OFDM subchannels for a total of S·A angles each. For example, a 16×16 system with 320 MHz channels requires 256 complex elements for each of the 996 subcarriers. The 802.11 standard requires 8 bits for each real and imaginary component of the CBF, which results in 510 kB.
The size of the beamforming feedback (BF) grows as
This grow implies the following challenges:
The placement and size of the bottleneck ultimately determine the head network architecture, and thus (i) the station computational load, (ii) the BF feedback size, and (iii) the beamforming accuracy. Indeed, there is a trade-off between the complexity of the head model, the BF compression rate, and the accuracy of inference. While placing the bottleneck early on with a low number of nodes reduces the station computation load and airtime overhead, it leads to a decrease in beamforming accuracy, which ultimately increases the BER. Therefore, the bottleneck placement and size must be adjusted according to the application-specific requirements.
The original DNN can be modeled as a function M that maps the channel matrix Hi∈CNr×Nt×S to the BF Vi∈CNr×Nt×S as M(H; θ):CH→CV, thorough L-layer transformations:
Let LiH(e,N) be the station i overhead consists of three components: (i) the computational cost (i.e., the power consumption and memory required for executing the model), denoted by Lic(e,N); (ii) the execution time for BF compression through the head model, denoted by TiH(e,N); and (iii) the power consumption of transmitting the compressed BF to the access point, denoted by Litx(e,N). Also, TA i (e,N) represents the compressed BF feedback airtime. Finally, TT (e,N) denotes the time required for reconstructing the BF at the access point. Notice that compression, decompression and airtime overhead depend on the placement e and size of the bottleneck N. The BOP may be defined such that it minimizes the station computation overhead and feedback airtime as
where μiH parameterizes the importance of reducing the stations overhead versus the feedback airtime. In applications where stations are resource-constrained, it is crucial to reduce the stations load, i.e., μiH>μiA. On the other hand, in dynamic propagation environments like crowded rooms, where the channel coherence time is short, high feedback airtime cannot be tolerated. Thus, reducing the feedback airtime must be prioritized, i.e., μiH<μiA. BER, represents the bit error rate (BER) of client i. As described herein, the accuracy of the generated BF is measured at the access point in terms of achievable BER by the stations. BER is the number of erroneous bits divided by the total number of transferred bits. Condition (7c) guarantees that the BER experienced by each client does not exceed the maximum BER threshold γ. Condition (7d) indicates that the maximum end-to-end delay of BF cannot exceed the maximum tolerable delay denoted by t. In practice, these two conditions ensure that the bottleneck placement does not significantly impact the inference accuracy and latency. The maximum tolerable BER and delay can be specified according to the requirements.
The BOP is a particular instance of the extremely complex neural architecture search (NAS) problem. Example embodiments implement a heuristic algorithm to search for proper hyperparameters that is specific to our context. Specifically, to limit the search space, the following procedure may be used:
Because H and I′ are complex matrices, real and complex components can be decoupled in the matrices and treat them as double-sized real matrices. For each of our datasets, a dataset can be split into training, validation, and test splits with 8:1:1 ratio. Example embodiments may be trained offline for various network configurations and does not require retraining. The stations select the proper trained DNN according to the network configuration information acquired from the NDP preamble.
Loss Function: One goal is to deploy exactly the same model for each station without fine-tuning its parameters to its environment. The training process may be done offline (i.e., on a single computer). Given a channel matrix Hi, the DNN model M estimates the corresponding BF Vi, i.e., Vi=M(Hi, θ). We formulate the loss function L as follows:
where b indicates training batch size and ∥⋅∥1 represents L1-norm. Hij and Vij indicate the j-th channel matrix and BF for station i, respectively. By minimizing the loss in equation (8), the parameters θ of our DNN model M can be optimized. A stochastic gradient descent (SGD) and Adam can be used to train the synthetic and experimental datasets, respectively. Unless specified, the models in this example are trained for 40 epochs, using the training split in the dataset with batch size of 16 and the initial learning rate of 10−3. The learning rate is decreased by a factor of 10 after the end of 20th and 30th epochs. Using the validation split in the dataset, the model can be assessed in terms of achieved BER at the end of every epoch and save the best parameters θ* such that achieve the lowest BER for the validation split. The trained model is assessed with the best parameters for the held-out test split in the dataset and report the test BER.
Difference with Autoencoders: Although an AE is similar in terms of model architecture, its training objective is different. AEs are trained to reconstruct its input in an unsupervised manner (e.g., to estimate {tilde over (V)}i given Vi). Conversely, a task-specific model may be trained in a supervised fashion to estimate BF Vi given a channel matrix Hi.
Computational Overhead: The complexity of the SVD operation for decomposing the BF V in 802.11 is O((4NtNr2+22Nt3)S). The BF is further transformed into a set of angles using the Givens rotation (GR) matrix multiplication which has a complexity of O(Nt3 Nr3S). Conversely, the complexity of an example embodiment is O(K Nt2 Nr2 S2), where K<1 denotes the head model's compression level.
Airtime Overhead: In 802.11, the size of the compressed BF report is BMR=8×Nt+Na×S×(bϕ+bψ)/2 where Na denotes the number of Givens angles. Values bϕ and bψ are the number of bits required for the angle quantization. Therefore, the 802.11 compression ratio can be written as
where b=16 is the number of bits required for transmitting channel information over each subcarrier. Conversely, the compression rate of example embodiments is K, which is constant and does not grow with the size of the channel matrix.
While example embodiments have been particularly shown and described, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the scope of the embodiments encompassed by the appended claims.
This application claims the benefit of U.S. Provisional Application No. 63/362,524, filed on Apr. 5, 2022. The entire teachings of the above application are incorporated herein by reference.
This invention was made with government support under Grant Numbers 2120447 and 2134567 awarded by the National Science Foundation. The government has certain rights in the invention.
| Filing Document | Filing Date | Country | Kind |
|---|---|---|---|
| PCT/US2023/065300 | 4/4/2023 | WO |
| Number | Date | Country | |
|---|---|---|---|
| 63362524 | Apr 2022 | US |