One or more aspects of embodiments according to the present disclosure relate to wireless communications, and more particularly to a learning-based system and method for interference whitening.
In a wireless communication system (e.g., a 5G mobile communications system), interference may be a potentially important source of errors in transmitted data. Interference whitening may be used as part of a process for mitigating the effects of interference. An interference whitener may whiten the spatial domain interference and noise by pre-multiplying the received signal vector and channel matrix with the Cholesky factorization of a covariance matrix, which may be selected from among a plurality of candidate covariance matrices.
The covariance matrix corresponding to the lowest error rate may depend on the signal-to-interference ratio (SIR), the distribution of interference in the bandwidth part (BWP), the signal-to-noise ratio (SNR), and other factors. However, in practice, the SIR, and the distribution of interference in the BWP may be unknown at the receiver.
Thus, there is a need for an improved system and method for interference whitening.
According to an embodiment of the present disclosure, there is provided a method, including: receiving a signal; extracting a first set of features from the signal; making a first selection, by a first neural network, based on the first set of features; and selecting a first covariance matrix, from a plurality of covariance matrices, based on the first selection.
In some embodiments, the making of the first selection by the first neural network includes making the first selection based on a plurality of initial covariance estimates, each corresponding to a respective resource block (RB) of a contiguous set of resource blocks.
In some embodiments, the contiguous set of resource blocks includes all of the resource blocks in a bandwidth part.
In some embodiments, the method further includes: extracting a second set of features from the signal; and making a second selection, by a second neural network, based on the second set of features, wherein the first set of features corresponds to a first resource block, and the second set of features corresponds to a second resource block.
In some embodiments: the first selection is an indication of estimated signal to interference ratio in the first resource block; the first selection corresponds to a signal to interference ratio less than a first threshold; and the selecting of the first covariance matrix includes selecting a covariance matrix based on a first initial covariance estimate, the first initial covariance estimate corresponding to the first resource block.
In some embodiments: the first selection is an indication of estimated signal to interference ratio in the first resource block; the first selection corresponds to a signal to interference ratio greater than a first threshold; the second selection is an indication of estimated signal to interference ratio in the second resource block; the second selection corresponds to a signal to interference ratio greater than the first threshold; the selecting of the first covariance matrix includes selecting a covariance matrix based on a first initial covariance estimate and on a second initial covariance estimate; the first initial covariance estimate corresponds to the first resource block; and the second initial covariance estimate corresponds to the second resource block.
In some embodiments, the method further includes calculating a first initial covariance estimate, wherein a first feature of the first set of features is based on the first initial covariance estimate.
In some embodiments, the first feature includes an eigenvalue of the first initial covariance estimate.
In some embodiments, the first feature includes a QR decomposition of the first initial covariance estimate.
In some embodiments, the first feature includes an element of the first initial covariance estimate.
According to an embodiment of the present disclosure, there is provided a device, including: a radio; and a processing circuit, the processing circuit being configured to: receive, through the radio, a signal; extract a first set of features from the signal; make a first selection, by a first neural network, based on the first set of features; and select a first covariance matrix, from a plurality of covariance matrices, based on the first selection.
In some embodiments, the making of the first selection by the first neural network includes making the first selection based on a plurality of initial covariance estimates, each corresponding to a respective resource block (RB) of a contiguous set of resource blocks.
In some embodiments, the contiguous set of resource blocks includes all of the resource blocks in a bandwidth part.
In some embodiments, the processing circuit is further configured to: extract a second set of features from the signal; and make a second selection, by a second neural network, based on the second set of features, wherein the first set of features corresponds to a first resource block, and the second set of features corresponds to a second resource block.
In some embodiments: the first selection is an indication of estimated signal to interference ratio in the first resource block; the first selection corresponds to a signal to interference ratio less than a first threshold; and the selecting of the first covariance matrix includes selecting a covariance matrix based on a first initial covariance estimate, the first initial covariance estimate corresponding to the first resource block.
In some embodiments: the first selection is an indication of estimated signal to interference ratio in the first resource block; the first selection corresponds to a signal to interference ratio greater than a first threshold; the second selection is an indication of estimated signal to interference ratio in the second resource block; the second selection corresponds to a signal to interference ratio greater than the first threshold; the selecting of the first covariance matrix includes selecting a covariance matrix based on a first initial covariance estimate and on a second initial covariance estimate; the first initial covariance estimate corresponds to the first resource block; and the second initial covariance estimate corresponds to the second resource block.
According to an embodiment of the present disclosure, there is provided a device, including: a radio; and means for processing, the means for processing being configured to: receive, through the radio, a signal; extract a first set of features from the signal; make a first selection, by a first neural network, based on the first set of features; and select a first covariance matrix, from a plurality of covariance matrices, based on the first selection.
In some embodiments, the making of the first selection by the first neural network includes making the first selection based on a plurality of initial covariance estimates, each corresponding to a respective resource block (RB) of a contiguous set of resource blocks.
In some embodiments, the contiguous set of resource blocks includes all of the resource blocks in a bandwidth part.
In some embodiments, the means for processing is further configured to: extract a second set of features from the signal; and make a second selection, by a second neural network, based on the second set of features, wherein the first set of features corresponds to a first resource block, and the second set of features corresponds to a second resource block.
These and other features and advantages of the present disclosure will be appreciated and understood with reference to the specification, claims, and appended drawings wherein:
The detailed description set forth below in connection with the appended drawings is intended as a description of exemplary embodiments of a learning-based system and method for interference whitening provided in accordance with the present disclosure and is not intended to represent the only forms in which the present disclosure may be constructed or utilized. The description sets forth the features of the present disclosure in connection with the illustrated embodiments. It is to be understood, however, that the same or equivalent functions and structures may be accomplished by different embodiments that are also intended to be encompassed within the scope of the disclosure. As denoted elsewhere herein, like element numbers are intended to indicate like elements or features.
In a multiple input multiple output (MIMO) 5G receiver, (e.g., in a user equipment (UE)), interference whitening may be performed to improve performance in the presence of interference, which in most circumstances is not white. As used herein, the phrase “user equipment” is used as a countable noun even though the noun it contains (“equipment”) may not be countable in ordinary English.
In some embodiments, a machine-learning-based (or “learning-based”) method is employed to select an interference whitener (IW) for each resource block (RB) within the bandwidth part (BWP) of the received signal on a MIMO receiver. In a 5G network, the interference whitener may be selected based on input features derived from reference symbols, for example, demodulation reference symbols (DMRSs). A covariance matrix for the interference whitening may then be selected, directly or indirectly (as discussed in further detail below) by a neural network (e.g., multi-layer perceptron (MLP) network), from a predefined set of candidate covariance matrices. The interference whitener may whiten the spatial domain interference and noise by pre-multiplying the received signal vector and channel matrix with the Cholesky factorization of the selected covariance matrix. The use of an appropriate covariance matrix for whitening may result in a reduced error rate.
Referring to
The system may operate in one of three modes, which may be referred to as “per BWP” (illustrated in
The feature extraction block 110 may use as input received signal vectors, pilots, and channel matrices that are (i) in per BWP mode, from DMRS locations in the BWP, (ii) in per RB chunk mode, from DMRS locations in the RB chunk, and (iii) in per RB mode, from DMRS locations in the RB.
Interference may not be present in each RB. Letting: (i) B be the total number of resource blocks (RBs) in the bandwidth part (BWP) (ii) C be the number of RBs in one RB chunk when per RB chunk mode is employed, (iii) SI be the set of RBs containing interference (such that the set SI indicates the interference distribution in the BWP, e.g., which RBs within BWP have interference), and (iv) Sb be the set of REs in the b-th RB, the received signal at the UE in the n-th RE may be written as:
where wn is the noise vector.
The input feature extraction block 110 may first compute an initial covariance estimate based on the received signal vector yn, channel Hn and transmitted pilot xn in the DMRS REs as follows:
where SD,b is the set of DMRS resource elements (REs) in the b-th RB and |SD,b| is the cardinality of the set. The input feature extraction block 110 may (as discussed in further detail below) generate one or more features f1, f2, f3, . . . from RD,b.
In per BWP mode and in per RB chunk mode, the neural network may provide an output z∈{1, 2, . . . N} to select, from among N methods (or “method options”) referred to herein as Option-1, 2, . . . or N, a method for computing the covariance matrix used for interference whitening, which may be referred to herein as RIW,b. For N=4, the options may include (without being limited to) the following:
where diag{A} is a diagonal matrix having the same diagonal elements as the matrix A. The options identified as Option-1 through Option-4 are arranged, above, in ascending order of complexity (when the calculating of the inverse of the Cholesky factorization is taken into account). Thus, Option-1 has the least complexity. For example, the complexity of Option-3 (which does not require any calculations to determine RIW,b from RD,b) may be greater than the complexity of Option-2 because the complexity of the calculating of the inverse of the Cholesky factorization may be low, in Option-2, in which RIW,b is diagonal.
Referring to
IF any zb=1 for b=1, 2, . . . B
Compute RIW,b with option i for all b=1, 2, . . . , B
Compute RIW,b with option j for all b=1, 2, B
where options i and j (each of which is an index identifying a method, with e.g., the index value 1 identifying Option-1, the index value 2 identifying Option-2, and so forth) may be found using Equation (4) below. For example, if the SIR is low (e.g., if zb=1, or more generally, less than a first threshold) in one of the RBs, then a method option (e.g., Option-3) which does not average together the initial covariance estimates may be employed. Otherwise, a method option that combines multiple initial covariance estimates, such as Option-2 or Option-4, may be employed.
As such, in the per BWP mode and in the per RB chunk mode, the neural network 115 may make a first selection (the selection being an output value identifying a method option), and a first covariance matrix may then be calculated based on this first selection (or, equivalently, the first covariance matrix may be said to be selected (from the plurality of covariance matrices corresponding to the plurality of method options) based on the first selection). Similarly, in the per RB mode, a first neural network 115, corresponding to a first RB, may make a first selection (the selection being an output value corresponding to an estimated SIR for the first RB), and a first covariance matrix may be selected based on the first selection (and based also on the selections made by the other neural networks 115).
Offline training to determine the neural network parameters (θ) may be performed using a method including the following steps:
1. Generation of a labeled dataset.
2. Selection of input features.
3. Neural network training.
In order to train the neural network, first a labeled dataset is generated; this may be performed as follows. Two label generation methods may be employed. For the per BWP and per RB chunk modes, the labels may be generated based on the instantaneous CRC flag, e.g., based on the instantaneous decoding result. For the per RB mode, the labels may be generated based on the average of the CRC flag, e.g., based on the average of the decoding result.
The labeled dataset consists of a tuple containing features f1, f2, . . . and label z E {1, 2 . . . N}. Each tuple corresponds to one BWP at a specific scenario of SNR, SIR, interference distribution in BWP, modulation order, and code rate. A simulation (using pseudorandom noise and interference) may be used to generate simulated received signals, and to simulate the processing of the signal after simulated application of an interference whitening filter. To generate the labeled data, each method option 1, 2, . . . N is used for whitening for each BWP. After interference whitening, the signal is detected and decoded.
where cn∈{0,1} is the cyclic redundancy check (CRC) pass/fail flag generated by the decoder if Option-n is used for whitening.
For the per RB mode, the labels may be generated based on the average of the CRC flag. Letting
z
b
=i, where
z
b
=j, where
According to Equation (4), one rule may be employed (e.g., selecting method option Option-3) when at least one RB is at low SIR, and another rule (e.g., selecting method option Option-2 or method option Option-4) may be employed when all of the RBs are at relatively high SIR.
Feature generation may be performed as follows. Features may be generated to capture different scenarios specified by the interference distribution in the BWP, the SIR, and the SNR. The features may be extracted from the elements of RD,b and from the modulation and coding scheme. Examples of features that may be extracted from RD,b include (without being limited to) (i) eigenvalues of RD,b, (ii) the QR decomposition of RD,b, (iii) diagonal and non-diagonal elements of RD,b, and (iv) combinations thereof.
Input feature selection may then be performed as follows. For the training of the neural network, a subset of all available features may be used, the subset being selected to contain the features that are most informative (or relevant) regarding the label z. The most informative features may be found using mutual information between the label z and features fi. These selected features may then be selected by the input feature extraction block 110.
Neural network training may be performed to obtain the network parameter θ. The network parameter θ includes (e.g., consists of) weights wij(l) and biases vi(l). A neural network with an input layer 305, one hidden layer 310 having P nodes, and an output layer 315, as illustrated in
where
is the sigmoid activation function and vi(1) is the bias term. The output of node i in layer l=3 is
a
i
(3))=Σjwij(2)aj(2)+vi(2),i=1,2, . . . ,D. (6)
Finally, the output of the network may be computed by applying softmax on {a1(3), a2(3), . . . , aN(3)} as follows
In order to obtain weight and bias terms {wij(l), vi(l)} and hence the network parameter θ, the neural network may be trained using a quasi-Newton algorithm to minimize the following cross-entropy cost function
c(θ)=C(wij(l),vi(l))=−Σk=1KΣi=1NI(i=zk)log(ri,k), (8)
where I(i=zk) is an identity function (1 if i=zk, and 0 otherwise), zk in the label in the k-th training tuple, ri,k is the MLP output corresponding to the k-th training sample, and K is the total number of training samples. Once training has been performed and the network parameter θ has been obtained, learning-based selection of a covariance matrix for whitening may be performed as illustrated in
The block error rate achieved using learning-based interference whitener selection is shown in
In some embodiments, a processing circuit or means for processing (discussed in further detail below) may perform some or all of the methods described herein. For example, in some embodiments, a UE includes a processing circuit and a radio, and the processing circuit performs the method of
As used herein, “a portion of” something means “at least some of” the thing, and as such may mean less than all of, or all of, the thing. As such, “a portion of” a thing includes the entire thing as a special case, i.e., the entire thing is an example of a portion of the thing. As used herein, the term “or” should be interpreted as “and/or”, such that, for example, “A or B” means any one of “A” or “B” or “A and B”.
The terms “processing circuit” or “means for processing” are used herein to mean any combination of hardware, firmware, and software, employed to process data or digital signals. Processing circuit hardware may include, for example, application specific integrated circuits (ASICs), general purpose or special purpose central processing units (CPUs), digital signal processors (DSPs), graphics processing units (GPUs), and programmable logic devices such as field programmable gate arrays (FPGAs). In a processing circuit, as used herein, each function is performed either by hardware configured, i.e., hard-wired, to perform that function, or by more general-purpose hardware, such as a CPU, configured to execute instructions stored in a non-transitory storage medium. A processing circuit may be fabricated on a single printed circuit board (PCB) or distributed over several interconnected PCBs. A processing circuit may contain other processing circuits; for example, a processing circuit may include two processing circuits, an FPGA and a CPU, interconnected on a PCB.
As used herein, when a method (e.g., an adjustment) or a first quantity (e.g., a first variable) is referred to as being “based on” a second quantity (e.g., a second variable) it means that the second quantity is an input to the method or influences the first quantity, e.g., the second quantity may be an input (e.g., the only input, or one of several inputs) to a function that calculates the first quantity, or the first quantity may be equal to the second quantity, or the first quantity may be the same as (e.g., stored at the same location or locations in memory as) the second quantity.
It will be understood that, although the terms “first”, “second”, “third”, etc., may be used herein to describe various elements, components, regions, layers and/or sections, these elements, components, regions, layers and/or sections should not be limited by these terms. These terms are only used to distinguish one element, component, region, layer or section from another element, component, region, layer or section. Thus, a first element, component, region, layer or section discussed herein could be termed a second element, component, region, layer or section, without departing from the spirit and scope of the inventive concept.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the inventive concept. As used herein, the terms “substantially,” “about,” and similar terms are used as terms of approximation and not as terms of degree, and are intended to account for the inherent deviations in measured or calculated values that would be recognized by those of ordinary skill in the art.
As used herein, the singular forms “a” and “an” are intended to include the plural forms as well, unless the context clearly indicates otherwise. It will be further understood that the terms “comprises” and/or “comprising”, when used in this specification, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. As used herein, the term “and/or” includes any and all combinations of one or more of the associated listed items. Expressions such as “at least one of,” when preceding a list of elements, modify the entire list of elements and do not modify the individual elements of the list. Further, the use of “may” when describing embodiments of the inventive concept refers to “one or more embodiments of the present disclosure”. Also, the term “exemplary” is intended to refer to an example or illustration. As used herein, the terms “use,” “using,” and “used” may be considered synonymous with the terms “utilize,” “utilizing,” and “utilized,” respectively.
Any numerical range recited herein is intended to include all sub-ranges of the same numerical precision subsumed within the recited range. For example, a range of “1.0 to 10.0” or “between 1.0 and 10.0” is intended to include all subranges between (and including) the recited minimum value of 1.0 and the recited maximum value of 10.0, that is, having a minimum value equal to or greater than 1.0 and a maximum value equal to or less than 10.0, such as, for example, 2.4 to 7.6. Any maximum numerical limitation recited herein is intended to include all lower numerical limitations subsumed therein and any minimum numerical limitation recited in this specification is intended to include all higher numerical limitations subsumed therein.
Although exemplary embodiments of a learning-based system and method for interference whitening have been specifically described and illustrated herein, many modifications and variations will be apparent to those skilled in the art. Accordingly, it is to be understood that a learning-based system and method for interference whitening constructed according to principles of this disclosure may be embodied other than as specifically described herein. The invention is also defined in the following claims, and equivalents thereof.
The present application claims priority to and the benefit of U.S. Provisional Application No. 63/168,545, filed Mar. 31, 2021, entitled “MACHINE LEARNING BASED INTERFERENCE WHITENER SELECTION”, the entire content of which is incorporated herein by reference.
Number | Date | Country | |
---|---|---|---|
63168545 | Mar 2021 | US |