The present invention relates to communication networks, and in particular to a method, system and computer-readable medium for predicting transmission channel parameters, and for training a communication system to predict such transmission channel parameters.
Communication systems are used for the transmission of information and impact everyday life. Digital transmission over channels of communication systems is based on the ability to recover the transmitted message when the signal transmitted undergoes channel distortion and noise. Examples of communication networks that include digital transmission channels include: 1) mobile communication network (radio link); 2) backbone networks of a mobile network (fiber optic); 3) submarine or long distance communication networks; 4) space communication networks; and 5) interspace/satellite communication networks.
EP 0904649, which is hereby incorporated by reference herein, describes a maximum likelihood sequence estimation (MLSE) decoder with a neural network.
In an embodiment, the present invention provides a method for an end-to-end system for channel estimation. The method comprises: obtaining data associated with a communication system, wherein the communication system comprises a receiver, a transmitter, and a communication channel; training a neural network that models the communication channel of the communication system based on inputting the obtained data into the neural network and using a decoder, wherein the neural network produces an output indicating a probability of a signal from the communication channel; and using the trained neural network for decoding information from the communication channel.
Embodiments of the present invention will be described in even greater detail below based on the exemplary figures. The present invention is not limited to the exemplary embodiments. All features described and/or illustrated herein can be used alone or combined in different combinations in embodiments of the present invention. The features and advantages of various embodiments of the present invention will become apparent by reading the following detailed description with reference to the attached drawings which illustrate the following:
Traditional methods, which have been developed, address the channel estimation and symbol recovery separately. Shlezinger, Nir, et al., “ViterbiNet: A Deep Learning Based Viterbi Algorithm for Symbol Detection,” ArXiv:1905.10750 (Sep. 29, 2020), which is hereby incorporated by reference herein, discuss an approach that attempts to apply machine learning. However, there is still a lack of a proper approach. Therefore, in contrast to existing approaches, embodiments of the present invention provide an end-to-end learnable system for learning channel(s) and decoding that provides for a number of improvements to the communication system itself. The end-to-end system of the present invention may be trained such that the training is not done at each component separately, but rather, the error at an output is propagated to all of the components within the system. By doing this, each component may be differentiable and/or have a “surrogate” gradient and/or an estimate of the gradient.
For example, in an embodiment, the present invention provides a method to train an end-to-end transmission system to estimate the channel parameters, which incorporates a decoder in the training phase. The method may be advantageously applied to reduce preamble (pilot signal) length and thereby improve communication efficiency and reduce error rate. The method also provides for more flexible architecture for the design of the transmitter and receiver system.
In other words, the present invention may improve the ability to model the transmission channels correctly, which may improve the performance and/or efficiency within communication systems, especially in communication systems with variant channels. In some instances, the method provides an end-to-end training system for channel estimation, which may lead to better decoding, lower error rates, and more efficient transmission. For example, the present invention may allow for a shorter pilot signal that still provides the same and/or even improved performance (e.g., error rate) of the communication system. Additionally, and/or alternatively, the present invention may permit for more flexible architectures to model the communication channel.
In other words, traditionally, a communication channel may be estimated using pre-ambles of known symbols that are sent to the channel to estimate the channel response. In contrast, using the present invention, the time to identify the channel is reduced by pre-training or by using information from other devices (e.g., transmitters and/or receivers). Using a pre-trained network, the system of the present invention provides already determined estimations of the parameters from the received signals without pre-ambles. In an embodiment, the online version may allow improvement of the parameter estimation based on successfully decoded messages. Additionally, and/or alternatively, by using the generative adversarial neural network (GAN), the probability of the symbols are used in the decoding process, also with small or no pre-amble symbols.
In embodiments with a varying channel, the network of the present invention may react quickly to known situations close to the training scenario and since the network may be updated based on the local condition, the network parameters may be shared to allow fast response to the local environment. For example, a network model may be updated for a local transmitter in a room with various obstacles or in an urban scenario. Additional information as input features as the location of the receiver may further improve the performances. This information may be included in the training phase giving a location aware channel model. In instances when the receiver is inside a building or in an open road, the network may provide a more tailored response and thus better channel model and lower error rate.
In an embodiment, the present invention provides a method for an end-to-end system for channel estimation. The method comprises: obtaining data associated with a communication system, wherein the communication system comprises a receiver, a transmitter, and a communication channel; training a neural network that models the communication channel of the communication system based on inputting the obtained data into the neural network and using a decoder, wherein the neural network produces an output indicating a probability of a signal from the communication channel; and using the trained neural network for decoding information from the communication channel. The output can indicate the probability associated with each symbol and/or the probability of the signal given a symbol or a specific sequence of symbols (e.g., p(y|s)). In other words, the output can be the probability of the symbol, more than the signal, or a better probability of the signal given the symbol. This can also depend on the decoding algorithm and will be described in further detail below.
In an embodiment, the obtained data comprises transmitted symbols, channel output, and/or starting values of the communication system.
In an embodiment, training of the neural network is based on using a gradient estimation.
In an embodiment, using the trained neural network comprises: deploying the trained neural network into the communication system to decode information from the communication channel using a minimization algorithm, wherein the minimization algorithm is based on a Viterbi method and/or other methods.
In an embodiment, the method further comprises: building, by the receiver, an ensemble of decoders associated with a plurality of communication channels within the communication system, wherein the ensemble of decoders comprises a plurality of trained neural networks that models the plurality of communication channels, and wherein using the trained neural network comprises selecting a trained neural network from the plurality of trained neural networks based on checking error correcting symbols associated with a plurality of outputs from the plurality of trained neural networks.
In an embodiment, the method further comprises: obtaining new data associated with the communication system; re-training the neural network based on the new data; and sing the re-trained neural network for the communication channel.
In an embodiment, the method further comprises: providing the trained neural network associated with the communication channel to a base station, wherein the base station shares the trained neural network with a plurality of other devices, and wherein the plurality of other devices uses the trained neural network for decoding information from the communication channel.
In an embodiment, training the neural network that models the communication channel of the communication system comprises: inputting the obtained data into the neural network to generate neural network outputs; determining decoded neural network outputs based on inputting the neural network outputs into the decoder; determining errors within the decoded neural network outputs based on a loss function; and updating the neural network based on the determined errors.
In an embodiment, using the trained neural network comprises: obtaining, by the receiver, information associated with an original message from the transmitter via the communication channel; inputting the information associated with the original message into the trained neural network to generate an output associated with the information; and decoding, using the decoder, the output associated with the information to determine a decoded message.
In an embodiment, the neural network is a standard convolutional neural network (CNN) or an auto-regressive CNN.
In an embodiment, training the neural network that models the communication channel of the communication system comprises: training a variational auto encoder (VAE) that comprises an encoder neural network and a decoder neural network, wherein the encoder neural network generates an output that is provided to the decoder neural network, and wherein an output of the decoder neural network is provided to the decoder.
In an embodiment, training the neural network that models the communication channel of the communication system comprises: training a generative adversarial neural network (GAN), wherein the GAN comprises a neural network that reconstructs a probability of symbols for channel signals, a generative network that is used to train the decoder, and a discriminator network that provides a probability that the channel signals are probable.
In an embodiment, the method further comprises: prior to training the neural network based on the obtained data, pre-training the neural network using supervised learning.
In another embodiment, the present invention provides a system for an end-to-end system for channel estimation. The system comprises: a receiver configured to: obtain data associated with a communication system, wherein the communication system comprises the receiver, a transmitter, and a communication channel; train a neural network that models the communication channel of the communication system based on inputting the obtained data into the neural network and using a decoder, wherein the neural network produces an output indicating a probability of a signal from the communication channel; and using the trained neural network for decoding information from the communication channel.
In a further embodiment, a tangible, non-transitory computer-readable medium having instructions thereon which, upon being executed by one or more processors, alone or in combination, provide for execution of a method according to any embodiment of the present invention.
The channel 104 may be unknown and the present invention describes a method and system to estimate the model of the channel 104. Examples of channels 104 include, but are not limited to: radio-frequency channels used in point-to-point communication, mobile communication or satellite communication; water or elastic mediums for transmission of sound/vibration waves; and optical mediums for transmission of optical signals.
For example, the optimization solvers use the Viterbi method and/or other methods and the gradient of the optimization problem is estimated using the one step rule:
∇zL≈1/τ[x(z+τ∇xL)−x(z)]
or
∇zL≈−∇xL
Other gradients are otherwise computed using the chain rule and back propagated as a normal neural network system. In other words, this means that the method may be used to propagate the error using the chain rule of differentiation to compute the weights of the feed-forward network. For example, in
As shown in
In other words, the channel 104 may be modeled using a NN. The NN may capture (e.g., automatically capture) short and long range relationships based on adjusting the weights of the NN. If/when additional information is available, the additional information may be incorporated without prior-knowledge of their relationship to the final channel parameters. In some instances, the signal transmitted may cover different bandwidths/frequencies and the local interaction of these subsystems may be captured by the NN. In some variations, the device may introduce distortion in the received signal and this may be incorporated into the trained NN by, for example, introducing specific parameters to model this interaction.
To put it another way,
In some instances,
The output (z) is then fed into a decoder 206, which then decodes the output of the channel model 204. For instance, as mentioned above, the decoder 206 may determine the decoded information (x) based on a linear optimization problem (e.g., solving for the linear optimization problem). Then, a loss function (L) 208 is used to determine errors (e.g., the gradients such as ∇zL and/or ∇xL) and the errors (e.g., the gradient ∇zL that is determined based on the ∇xL) are used to update and train the channel model/neural network. For instance, the weights of the neural network 204 may be updated based on the errors from the loss function 208.
An embodiment of the present invention provides for forward convolution (auto-regressive). For example, in some instances, the channel may have limited memory. As such, the encoder (e.g., the channel model/neural network 204) may be a convolutional neural network (CNN), which in some examples, may be implemented using a looking forward model (auto-regressive).
Another embodiment of the present invention provides a variational auto-encoder (VAE).
Referring to
a. The input symbols (x)
b. The input channel signals (v)
For example, the VAE 502 includes the encoder and the decoder. The encoder and the decoder may be one or more neural networks. The output (v) of the encoder of the VAE 502 may be provided to the decoder of the VAE 502. The VAE may be used to remove the noise from the received signal with the use of proper loss during training. The feature vector (v) may include relevant information useful for the decoding algorithm. The VAE may be used as filtering out information that is not relevant and generate one or more parameters that are useful to the decoder. The VAE is used to train part of the encoder/decoder with the input data and then train the VAE decoder to provide the signal to the decoding algorithm within the decoder 206.
P(s|y) and p(y|s) are probabilities used in the decoding algorithm (e.g., within the decoder 206). In particular, p(y|s) is used to evaluate if the current symbol was transmitted, while p(s|y) is intermediate information and is the probability of a specific symbol given the current input y. This last information might not be enough to decode due to the exponential number of symbol to decode (e.g., size of the alphabet power the length of the sequence minus the number of not valid decoding sequence given by the error correcting code). P(s|y) and p(y|s) are called the conditional probabilities. A class of decoders uses p(y|s) to decide which s has been transmitted. In other words, in operation, the present invention may seek to maximize p(y|s) over s.
During the deployment and referring to
A further embodiment of the present invention provides a generative adversarial neural network (GAN).
a. The input symbols
b. The input channel signals
In other words,
In an embodiment, the present invention provides for online learning. Here, it is considered the case where the method is applied for online learning. For example, the training is locally restarted once enough data is received and decoded. This may occur when the decoder is producing correct messages. Error correcting codes may be used to decide if the received message has been correctly decoded and thus start the re-training. In other words, the receiver 106 and/or another device may determine whether the received messages are correctly decoded. Based on the determination, the receiver 106/other device may re-train the neural network/channel model (e.g., re-train the GAN, VAE, CNN, and/or other machine learning models).
In an embodiment, the present invention provides for ensemble channel models. Here, the receiver 106 may use multiple channels and thus the receiver 106 may operate in an ensemble manner. The output of the ensemble may be optimized for the current channel before restarting training. These weights may then be shared with the base station or other mobile devices/receivers.
According to an embodiment of the present invention, starting from existing traditional channel models, a model is used as a supervised learning signal to pre-train the network and then proceed with the normal training. In other words, a supervised learning method may first be used to pre-train a channel model. After, normal training (e.g., described above in
In an embodiment, the present invention provides for transfer learning. Here, pre-trained models are used to accelerate the convergence of the training phase. By using the pre-trained models, the receiver 106 may be able to more quickly adapt to new situations (dynamic channel or movement).
In an embodiment, the present invention provides for transmitter channel encoding and
In an embodiment, the present invention provides a protocol for transfer learned models. With the previous approach (e.g., the approach shown in
To put it another way,
Embodiments of the present invention provide for the following improvements:
In an embodiment, the present invention provides a method for predicting parameters of a transmission channel of a communication system, the method comprising the steps of:
Referring to the first step, the transmitted symbols may be messages that may be transmitted from the transmitter to the receiver. They may be the ground truth data. The channel output is the information that the receiver sees when in operation. This may include the frequency signal and auxiliary information (e.g., temperature, distance to transmitter, location, and so on). The starting value of the network (e.g., the NN) are parameters that may be given from other forms of training (self-training, training in simulated environment) or previous training sessions.
Referring to the second step, z may represent the probability of the symbols within p(y|s). This information may depend on the decoding algorithm. Referring to the third step, the learned parameters may be the parameters of NN of the channel model 204 and the output z when y is received during transmission. In some instances, the learned parameters may be an output in
In some instances, the method further comprises one or more steps of the below steps of:
Referring to the error correcting symbols, every time the decoder produces x (shown in
Embodiments of the present invention may be applied to receivers where the message specification considers a neural network and the use of a decoder that minimizes a linear cost. The design of the received decoder can be used in communication networks and associated protocols. In contrast to embodiments of the present invention, other solutions have disadvantages such as a longer pilot signal, lower throughput and a higher error rate.
In each of the embodiments described, the embodiments may include one or more computer entities (e.g., systems, user interfaces, computing apparatus, devices, servers, special-purpose computers, smartphones, tablets or computers configured to perform functions specified herein) comprising one or more processors and memory. The processors can include one or more distinct processors, each having one or more cores, and access to memory. Each of the distinct processors can have the same or different structure. The processors can include one or more central processing units (CPUs), one or more graphics processing units (GPUs), circuitry (e.g., application specific integrated circuits (ASICs)), digital signal processors (DSPs), and the like. The processors can be mounted to a common substrate or to multiple different substrates. Processors are configured to perform a certain function, method, or operation (e.g., are configured to provide for performance of a function, method, or operation) at least when one of the one or more of the distinct processors is capable of performing operations embodying the function, method, or operation. Processors can perform operations embodying the function, method, or operation by, for example, executing code (e.g., interpreting scripts) stored on memory and/or trafficking data through one or more ASICs. Processors can be configured to perform, automatically, any and all functions, methods, and operations disclosed herein. Therefore, processors can be configured to implement any of (e.g., all) the protocols, devices, mechanisms, systems, and methods described herein. For example, when the present disclosure states that a method or device performs operation or task “X” (or that task “X” is performed), such a statement should be understood to disclose that processor is configured to perform task “X”.
While embodiments of the invention have been illustrated and described in detail in the drawings and foregoing description, such illustration and description are to be considered illustrative or exemplary and not restrictive. It will be understood that changes and modifications may be made by those of ordinary skill within the scope of the following claims. In particular, the present invention covers further embodiments with any combination of features from different embodiments described above and below. Additionally, statements made herein characterizing the invention refer to an embodiment of the invention and not necessarily all embodiments.
The terms used in the claims should be construed to have the broadest reasonable interpretation consistent with the foregoing description. For example, the use of the article “a” or “the” in introducing an element should not be interpreted as being exclusive of a plurality of elements. Likewise, the recitation of “or” should be interpreted as being inclusive, such that the recitation of “A or B” is not exclusive of “A and B,” unless it is clear from the context or the foregoing description that only one of A and B is intended. Further, the recitation of “at least one of A, B and C” should be interpreted as one or more of a group of elements consisting of A, B and C, and should not be interpreted as requiring at least one of each of the listed elements A, B and C, regardless of whether A, B and C are related as categories or otherwise. Moreover, the recitation of “A, B and/or C” or “at least one of A, B or C” should be interpreted as including any singular entity from the listed elements, e.g., A, any subset from the listed elements, e.g., A and B, or the entire list of elements A, B and C.
Priority is claimed to U.S. Provisional Application No. 63/163,121 filed on Mar. 19, 2021, the entire contents of which is hereby incorporated by reference herein.
| Number | Date | Country | |
|---|---|---|---|
| 63163121 | Mar 2021 | US |