The present invention generally relates to transmission and receipt of wireless signals. More particularly, to systems and methods for concealing high-capacity covert wireless signals within an active, overt wireless signal, with the covert-signal receiver being trained with active learning techniques.
As coexistence of frequency-agile, heterogeneous and cognitive nodes becomes a norm in the next generation (xG) wireless networks, the electromagnetic environment (EME) becomes congested, contested and competitive. Potential adversaries will seek opportunities to decipher critical information transmitted in open signals over the air. Although cryptography exists in the higher layers as an add-on feature, there exist several vulnerabilities in the physical layer that can expose critical information over the air. Jamming, spectrum poisoning, and signal spoofing are some of the attacks, which can be launched in the physical layer to intercept the data or launch impersonation attack or denial of service. Signal disruption is even more crucial in electronic warfare, where the existence of a communication between two radios may raise concerns. Hence, in critical scenarios, to avoid any such attacks, one strategy is to obfuscate a wireless signal such that it reaches the intended receiver without it being detected by a third party.
Military communications will often rely on signals with low probability of intercept (LPI) or a low probability of detection (LPD) in hostile environment. However, these signals typically suffer from low capacity of the communication link, especially when the signal needs to be hidden under a “noise floor” to minimize the probability of detection. It is also known to hide the presence of a secret, covert signal within another overt cover signal, a term widely known as “wireless steganography” or covert communication. Steganography is an early technique to hide secret information within other overt information, which can be image, text, audio or video.
Major benefits of using wireless signals for steganographic, covert communication over other forms of information is that the covert signal can only be captured over-the-air in the vicinity of the covert transmitter, and the covert signals are not stored and transitory. With attention paid solely to the overt carrier signal, there is no recordation or storage of the covert signal communication. Despite the advantage of placing covert signals in an overt signal, there are several disadvantages to using this method of communication.
Many of the steganographic wireless signal techniques suffer from low data capacity of the secret channel. There have been attempts to increase data capacity by using machine learning algorithms which have enabled image steganography that can hide a secret image of the same size as the cover image, but this technique in the image domain has not been extrapolated to the wireless domain due to the fundamental difference in data type. Furthermore, wireless steganography is normally limited to modifying the signal within the limits of various wireless standards to hide the signal. When the signal is decoded according to the standards, the covert message would not be revealed. However, many of these techniques will still reveal presence of an anomaly in the transmitted signal if steganalysis is performed on all data sent in the overt signal domain. It is to address the problems with the prior art of embedding covert signals in an overt wireless signal transmission that that present invention is primarily directed.
Briefly described, the present invention provides a system and method for transmitting, from an encoder to a decoder, one or more covert wireless signals within an overt wireless signal. The encoder receives a bitstream and encodes the received bitstream into an encoded noise signal that replicates a noise signal of a predetermined hardware device. The encoded noise signal is then combined with a cover modulated signal to form at least one covert wireless signal that is distinct from and conceals the received bitstream. The covert wireless signal is transmitted within an overt wireless signal to a decoder that receives the covert wireless signal, removes the cover modulated signal from the received covert wireless signal to isolate the encoded noise signal, and then converts the isolated encoded noise signal into a decoded bitstream. The decoder can, but does not have to, receive the overt wireless signal and/or act upon it.
The system and method can include a critic module operably coupled to the encode that compares the encoded noise signal generated by the encoder and the noise signal of the predetermined hardware device, and determines statistical properties for each of the encoded noise signal and the noise signal of the predetermined hardware device. The predetermined hardware device can be any device that is known to introduce an amount of noise in a wireless signal transmission. The encoder can be further configured to adjust characteristics of the encoded noise signal in response to the critic module determining that the statistical properties for the encoded noise signal differ from the statistical properties of the noise signal of the predetermined hardware device.
In one embodiment, each of the encoder and the decoder includes a multi-node neural network. Other types of AI or other expert systems can be used for each of the encoder and decoder. In such embodiment, the encoder can be further configured to transmit the original bitstream to the decoder for a predetermined training session such that the decoder will receive the bitstream from the encoder during the training session, and compare the received bitstream with the decoded bitstream to thereby determine a decoding accuracy. The decoder can relay the decoding accuracy to the encoder and thus train the system to increase accuracy in receipt and reveal of the contents of the one or more covert wireless signals
Furthermore, the covert wireless signal can be a plurality of carrier signals, optionally established through orthogonal frequency-division multiplexing (OFDM) or quadrature amplitude modulation (QAM). And the covert signal(s) can also be encrypted depending on the embodiment.
The encoded noise signal can be within a transmission bandwidth defined by at least one predetermined regulatory communication standard, such as those for mobile, wireless, or network communication bandwidths. For instance, most wireless standards provide acceptable range of error for operation. Thus, the covert signal(s) containing the covert message (in bits) will appear to mimic the distribution of a noise (complex signal) generated from any wireless radio hardware such that a steganalysis on this covert signal will not be able to differentiate whether the source of the noise is the transmitter frontend or there is an underlying covert communication.
The present invention therefore provides an advantage is that it can provide high-bandwidth covert wireless signals in steganographic wireless signal techniques Furthermore, the present invention allows wireless steganography that can exceed the data channel within various wireless standards because it can also utilize known channels of noise. The invention also has an industrial application in providing a novel encoder and decoder that can communicate over the one or more covert wireless channels.
With reference to the figures in which like numerals represent like elements throughout the several views,
In one embodiment, the technique uses a common, physical-layer protocol to mask the communication that takes advantage of the hardware imperfections present in commodity hardware, intrinsically noisy channel of wireless communication, as well as potentially receiver diversity. When embodied within software-defined radios, the system operates in the standard 2.4 GHz ISM band, but can also be easily extended to TV or other broadcast channel whitespaces. In one embodiment, the system 22 (
Generative Adversarial Networks (GAN) are extant intelligent networks that can generate realistic images, videos, speech, handwritten text that efficiently transforms the domain of input data to another desired domain. The present invention can leverage this property of the GANs to transfer the domain of secret message to a hardware noise, which can be carried by any cover signal of choice.
In the system 12, one or more covert wireless signals (such as QAM or OFDM digital signals) can be transmitted within an overt wireless signal (such as Wi-Fi, LTE, LoRA, or other standard communication, regulatory band signals). The encoder 14 receives a bitstream M and encodes the received bitstream (Cenc) into an encoded noise signal that replicates a noise signal (N) of a predetermined hardware device (Noise generator 20), such as a transmitting radio 10 or other transmitting device, such as repeater, Wi-Fi router, etc. The encoded noise signal is then combined with a cover modulated signal (Cmod) to form at least one covert wireless signal that is distinct from and conceals the received bitstream. The covert wireless signal is transmitted within an overt wireless signal over a channel 16 to a decoder 18 that receives the covert wireless signal (Cmod), removes the cover modulated signal from the received covert wireless signal to isolate the encoded noise signal (Cenc), and then converts the isolated encoded noise signal into a decoded bitstream (M). The decoder 18, can, but does not have to, receive the overt wireless signal and/or act upon it.
The system 12 can include a critic module 22 operably coupled to the encoder 14 that compares the encoded noise signal (Cenc) generated by the encoder 14 and the noise signal (N) of the predetermined hardware device, and determines statistical properties for each of the encoded noise signal and the noise signal of the predetermined hardware device. The predetermined hardware device can be any device that is known to introduce an amount of noise in a wireless signal transmission. The encoder 14 can be further configured to adjust characteristics of the encoded noise signal in response to the critic module 22 determining that the statistical properties for the encoded noise signal (Cenc) differ from the statistical properties of the noise signal (N) of the predetermined hardware device.
In one embodiment, each of the encoder 14 and the decoder 16 includes a multi-node neural network. Other types of AI or other expert systems can be used for each of the encoder 14 and decoder 16. In such embodiment, the encoder 14 can be further configured to transmit the original bitstream (M) to the decoder 16, or other device, for a predetermined training session such that the decoder 16 will receive the bitstream from the encoder 14 during the training session, and compare the received bitstream with the decoded bitstream to thereby determine a decoding accuracy. Thus, the network can be “trained” to ensure it is correctly recreating the covert bitstream. The decoder 16 can relay the decoding accuracy to the encoder 14 and thus train the system 12 to increase accuracy in receipt and reveal of the contents of the one or more covert wireless signals
As is further explained herein, the covert wireless signal can be a single signal within the overt channel, or can be plurality of carrier signals, optionally established through orthogonal frequency-division multiplexing (OFDM) or quadrature amplitude modulation (QAM). The covert signal(s) can also themselves be encrypted depending on the embodiment, but the encryption/decryption of the covert signals does add overhead to the data transmission, both lowering data capacity as well as increasing the possibility of detection.
Several advantages of the present invention can be categorized as: 1) Cover-independent covert signal: The proposed method to generate a covert signal by domain transformation is independent of any properties of the cover signal, like waveform or modulation order; 2) High capacity. As the covert signal is independent of the cover, one symbol of cover signal can embed one symbol of covert signal. Hence, in the domain of complex representation of signals, it can achieve up to 100% embedding capacity; 3) Hardware Noise as an input to the NN: a flexible neural network architecture can be used where the variation of hardware noise is chosen as an input parameter; 4) Steganalyzer in a training session: Instead of performing steganalysis as a separate task, the steganalyzer can be integrated during the process of encoding, in form of a critic module 22. The critic module 22 helps in differentiating true hardware noise and encoder 14 generated covert signal, thus providing important feedback to the encoder 14 for optimizing the encoding process. As the steganalysis is performed in signal domain, and not on decoded data, it is resilient to signal anomaly detection techniques; and 5) Operational in wide range of SNR: Instead of modulating symbol-by-symbol as in traditional communication system, the encoder 14 and decoder 18 are designed to operate on blocks of bits, which improves the performance of the covert link at different levels of induced hardware noise
In one embodiment, system 12 can consist of three main nodes: an encoder 14, a decoder 18 and a critic as shown in
Similar to most steganographic schemes, a distorted cover signal Nmod consists of a modulated cover signal added to a noise signal N. The noise signal N is collected from a real transmitter to carry the statistical properties of the transmitter's hardware impairments. In other words, N˜C(0,σ2HW). The main goal of the encoder is encoding the confidential information M to generate complex covert signal Cmod that looks statistically identical to a distorted cover signal Nmod such that any receiver can demodulate Cmod as a standard modulated signal. But, an intended receiver with a decoder 18 neural network, can decode it and extract the secret message, M. Thus, there can exist an AWGN channel between the encoder 14 and decoder 18. If the encoder 14 transmits a complex modulated signal Cmod the decoder 18 receives Ċmod which is given by: Ċmod=Cmod+W where W˜C(0,σ2ch) is the added noise vector due to the channel, and σ2ch depends on SNR of the channel (SNRch).
The decoder 18 first demodulates the received complex modulated covert signal Ċmod as a standard modulated cover signal, which is then subtracted from Ċmod to reveal the encoded noise vector Ċenc. Then, it decodes Ċenc to recover original message M. The critic module 22 (Steganalyzer) is required to distinguish between Cmod and Nmod. It accepts Cmod or Nmod and calculates the confidence probability (Pcon) for each sequence. The critic module 22 measures the statistical properties for both Cmod and Nmod. Cmod can be detected as an altered message if the two sequences Cmod and Nmod have different distribution. Thus, the encoder 14 has to modify Cmod so that it looks statistically similar to Nmod. However if Pcon=0.5, then the encoder 14 performed well since the critic module 22 can not distinguish between Cmod and Nmod. At that point, Cmod can not be detected as an altered message the encoder 14 and decoder 18 has been trained to generate undetectable covert signal.
The encoder 14, decoder 18 and critic module 22 can all be neural networks with parameter θE, θD, θC respectively. The encoder 14 network is designed to accept M of length k in bits (i.e., M∈ßk×1, where ß={0, 1}) and outputs covert noise Cenc∈C(k/2)×1. For practical implementation, this is constrained with a variance less or equal to σ2HW. Note that both M and Cenc have the same length k. The decoder 18 accepts the demodulated encoded noise vector Ċenc∈C(k/2)×1, and outputs {dot over (M)}∈Rk×1. {dot over (M)} is restricted within the range between (0,1). At the end of a successful training process, M should converge to ß. The critic network accepts either Cmod and Nmod∈C(k/2)×1 and outputs Pcon, which is restricted within the range of (0,1).
The encoder 14 network starts with a fully connected (FC) layer 24 without any activation function. The FC layer 24 performs an initial permutation of the input data and changes the domain of the input data from bit domain to real domain to increase the mapping space and avoid singularities. The rest of the network consists of multiple convolutional layers, which extracts optimal feature representation for M. The convolutional layer, is described as Conv(W, din, dout, s), where W is the feature window size, din is the input depth of the feature vector, dout is the depth of the output feature vector, and s is the stride. The last layer is k-normalization to maintain Cenc's power constraint. Finally, we use “real to complex” layer to merge the real output vector to a complex noise vector, Cenc∈C(k/2)×1.
The decoder 18 network starts by “complex to real” layer to convert Cmod to real data vector, followed by a FC layer 24, which acts as a denoising layer to compensate the noise effect due to the channel between the transmitter and the receiver. The rest of the network consists of multiple convolutional layers to decode the encoded feature representation and obtain M. The last layer has a sigmoid activation function to restrict M's values between (0,1). After a successful training process, M should converge to the bit values. The critic network (module) 22 is similar to the decoder 18 network. However, it differs from the decoder 18 network of having an extra FC layer 26 followed by a Sigmoid activation function to output Pcon.
The k-normalization layer is designed to constrain Cenc's power level to mimic a given hardware impairment σ2HW. In this work, we provide a generic design for the k-normalization layer such that it accepts as an input and can generate different levels of hardware noise as required by the system. Thus, the k-normalization layer is formulated as:
where xi and yi are the elements of the input vector X and output vector Y respectively.
In this embodiment, the encoder 14 encodes a secret message M to produce a noise vector Cenc and modulated over a covert signal to produce a covert Cmod. The main goal of the encoder 14 to create Cmod that looks like a distorted modulated signal for a defined modulation order m. Moreover, Cenc should have the same statistical properties of the hardware noise impairments of the transmitter (e.g transmitting radio 10). The decoder 18 knows the encoding process, so it can recover the message. On the other hand, the critic network (module 22) measures the statistical properties of either Cmod or Nmod to figure out if the input signal is altered or not. In a learning-based model, the encoder 14, decoder 18, and critic module 22 can all be configured as neural networks. The encoder 14 network is trained to encode a secret message M to generate covert signals Cmod such that only the decoder can recover M, and the critic network (module 22) cannot do better than the random guessing between Cmod and Nmod.
One can define E(θe, M), D(θd, Cmod), C(θc, Cmod, Nmod) as the mapping functions of the encoder 14, decoder 18 and critic module 22 respectively. Moreover, we define d(M, {dot over (M)}) as the L2 norm between M and {dot over (M)}. Intuitively, the decoder's 18 loss function can be formulated as:
where EM {.} is the expected value over M data set. As shown, both the encoder 14 and the decoder 18 try to optimize their parameters to achieve communication reliability through minimizing the mean error between the original message M and the predicted message {dot over (M)}. The critic network (module 22) has the same role as the discriminator in GANs [11]. Thus, the critic's loss function can be expressed as:
LC (θC, θE, M, Nmod) represents the binary cross entropy loss between the distorted cover Nmod and the covert Cmod, which depends on θC and θE. In this model, one can achieve the most adversarial case for the critic network (module 22) through optimizing θC using the above, which accepts the output of the encoder 14 network as an input. Thus, during the training process, one can freeze θE while updating θC to ensure that the critic network can make informed decision about distinguishing between Nmod and Cmod. As mentioned in the above discussion, the encoder should generate a covert message Cmod which has the same statistical properties of Nmod, however, the loss function presented is not normally sufficient to accomplish this property. So, a joint objective function is defined between the encoder 14 and the decoder 18 so that they can defeat the critic network (module 22) through learning an optimal transmission scheme such that the critic reaches the maximum uncertainty between Nmod and Cmod, and only the decoder 18 can recover the message. This loss function LE,D can be expressed as:
L
E,D(θE,θD,θE,M)=LD(θE,θD,M)+LC(θC,θE,M,Mmod)
Here, the first term maintains the communication reliability between the encoder and the decoder, while the second term guarantees that the generated covert Cmod has the same statistical properties of the distorted cover signal Nmod. Similar to the critic network (module 22), both the encoder 14 and the decoder 18 update their parameters (i.e. θE and θD) based on LE,D (θE, θD, θE, M) while critic's 22 parameters are frozen.
For the steganography system requirements, one can define I(X;Y) as the mutual information between X and Y. In addition, define DKL (P∥Q) as the KL divergence between P and Q distributions.
For a fixed cover distribution PN (n), and message distribution PM (m), a steganography system having encoding and decoding functions
(ε) is perfectly secure, if
I(M;{circumflex over (M)})>0, and DKL(PN(n)∥PC
The first condition ensures the communication reliability between the transmitter (such as transmitter radio 10) and the receiver (such as receiver radio 12)(i.e., useful steganography system) while the second guarantees that the critic function (Module 22) cannot distinguish between the cover and covert messages. From previous definition, I(M,{dot over (M)}) is given by:
I(
where H (.) is the binary entropy function. The first goal of steganography system is maximizing I(M,{dot over (M)}). The conditional entropy H(M/{dot over (M)}) depends on the probability density function P(M/{dot over (M)}) which is given by:
Assuming M symbols are uniformly distributed, then we can use the likelihood approximation (i.e., P(M/{dot over (M)})≃P({dot over (M)}/M)). Since {dot over (M)}=D(ε(M)), then P({dot over (M)}/M) can be assumed as normal distribution with mean M and maximum acceptable variance (error) e (i.e., P({dot over (M)}/M)˜(M,e)).
Consequently
where L is the total number of symbols in message set. Maximizing I(M,{dot over (M)})) is equivalent to minimizing the mean square error between M and {dot over (M)}. Accordingly, the learning model satisfies the secrecy condition (i.e.,
D
KL(PN(n)∥PC
As stated earlier, the encoder 14 and the critic network (Module 22) acts as the generator and the discriminator in a typical GAN architecture. Thus, an optimal critic network C* and be derived as:
Moreover, the optimal encoder ε* can be obtained from:
ε*=min{−log 4−2JSD(PC
where JSD(P∥Q) is the Jensen-Shannon divergence between P and Q distributions. Thus, one obtains ε* if:
JSD(PC
Consequently:
Therefore, the steganography system is perfectly secure if the output of the critic network (module 22) equals ½, which means that the critic can not distinguish between the cover and covert messages, i.e.:
P
N(n)≃PC
Experimental verification of the above was performed with Tensorflow framework. The input length k=48, and the maximum relative constellation error Erms values similar to 64 point FFT in OFDM PHY of WiFi standard, such that it can be used over an OFDM signal. Two training sets are constructed for the secrets M and distorted cover messages Nmod. Each training set consists of 20000 symbols and each symbol is of size k. The cover signal is embodied as a modulated QPSK signal (i.e., m=2). The batch size is 8000. An optimizer with a learning rate of 0.001 is used to optimize the three networks included in the learning model. The number of the training epochs is 8000. The three networks are trained simultaneously in each epoch such that the parameters of the critic network (Module 22) are updated, while the parameters of both the encoder 14 and the decoder 18 are frozen. Then, the parameters of both the encoder 14 and the decoder 18 are updated jointly while the parameters of the critic network (module 22) are frozen. The channel's training signal to noise ratio (SNRt) equals 17 dB. For the testing phase, a testing set consisting of 1000 symbols for M and Nmod were used. Then a range for SNRch was defined from 0 to 40 dB.
The corresponding structures, materials, acts, and equivalents of all means or step plus function elements in the claims below, if any, are intended to include any structure, material, or act for performing the function in combination with other claimed elements as specifically claimed. The description of the present invention has been presented for purposes of illustration and description, but is not intended to be exhaustive or limited to the invention in the form disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the invention. The embodiment was chosen and described in order to best explain the principles of one or more aspects of the invention and the practical application, and to enable others of ordinary skill in the art to understand one or more aspects of the invention for various embodiments with various modifications as are suited to the particular use contemplated.
This application claims the benefit of U.S. Provisional Patent Application No. 63/244,393, filed on Sep. 15, 2021, the entirety of which is hereby incorporated herein by this reference.
Number | Date | Country | |
---|---|---|---|
63244393 | Sep 2021 | US |