The present disclosure is concerned with spectrum analysis to enable improved dynamic spectrum access (DSA) to monitor communication on, and to make effective use of, available bandwidth.
With the introduction of new technologies and the increased requirement for communication between users, devices and technologies, spectral bandwidth is becoming scarce and its use has to be optimised and improved to ensure continuous communication demands are met. This is particularly important in safety-critical systems such as in the aerospace field, where the quality of communications must be guaranteed. This includes signals from the multiple sensors used in aircraft control and maintenance, actuator command signals, navigation instructions from the control tower etc. Whilst development is focusing on increasing bandwidth through e.g. 4G, 5G technologies and the like, there is also focus on making the best use of available bandwidth by analysing how that bandwidth is used and providing dynamic spectrum access (DAS) solutions where ‘spaces’ exist.
In addition, there may be the need, e.g. in military applications, to obtain information about existing communications in unknown environments.
While DSA solutions are known e.g. from H. Song, L. Liu, J. Ashdown and Y. Yi, ‘A Deep Reinforcement Learning Framework for Spectrum Management in Dynamic Spectrum Access,’ IEEE Internet of Things Journal, doi: 10.1109/JIOT.2021.3052691, Y. Xu, J. Yu, W. C. Headley and R. M. Buehrer, ‘Deep Reinforcement Learning for Dynamic Spectrum Access in Wireless Networks,’ MILCOM 2018-2018 IEEE Military Communications Conference, (MILCOM), Los Angeles, Calif., USA, 2018, pp. 207-212, doi: 10.1109/MILCO and O. Naparstek and K. Cohen, ‘Deep Multi-User Reinforcement Learning for Distributed Dynamic Spectrum Access,’ IEEE Transactions on Wireless Communications, vol. 18, no. 1, pp. 310-323, January 2019, doi: 10.1109/TWC.2018.2879433, these rely on predetermined communication channels and identifying which of those channels are currently available, and allowing communication on those. Conventionally, when a user wishes to communicate on a network, the user has to request access and obtain a response from the network, whereby the user is allocated a fixed bandwidth channel and time slot on which they may communicate. Such allocation is not particularly efficient and may not actually make use of available bandwidth in view of the fluctuating nature of the spectrum. Alternative solutions, such as described in G. Hong, J. Martin and J. Westall, ‘Adaptive bandwidth binning for bandwidth management,’ Elsevier, 2019, use a technique known as data binning, where metadata is obtained from the data packet to be transmitted, and the data is placed on the communication bus by means of a scheduler.
More recently, the evolution of embedded machine learning has allowed the development of new techniques for spectrum analysis and DSA. Many of these systems, however, also rely on detecting non-busy, pre-determined channels to which access is provided. Other solutions to obtaining access by analysing the spectrum, such as H. Chang, H. Song, Y. Yi, J. Zhang, H. He and L. Liu, ‘Distributive Dynamic Spectrum Access Through Deep Reinforcement Learning: A Reservoir Computing-Based Approach,’ IEEE Internet of Things Journal, vol. 6, no. 2, pp. 1938-1948, April 2019, doi: 10.1 are known. Whilst such methodologies provide an effective DSA solution, the fully automated nature, and the use of Recurrent Neural Networks (RNN), means that the information obtained from the analysis and the way in which it is used to make access decisions is not in a human-readable form and is not transparent. Some industries, e.g. the aerospace industry, require, for safety, that data used in such access decisions is available to be read and understood by humans. Certification of communications systems often relies on such information being made available during the process. RNNs also require high computational power which makes them difficult to integrate into some devices or systems.
There is, therefore a need for an improved technique for effective DSA to a communication spectrum.
According to the disclosure, there is provided a method of allocating space on a spectrum on which information is transmitted, to information to be transmitted, the method comprising: identifying carrier frequencies and bandwidths of information being transmitted on the spectrum, determining an optimal carrier frequency and bandwidth for the information to be transmitted based on the carrier frequencies and bandwidths of information being transmitted on the spectrum; transmitting the information to be transmitted using modulation at the identified optimal carrier frequency and bandwidth.
Also provided is a system for performing the method.
Examples of techniques according to the disclosure will now be described by way of example only. The scope of the invention is not limited by the description and variations are possible within the scope of the invention as defined by the claims.
The dynamic spectrum allocation technique of the present disclosure allocates space on a spectrum to information to be transmitted using a machine learning based concept such that space on the spectrum can be allocated based on a real time determination of where space exists.
The spectrum allocation may also use the results of a spectrum analysis technique designed to recurrently identify characteristics of a spectrum based on identification of the information currently being transmitted on the spectrum, to allocate space on a spectrum to a user wishing to communicate on the network. Intermediate data can be provided during the analysis, which can be read/interpreted by a human means that the technique can be used in systems such as avionics, the certification of which relies on being understandable to humans.
The spectrum allocation technique of this disclosure uses inputs indicative of the current use of the spectrum and applies deep reinforcement learning to identify the types of modulation and the location in the spectrum, in terms of carrier frequency and bandwidth, where the information to be transmitted can be placed in the spectrum. This will be described, by way of example, in more detail below.
In one example of a system in which the spectrum allocation of this disclosure can be used, the spectrum is first analysed by fragmenting the signal into different types of modulation occurring on the spectrum. These may be identified in terms of the number of different modulations, their carrier frequencies and bandwidths and, in some examples, the name of the type of modulation.
Such a system is described below with reference to
In the system of
(Frxh)n=S{I,Fci,Bwi,namei}n
These characteristics can then be fed into a time analysis block 3 to obtain a time analysis using a stream reasoning engine as described further below, to detect specific patterns (Pj) in time, defined as a set of rules (Rj), such as frequency jumps over time or a particular periodicity, which can be used for the allocation of parts of the spectrum for signals to be communicated on the network, as described further below.
In one example, as shown, the detected patterns (Pj) are provided to a processor (here described as the SDR Brain) 4. The Brain 4 determines, based on the time analysis, the proper modulation to use for communication of the next piece of information to be transmitted. In addition, the SDR brain 4 may update the rules set (R1) used by the time analysis block 3 based on a determination of which patterns are relevant to be recognized, based on the spectrum characteristics.
The brain 4 may select the appropriate modulation to use for the information to be transmitted in various ways. One example is by using a decision tree such as shown in
This is, of course, one example only of the methodology that could be used for selecting the appropriate modulation, even if a similar decision tree is used, the rules and patterns are selected according to the application in question.
In the example shown, once the spectrum has been analysed, and the modulation type has been selected, the signal characteristics, the patterns from the time analysis and the selected modulation can be fed to a deep reinforcement learning (RL) block 5 with a signal allocation algorithm 6 which can determine the carrier frequency and bandwidth of the modulated signal. Using the determined carrier frequency Fc and bandwidth Bw, the SDR brain 4 allocates a part of the spectrum to the signal to be transmitted and introduces the signal to the spectrum at the best carrier frequency and bandwidth available.
With reference to
As described above, this block 2 identifies the number of modulations i present in a spectrum being observed, identifying their carrier frequency Fci and bandwidth Bwi to determine their location in the spectrum, and classifying them (name) into a set of predetermined classes.
One way of identifying or predicting the number of modulations i involves providing the frequency representation 10′ of the spectrum to be analysed to a machine learning based model. A simple regression calculation can be performed (block 20) preferably using any known type of predictor that is able to identify gradients in data, e.g., a one-dimensional CNN, a Gaussian Process, or an XGBoost regressor, although other simpler models e.g. polynomial regression or a Gaussian k Nearest Neighbours algorithm could also be used.
Next (block 30), the carrier frequency (Fc) is determined for each modulation. This may be done using similar predictors but multivariate regression is required because outputs need to be predicted for the multiple number of modulations i from the first block 20.
A similar procedure is then performed at block 40 to determine the spectral bandwidth Bw for each modulation.
In one example, each regressor block may be trained separately, to avoid errors being propagated through the system.
These processes result in the incoming signal being fragmented into the different modulations defined by their position and width in the spectrum, stored at block 50. While this provides extensive information as to the information populating the spectrum, the information can be further augmented by specifying the type of signals the identified modulations consist of, in a signal classification block 60. One way of identifying the types of modulation by name is by providing the different modulations that have been identified and located in the spectrum to a one dimensional CNN where they can be classified according to a list of known or expected modulations and allocated a corresponding name (name). Types of signal that are not recognised at this stage can be identified by a classification for ‘unknown’ signals.
In one example, the classification according to a list of known or expected modulations can be based on a list that is updated by machine learning as will be discussed below. In a simpler system, though, the list can be pre-established and fixed.
Using this spectrum characterisation by fragmenting the spectrum and identifying the location (by carrier frequency and bandwidth) and, preferably, type of the different modulations present, allows for a user or system to understand the environment and how the spectrum is used, on a real time basis and this information can then be used for DSA and/or to understand how unknown environments are being used e.g. for military use. The fragmentation and characterisation allows information to be obtained from both known and unknown environments. Previous systems have relied on a previous study and understanding of the environment and isolation of modulations based on a previous understanding of the environment. The usefulness of these previous systems is therefore limited to static environments and their reliability is sensitive to noise fluctuations.
In the example shown in
In one example according to the disclosure, therefore, the above-described spectrum characterisation concept can be modified to extend to the microwave spectrum. An example of architecture for spectrum characterisation for microwave photonics sensing is shown in
The spectrum is converted to a signal that can be processed by regression blocks such as described above, in relation to
To further improve the usefulness of the information obtained about the spectrum, the output of the spectrum characterisation block 2 may be further subjected to a time analysis (block 3). The time analysis block 3 takes the signal classification information and analyses it in the time domain to identify temporal features of each type of modulation according to a set of rules. Such temporal features may be e.g. periodicity or frequency hopping. The time analysis block 3 outputs the time domain information in an interpretable manner, thus improving the explainability of the system (important for e.g. certification in avionics). Information provided in this manner can also be embedded in SDR devices which improves the speed of DSA.
One way of performing the time analysis uses stream reasoning—i.e. an incremental reasoning over streams of incrementally available information. Stream reasoning allows the analysis of rapidly changing information to detect patterns or trends in the data. Stream reasoning engines per se are known which can be used in the time analysis block 3. Rules can be set to e.g. analyse how many times a specific modulation has been detected, which modulations are hopping their carrier frequency and at what periodicity etc.
As described above, using the time analysed spectrum information and the fragmented modulation information, a processor (SDR brain 4) is able to make a determination as to the most appropriate type of modulation to use for the information to be transmitted.
The information obtained from the system characterisation is processed to determine a selected modulation for use by a signal to be transmitted based on the spectrum characterisation—i.e. how the spectrum is currently being used.
The spectrum characterisation can be used by a learning block 5 to determine where in the spectrum available space exists or is likely to exist, so that signals to be transmitted can be allocated to such spaces and transmitted using a selected type of modulation.
The learning block is the focus of the present disclosure and, as mentioned above, may also be used with inputs from systems other than that described above.
The learning block 5 obtains frequency information—i.e. the selected modulation—of the signal to be transmitted and also obtains information, in the frequency domain, about the current use of the spectrum and makes a decision, using Reinforcement Learning (RL) techniques, to introduce the signal to be transmitted into the spectrum at the best carrier frequency and bandwidth available.
The RL learning block 5 trains on an artificial and controlled environment which uses information from signal characterisation, identifying the carrier frequencies and bandwidths being used in the spectrum, and, once trained, is able to identify the optimal carrier frequency and bandwidth for a new signal to be transmitted, that minimises inter-signal interference while maximising the bit rate transmission of the new signal. In one example, the RL block 5 uses the outputs from each of the blocks 20, 30, 40 of
The training of the block is illustrated in
The environment reacts to the received action which results in a new state to be use for the next iteration and also provides a ‘reward metric’ for the current iteration. The reward (ri) is a metric of the interference produced between the original spectrum and the new modulation in the frequency suggested by the action.
The agent receives the ‘reward’ and interprets this as an indication of how successful the suggested action was. Based on this, the agent is also able to train itself and update its current knowledge.
One way of providing the reward function uses the ‘Hadamard’ product as discussed in R. A. Horn, ‘The hadamard product,’ Proc. Symp. Appl. Math, 1990. The interference (ΔW) is calculated using the Hadamard product between the original spectrum and the newly allocated modulation in the selected carrier frequency. The latter assumes a Gaussian noise wavelength in the rest of the spectrum. To compute the reward based on the observed interference, the signal is swept through the whole spectrum, checking interference in every possible position, to obtain the maximum (ΔWmax) and minimum (ΔWmin) possible interferences. The observed interference (ΔWobserved) is used to obtain the reward function:
Using this technique allows for a faster response time from the environment when measuring the interference resulting in a shorter training time than in conventional approaches.
In this example, each learning iteration is considered to be independent from each other, usually referred to in the literature as “episodic tasks”. Because this entails only considering immediate rewards, the value function that results from following a policy π is given by the following simplification of the Bellman equation, which associates the policy π to the probability of reaching the reward r following the action a. Note that in this version, it only considers the immediate reward, discarding previous states, which allow us the number of training cycles to be reduced compared with the q-learning approach used in the SoTA.
V(s)=Σπ(a|s)·Praa(r(s,a))
The action space for the proposed RL model is technically continuous since frequency is a continuous parameter from an analog point of view. Following this assumption, actor-critic methods might be the most successful model for the task, given their advantage over continuous action spaces. However, since the observed state is already not represented as a continuous shape (it includes information from the new modulation as well as the original spectrum), we can simply discretize the available RF range, making it more compatible with most RL models, but maintaining the continuous nature of the spectrum. Under these conditions, a contextual bandit model would be a more suitable model for the task, since it not only benefits from the discrete action space but also the episodic nature of the task. This allows the best carrier frequency and bandwidth to be selected without defining previous discrete channels as is known in the art.
Using the concepts of this disclosure, communication in busy spectrums can be improved in terms of reliability and speed, and interference can be minimised between devices sharing the same bandwidth. The system can adapt dynamically to changing environments and newly introduced modulations allowing for increased safety and reliability. In providing explainable or readable outputs throughout the system, the system is suitable for certified use in many applications e.g. avionics and other safety-critical applications. The concept allows for an optimal use of available spectrums thus enabling communication of increased amounts of data.
Variations of the examples described above are possible within the scope of the invention as defined by the claims.
Number | Date | Country | Kind |
---|---|---|---|
21194971.4 | Sep 2021 | EP | regional |