This invention relates to a method of signal processing, particularly but not exclusively for processing a Code Division Multiple Access (CDMA) signal.
Signal processing finds application in a wide variety of technical fields, such as in telecommunications, in neural networks and in data compression. When information is encoded into a signal, a common problem in signal processing is how to determine this information given some measured characteristics of the signal. This is typically performed by finding the solution which maximises the posterior probability (the probability of the information given the signal characteristics).
Pearl (Probabilistic Reasoning in Intelligent Systems, Morgan Kaufmann Publishers, San Francisco, Calif., 1988), Jensen (An Introduction to Bayesian Networks, UCL Press, London, 1996) and MacKay (Information Theory, Inference and Learning Algorithms, Cambridge University Press, 2003) describe graphical models for the statistical dependence between acquired data and an iterative method for inferring the data from a signal, known as Belief Propagation (BP). When the graphical model comprises loops, there is no guarantee that the method will converge to the original information, although Weiss (Neural Computation 12 1, 2000) provides some theory to show when this will occur in restricted cases. When the space of solutions is contiguous, BP typically provides good performance.
BP has been extended by Mézard, Parisi and Zecchina (Science 297 812, 2002) to the case where the space of solutions is fragmented and for problems that can be mapped onto sparse graphs.
Kabashima (J. Phys A 36 11111, 2003) describes a technique for inference of the information given a signal, based on passing condensed messages between variables, consisting of averages over grouped messages. This technique works well in cases where the solution space is contiguous. However, the technique does not work where there are many possible competing solutions, which is characteristic of a fragmented solution space; the emergence of competing solutions would typically prevent the iterative algorithm from converging. Problems in the area of signal processing often present such behaviour, for some values of certain key parameters which may be known or unknown.
The present invention seeks to provide an improved method of signal processing, against this background. The present invention provides a method of processing a signal to infer a first data set encoded therein, the method comprising the steps of measuring a plurality of characteristics of the signal; establishing a plurality of correlation matrices, each correlation matrix comprising a plurality of correlation values; generating second and third data sets; determining an update rule relating each datum of the second and third data sets to each other respective datum of the second and third data sets by way of the measured signal characteristics and the properties of the correlation matrix; applying the update rule to the second and third data sets to obtain updated second and third data sets; and generating an inferred data set representative of the encoded first data set from the updated second and third data sets.
Preferably, the method further comprises the steps of: determining a plurality of likelihoods, each likelihood comprising the probability of a signal characteristic given the first data set, with respect to a free parameter; and optimising the free parameter with respect to a predefined cost measure.
In a further aspect the invention provides an inference method for solving a physical problem mapped onto a densely connected graph, where the number of connections per variable is of the same order as the number of variables, comprising the steps of: (a) forming an aggregated system comprising a plurality of replicated systems, each of which is conditioned on a measurement obtained from a physical system, with a correlation matrix representing correlation among the replicated systems; (b) expanding the probability of the measurements given the solutions obtained by the replicated systems; (c) based on the expansion of the step (b), deriving a closed set of update rules, which are capable of being calculated iteratively on the basis of results obtained in a previous iteration, for a set of conditional probability messages given the measurements; (d) optimising free parameters which emerge from at least one of the steps (b) and (c) for the specific problem examined with respect to a predefined cost measure; (e) using the optimised parameters to derive an optimised set of update rules for the conditional probability messages given the measurements; (f) applying the update rules iteratively until they converge to a set of substantially fixed values; and (g) using the substantially fixed value to determine a most probable state of the variables.
Preferably, step (b) of the inference method comprises expanding the likelihood in the large number limit. Preferably, the inference method further comprises the further subsequent step of deriving from the optimised set a posterior estimate.
By the use of a correlation matrix, the method of the present invention permits the determination of a probability per datum, averaged over a plurality of correlated estimates. As a result of the optimisation with respect to a predefined cost, the value of an unknown, free parameter can be ascertained. This free parameter is an unknown characteristic of the signal, which in signal processing applications, may be any parameterised unknown introduced as a result of earlier processing of the signal, for instance, the introduction of noise and interference in a communication system, noisy inputs to a system in a neural network, or controlled distortion in a data compression system.
The invention finds application in various fields of signal processing. For example, in the field of Code Division Multiple Access (CDMA) it is possible to determine the probability of the original information (estimate) given the plurality of signal characteristics, such that the noise level which was previously unknown, can be ascertained. Estimation of noise is an important problem in signal detection for a communication system. This determination advantageously allows the detector itself to calculate a value for noise level and thereby reduces the probability of error in the detected information.
The present techniques may be applied to a broad range of applications, for example including inference in discrete systems and decoding in error-correction and compression schemes as described by Hosaka, Kabashima and Nishimori (Phys. Rev E 66 066126, 2002).
However, a specific example of an application to acquiring a data set from a Code Division Multiple Access (CDMA) signal will now be described by way of example only.
Multiple access communication refers to the transmission of multiple messages to a single receiver. In the system shown in
In the CDMA system of
A technique for detecting and decoding such messages is based on passing probabilistic messages between variables in a problem mapped onto a dense graph. Passing these messages directly, as separately suggested by Pearl, Jensen and Mackay, is infeasible due to the prohibitive computational costs. The technique disclosed in Kabashima based on passing condensed messages between variables, consisting of averages over grouped messages, works well in cases where the space of solutions is contiguous and iterative small changes will result in convergence to the most probable solution. However, this technique does not work where there are many possible competing solutions; the emergence of competing solutions would typically prevent the iterative algorithm from converging. This is the situation in signal detection in CDMA.
CDMA is based no spreading the signal by using K individual random binary spreading codes of spreading factor N. We consider the large-system limit, in which the number of users K is large (tends to infinity) while the system load β≡K/N is kept to be O(1) (of order 1). We focus on a CDMA system using binary shift keying (BPSK) symbols and will assume the power is completely controlled to unit energy. The received aggregated, modulated and corrupted signal is of the form
where bk is the bit transmitted by user k, sμk is the spreading chip value, nμ is the Gaussian noise variable drawn from N (0,1), and yμ the received message (
The goal is to obtain an accurate estimate if the vector b for all users given the received message vector y by approximating the posterior P (b|y) (probability of b given y). A method for obtaining a good estimate of the posterior probability in the case where the noise level is accurately known has been presented in Kabashima. However, the calculation is based on finding a single solution and is therefore bound to fail when the solution space becomes fragmented, for instance when the noise level is unknown, case that is of high practical value.
The reason for the failure in this case can be qualitatively understood by the same arguments as in the case of sparse graphs; the existence of competing solutions results in inconsistent messages and prevents the algorithm from converging to an accurate estimate. An improved solution can therefore be obtained by averaging over the different solutions, inferred from the same data, in a manner reminiscent of the SP approach, only that the messages in the current case are more complex.
Using Bayes rule one obtains the BP equations (1):
where âμkt+1 and aμkt are normalization constants. For calculating the posterior (2)
an expression representing the likelihood is required and is easily derived from the noise model (which is not necessarily identical to the true noise) (3)
where yμ=yμu and uT≡1, 1, . . . , 1 (n dimensional)
An explicit expression for inter-dependence between solutions is required for obtaining a closed set of update equations. We assume a dependence of the form (4)
where hμkt is a vector representing an external field and is the matrix of cross-replica correlations. Furthermore, we assume the following symmetry between replica (5):
An expression for equation (4) immediately follows
where Zμkt is a normalization constant.
We expect the free energy obtained from the well behaved distribution Pt to be self-averaging, from which one deduces the following scaling laws: h˜O(1) and p˜O(n−1). In the remainder of the application we will rescale the off-diagonal elements of Qμkt to gμkt/n, where gμkt˜O(1).
To calculate correlation between replica we expand P (yμ|B) (Eq. 3) in the large N limit, where N is much larger than 1 and where inaccuracies occurring due to the approximation taken are negligible, as in Kabashima, to obtain (6):
where
σ is an estimate on the noise and C is a constant. Using the law of large numbers as outlined by Spiegel, Schiller and Srinivasan (Schaum's Outline of Probability and Statistics, Schaum N.Y., 2000) we expect the variables Δμk to obey a Gaussian distribution.
The mean value of bka at time of t+1 is then given by (7):
where (Pμ)k1≡(1/K) sμksμl and (I)k1≡δkl respectively. mμkt, Qμkt and Yμkt are (8), (9):
where nμkt are free parameters related to the location of dominant terms in the probability P (yμ|B).
The main difference between Eq. (7) and the equivalent in Kabashima is the emergence of an extra term in the prefactor, βYμkt, reflecting correlations between different solutions groups (replica). To determine this term we optimise the choice of Yμkt by minimising the bit error at each time step. Optimizing the inference error probability Pbt at any time with respect to Yμkt one obtains straightforwardly that Yt=(σ02−σ2)/β which is just a constant. However, it holds the key to obtaining accurate inference results. If our noise estimate is identical to the true noise the term vanishes and one retrieves the expression of Kabashima; otherwise, an estimate of the difference between the two noise values is required for computing {circumflex over (m)}μkt+1.
As a byproduct of the optimisation of Yt, we found that the Equation (7) can be expressed as (10), (11):
where no estimate on σ0 is required.
The estimate at the t-th iteration on the kth bit {circumflex over (b)}kt is then approximated by (12):
The inference algorithm requires an iterative update of Equations (8, 9, 10, 11, 12) and converges to a reliable estimate of the signal, with no need for an accurate prior information of the noise level. The computational complexity of the algorithm is of O (K2).
To demonstrate the performance of our algorithm, we carried out a set of experiments of the CDMA signal detection problem under typical conditions. Error probability of the inferred signals has been calculated for a system of β=0.25, where the true noise level is σ02=0.25 and the estimated noise is σ2=0.01, as shown in
The solid line represents the expected theoretical results (density evolution), knowing the exact values of the σ02 and σ2, while circles represent simulation results obtained via the suggested practical algorithm, where no such knowledge is assumed. The results presented are based on 105 trials per point and a system size N=2000 and are superior to those obtained using the original algorithm (Kabashima).
Another performance measure one should consider is
This provides an indication of the stability of the solutions obtained. In the inset of
The CDMA signal detection problem is described by way of example only and without limiting the generality of the method. Similar inference methods could be obtained using the same principles for a variety of inference problems that can be mapped onto dense graphs. In a general method:
1. The generic inference approach is based on considering a large number of replicated solution systems (which is much larger than 1 and where inaccuracies occurring due to the approximation taken are negligible), each of which is conditioned on the same observations;
2. A correlation matrix of some form between replicated solutions is assumed;
3. The likelihood of observations given the replicated set of solutions is expanded using the large system size;
4. A closed set of updated rules for a set of conditional probabilities of messages given data is then derived;
5. Free parameters that emerge from the calculations are optimised.
These are the main steps of a generic derivation of a method of using belief propagation in densely connected systems that enables one to obtain reliable solutions even when the solution space is fragmented. The update rules which are obtained are applied iteratively until they converge until a set of substantially fixed values. In this context, “substantially fixed” is intended to mean that the values fulfil one or more criteria for convergence. For example, such a set of criteria may be that the values change by less than respective threshold amounts for consecutive iterations. These values are then used to determine the most probable states of the variables.
The update rules are then used as illustrated in
Although one specific embodiment has been described to illustrate in detail the present invention, it is nevertheless to be understood that this is merely by way of example and that the invention is in fact generally applicable to the processing of signals.
For example in the area of neural networks a known problem is learning (parameter estimation) in the Linear Ising perceptron. In this problem, learning is equivalent to inferring a data set (weights, following the neural networks terminology) encoded in a signal, given a plurality of characteristics of a signal. The Linear Ising perceptron is initialised with a small number of characteristics of a signal and thereby estimates the data set with some probability of error. When additional information is added, the algorithm again estimates the data set, with a reduced probability of error. The learning performance of the perceptron is measured by the improvement in probability of error given the additional information. In this respect, the skilled person is able to formulate the problem in similar terms to the CDMA problem, as described in detail above.
Another example is in the area of lossy data compression. A signal comprises a plurality of characteristics corresponding to an original message. This signal is processed to generate a compressed data set. The size of the compressed data set is smaller than the number of characteristics of the signal. The problem is to infer the compressed data set given the signal and a fixed distortion limit. The original message defines the plurality of signal characteristics while the compressed data set represents the original information to be estimated. Again, an iterative method for estimating the compressed data set could be devised along the lines described for the CDMA signal detection by a skilled person.
Number | Date | Country | Kind |
---|---|---|---|
0505354.1 | Mar 2005 | GB | national |
Filing Document | Filing Date | Country | Kind | 371c Date |
---|---|---|---|---|
PCT/GB2006/000976 | 3/16/2006 | WO | 00 | 9/14/2007 |