This disclosure relates to processing of noisy signals, and more particularly, to conversion of noisy signals to a probabilistic representation.
One approach to conversion of an analog value to a digital value based on a partition of the input range makes use of an analog-to-digital (A/D) converter (ADC). An input range with 2n input regions can output an n-bit digital value representing which region the input is found within. Among the fastest A/D converters are “flash” converters. A flash (or direct) converter can make use of 2n−1 comparators to perform an n-bit conversion in one stage, which each comparator being associated with a region quantized to the same digital value. The size and cost of all those comparators makes flash converters generally impractical for large numbers of output bits (e.g., much greater than eight bits or 255 comparators). Other forms of A/D converters use multiple iterations to reduce the circuitry at the expense of increased conversion time. For example, a successive approximation ADC can use log2(n) processing stages with a single comparator.
Another approach to converting an analog value based on a partition of an input range is one that outputs probabilities that the input is found within each of the regions based, for example, on a noisy version of the input. An approach based on use of a parallel arrangement of circuitry that maps the noisy input into a set of analog representations of probabilities is described in co-pending application publication US2010/0281089A1, “SIGNAL MAPPING,” published on Nov. 4, 2010. Such circuitry can be adapted or configured according to the characteristics of the degradation (e.g., according to the variance of an additive noise) and/or prior information about the distribution of the clean input (e.g., a distribution over a discrete set of exemplar values, uniformly distributed etc.).
In a general aspect, another approach to converting an analog value into a set of probabilities avoids use a fully parallel arrangement of circuitry as described in US2010/0281089A1. In some examples, an iterative approach is used to determine the output probability values in a series of iterations. For instance, each iteration is associated with a different bit of an n-bit representation of 2n regions of the input range, and output probability and/or intermediate values can be accumulated, for example, in analog form, during the iterations.
In another aspect, in general, a signal converter includes an input comparison module configured to accept an analog signal input, and to provide an analog comparison output characterizing a continuous value and a logical comparison output. A controller is coupled to the input comparison module and is configured to accept the logical comparison output from the input comparison module, and to control configuration of the input comparison module according to the logical comparison output. An analog accumulator is used for accumulating the analog comparison output from the input comparison module, and providing a plurality of analog values, each analog value being associated with a different domain of the input of the input comparison module.
Aspects may include one or more of the following features.
The analog accumulator includes a plurality of storage elements, each storage element being associated with a different one of the analog values provided.
The controller is configured to control an iterative conversion of the analog signal input, wherein at each iteration, the analog accumulator is updated with an analog comparison output.
The controller is configured to control a data dependent conversion, including terminating the conversion after a number of iterations that depends on the analog signal input.
The converter has plurality of pipeline stages, wherein each of the stages includes an analog accumulator, an input comparator, and an analog input memory element, the analog accumulators being coupled to pass analog values between stages of the pipeline.
The controller is configured to control a pipelined conversion of a series of analog signal inputs, including configuring an input comparison at each stage according to a logic output of an input comparison at a prior stage.
A computing device is coupled the analog accumulator and configured to perform a probabilistic computation using the analog values provided from the accumulator.
In another aspect, in general, a method is used for converting a noisy analog signal, which corresponds to a first signal, into a plurality of analog values, each of which characterizes a probability that the first signal is in a corresponding part of its range. In some examples, the input range is partitioned into 2n parts, each of which is associated with an n-bit index, for example a base 2 (binary) number of a Gray Code index. In these examples, each of the analog values characterizes a probability that a corresponding different one of the n bits of the index will take on a particular value. The method includes receiving the noisy signal that includes the first signal in combination with noise, identifying a set of values that correspond to a first state of the bit; determining a probability that the noisy signal, in the absence of noise, would have had a value within the set of values; comparing the probability with a probability threshold; and for at most one of the plurality of bits, based on at least the comparison, terminating the procedure.
In some examples, for at least one of the bits, the set of values includes a plurality of non-overlapping subsets, each of which has a lower boundary and an upper boundary. In such practices, determining the probability includes for each of the lower boundaries, determining a first probability, the first probability being a probability that the noisy signal, in the absence of the noise, would have had a value in excess of the lower boundary, for each of the upper boundaries, determining a second probability, the second probability being a probability that the noisy signal, in the absence of the noise, would have had a value in excess of the upper boundary, and determining a difference between a sum of each of the first probabilities and a sum of each of the second probabilities.
In another aspect, a method is directed to representing an estimate of a value of a noisy analog signal with a plurality of bits. Such a method includes receiving a noisy signal that includes a transmitted signal in combination with noise; identifying a set of values that correspond to a state of the bit; identify non-overlapping subsets of the set of values, each of the non-overlapping subsets having a boundary; for one of the boundaries, determining a probability that a noisy signal, in the absence of noise, would have had a value in excess of the boundary; determining that the value is at least a threshold distance from a value indicative of certainty; on the basis of the boundary, selecting a set of additional boundaries; and for each of the boundaries in the selected set, setting the boundary to a value that indicates certainty.
In yet another aspect, a method is directed to representing an estimate of a value of a noisy analog signal with a plurality of bits, said method comprising: receiving a noisy signal that includes a transmitted signal in combination with noise; determining a probability that said noisy signal, in the absence of noise, would have had a value within a range of values; on the basis of said determined probability, making an inference concerning a most likely value of a bit probability for a second bit in said plurality of bits, wherein said second bit is less significant than said first bit; on the basis of said inference, assigning a bit probability to said second bit.
Other aspects of the invention include a manufacture that includes a computer-readable medium having encoded thereon software for implementing the foregoing methods, as well as an, apparatus that includes an analog-to-probability converter with circuitry configured to implement any of the foregoing methods.
In some examples, a set of one or more soft slicers are combined with a decision tree to create an analog-to-probability converter. The output from the set of soft slicers is used to decide the probability associated with each successively less significant bit. This can be used as a basis for deciding what branches of the decision tree should be traversed for the next bit. For those bits that are deep within the noise, the information within those bits is essentially obscured by the noise. As a result, the probabilities that those bits take on a particular value are simply set to 0.5, without having to actually calculate them.
In other cases, where the a noiseless version of a received signal has a slice probability at a particular bit that is close to zero or one, the apparatus sets the slice probability to either one or zero, as appropriate, for all corresponding slice probabilities for less significant bits that lie on one or the other branch of a decision tree below the corresponding slice of that bit. Again, this avoids the need to actually calculate those probabilities.
A number of advantages accrue to an apparatus as described herein. For example, analog or stochastic circuits for assigning probabilities can use less energy than digital slicers in a conventional A/D converter. As a result, the apparatus described herein can save energy.
The apparatus described herein also avoids having to compute output bits that contain little or no information. This saves time and energy.
The apparatus described herein is capable of outputting “soft information” in stochastic or analog format, which is ideal for input into an analog or stochastic probability computer.
The soft slicer compares a value of the input noisy signal with a boundary value, or slicer level, and outputs a probability that the input value would have been classified, in the absence of noise, as having a value greater than or equal to the slicer level. This probability is then provided to a window comparator, which determines if the probability is within one of two windows, each of which is bounded by a value indicating certainty (0 or 1), and each of which has a pre-selected width.
Aspects may have one or more of the following advantages. “Soft Slicers” (e.g., analog or stochastic demappers) may use less power than digital slicers in a conventional ADC, thereby saving energy in conversion of an analog input.
A controller of the signal conversion can avoid computing or outputting bits that contain little or no information thereby saving time and energy. For example, in a case in which successively less significant bits are effectively unknown due to the input noise level and/or the proximity of the input to boundaries corresponding to the bit values, the conversion can be terminated. Note that such termination may depend on the input value, and not solely on the signal to noise level.
Pipelined implementations of the signal converter can use approximately exponentially fewer demapper slicers at the cost of approximately linearly more latency through the circuit.
The system is capable of outputting “soft information” in stochastic or analog format, which is ideal for input into an analog or stochastic probability computer.
By computing the soft information using a soft slicer and directly outputting it, we eliminate some number of stages from the pipeline process. Additional, stages are undesirable because they allow additional opportunities for noise, mismatch, nonlinearities, or other non-idealities to enter into the conversion process.
Other features and advantages of the invention are apparent from the following description, and from the claims.
Referring to
One example of overlapping classes corresponds to a bit position in a binary number. For example, suppose that the original range of the signal, x was 0.0≦x<8.0, one choice is to define 8 classes, indexed from 0 to 7, such that class 0=0002 (the subscript indicating base 2) is associated the range 0.0≦x<1.0, through class 7=1112 being associated with the range 7.0≦x<8.0. In such an example, n=3 analog probability outputs 145 may be provided, with each providing a probability that a corresponding one of the three bits is 1. This type of representation may be used in further probabilistic computation, for example, using the techniques described in co-pending application “ANALOG COMPUTATION USING NUMERICAL REPRESENTATIONS WITH UNCERTAINTY”, published as US2010/0223225A1 on Sep. 2, 2010. Note that other than the most significant bit corresponds to a union of multiple ranges. For example, the second bit corresponds to the original signal range 2.0≦x<4.0∪6.0≦x<8.0.
In a situation in which the analog input is not degraded by noise, and the classes correspond to the bits of a binary index number of contiguous analog ranges, the output probabilities are either 0.0 or 1.0, and the signal conversion system effectively acts as a conventional Analog-to-Digital Converter (ADC). In the corresponding situation in which the input is degraded by a noise (e.g., an additive noise), the outputs represent the probabilities (x εk|y) for all k, where k is the set of all (noise-free) values at the input of an ADC that would produce a 1 in the kth bit.
For binary encoding from bit-values, most of the sets k contain multiple non-overlapping regions. We can decompose the problem of finding probabilities of bits into a problem of finding cumulative probabilities associated with the boundaries of each region, and then determining the bit probabilities from these. For one-dimensional signals, each region, i, can be identified by its lower and upper boundaries, lk,i and uk,i, respectively. (The top-most region for all bits has an upper bound of +∞, since x→+∞ would map to the all-ones ADC output.)
For each boundary (upper or lower), di, we wish to determine the probability that the input, x is greater than or equal to this value (this is one minus the cumulative distribution of x given y evaluated at di). We can write this in terms of the probability density function, px|y(x|y), as:
The denominator is a normalization constant and is independent of i. We refer to the function associated with each boundary, di, as a “soft slicer.”
The term py|x(y|x) represents the noise through the distribution of y given a known value of x. For independent, identically distributed (i.i.d.) additive noise (where y=x+n), py|x(y|x)=pn(y−x). For i.i.d. additive Gaussian noise, py|x(y|x)=y(x,σn).
The term px(x) represents the prior distribution of the signal—that is, the distribution of the signal prior to the knowledge gained by measuring the value y. In case the actual signal distribution is known, this distribution should be used in the calculation. If, for example, our knowledge of x were limited to its mean, μx and average energy, σx2, we could use px(x)=x(μx,σx). If, instead, we knew that the signal was bounded by a certain lower and upper bound, L and U, respectively, we could use
Given the set of boundary probabilities, we can compute the bit probabilities by summation:
where, for each k, the values of i index the set of all lower boundaries of the regions of k and the values of j index the set of all upper boundaries of the regions of k.
Continuing to refer to
Referring to
The functions implemented by the modules are introduced below in general terms, with further specific embodiments being described after this introduction.
In the common case of Gaussian noise, where py|x(y|x)=y(x,σn), the computation of (x≧di|y) can simplify.
If nothing is assumed about the prior distribution of x, then we can take px(x) as uniform over a region from L to U, where L<lk,i for all k and i and U>uk,j for all k and j (except for the last j for each k since uk,j is ∞ in this case). In this case we can write:
where (z;μ,σ) is the cumulative normal distribution of mean μ and standard deviation σ, evaluated at z, which is equal to
Since (z;μ,σ)=−(μ;z,σ), we can equivalently write:
For sufficiently small values of L, where (y;L,σn)≈1 and sufficiently large values of U, where (y;U,σn)≈0, this simplifies to:
(x≧di|y)≈(y;di,σn) (12)
or, equivalently:
For each di, these functions are identically shaped functions positioned at di. So, with this approximation, for a given value of σn, functions of identical shape can be reused for each di simply by repositioning the location of the function. Under this assumption, the series of soft slicers can be implemented using a single function that can be repositioned using an additive offset.
As an example,
In the case that the prior of the signal x is Gaussian, px(x)=x(μx,σx), then we can write:
When σx is large in comparison with σn, then ŷ≈y and {circumflex over (σ)}n≈σn and therefore:
(x≧di|y)≈(y;di,σn) (20)
or, equivalently:
This approximation for a Gaussian prior is identical to the approximation for a uniform prior with sufficiently small values of L and sufficiently large values of U.
The configuration of the input comparison module 120 (see
For binary encoded output bits, the boundaries associated with each bit form a hierarchy that can be used to determine the order of computation. In this case, going from most-significant-bit (MSB) to least-significant-bit (LSB), the region boundaries associated with each successive bit share all of the boundaries with all previous (more significant) bits.
For an n-bit ADC, we number the bits from 1, the most significant bit, to n, the least significant bit, so that the numbers corresponds to levels of the hierarchy. To simplify notation, we normalize the input range of the ADC to span real values from 0.0 to 1.0.
The region 1 for the most significant bit, has a single lower boundary, l1,1=½. For the next most significant bit, the region boundary 2, adds two additional boundaries for a total of three. Specifically, 2 has two lower boundaries, l2,1=¼ and l2,2=¾ plus one upper boundary, u21=½=l1,1. The upper boundary is in common with the boundary from the previous level.
Generalizing this sequence, at each level of the hierarchy, k, 2k−1 boundaries are added, which are all of the lower boundaries of k. The set of lower boundaries are positioned as
for i=1 to i=2k−1. The 2k−1−1 boundaries already computed are all of the upper boundaries of k.
For each level of the hierarchy, the probabilities associated with the upper boundaries have already been computed for previous levels. Therefore, only probabilities associated with the 2k−1 lower boundaries need be computed. For each of these boundaries, we compute (x≧lk,i|y) as described in previous sections.
From Equation (5) we can see that calculating the bit probabilities is simply taking the sum of the lower boundary probabilities minus the sum of the upper boundary probabilities. But since the upper boundary probabilities are simply the probabilities for all of the boundaries already computed for previous levels, and the lower boundary probabilities are the probabilities for all of the boundaries added for the current level, we can sum these groups of probabilities separately and subtract them to form the probability of bit k.
The sum of the probabilities of the previous levels can be kept as a running sum, Sk−1, where at each level we perform the following computations to generate both the bit probability and the running sum to be used at the next level:
where S0=0.
If computing the full hierarchy described above, probabilities associated with a total of 2k−1 boundaries would need to be computed. But under typical conditions, most of these probabilities will be very nearly 0 or 1.
Since (x≧di|y) is a monotonically increasing function of y and a monotonically decreasing function of di, probability values previously computed in the hierarchy can be used as upper or lower bounds on the probability value for a given boundary. Specifically, if we have already computed (x da|y) and (x≧db|y) where da<di<db, then (x≧da|y) is an upper bound on the value of (x≧di|y) and (x≧db|y) is a lower bound.
Thus, if (x≧da|y)<ε for a small positive value of ε, then (x≧di|y)<ε. For sufficiently small ε when this condition occurs, we can skip the computation of (x≧di|y) entirely and assume any value from 0 to ε. To simplify implementation, we can choose the value 0, which will be approximately correct.
Similarly, if (x≧db|y)>1−ε then (x≧di|y)>1−ε, and we can skip the computation of (x≧di|y) entirely and assume any value from 1−ε to 1. To simplify implementation, we can choose the value 1, which will be approximately correct.
In choosing a and b for the above bounds, the tightest bounds are when da is the largest already-computed boundary value less than di and, similarly, when db is the smallest already-computed boundary value greater than di. Note that for the smallest value of boundary being computed at a given level, there is no lower value that has already been computed, and thus no upper bound less than 1. Similarly, for the largest boundary value computed at a given level, there is no larger value that has already been computed, and thus no lower bound greater than 0.
The dependencies represented by these choices for a and b can be represented as a binary tree. For each soft slicer at level k, the corresponding node in the tree connects to the two nodes associated with the soft slicers at level k+1 with the nearest soft slicer positions (one above and one below).
As we proceed through each level of the hierarchy, the results determine which branches need be computed in the next levels. For each soft slicer, there are three possible outcomes that affect the conditional computation: (1) the computed probability is less than ε, (2) the computed probability is greater than 1−ε, or (3) the computed value falls somewhere between these values. Under condition (1), the entire right-hand branch of the tree below the current node (corresponding to larger values) need not be computed and the corresponding probabilities can be assumed equal to 0. Note that this applies not only to the next level, but to all subsequent levels in the same branch. Similarly, under condition (2), the entire left-hand branch of the tree below the current node (corresponding to smaller values) need not be computed and the corresponding probabilities can be assumed equal to 1. Under condition (3) both branches need to be continue to be computed.
In one implementation, the determination of the three conditions above are determined directly, by comparing the output of the soft slicer to thresholds ε and 1−ε. If ε is very small, these comparison operations must be fairly precise. In an alternative implementation, since for a given noise and signal distribution we know the shape and position of each soft slicer curve, the two values of y that correspond to the values for which the soft slicer output would cross ε and 1−ε, respectively, are pre-calculated. When each soft slicer is computed, we also compute two hard thresholds on the value of y, at each of the pre-computed values. Conditions (1)-(3), above, are determined based on the outcome of these hard thresholds. While these are hard thresholds (comparators), the values need not be very precise since a large change in these threshold values would produce only a small change in the outcome of the soft slicer (since the soft-slicer curves are nearly flat in the regions where its value is near 0 or 1.
Using the direct comparison of the soft slicer outputs to the thresholds, ε and 1−ε, the computation can be summarized according to pseudocode shown of
Using the alternative implementation described above, in which the y value is compared to pre-calculated thresholds in lieu of comparing the soft slicer outputs to ε and 1−ε, the conditional computation can be summarized according to pseudocode shown of
Equivalent variations of this algorithm would make use of commonly used data structures to perform operations on entire branches of the tree more simply.
By skipping computation in the manner described in this section, the total number of soft slicer computations needed can be much less than 2k−1. The amount of computation needed depends primarily on the values of σn, ε, and y. The specific form of the prior distribution also has an effect.
Continuing the earlier example,
From
Specifically, as we process each level of the hierarchy, from k=1 to n, if |(x≧εk|y)−½|<δ, for some small positive value of δ, then set └(x εm|y)=½ for all m>k.
Note that the order of certainty with bit significance is not strictly true for all values of y, particularly for values near the ends of the ADC's range. When this assumption is not true and this rule is applied, there can be error in the probabilities assigned to the least significant bits greater than δ.
Continuing the earlier example,
Including early stopping, the computation can be summarized in the pseudocode shown in
The window comparator produces a “soft bit true,” which is high when the slicer output is within one of the two windows. The window comparator also outputs hard bit values. The “hard bit probability analog value” is an analog signal equivalent to the hard bit value of 0 or 1. Since the slicer output can represent a current or voltage value, this provides a way to represent hard digital values in the analog domain.
The implementation of
The additional signal “TERMINATE” in
The input signal, in the absence of noise, that is introduced into the circuitry need not arise from a uniform distribution, but can arise from other probability distributions, or from sets of permitted values, for example, form a constellation of signaling values. Furthermore, the input may have multiple dimensions, and the comparison modules forming comparisons for multidimensional regions.
In some alternatives or extensions, the decision tree we use in our example is what is can be thought of as a generative model for the distribution of the signal. We use a binary tree with un-weighted decisions between choosing a 0 branch or a 1 branch. This process produces a uniform distribution, which is not a bad way to start—it assumes no special structure in the signal.
However, it is very possible to extend this generative model to take into account greater structure in the system that generated the signal. Let us call this system the “transmitter.” It could be a human engineered system such as a wireless transmitter that favors sending certain waveforms or signals over others, or could be a naturally occurring phenomena such as the sound from a tree blowing banging against the exterior of a house, or any other system that generates a signal we wish to analyze.
A slightly more specific model could still be a binary tree with weights for the decisions. These (weighted) binary decisions are called Bernoulli variables. Perhaps we know that the signal tends to be low amplitude, so the MSB is 60% likely to be a 0. This prior could be incorporated into the decision process as we branch on the tree, proceeding through the conversion process.
Such weights need not be determined by a human a priori and input into the system. They can be learned. One principled way to do this is to represent the decision variables with Beta distributions, the conjugate distribution of the Bernoulli distribution. We could then estimate the parameters of these beta distributions using training data obtained from recording multiple samples of the signal over time (or space or the like). An estimation algorithm such as Markov Chain Monte Carlo, Gibbs sampling or a variational technique can be used to perform the estimation of these parameters from the data. Different values of these parameters would produce a family of distributions that could be used to model the data. We could also potentially use this process to model the noise in the channel in addition or instead of only modeling the signal.
So far we have retained the binary tree model and only varied the weights of the decisions. But if we do not like the limited family of distributions made available by simply weighting the binary tree process, then we can generalize further. We can actually change the form of the tree. For example, we could use what we herein call the “Boston Science Museum Process.” This is derived from the exhibit in many science exploratorium museums that involves dropping ping pong balls through a diagonal lattice of pegs attached to a peg board. The result is a pile of balls with a Gaussian distribution. This is similar to a binary process, but as illustrated below, the initial decisions do not move your position as far on the number line, which clumps together the final result into a Gaussian distribution. This process is an efficient way to model a Gaussian distribution. Furthermore, just as we did in the binary tree process, we could weight the variables in the Boston Science Museum with priors and/or parametrize the variables as Beta distributions and learn their weights. We could also potentially use this process to model the noise in the channel in addition or instead of only modeling the signal.
Even further generalizing, we could design or learn an infinite number of more complex parametric or non-parametric generative models or factor graphs to model the “transmitter” system.
The computation described above (either with or without early stopping) can be implemented in hardware to directly convert analog input signals to analog probability values.
In one embodiment, the input signal is sampled via a sample-and-hold circuit. The output of the sample-and-hold is used successively as input to the soft-slicer, each time using a distinct soft-slicer position (the additive offset of the soft-slicer based on which boundary, di is being computed) based on the procedure described above to determine the next soft-slicer position. Each output of the soft-slicer is accumulated as described above to form the bit probabilities.
In another embodiment, not only the position of the soft-slicer is varied on each use, but the shape of the slicer function. This would be appropriate when the approximations that allow a common slicer function are not appropriate. In this case, the slicer circuit is parameterized by both the position (as an additive offset) as well as other parameters that allow it's shape to change appropriately. Depending on the specific slice, di, being computed, the control circuitry would set the parameters to approximate the ideal shape of that slicer function.
In another embodiment, the computation is pipelined, such that as each bit layer is computed, the input value (from the sample-and-hold) and the intermediate results are passed to a subsequent stage to operate on the next bit level. While the next stage operates on the next bit level associated with one input value, the current stage receives the input (and intermediate results, in the case of stages beyond the first stage) associated with the next input value. In this way, the time between inputs can be significantly shorter than the time to perform the entire APC computation for a given input value.
The window comparator produces a “soft bit true,” which is high when the slicer output is within one of the two windows. The window comparator also outputs hard bit values. The “hard bit probability analog value” is an analog signal equivalent to the hard bit value of 0 or 1. Since the slicer output can represent a current or voltage value, this provides a way to represent hard digital values in the analog domain.
The implementation of
The additional signal “TERMINATE” in
It is to be understood that the foregoing description is intended to illustrate and not to limit the scope of the invention, which is defined by the scope of the appended claims. Other embodiments are within the scope of the following claims.
This application claims the benefit of U.S. Provisional Application No. 61/328,551 filed Apr. 27, 2010, the contents of which are incorporated herein by reference. This application is related to, but does not claim the benefit of the filing date of the following applications: U.S. application Ser. No. 12/716,113, titled “SIGNAL MAPPING,” filed Mar. 2, 2010; and U.S. application Ser. No. 12/716,155, titled “ANALOG COMPUTATION USING NUMERICAL REPRESENTATIONS WITH UNCERTAINTY,” filed on Mar. 2, 2010. The contents of these applications are incorporated herein by reference.
This invention was made with government support under FA8750-07-C-0231 awarded by the Defense Advanced Research Projects Agency (DARPA). The government may have certain rights in this invention.
Number | Name | Date | Kind |
---|---|---|---|
6542105 | Sakuragi | Apr 2003 | B2 |
20100289681 | Kamikisaki | Nov 2010 | A1 |
Number | Date | Country | |
---|---|---|---|
20120194375 A1 | Aug 2012 | US |
Number | Date | Country | |
---|---|---|---|
61328551 | Apr 2010 | US |