The present invention relates to signal decoding, and in particular to analog decoders.
In the mid-1990's, a new class of error correcting codes called “Turbo codes” was introduced [3]. Turbo codes achieve performance, typically measured in BER (bit error rate), very close to a long sought-after limit known as the Shannon bound. The decoding algorithms used in Turbo codes have since been generalized into a class of decoding algorithms based on factor graphs [17, 6]. These algorithms are commonly referred to as message-passing algorithms. Turbo-like codes which employ message-passing algorithms are generally referred to as iterative decoders.
An important general form of message passing algorithms is the sum-and-product algorithm [9], which implements probability propagation on a graph. The sum-and-product algorithm is a very general algorithm which describes decoders for many codes, including trellis codes, LDPC (low-density parity check) codes, block product codes, BCJR (Bahl, Cocke, Jelinek, Raviv) codes [2], and Turbo codes. The sum-and-product algorithm also describes algorithms used in artificial intelligence.
There has been great interest in implementing iterative decoders, but conventional digital implementations are often complex and demand expensive resources. Analog circuits for iterative decoding have been proposed and demonstrated by various researchers in recent years [10, 7, 12, 13, 14, 8, 15, 18]. CMOS (Complementary Metal Oxide Semiconductor) circuits are of particular interest for some applications because they can be implemented in ordinary so-called “plain vanilla” CMOS processes, which are low-cost compared to high-end alternatives such as BiCMOS and SiGe. CMOS analog decoders can also be more easily integrated with other CMOS components for single-chip receiver solutions.
In many cases, analog decoders offer significant advantages over digital designs. For example, the operations required for implementing the sum-and-product or BCJR algorithms can be implemented in analog decoders with fewer transistors. Analog circuits also require significantly fewer wire connections between components. Such efficient use of space allows parallel circuit implementations to enable decoding operations to occur completely in parallel, which thereby provides for high data throughput.
Analog decoders are also intrinsically low-power, and eliminate the need for high speed analog-to-digital (A/D) conversion in a receiver front-end. A typical A/D converter by itself consumes a significant amount of power and silicon real-estate. An analog decoder may thus be thought of as an information A/D converter, specially designed to convert coded analog channel information into decoded bits.
In one aspect, the invention provides an electronic circuit comprising a plurality of multiplier modules for receiving a plurality of first input signals and respective ones of a plurality of second input signals, each multiplier module configured to generate as output signals products of the plurality of first input signals and its respective one of the plurality of second input signals, and a plurality of dummy multiplier modules for receiving the plurality of second input signals, each dummy multiplier module corresponding to a respective one of the plurality of multiplier modules and configured to form products of the respective one of the plurality of second input signals of its corresponding multiplier module and the second input signals other than the respective one of the plurality of second input signals.
Each of the multiplier modules preferably comprises a plurality of transistors for receiving respective ones of the plurality of first input signals and the respective one of the plurality of second input signals. Each of the plurality of transistors preferably comprises a control terminal for receiving the respective one of the plurality of first input signals, a first switched terminal for receiving the respective one of the plurality of second input signals, and a second switched terminal on which the output signals are generated.
The dummy multiplier modules preferably have a similar structure, comprising a plurality of transistors for receiving the respective one of the plurality of second input signals and respective ones of the other second input signals, with each transistors comprising a first switched terminal for receiving the respective one of the plurality of second input signals, a control terminal for receiving the respective one of the other second input signals, and a second switched terminal on which the products are formed.
In one embodiment, a connectivity module receives the output signals from the plurality of multiplier modules and generates as output signals sums of predetermined ones of the output signals from the multiplier modules. The connectivity module preferably comprises inputs for receiving the output signals from the plurality of multiplier modules and outputs for outputting the output signals of the connectivity module. Connections between the inputs and the outputs determine the predetermined ones of the output signals from the multiplier modules.
A renormalization module is also provided in some embodiments for receiving the output signals from the connectivity module and normalizing the output signals from the connectivity module to thereby generate output signals that sum to a predetermined unit value.
The invention also provides, in another aspect, a sum-product circuit comprising a plurality of first inputs for receiving respective first input signals, a plurality of second inputs for receiving second input signals, a plurality of multiplier modules connected to the first inputs and a respective one of the second inputs, each multiplier module configured to generate as intermediate output signals products of the first input signals and the second input signal on its respective one of the second inputs; a plurality of dummy multiplier modules connected to the second inputs, each dummy multiplier module corresponding to a respective one of the plurality of multiplier modules and configured to form products of the second input signal of its corresponding multiplier module and the second input signals other than the second input signal of its corresponding multiplier module; a connectivity module for receiving the intermediate output signals and configured to generate as output signals sums of predetermined ones of the intermediate output signals; and a plurality of output terminals for outputting the output signals.
The sum-product circuit may also include a plurality of reset switches connected across the first inputs, the second inputs, and the outputs and configured to short the first inputs, to short the second inputs, and to short the outputs in response to a reset signal.
Other aspects and features of the present invention will become apparent, to those ordinarily skilled in the art, upon review of the following description of the specific embodiments of the invention.
The invention will now be described in greater detail with reference to the accompanying diagrams, in which:
a is a schematic diagram of an array of source-connected transistors as an example implementation of the blocks 16, 18, and 20 of
b is a schematic diagram of diode-connected transistors for converting voltages to currents;
a shows an example trellis stage;
b is a block diagram of an example low-voltage sum-product architecture according to an embodiment of the invention for computing on trellis graphs;
a shows an example Tanner Graph;
b shows an example normalized Tanner Graph;
The sum-product algorithm [9, 17] is a general framework for expressing probability propagation on factor graphs. The purpose of the algorithm is to compute global conditional probabilities using only local constraints. Constraints are often expressed by factor graphs [5, 9, 17]. Special cases of factor graphs include trellis graphs and Tanner graphs, which are also referred to as constraint graphs. According to one embodiment of the invention, the sum-product algorithm is implemented on graphs which express boolean functions on discrete-type variables.
For example, consider the implementation of graphs which can be simplified to local boolean constraints on three variables.
The constraint ƒ and the variables (x,y,z) have, up to this point, been defined as deterministic. When (x,y,z) are random variables, denoted herein with boldface type, they are characterized by probability, masses instead of exact values. The goal of probability propagation is to determine the probability mass of z, written ρz, based on known masses ρx and ρy. Normal typeface for these variables indicates particular samples or values of the random variables.
The local operations of the sum-product algorithm are described as follows. The constraint ƒ is mapped to a processing node which receives probability masses for variables x and y. These variables are assumed to be independent of each other. The processing node then computes the probability mass of z based on the constraint ƒ and the masses of x and y. Let Sƒ be the set of combinations of (x,y,z) for which ƒ(x,y,z) is satisfied. Let Sƒ(j) be the subset of Sƒ for which z=j, where jεAz. We then compute, for each j, the function
where kεAx and lεAy, η is any non-zero constant real number, and x, y, and z are random variables. The constant η is typically chosen so that ΣjP(z=j)=1. In principle, though, the accuracy of the algorithm is indifferent to η.
The local computation (1), is the heart of the sum-product algorithm. A complete sum-product decoder consists of many interconnected instances of (1).
One approach for CMOS analog decoder designs, elaborated in [10], has emerged as a popular topology for analog decoder designs. This topology is based on a generalized Gilbert multiplier.
In typical implementations of a Gilbert circuit, each multiplier module 16, 18, 20 includes an array of source-connected transistors, such as the array 17 shown in
In
For the purposes of simplifying circuit analysis, assume that Ixi∝ρx(i) and that Iyj∝ρx(j), so that the current-mode column and row inputs represent probability masses.
Intermediate outputs emerge from the top of the multiplier modules 16, 18, and 20 in
The Gilbert multiplier consists of MOS transistors biased in the sub threshold region, meaning vgs<VTh for each transistor, where vgsrefers to the voltage between the gate and source of an MOS device, and Vth refers to the threshold voltage. In digital design, sub threshold transistors are usually regarded as “off.” A more precise model of their behaviour is given by
where I0 is a technology constant with units of amperes, W and L are the width and length of the transistor device, respectively, K≈0.7 is a unit less technology constant, and UT≈25 mV is the well-known thermal voltage. In the subthreshold region, ID is usually below 1 μA, resulting in very low power consumption. The subthreshold region is also commonly known as weak inversion, because the mobile charge density in the transistor's channel is very small. Circuits based on this subthreshold model were popularized by Vittoz, et. al. [16].
If vds is sufficiently large (around 150 mV), then it has little effect on ID in (2). When vds is large enough to be neglected, the device is said to be in saturation. In the canonical approach, all transistors are assumed to be in saturation. To ensure this, the reference voltage Vref≈0.3V is used at the source of transistors 21, 22 and other row input current to voltage conversion devices. This maintains a sufficiently high voltage at the drain of each column input transistor 15, 17, 19 for the column input transistors to remain in saturation.
The Gilbert multiplier is based on the translinear, principle. According to this principle, in a closed (Kirchoff) loop of devices in which the current (ID) is an exponential function of the voltage (Vgs), a sum of voltages is equivalent to a product of currents. Because the sum of forward voltage drops equals the sum of backward voltage drops around the loop, the product of forward currents equals the product of backward currents.
By taking a closed loop beginning and ending at Vref and traversing the Vgs of four devices 25, 26, 27, 28 as shown in
If the column input transistors 15, 17, 19 are saturated, their drains simply replicate the current inputs at corresponding current to voltage conversion devices 23, 24. Because the sources of their constituent transistors are all connected together, the sum of intermediate outputs from the jth multiplier module is equal to Iyj. Thus
Because the algorithm specifies probability masses as the input and output of processing nodes, it may be assumed that the denominator of (6) is equal to one, in probability terms, and thus may be neglected.
The above requirement for saturation imposes a minimum allowed supply voltage on Gilbert-based circuits. It is common to use one Gilbert-based circuit made of NMOS devices, of which the outputs are “folded” into a second Gilbert-based circuit made of PMOS devices. A “slice” of this topology is illustrated in
It is clear in
This result is based on the saturation assumption. In practice, saturation is only an approximate condition. The output of the folded circuit does always depend slightly on the other inputs. As Vref(P) is increased and Vref(N) is decreased, the device becomes less saturated, and the output begins to change dramatically. In the extreme case where Vref(P)=Vdd and Vref(N)=0, we must use the full model expressed in (2), without neglecting Vds.
When a transistor is not in saturation, the translinear principle still applies. The complete non-saturation MOS device model is illustrated in
It is clear from the foregoing that the voltage needs of the Gilbert multiplier can be reduced if Vref=0. This results in the column input transistors 15, 17, 19 becoming unsaturated. This situation can be analyzed using the translinear principle. The circuit consists of the translinear loops shown in
Ixi·Izkj=Izij·Ixk′ (7)
such that in FIG. 7
Iƒ=Iyj· (8)
Idj≡Iƒ−Ir· (9)
The role of the source-connected transistors, from (4), may be expressed as:
To solve for all currents in the circuit, we need one more equation, which is provided by the third loop, through the devices 43, 39, 44 shown in
Although the devices 40, 41, 42 are shown in
Combining (7) through (12), we arrive at
The result (16) is almost the same as the normal Gilbert multiplier output (6), except there is an additional term in the denominator. In probability terms, the denominator of (16) can no longer be neglected. To solve this problem, in accordance with an embodiment of the invention, additional transistors are provided with their sources connected to Idj. If these transistors represent a current Iε≡Σl≠jIyl, then the output becomes
In probability terms, the denominator of (19) is a constant and can be neglected. The addition of redundant transistors therefore corrects the probability calculation of the Gilbert multiplier when Vref=0. Because these new transistors do not provide useful outputs, they are referred to herein as dummy transistors or dummy inputs. The drains of these transistors are preferably connected to Vdd.
All transistor devices used to implement the sum-product circuit in
In a practical setting, the sum-product algorithm is carried out repeatedly in a chain of computations. The output of one computation provides input for the next. In an analog implementation, each computation is performed by a separate sum-product module. A complete analog error control decoder consists of cascades of sum-product modules.
For a low-voltage decoder, renormalization of currents between modules is preferred. In a canonical Gilbert-based sum-product circuit with outputs expressed by (6), the denominator is equivalent to ‘1’ and can be truly ignored. In the low voltage output expressed by (19), however, the denominator is equivalent to ‘2’, which results in substantial attenuation at the output of each circuit. While, in principle, linear attenuation will not change the. result of decoding, consistent attenuation in a large network will cause the outputs to approach zero, making it impossible to extract any results, and causing the circuit to slow to a halt.
By inserting the renormalization module 52 between sum-product circuits, the current outputs are prevented from approaching zero. A renormalization circuit that may be implemented in the sum-product circuit of
The circuit of
which is a generalization of (16). As demonstrated by (20), the renormalization circuit of
Because sum-product circuits will normally be situated in a network, the. iterated behaviour of (20) should also be considered. This can be simplified to a one-dimensional problem by using a summary variable k for each set of probabilities, defined as
This allows the treatment of (20) as a simple one-dimensional transfer function,
Iteration of (22) is illustrated in
k0=0 and (23)
k1=n−n/m. (24)
It is well known that a fixed point is stable and non-oscillating if and only if the slope of the transfer function, ƒ′n(k), satisfies 0≦ƒ′n(kƒ)<1 at the fixed point kƒ. Also, a fixed point kƒ is unstable (i.e. it is a repeller) if and only if ƒ′n(kƒ)>1. It can also be shown that
ƒ′n(0)=m and (25)
ƒ′n(k1)=1/m. (26)
Equations (25) and (26) show that there is always a stable fixed point above zero when m>1. Known renormalizers use m=1, and therefore cause all currents to approach zero in a low-voltage network. By using m>1, this can be avoided.
In a low-voltage design, use of a reset circuit is also preferred. An example of such a circuit is shown in
Reset circuits are well known in Gilbert-based designs, having been studied, for example, in the PhD theses of Felix Lichtenberger [11] and Jie Dai [4]. Their value seems debatable in canonical Gilbert circuits. For low-voltage circuits as in
One very common class of decoders employ the BCJR algorithm. This algorithm is used with concatenated convolutional codes, such as Turbo codes and Turbo equalizers. The BCJR algorithm [2] is a special case of the sum-product algorithm (1), as shown in [9]. In the BCJR algorithm, variables are often not binary.
A trellis stage is a portion of a trellis graph which describes a boolean constraint function. The graph of a trellis stage consists of two columns of states, connected by branches. An example trellis stage is shown in
An example of the low-voltage sum-product architecture for computing on trellis graphs is shown in
The particular trellis function is determined by the connectivity module 124.
The sum-product circuit is shown with the state probabilities as row inputs and the branch probabilities as column inputs, but these roles can be reversed without affecting the results. Every stage of the BCJR algorithm consists of a matrix multiplication of the form (27). Low-voltage sum-product circuits can therefore be easily produced to implement a complete BCJR decoder.
Decoders for binary LDPC codes are mapped from the code's normalized Tanner Graph, which is a direct visualization of the code's binary parity check matrix. The Tanner Graph contains two types of nodes, check nodes and variable nodes. A variable node denotes a particular bit in a codeword. A check node represents a parity check equation, which is a single row in the parity check matrix. All variables in the graph are binary. For implementations, this means that all probability masses will have only two components.
In a normalized Tanner Graph, also known as a Forney Graph, equality nodes are inserted between variable nodes and check nodes [5]. This is illustrated in
The purpose of equality node insertion is to provide explicit representation for the constraint which occurs at variable nodes: all connected edges convey the same variable. Let Ui be the set of check nodes connected to variable i, and let uj be the binary value from check node j. The equality constraint is satisfied if and only if
uj=uk, ∀j, kεUi· (28)
The check node is only slightly more complicated. Let Vj be the set of variables which are connected to check node j, and let vi be the value of variable i. Parity check j is then satisfied if and only if
where ‘⊕’ denotes modulo-2 summation.
The constraints in (28) and (29) can be conveniently broken down into recursive operations, allowing, the construction of nodes with many edges by connecting 3-edge nodes in a chain. This is illustrated in
For a 3-edge check node, the constraint function is simply a logical XOR operation. Mapping this to a sum-product implementation, and labeling the three edges X, Y, and Z, we obtain.
P(Z=0)=P(X=0)·P(Y=0)+P(X=1)·P(Y=1) (30)
P(Z=1)=P(X=1)·P(Y=0)+P(X=0)·P(Y=1) (31)
For a 3-edge equality node, we obtain
P(Z=0)∝P(X=0)·P(Y=0) (32)
P(Z=1)∝P(X=1)·P(Y=1). (33)
The proportionality symbol is used to indicate that the algorithm is invariant to multiplication by any non-zero normalizing constant.
Applying the general circuit of
What has been described is merely illustrative of the application of the principles of the invention. Other arrangements and methods can be implemented by those skilled in the art without departing from the spirit and scope of the present invention.
For example, the invention is in no way limited to the particular implementations shown as illustrative examples in the drawings. Alternate implementations, using different elements and/or different types of components, will be apparent to those skilled in the art.
The invention may also be applied to other types of decoding. than the trellis and LDPC decoding described above.
This application claims the benefit of U.S. Provisional Patent Application, Ser. No. 60/544,191 which was filed on Feb. 13, 2004.
Number | Name | Date | Kind |
---|---|---|---|
5115409 | Stepp | May 1992 | A |
5712810 | Kimura | Jan 1998 | A |
6584486 | Helfenstein et al. | Jun 2003 | B1 |
7024448 | Matsugaki et al. | Apr 2006 | B2 |
Number | Date | Country | |
---|---|---|---|
20060004901 A1 | Jan 2006 | US | |
20070276895 A9 | Nov 2007 | US |
Number | Date | Country | |
---|---|---|---|
60544191 | Feb 2004 | US |