Error Correction Circuit for Data Communication Providing Parallelizable Linear Programming Decoding

Information

  • Patent Application
  • 20140089759
  • Publication Number
    20140089759
  • Date Filed
    September 26, 2012
    12 years ago
  • Date Published
    March 27, 2014
    10 years ago
Abstract
An error detection/correction system provides an electronic circuit detecting and correcting transmission errors using linear programming. Linear programming techniques are made practical for real-time error correction and decoding by dividing the linear programming problem into independent parallelizable problems so that separate independent portions of the electronic circuit may simultaneously address solutions related to individual bits and/or parity rules. Linear programming is believed to avoid error floors inherent in conventional belief propagation error detection and correction techniques providing a decoding system suitable for high reliability applications.
Description
BACKGROUND OF THE INVENTION

The present invention relates to error correction circuits for detecting and correcting errors in transmitted or stored digital data and in particular to an error correction circuit providing high-speed linear programming detection and error correction.


The reliable transmission and storage of digital data, for example binary symbols transmitted as electrical impulses over data communication networks or stored on digital storage media, may employ error detection and/or correction circuitry and protocols to guard the transmitted or stored data against corruption.


Generally, error detection and correction is obtained by providing redundant bits in the transmitted or stored data. A naïve redundancy scheme may simply duplicate the transmission or storage of the data; however, sophisticated redundancy systems provide a limited number of detection and correction bits (henceforth “check bits”) each of which serve to detect and/or correct errors in multiple other data bits. For example, an 8-bit message might have a single ninth check bit termed a parity bit. This parity bit is set or reset so as to make the total number of set bits in the message and parity bit an even number. It will be understood that corruption of any one of the message or parity bits caused by a crossover (i.e. changing of the bits during transmission or during storage) will be readily detected by checking whether the total number of bits set is even. If not, an error in transmission or storage can be assumed.


More generally, multiple check bits can be added to any stored or transmitted data word allowing both detection and correction of errors in the message bits. These extra bits increase the numeric spacing or Hamming distance between legitimate message symbols represented by the data and check bits taken together. Errors are detected if the received symbol is positioned between legitimate symbols and error correction is obtained by choosing the closest legitimate symbol. In this respect it will be understood that error correction simply chooses the most likely correct symbol.


Sophisticated error detection and correction systems may employ “low density parity check codes” in which overlapping subsets of data bits and check bits are subject to independent constraints (e.g. even parity), the constraints thus each applying to only a small subset of the bits. Low-density parity check codes allow transmission of data very close to the Shannon limit that relates the amount of information that can be transmitted over a channel of a given bandwidth in the presence of given noise interference.


Decoding information that has been encoded using low density parity check codes is computationally demanding, involving the determination of a most likely transmitted string in the face of errors and subject to the overlapping constraints of the originally transmitted message. One method of performing this decoding, termed “belief propagation”, iteratively communicates values of the received bits in each subset (as maintained in a buffer) to a circuit that applies attempts to reconcile the received values according to their constraint parity and retransmit updated reconciled values to the buffer. Each bit of the buffer receives updates from multiple reconciling circuits and a mechanism is provided to integrate the different and often conflicting updates. This cycle is performed iteratively.


Belief propagation is parallelizable to the extent that the calculations associated with each reconciliation step and each integration step may be implemented simultaneously and independently by separate computing elements. This is important for high-speed message processing.


Unfortunately, although belief propagation is often empirically successful, there is no guarantee that it will converge to a set of bits that meets the constraint of the parity rules. Further, belief propagation is subject to an “error floor” representing a limit to its ability to detect and correct errors.


SUMMARY OF THE INVENTION

The present invention provides a decoder for low density parity check codes and other similar coding systems that both demonstrably converges and may be parallelized for execution of different portions of the decoding simultaneously for high-speed processing.


Specifically, the present invention provides an error correction circuit including a buffer memory for holding a received string of bits derived from a transmitted string of bits, the latter subject to a probability of transmission error. “Transmitted” shall be understood in this context to include the process of data storage as well as the process of data transmission. A parity rule memory holds a set of parity rules for the transmitted string of bits, each parity rule describing a predefined intended relationship between a subset of the bits as originally transmitted. The buffer memory and parity rule memory communicate with a linear programming optimizer, the latter which generates a corrected string of bits from the received string of bits using a linear programming process configured to maximize a probability that the corrected string of bits represents the transmitted string of bits, subject to the parity rules for the bits.


It is thus a feature of at least one embodiment of the invention to provide a robust error correction and detection decoding that can be proven to converge. Current belief propagation is not subject to this rigorous understanding. It is further a feature of at least one embodiment of the invention to provide an error correction and detection system that does not appear to be subject to an error floor, making it suitable for extremely high reliability error detection and correction.


The linear programming optimizer may iteratively repeat two steps, a first step adjusting values of the corrected string of bits based on iteratively changing replica parity subvectors (henceforth “replicas”) and a second step updating the iteratively changing replicas based upon their deviation from the actual parity rules.


It is thus a feature of at least one embodiment of the invention to provide a decoding system that reduces data flow paths by modification of replicas, thus offering potential for simpler hardware implementation.


The first step of adjusting values of the corrected string of bits may adjust each bit of the corrected string of bits as a function of the iteratively changing replicas independent of the value of the other bits of the corrected string of bits.


It is thus a feature of at least one embodiment of the invention to permit parallel processing of the modifications of the replicas for improved scalability with long data words and improved execution speed.


The electronic circuit may provide independently executing multiple computational elements associated with different replicas to substantially simultaneously adjust the different replicas


It is thus a feature of at least one embodiment of the invention to provide a circuit exploiting the parallelism of the replica iterations.


The electronic circuit may provide multiple independently executing computational elements associated with different values of the corrected string of bits to substantially simultaneously adjust the different values of the corrected string of bits.


It is thus a feature of at least one embodiment of the invention to provide a circuit exploiting the parallelism of the correction of the received string of bits.


The second step of updating the iteratively changing replicas may define a projection of the iteratively changing replicas to a parity polytope being a convex hull whose vertices are defined by the parity rules.


It is thus a feature of at least one embodiment of the invention to provide a mechanism for efficiently projecting solutions to the parity polytope, the latter representing a relaxation to the decoding process permitting linear programming to be applied.


The first and second steps may implement an alternating direction method of multipliers.


It is thus a feature of at least one embodiment of the invention to exploit a robust and well-understood mathematical technique in a novel way to provide parallelism in the linear programming solution to error detection and correction.


The maximized probability may model a binary symmetric channel.


It is thus a feature of at least one embodiment of the invention to provide a simple and widely applicable model for the transmission link.


The parity rules may provide a low-density parity check taking as arguments less than 20 percent of the bits of the transmitted string of bits. Alternatively or in addition, the parity rules may provide for even parity for a subset of bits of the transmitted string of bits.


It is thus a feature of at least one embodiment of the invention to provide a system suitable for a common class of error correction protocols.


These particular features and advantages may apply to only some embodiments falling within the claims and thus do not define the scope of the invention.





BRIEF DESCRIPTION OF THE FIGURES


FIG. 1 is a simplified block diagram of a first application of the present invention involving data communication of binary data over a noisy channel as may incorporate an error decoder of the present invention in different inter-communicating components;



FIG. 2 is a simplified block diagram a second application of the present invention involving a data storage system that may incorporate the error decoder of the present invention in the different inter-communicating components;



FIG. 3 is a function diagram of a circuit constructed according to the present invention as may receive a binary string of data for detecting and correcting errors according to predetermined stored parity rules;



FIG. 4 is a simplified factor graph showing a relationship between parity rules and bits of the received binary string according to one protocol executable by the present invention;



FIG. 5 is a flowchart of the principal steps of the iterative decoding process of the present invention;



FIG. 6 is a fragmentary representation of FIG. 3 showing data flow during initialization steps of the decoding process;



FIG. 7 is a fragmentary representation of FIG. 3 showing data flow during an updating of the bit string in a first step of an iterative decoding process;



FIG. 8 is a fragmentary representation of FIG. 3 showing data flow during an updating of replica for a second step of the iterative decoding process;



FIG. 9 is a fragmentary representation of FIG. 3 showing data flow during an updating of a Lagrangian multiplier value used in the iterative decoding process and;



FIG. 10 is a representation of a simplified polytope showing a projection process used in iterative adjustment of the replica in the present invention.





DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENT

Referring now to FIG. 1, a data communication system 10 may provide for a first terminal device 12, for example a computer terminal or server, communicating with network media 14, for example, the latter being copper conductors, optical media, or radio communication or the like. The network media 14 communicates through one or more network devices 16, such as routers or other switches, with a second terminal device 18 similar to the first terminal device 12.


Data communicated on the data communication system 10 may comprise data composed of string of binary bits, termed a “packet” 20, including a content portion 22 and error correcting portion 24 which may be segregated as shown or interleaved arbitrarily. The error correcting portion 24 provides message redundancy to improve one or both of error detection and error correction in the event that bits of the packet 20 are corrupted or flipped, being generally changing a bit value from 1 to 0 or 0 to 1.


Referring now also to FIG. 2, alternatively, a data storage system 30 may provide for data storage media 32, for example, including disk drives or solid-state memory or the like that may hold stored data words. The stored data words also comprise a packet 20 having a content portion 22 and an error-correcting portion 24 as described above. The packet 20 may be communicated to a storage media control device 34 and then to a computer 36 using data of the packet 20. For convenience of explanation, the data transmitted or stored in these examples will be henceforth termed “transmitted data”.


The present invention provides an error decoder circuit 38 that may be located to receive the packet 20 after transmission at any of the devices 12, 16, 34 and 36, and as so located operate to detect any errors caused by transmission or storage of the packet 20 and, to the extent possible, correct those errors based on an analysis of the transmitted bits of the content portion 22 and the error correcting portion 24.


Referring now to FIG. 3, the error decoder circuit 38 may include a set of memory structures for storing and accessing of digital data including: a packet buffer 40 for storing the bits 19 of the packet 20 (the bits 19 herein designated xi where i is an index for the bit number), a parity rule store 42 for storing parity rules 43 (each parity rule 43 herein designated as Pj), a replica store 44 for storing replicas 45 (each replica 45 herein designated zj, where j is an index for the number of different parity rules), a Lagrangian multiplier store 46 for storing Lagrangian multiplier values 47 (each multiplier designated λj), and a likelihood store 48 storing likelihood values 49 (each likelihood value 49 herein designated γi). Each of these will be discussed in greater detail below.


Generally, each bit 19 will have one of two states generally referred to as zero or one, each parity rule 43 will be represented by a matrix, each replica 45 as a vector and each Lagrangian multiplier value 47 and each likelihood value 49 will be represented as a binary word of predetermined length. The error decoder circuit 38 will generally receive, through an input channel 39, packets 20 transmitted from another device and, using a process that will be described below and with access to other data of the memory structures 42, 44, 46, and 48, will detect errors in the received packet 41 for later use by downstream device.


The error decoder circuit 38 may also provide multiple computational units 50 each being, for example, a limited instruction set processor or discrete processing circuitry including but not limited to a gate array or the like. These computational units 50 may execute substantially independently and simultaneously according to a stored program, implemented in software, firmware or as is implicit in the circuit organization and connection, to read and write to each of the memory structures 40, 42, 44, 46, and 48 to permit parallel execution of the error correcting process as will be described. Generally an arbitrary number of computational units 50 may be used and allocated appropriately.


In the embodiment discussed herein, the computational units 50 may include message bit computational units 52 and replica computational units 54. The message bit computational units 52 may be provided in number equal to the number of bits in the binary packet 20 (and hence the number of bits i in the buffer 40) and replica computational units 54 may be provided equal in number to the number j of the replicas 45 of the replica store 44. The invention is not limited to this particular embodiment and it will be understood from the discussion below that the staging of the processing of data by the error decoder circuit 38 may permit a sharing of functions between the message bit computational units 52 and the replica computational units 54 when those computational units are general-purpose processors and that some sequential processing may be permitted with a concomitant loss in speed.


Referring now to FIGS. 3 and 4, each of the parity rules 43 may be associated with a particular subset of the bits 19 received in the buffer 40 as indicated by connecting lines 56 to implement a parity check only on the connected bits 19 of that subset. In one embodiment, the parity rules 43 (and, as will be discussed below, the replica 45) may be associated with a relatively small number of the total bits 19 of the packet 20 (for example, in the classic “(3,6) regular” family of low-density parity-check code introduced by Robert Gallager in his 1962 PhD thesis each parity rule is associated with 6 bits, even if the code length is in the thousands) and thus represent a low density parity check. The parity check may, for example, enforce an even parity on the associated bits 19 of the subset indicating that the number of one bits must be even if the bits 19 of the subset were transmitted without error. The invention may nevertheless be useful for high-density parity checks and other constraints between bits 19 of packet 20 implemented by objective rules of a type similar to the parity checks described.


In the example of FIG. 4, the second replica 45 may look at the second, third and fifth bits 19 to ensure that there is even parity. In this example, each of these bits 19 may be the one violating the even parity constraint of the replica 45. An error in the transmission of the packet 20 is thereby indicated, and the invention, as will be described below, accordingly operates next to determine, to the extent possible, a corrected set of bits 19 that has a maximum likelihood of representing the packet 20 as originally transmitted. Generally this process will attempt to reconcile each of the rules implemented by different replicas 45 operating on overlapping subsets of the bits 19 and the present invention performs this reconciliation by casting the reconciliation as a linear programming problem.


Linear programming will be understood to be a mathematical technique for problems that can be expressed as maximization (or minimization) of a vector of variables subject to a constraint which defines a convex polytope over which the function must be maximized (or minimized). In this case the linear programming problem may be expressed as:









minimize








i




γ
i



x
i







(
1
)







subject to each xi meeting the parity rules


where γi is a negative log likelihood ratio defined as:










γ
i

=

log


(


p


(



x
~

i


0

)



p


(



x
~

i


i

)



)






(
2
)







and where the numerator and denominator express the probability that {tilde over (custom-character)} received given that the value to the right of the vertical line was transmitted. This definition assumes a binary symmetric transmission channel where the bits are equally likely to flip from 0 to 1 as from 1 to 0. A more complete explanation of this formulation is provided in the attached Appendix.


Referring now to FIG. 5 and FIG. 6, in a first step of the error detecting and correcting process implemented by the computational units 50 according to their stored programs, a likelihood vector is generated, as indicated by process block 70, and used to populate the likelihood store 48. This initialization may be chosen arbitrarily but may, for example, initialize each value γi to one when its corresponding bit xi in the received packet 20 is zero and vice versa, reflecting the definition of γi. In this process the value of γi is generated for each value of xi.


As indicated by process block 72, the parity rule store 42 next may be loaded with the known parity rules for the communication system and, at process block 74, the replica store 44 may be initialized to match the parity rules 43 of the parity rule store 42. Finally the Lagrangian multiplier store 46 may be initialized, for example, to all zero values.


Referring now to FIGS. 5 and 7, the computational units 50 then loop through the following three steps. At a first step indicated by process block 76, each of the values of xi of the byte packet 20 are updated using the mechanism of alternating direction method of multipliers (ADMM) according to the current values of the replica 45 in replica store 44, the current values of the Lagrangian multiplier store 46 and the current values of the likelihood store 48 according to the following equation:










x
i

=




[

0
,
1

]




(


1




N
v



(
i
)







(





j



N
v



(
i
)






(


z
j

(
i
)


-


1
μ



λ
j

(
i
)




)


-


1
μ



γ
i



)


)






(
3
)







The terms of this equation are more fully defined and developed in the attached Appendix.


The above first step which updates the values of the bits 19 is limited to changing the bits 19 to values between zero and one. This process may be performed in parallel by multiple message bit computational units 52 each operating on a different bit 19 simultaneously for multiple or all bits 19.


Referring now to FIGS. 5, 6 and 8, at a second step per process block 78, the values of the replica store 44 are updated based on the current parity rules 43 and the Lagrangian multiplier value 47 in a first update substep creating intermediate vectors vj according to the following equation:










v
i

=



P
j


x

+


λ
j

μ






(
4
)







In a second substep, the vector vj is then projected onto a closest surface of the convex hull (parity polytope) whose vertices are defined by the parity rules 43.


Referring momentarily also to FIG. 10, in the present example of parity rules having even parity, the convex hull will be the surface defined by vertices of even parity (even Hamming weight). A simplified example of this projection may be provided with respect to a vector vj of length three wherein the parity polytope has vertices aligned with vertices (000), (011), (101) and (110) of a cube representing the even parity solutions for a vector of length three. The projection 80 finds the closest point 82 on the surface of the convex hull 84 defined by these vertices. An efficient method for computing this projection is described in the attached appendix.


The coordinates of the point 82 of the projection define the new replica vector zj. This updating of the replica vector zj for each of the replica vectors j may also be accomplished in parallel by multiple replica computational units 54 which do not require the solutions to the other updated replica vectors zj to complete their updating.


Referring now to FIGS. 5 and 9, in a third subset indicated by process block 86, the Lagrangian multipliers are then updated according to the equation:





λjj+μ(Pjx−zj).  (5)


Again, this process may be executed in parallel by multiple replica computational units 54.


At decision block 88, convergences are checked by determining whether the greatest difference between a parity rule 43 and its replica 45 (among all the parity rules) is less than the predetermined value according to the inequality:











max
j








P
j


x

-

z
j







<

ɛ
.





(
6
)







If this condition is satisfied, the current values of xi are returned as a solution to the correction problem as indicated by process block 90. Otherwise an additional iterated loop is performed beginning again at process block 76.


The applicants believe that the solution provided by this linear programming technique avoids the error floor inherent in belief propagation techniques. Accordingly the present invention both provides certainty of convergence and may present a preferred decoding technique for high reliability applications where frame error rates (also known as word error rates) of 10−10 or lower are required. Linear programming decoders do not display error floors.


When introducing elements or features of the present disclosure and the exemplary embodiments, the articles “a”, “an”, “the” and “said” are intended to mean that there are one or more of such elements or features. The terms “comprising”, “including” and “having” are intended to be inclusive and mean that there may be additional elements or features other than those specifically noted. It is further to be understood that the method steps, processes, and operations described herein are not to be construed as necessarily requiring their performance in the particular order discussed or illustrated, unless specifically identified as an order of performance. It is also to be understood that additional or alternative steps may be employed.


References to “an electronic circuit”, “a controller”, a “computer” and “a processor” and the like can be understood to include one or more circuits, controllers, computers or processors that can communicate in a stand-alone and/or a distributed environment(s), and can thus be configured to communicate via wired or wireless communications with other processors, where such one or more processor can be configured to operate on one or more processor-controlled devices that can be similar or different devices. Furthermore, references to memory, unless otherwise specified, can include one or more processor-readable and accessible memory elements and/or components that can be internal to the processor-controlled device, external to the processor-controlled device, and can be accessed via a wired or wireless network. Similarly the term program or the like may refer to conventional computer programming in programming languages stored in memory or implicit programming contained in the interconnection of electronic devices, for example, with programmable field gate arrays. The term electronic circuit should be broadly construed to cover programmable and other circuits including computers, field programmable gate arrays, as well as application-specific integrated circuits and should generally include circuits including both optical and electronic features used for implementing circuit type functions.


It is specifically intended that the present invention not be limited to the embodiments and illustrations contained herein and the claims should be understood to include modified forms of those embodiments including portions of the embodiments and combinations of elements of different embodiments as come within the scope of the following claims. All of the publications described herein, including patents and non-patent publications, are hereby incorporated herein by reference in their entireties.


APPENDIX

In this paper we consider a binary linear LDPC code custom-character of length N defined by a M×N parity-check matrix H. Each of the M parity checks, indexed by custom-character={1, 2, . . . , M}, corresponds to a row in the parity check matrix H. Codeword symbols are indexed by the set custom-character={1, 2, . . . , N}. The neighborhood of a check j, denoted by custom-character(j), is the set of indices i∈custom-character that participate in the jth parity check, i.e., custom-characterc(j)={i|Hj,i=1}. Similarly for a component i∈custom-character, custom-characterv(i)={j|Hj,i=1}. Given a vector χ∈{0,1}N, the jth parity-check is said to be satisfied if custom-character(j)χi is even. In other words, the set of values assigned to the χi for i∈custom-characterc(j) have even parity. We say that a length-n binary vector χ is a codeword, χ∈custom-character, if and only if (iff) all parity checks are satisfied. In a regular LDPC code there is a fixed constant d, such that for all checks j∈custom-character, |custom-character(j)|=d. Also for all components i∈custom-character, custom-characterv(i)| is a fixed constant. For simplicity of exposition we focus our discussion on regular LDPC codes. Our techniques and results extend immediately to general LDPC codes and to high density parity check codes as well.


To denote compactly the subset of coordinates of χ that participate in the jth check we introduce the matrix Pj. The matrix Pj is the binary d×N matrix that selects out the d components of χ that participate in the custom-characterth check. For example, say the neighborhood of the jth check, custom-characterc(j)={i1, i2, . . . id}, where i1<i2< . . . <id. Then, for all k∈[d] the (k, ik)th entry of Pj is one and the remaining entries are zero. For any codeword χ∈custom-character and for any j, Pjχ is an even parity vector of dimension d. In other words we say that Pjχ∈custom-characterd for all j∈custom-character (a “local codeword” constraint) where custom-characterd is defined as






custom-character
d
={e∈{0,1}d|∥e∥1 is even}.  (1)


Thus, custom-characterd is the set of codewords (the codebook) of the length-d single parity-check code.


We begin by describing maximum likelihood (ML) decoding and the LP relaxation proposed by Feldman et al. Say vector {tilde over (χ)} is received over a discrete memoryless channel described by channel law (conditional probability) W:custom-character×{tilde over (custom-character)}→custom-character≧0, custom-character=1 for all χ∈custom-character. Since the development is for binary codes |custom-character|=2. There is no restriction on custom-character. Maximum likelihood decoding selects a codeword χ∈custom-character that maximizes custom-character, the probability that was received given that χ was sent. For discrete memoryless channel W, custom-characteri∈custom-characterW({tilde over (χ)}ii). Equivalently, we select a codeword that maximizes Σi∈custom-character log custom-character. Let γi be the negative log-likelihood ratio, γi:=log [W({tilde over (χ)}i|0)/W({tilde over (χ)}i|1)]. Since log W({tilde over (χ)}ii)=−γiχi+log custom-character, ML decoding reduces to determining an χ∈custom-character that minimizes γTχ=Σi∈custom-characterγiχi. Thus, ML decoding requires minimizing a linear function over the set of codewords.


Feldman et al. [3] show that ML decoding is equivalent to minimizing a linear cost over the convex hull of all codewords. In other words, minimize γTχ subject to χ∈conv(custom-character). The feasible region of this program is termed the “codeword” polytope. However, this polytope cannot be described tractably. Feldman's approach is first to relax each local codeword constraint Pjχ∈custom-characterd to Pjχ∈custom-characterd where






custom-character
d=conv(custom-characterd)=conv({e∈{0,1}d|∥e∥1 is even}).  (2)


The object custom-characterd is called the “parity polytope”. It is the codeword polytope of the single parity-check code (of dimension d). Thus, for any codeword χ∈custom-character, Pjχ is a vertex of custom-characterd for all j. When the constraints Pjχ∈custom-characterd are intersected for all j∈custom-character the resulting feasible space is termed the “fundamental” polytope. Putting these ingredients together yields the LP relaxation that we study:





minimize γTχs.t. Pjχ∈custom-characterd∀j∈custom-character.  (3)


The statement of the optimization problem in (3) makes it apparent that compact representation of the parity polytope custom-characterd is crucial for efficient solution of the LP. Study of this polytope dates back some decades. In [36] Jeroslow gives an explicit representation of the parity polytope and shows that it has an exponential number of vertices and facets in d. Later, in [37], Yannakakis shows that the parity polytope has small lift, meaning that it is the projection of a polynomially faceted polytope in a dimension polynomial in d. Indeed, Yannakakis' representation requires a quadratic number of variables and inequalities. This is one of the descriptions discussed in [3] to state the LP decoding problem.


Yannakakis' representation of a vector u∈custom-characterd consists of variables μs∈[0, 1] for all even s≦d. Variable μs indicates the contribution of binary (zero/one) vectors of Hamming weight s to u. Since u is a convex combination of even-weight binary vectors, Σeven sdμs=1. In addition, variables zi,s are used to indicate the contribution to ui, the ith coordinate of u made by binary vectors of Hamming weight s. Overall, the following set of inequalities over O(d2) variables characterize the parity polytope (see [37] and [3] for a proof).






0


u
i



1




i


[
d
]










0


z

i
,
s





μ
s





i


[
d
]














even





s

d



μ
s


=
1







u
i

=




even





s

d




z

i
,
s






i


[
d
]
















i
=
1

d



z

i
,
s



=

s






μ
s





s





even




,

s


d
.






This LP can be solved with standard solvers in polynomial time. However, the quadratic size of the LP prohibits its solution with standard solvers in real-time or embedded decoding applications. In Section IV-B we show that any vector u∈custom-characterd can always be expressed as a convex combination of binary vectors of Hamming weight r and r+2 for some even integer r. Based on this observation we develop a new formulation for the parity polytope that consists of O(d) variables and constraints. This is a key step towards the development of an efficient decoding algorithm. Its smaller description complexity also makes our formulation particularly well suited for high-density codes whose study we leave for future work.


Decoupled Relaxation and Optimization Algorithms

In this section we present the ADMM formulation of the LP decoding problem and summarize our contributions. In Section III-A we introduce the general ADMM template. We specialize the template to our problem in Section We state the algorithm in Section III-C and frame it in the language of message-passing in Section III-D.


A. ADMM Formulation

To make the LP (3) fit into the ADMM template we relax χ to lie in the hypercube, χ∈[0, 1]N, and add the auxiliary “replica” variables zjcustom-characterd for all j∈custom-character. We work with a decoupled parameterization of the decoding LP.





minimize γTχ





subject to Pjχ=zj∀j∈custom-character






z
jcustom-characterd∀j∈custom-character





χ∈[0,1]N.  (4)


The alternating direction method of multiplies works with an augmented Lagrangian which, for this problem, is











L
μ



(

x
,
z
,
λ

)


:=



γ
T


x

+




j







λ
j
T



(



P
j


x

-

z
j


)



+


μ
2






j












P
j


x

-

z
j




2
2

.








(
5
)







Here λjcustom-characterd for j∈custom-character are the Lagrange multipliers and μ>0 is a fixed penalty parameter. We use λ and z to succinctly represent the collection of λjs and zjs respectively. Note that the augmented Lagrangian is obtained by adding the two-norm term of the residual to the ordinary Lagrangian. The Lagrangian without the augmentation can be optimized via a dual subgradient ascent method [38], but our experiments with this approach required far too many message passing iterations for practical implementation. The augmented Lagrangian smoothes the dual problem leading to much faster convergence rates in practice [39]. For the interested reader, we provide a discussion of the standard dual ascent method in the appendix.


Let χ and custom-character denote the feasible regions for variables χ and z respectively: χ=[0, 1]N and we use z∈custom-character to mean that z1×z2× . . . ×custom-charactercustom-characterd×custom-characterd× . . . ×custom-characterd, the |custom-character|-fold product of custom-characterd. Then we can succinctly write the iterations of ADMM as





χk+1:=custom-characterLμ(χ,zkk)






z
k+1:=custom-characterLμk+1,z,λk)





λjk+1:=λjk+μ(Pjχk+1−zjk+1).


The ADMM update steps involve fixing one variable and minimizing the other. In particular, χk and zk are the kth iterate and the updates to the χ and z variable are performed in an alternating fashion. We use this framework to solve the LP relaxation proposed by Feldman et al. and hence develop a distributed decoding algorithm.


B. ADMM Update Steps

The χ-update corresponds to fixing z and λ (obtained from the previous iteration or initialization) and minimizing Lμ(χ, z, λ) subject to χ∈[0, 1]N. Taking the gradient of (5), setting the result to zero, and limiting the result to the hypercube χ=[0, 1]N, the χ-update simplifies to







x
=





[

0
,
1

]

N








(


P

-
1


×

(




j




P
j
T



(


z
j

-


1
μ



λ
j



)



-


1
μ


γ


)


)



,




where P=ΣjPjTPj and Π[0,1]N(•) corresponds to projecting onto the hypercube [0, 1]N. The latter can easily be accomplished by independently projecting the components onto [0, 1]: setting the components that are greater than 1 equal to 1, the components less than 0 equal to 0, and leaving the remaining coordinates unchanged. Note that for any j, PjTPj is a N×N diagonal binary matrix with non-zero entries at (i,i) if and only if i participates in the jth parity check (i∈custom-characterc(j)). This implies that ΣjPjTPj is a diagonal matrix with the (i,i)th entry equal to |custom-characterv(i)|. Hence P−1=(ΣjPjTPj)−1 is a diagonal matrix with 1/custom-characterv(i)| as the ith diagonal entry.


Component-wise, the update rule corresponds to taking the average of the corresponding replica values, zj, adjusted by the scaled dual variable, λj/μ, and taking a step in the negative log-likelihood direction. For any j∈custom-characterv(i) let zj(i) denote the component of zj that corresponds to the ith component of χ, in other words the ith component of PjTzj. Similarly let λj(i) be the ith component of PjTλj. With this notation the update rule for the ith component of χ is







x
i

=




[

0
,
1

]









(


1





v



(
i
)







(





j



N
v



(
i
)






(


z
j

(
i
)


-


1
μ



λ
j

(
i
)




)


-


1
μ



γ
i



)


)

.






Each variable update can be done in parallel.


The z-update corresponds to fixing χ and λ and minimizing Lμ(χ, λ, z) subject to zjcustom-characterd for all j∈custom-character. The relevant observation here is that the augmented Lagrangian is separable with respect to the zjs and hence the minimization step can be decomposed (or “factored”) into custom-character separate problems, each of which be solved independently. This decouples the overall problem, making the approach scalable.


We start from (5) and concentrate on the terms that involve zj. For each j∈custom-character the update is to find the zj that minimizes









μ
2








P
j


x

-

z
j




2
2


-


λ
j
T



z
j







s
.
t
.





z
j







ℙℙ
d

.





Since the values of χ and λ are fixed, so are Pjχ and λj/μ. Setting v=Pjχ+λj/μ and completing the square we get that the desired update z*j is






z*
j=custom-characterd∥v−{tilde over (z)}∥22.


The z-update thus corresponds to |custom-character| projections onto the parity polytope.


C. ADMM Decoding Algorithm

The complete ADMM-based algorithm is specified in the Algorithm 1 box. We declare convergence when the replicas differ from the χ variables by less than some tolerance ∈>0.


D. ADMM Decoding as Message Passing Algorithm

We now present a message-passing interpretation of the ADMM decoding algorithm, Algorithm 1. We establish this interpretation using the “normal” factor graph representation [41]












Algorithm 1


Given a binary N-dimensional vector x ∈ {0, 1}N, parity check matrix H, and parameters μ and ε,


solve the decoding LP specified in (4)
















  1:
Construct the negative log-likelihood vector γ based on received word x.


  2:
Construct the d × N matrix Pj for all j ∈ custom-character .


  3:
Initialize zj and λj as the all zeros vector for all j ∈ custom-character .


  4:
repeat





  5:

UpdatexiΠ[0,1](1v(i)(jv(i)(zj(i)-1μλj(i))-1μγi))foralli.






  6:
 for all j ∈  custom-character  do


  7:
  Set vj = Pjx + λj/μ.


  8:
  Update zj custom-characterd(vj) where custom-characterd (·) means project onto the parity polytope.


  9:
  Update λj ← λj + μ (Pjx − zj).


 10:
 end for


 11:
until maxj ∥Pjx − zj< ε return x.










(sometimes also called “Formey-style” factor graphs). One key difference between normal factor graphs and ordinary factor graphs is that the variables in normal factor graph representation are associated with the edges of a regular factor graphs [42], and the constraints of the normal graph representation are associated with both factor and variable nodes of the regular representation. See [41], [43] for details. In representing the ADMM algorithm as a message-passing algorithm the χ and the replicas z are the variables in the normal graph.


We denote by χij(k) the replica associated with the edge joining node i∈custom-character and node j∈custom-character, where k indicates the kth iteration. Note that χij1(k)=χij2(k)=χik for all j1:j2custom-character, where χik is the value of χi at kth iteration in Algorithm 1. The “message” mi→j(k):=χij(k) is passed from node i to node j at the beginning of the kth iteration. Incoming messages to check node j are denoted as m→j(k):={mi→j(k): i∈custom-character(j)}. The zj can also be interpreted as the messages passed from check node j to the variable nodes in custom-characterc(j), denoted as mj→(k):=









{




m

j

i




(
k
)




:






i





c



(
j
)



}

.




Let







λ
j



:=



λ
j

/
μ






and









λ

j
,
i



:=


λ
j

(
i
)


/

μ
.




Then



,


for





all





j





v



(
i
)











m

i

j




(

k
+
1

)


=




[

0
,
1

]









(



1





v



(
i
)










j




c



(
j
)






[



m

j

i




(
k
)


-


λ

j
,
i





(
k
)



]



-


γ
i

μ


)

.






The z-update can be rewritten as






m
j→(k+1)=custom-characterd(m→j(k)+λ′j(k)).


The λ′j updated is





λ′j(k+1)=λ′j(k)+(m→j(k)−mj→(k)).


With this interpretation, it is clear that the ADMM algorithm decouples the decoding problem and can be performed in a distributed manner.


The Geometric Structure of d, and Efficient Projection onto d

In this section we develop our efficient projection algorithm. Recall that custom-characterd={e∈{0, 1}d|∥e∥1 is even} and that custom-characterd=conv(custom-characterd). Generically we say that a point v∈custom-characterd if and only if there exist a set of eicustom-characterci such that v=Σiαiei where Σiαi=1 and αi≧0. In contrast to this generic representation, the initial objective of this section is to develop a novel “two-slice” representation of any point v∈custom-characterd: namely that any such vector can be written as a convex combination of vectors with Hamming weight r and r+2 for some even integer r. We will then use this representation to construct an efficient projection.


We open the section in Section IV-A by describing the structured geometry of custom-characterd that we leverage, and laying out the results that will follow in ensuing sections. In Section IV-B, we prove a few necessary lemmas illustrating some of the symmetry structure of the parity polytope. In Section IV-C we develop the two-slice representation and connect the l1-norm of the projection of any v∈custom-characterd onto custom-characterd to the (easily computed) “constituent parity” of the projection of v onto the unit hypercube. In Section IV-D we present the projection algorithm.


A. Introduction to the Geometry of d

In this section we discuss the geometry of custom-characterd. We develop intuition and foreshadow the results to come. We start by making a few observations about custom-characterd.


First, we can classify the vertices of custom-characterd by their weight. We do this by defining custom-characterdr, the constant-weight analog of custom-characterd, to be the set of weight-r vertices of custom-characterd:






custom-character
d
r
={e∈{0,1}d|∥e∥1=r},  (6)


i.e., the constant-weight-r subcode of γd. Since all elements of custom-characterd are in some custom-characterdr for some even r, custom-characterd=∪0≦r≦d:r evencustom-characterdr. This gives us a new way to think about characterizing the parity polytope,






custom-character
d=conv(∪0≦r≦d:r evencustom-characterdr).


Second, we define custom-characterdr to be the convex hull of custom-characterdr,






custom-character
d
r=conv(custom-characterdr)=conv({e∈{0,1}d|∥e∥1=r}).  (7)


This object is a “permutahedron”, so termed because it is the convex hull of all permutations of a single vector; in this case a length-d binary vector with r ones. Of course,






custom-character
d=conv(∪0≦r≦d:r evencustom-characterdr).


Third, define the affine hyper-plane consisting of all vectors whose components sum to r as






custom-character
d
r
={x∈
custom-character
d|1Tx=r}


where 1 is the length-d all-ones vector. We can visualize custom-characterdr as a “slice” through the parity polytope defined as the intersection of custom-characterdr with custom-characterd. In other words, a definition of custom-characterdr equivalent to (7) is






custom-character
d
r=custom-characterdΩcustom-characterdr,


for r an even integer.


Finally, we note that the custom-characterdr are all parallel. This follows since all vectors lying in any of these permutahedra are orthogonal to 1. We can think of the line segment that connects the origin to 1 as the major axis of the parity polytope with each “slice” orthogonal to the axis.


The above observations regarding the geometry of custom-characterd are illustrated in FIG. 10 Our development will be as follows. First, in Sec. IV-B we draw on a theorem from [44] about the geometry of permutahedra to assert that a point v∈custom-characterd is in custom-characterdr if and only if a sorted version of v is majorized (see Definition 1) by the length-d vector consisting of r ones followed by d−r zeros (the sorted version of any vertex of custom-characterd). This allows us to characterize the custom-characterdr easily. Second, we rewrite any point u∈custom-characterd as, per our second bullet above, a convex combination of points in slices of different weights r. In other words u=Σ0≦r≦d:r evenαrur where urcustom-characterdr and the αr are the convex weightings. We develop a useful characterization of custom-characterd, the “two-slice” Lemma 2, that shows that two slices always suffices. In other words we can always write u=αur+(1−α)ur+2 where urcustom-characterdr, ur+2custom-characterdr+2, 0≦α≦1, and r=└∥u∥┘even, where └α┘even is the largest even integer less than or equal to a. We term the lower weight, r, the “constituent” parity of the vector.


Third, in Sec.-C we show that given a point v∈custom-characterd that we wish to project onto custom-characterd, it is easy to identify the constituent parity of the projection. To express this formally, let custom-characterd(v) be the projection of v onto custom-characterd. Then, our statement is that we can easily find the even integer r such that Πcustom-characterd(v) can be expressed as a convex combination of vectors in custom-characterdr and custom-characterdr+2.


Finally, in Sec.-D we develop our projection algorithm. Roughly, our approach is as follows. Given a vector v∈custom-characterd we first compute r, the constituent parity of its projection. Given the two-slice representation, projecting onto custom-characterd is equivalent to determining an α∈[0, 1], a vector a∈custom-characterdr, and a vector b∈custom-characterdr+2 such that the l2 norm of v−αa−(1−α)b is minimized.


In [45] we showed that, given α, this projection can be accomplished in two steps. We first project v onto αcustom-characterdr={x∈custom-characterd|0≦χi≦α, Σi=1dχi=αr} a scaled version of custom-characterdr, scaled by the convex weighting parameter. Then we project the residual onto (1−α)custom-characterdr. The object αcustom-characterdr is an l1 ball with box constraints. Projection onto αcustom-characterdr can be done efficiently using a type of waterfilling. Since the function custom-characterdr+2∥v−αa−(1−α)b∥22 is convex in a we can perform a one-dimensional line search (using, for example, the secant method) to determine the optimal value for α and thence the desired projection.


In contrast to the original approach, in Section.D we develop a far more efficient algorithm that avoids the pair of projections and the search for α. In particular, taking advantage of the convexity in α we use majorization to characterize the convex hull of custom-characterdr and custom-characterdr+2 in terms of a few linear constraints (inequalities). As projecting onto the parity polytope is equivalent to projecting onto the convex hull of the two slices, we use the characterization to express the projection problem as a quadratic program, and develop an efficient method that directly solves the quadratic program. Avoiding the search over a yields a considerable speed-up over the original approach taken in [45].


B. Permutation Invariance of the Parity Polytope and its Consequences

Let us first describe some of the essential features of the parity polytope that are critical to the development of our efficient projection algorithm. First, note the following


Proposition 1: u∈custom-characterd if and only if Σu is in the parity polytope for every permutation matrix Σ.


This proposition follows immediately because the vertex set custom-characterd is invariant under permutations of the coordinate axes.


Since we will be primarily concerned with projections onto the parity polytope, let us consider the optimization problem





minimizez∥v−z∥2 subject to z∈custom-characterd.  (8)


The optimal z* of this problem is the Euclidean projection of v onto custom-characterd, which we denote by z*=custom-characterd(v). Again using the symmetric nature of custom-characterd, we can show the useful fact that if v is sorted in descending order, then so is custom-characterd(v).


Proposition 2: Given a vector v∈custom-characterd, the component-wise ordering of custom-characterd(v) is same as that of v.


Proof: We prove the claim by contradiction. Write z*=custom-characterd(v) and suppose that for indices i and j we have vi>vj but z*i<z*j. Since all permutations of z* are in the parity polytope, we can swap components i and j of z* to obtain another vector in custom-characterd. Under the assumption z*j>z*i and vi−vj>0 we have z*j(vi−vj)>z*i(vi−vj). This inequality implies that (vi−z*i)2+(vj−z*j)2>(vi−z*j)2+(vi−z*i)2, and hence we get that the Euclidean distance between v and z* is greater than the Euclidean distance between v and the vector obtained by swapping the components.


These two propositions allow us assume through the remainder of this section that our vectors are presented sorted in descending order unless explicitly stated otherwise.


The permutation invariance of the parity polytope also lets us also employ powerful tools from the theory of majorization to simplify membership testing and projection. The fundamental theorem we exploit is based on the following definition.


Definition 1: Let u and w be d-vectors sorted in decreasing order. The vector w majorizes u if











k
=
1

q



u
k







k
=
1

q




w
k





1

q
<
d





,









k
=
1

d



u
k


=




k
=
1

d




w
k

.







Our results rely on the following Theorem, which states that a vector lies in the convex hull of all permutations of another vector if and only if the former is majorized by the latter (see [44] and references therein).


Theorem 1: Suppose u and w are d-vectors sorted in decreasing order. Then u is in the convex hull of all permutations of w if and only if w majorizes u.


To gain intuition for why this theorem might hold, suppose that u is in the convex hull of all of the permutations of w. Then u=Σi=1npiΣiw with Σi being permutation matrices, pi≧0, and 1Tp=1. The matrix custom-characteri=1npiΣi is doubly stochastic, and one can immediately check that if u=custom-characterw and custom-character is doubly stochastic, then w majorizes u.


To apply majorization theory to the parity polytope, begin with one of the permutahedra custom-characterds. We recall that custom-characterds is equal to the convex hull of all binary vectors with weight s, equivalently the convex hull of all permutations of the vector consisting of s ones followed by d−s zeros. Thus, by Theorem 1, u∈[0, 1]d is in custom-characterds if and only if














k
=
1

q



u
k





min


(

q
,
s

)










1

q
<
d




,




(
9
)










k
=
1

d



u
k


=

s
.





(
10
)







The parity polytope custom-characterd is simply the convex hull of all of the custom-characterds with s even. Thus, we can use majorization to provide an alternative characterization of the parity polytope to that of Yannakakis or Jeroslow.


Lemma 1: A sorted vector u∈custom-characterd if and only if there exist non-negative coefficients {μs}even s≦d such that














s





even

d



μ
s


=
1

,


μ
s


0





(
11
)










k
=
1

q



u
k







s





even

d




μ
s



min


(

q
,
s

)










1

q
<
d








(
12
)










k
=
1

d



u
k


=




s





even

d




μ
s



s
.







(
13
)







Proof: First, note that every vertex of custom-characterd of weight s satisfies these inequalities with μs=1 and μs′=0 for s′≠s. Thus u∈custom-characterd must satisfy (11)-(13). Conversely, if u satisfies (11)-(13), then u is majorized by the vector






w
=




s





even

d




μ
s



b
s







where bs is a vector consisting of s ones followed by d−s zeros. w is contained in custom-characterd as are all of its permutations. Thus, we conclude that u is also contained in custom-characterd.


While Lemma 1 characterizes the containment of a vector in custom-characterd, the relationship is not one-to-one; for a particular u∈custom-characterB there can be many sets {μs} that satisfy the lemma. We will next show that there is always one assignment of μs with only two non-zero μs.


C. Constituent Parity of the Projection

For α∈custom-character, let └α┘even denote the “even-floor” of a, i.e., the largest even integer r such that r≦a. Define the “even-ceiling,” ┌a┐even similarly. For a vector u we term └∥u∥1even the constituent parity of vector u. In this section we will show that if u∈custom-characterd has constituent parity r, then it can be written as a convex combination of binary vectors with weight equal to r and r+2. This result is summarized by the following


Lemma 2: (“Two-slice” lemma) A vector u∈custom-characterd iff u can be expressed as a convex combination of vectors in custom-characterdr and custom-characterdr+2 where r=└∥u∥1even.


Proof: Consider any (sorted) u∈custom-characterd. Lemma 1 tells us that there is always (at least one) set {μs,} that satisfy (11)-(13). Letting r be defined as in the lemma statement, we define α to be the unique scalar between zero and one that satisfies the relation ∥u∥1=αr+(1−α)(r+2):









α
=



2
+
r
-



u


1


2

.





(
14
)







Then, we choose the following candidate assignment: μr=α, μr+2=1−α, and all other μs=0. We show that this choice satisfies (11)-(13) which will in turn imply that there is a γrcustom-characterdr and a ur+2custom-characterdr+2 such that u=αur+(1−α)ur+2.


First, by the definition of α, (11) and (13) are both satisfied. Further, for the candidate set the relations (12) and (13) simplify to














k
=
1

q



u
k





α






min


(

q
,
r

)



+


(

1
-
a

)



min


(

q
,

r
+
2


)





,







1

q
<
d


,




(
15
)










k
=
1

d



u
k


=


α





r

+


(

1
-
α

)




(

r
+
2

)

.







(
16
)







To show that (15) is satisfied is straightforward for the cases q≦r and q≧r+2. First consider any q≦r. Since min(q, r)=min(q, r+2)=q, uk≦1 for all k, and there are only q terms, (15) must hold. Second, consider any q≧r+2. We use (16) to write Σk=1quk=αr+(1−α)(r+2)−Σq+1duk. Since uk≧0 this is upper bounded by αr+(1−α)(r+2) which we recognize as the right-hand side of (15) since r=min(q, r) and r+2=min(q, r+2).


It remains to verify only one more inequality in (15) namely the case when q=r+1, which is











k
=
1


r
+
1




u
k





α





r

+


(

1
-
α

)



(

r
+
1

)




=

r
+
1
-

α
.






To show that the above inequality holds, we maximize the right-hand-side of (12) across all valid choices of {μs} and show that the resulting maximum is exactly r+1−α. Since this maximum is attainable by some choice of {μs} and our choice meets that bound, our choice is a valid choice.


The logic is as follows. Since u∈custom-characterd any valid choice for {μs} must satisfy (11) which, for g=r+1, is













k
=
1


r
+
1




u
k







s





even

d




μ
s




min


(

s
,

r
+
1


)


.







(
17
)







To see that across all valid choice of {μs} the largest value attainable for the right hand side is precisely r+1−α consider the linear program





maximize Σs evenμsmin(s,r+1)





subject to Σs evenμs=1





Σs evenμss=(1−α)(r+2)





μs≧0.


The first two constraints are simply (11) and (13). Recognizing αr+(1−a)(r+2)=r+2−2α, the dual program is





minimize(r+2−2α)λ12





subject to λ1s+λ2≧min(s,r+1)∀s even.


Setting μr=α, μr+2=(1−α), the other primal variable to zero, λ1=½, and λ2=r/2, satisfies the Karush-Kuhn-Tucker (KKT) conditions for this primal/dual pair of LPs. The associated optimal cost is r+1−α. Thus, the right hand side of (17) is at most r+1−α.


We have proved that if u∈custom-characterd then the choice of r=└∥u∥1even and α as in (14) satisfies the requirements of Lemma 1 and so we can express u as u=αur+(1−α)ur+2. The converse—given a vector u that is a convex combination of vectors in custom-characterdr and custom-characterdr+2 it is in custom-characterd—holds becauseconv(custom-characterdrcustom-characterdr+2)c custom-characterd.


A useful consequence of Theorem 1 is the following corollary.


Corollary 1: Let u be a vector in [0, 1]d. If Σk=1duk is an even integer then u∈custom-characterd.


Proof: Let Σk=1duk=s. Since u is majorized by a sorted binary vector of weight s then, by Theorem 1, u∈custom-characterds which, in turn, implies u∈custom-characterd.


We conclude this section by showing that we can easily compute the constituent parity of custom-characterd(v) without explicitly computing the projection of v.


Lemma 3: For any vector v∈custom-characterd, let z=Π[0,1]d(v), the projection of v onto [0, 1]d and denote by custom-characterd(v) the projection of v onto the parity polytope. Then





└∥z∥1even≦∥custom-characterd(v)∥1≦┌∥z∥1even.


That is, we can compute the constituent parity of the projection of v by projecting v onto [0, 1]d and computing the even floor.


Proof: Let ρU=┌∥z∥1even and pρL=└∥z∥1even. We prove the following fact: given any y′∈custom-characterd with ∥y′∥1U there exits a vector y∈[0, 1]d such that ∥y∥1U, y∈custom-characterd, and ∥v−y|22<∥v−y′∥22. The implication of this fact will be that any vector in the parity polytope with l1 norm strictly greater that ρU cannot be the projection of v. Similarly we can also show that any vector with l1 norm strictly less than ρL cannot be the projection on the parity polytope.


First we construct the vector y based on y′ and z. Define the set of “high” values to be the coordinates on which y′i is greater than zi, i.e., custom-character:={i∈[d]|y′i>zi}. Since by assumption ∥y′∥1U≧∥z∥1 we know that |custom-character|≧1. Consider the test vector t defined component-wise as







t
i

=

{




z
i





if





i








y
i





otherwise
.









Note that ∥t∥1≦∥z∥1≦ρU<∥y′∥1. The vector t differs from y′ only in custom-character. Thus, by changing (reducing) components of y′ in the set custom-character we can obtain a vector y such that λy∥1U. In particular there exists a vector y with ∥y∥1U such that y′i≧yi≧zi for i∈custom-character and yi=y′i for i∉custom-character. Since the l1 norm of y is even and it is in [0, 1]d we have by Corollary 1 that y∈custom-characterd.


We next show that for all i∈custom-character, ∥vi−yi|≦|vi−y′i|. The inequality will be strict for at least one i yielding ∥v−y∥22<∥v−y′∥22 and thereby proving the claim.


We start by noting that y′∈custom-characterd so y′i∈[0, 1] for all i. Hence, if zi<y′i for some i we must also have zi<1, in which case vi≦zi since zi is the projection of vi onto [0, 1]. In summary, zi<1 iff vi<1 and when zi<1 then vi≦zi. Therefore, if y′i>zi then zi≧vi. Thus for all i∈custom-character we get y′i≧yi≧zi≧vi where the first inequality is strict for at least one i. Since yi=y′i for i∉custom-character this means that |vi−yi|≦|vi−y′i| for all i where the inequality is strict for at least one value of i. Overall, ∥v−y∥22<∥v−y′∥22 and both y∈custom-characterd (by construction) and y′∈custom-characterd (by assumption). Thus, y′ cannot be the projection of v onto custom-characterd. Thus the l1 norm of the projection of v, ∥custom-characterd(v)∥1≦ρU. A similar argument shows that ∥custom-characterd(v)∥1≧ρL, and so ∥custom-characterd(v)∥1 must lie in [ρLU]


D. Projection Algorithm

In this section we formulate a quadratic program (Problem PQP) for the projection problem and then develop an algorithm (Algorithm 2) that efficiently solves the quadratic program.


Given a vector v∈custom-characterd, set r=└∥Π[0,1]d(v)∥1even. From Lemma 3 we know that the constituent parity of z*:=custom-characterd(v) is r. We also know that z* is sorted in descending order if v is. Let S be a (d−1)×d matrix with diagonal entries set to 1, Si,i+1=−1 for 1≦i≦d−1, and zero everywhere else:






S
=


[



1



-
1



0


0





0


0




0


1



-
1



0





0


0




0


0


1



-
1






0


0

































0


0


0


0






-
1



0




0


0


0


0





1



-
1




]

.





The constraint that z* has to be sorted in decreasing order can be stated as Sz*≧0, where 0 is the all-zeros vector.


In addition, Lemma 2 implies that z* is a convex combination of vectors of Hamming weight r and r+2. Using inequality (15) we get that a d-vector z∈[0, 1]d, with














i
=
1

d







z
i


=


α





r

+


(

1
-
α

)



(

r
+
2

)




,




(
18
)







is a convex combination of vectors of weight r and r+2 if it satisfies the following bounds:














k
=
1

q







z

(
k
)






α






min


(

q
,
r

)



+


(

1
-
α

)







min


(

q
,

r
+
2


)










1

q
<
d





,




(
19
)







where z(k) denotes the kth largest component of z. As we saw in the proof of Lemma 1, the fact that the components of z are no more than one implies that inequalities (19) are satisfied for all q≦r. Also, (18) enforces the inequalities for q≧r+2. Therefore, inequalities in (19) for q≦r and q≧r+2 are redundant. Note that in addition we can eliminate the variable α by solving (18) giving






α
=

1
+


r
-




k
=
1

d







z
k



2






(see also (14)). Therefore, for a sorted vector v, we can write the projection onto custom-characterd as the optimization problem









minimize






1
2






v
-
z



2
2













subject





to





0



z
i



1





















Sz

0











0


1
+


r
-




k
=
1

d







z
k



2



1




(
20
)










k
=
1


r
+
1








z
k




r
-



r
-




k
=
1

d







z
k



2

.






(
21
)







The last two constraints can be simplified as follows. First, constraint (20) simplifies to r≦Σk=1dzk≦r+2. Next, defining the vector










f
r

=



(



1
,
1
,





,
1




r
+
1



,



-
1

,

-
1

,





,

-
1





d
-
r
-
1




)

T

.





(
22
)







we can rewrite inequality (21) as ƒrTz≦r. Using these simplifications yields the final form of our quadratic program:


Problem PQP:









minimize






1
2






v
-
z



2
2













subject





to





0



z
i



1







i






(
23
)






Sz

0




(
24
)






r



1
T


z



r
+
2





(
25
)








f
r
T


z



r
.





(
26
)







The projection algorithm we develop efficiently solves the KKT conditions of PQP. The objective function is strongly convex and the constraints are linear. Hence, the KKT conditions are not only necessary but also sufficient for optimality. To formulate the KKT conditions, we first construct the Lagrangian with dual variables β, μ, γ, ξ, θ, and ζ:







=



1
2






v
-
z



2
2


-

β


(

r
-


f
r
T


z


)


-


μ
T



(

1
-
z

)


-


γ
T


z

-

ξ


(

r
+
2
-


1
T


z


)


-

Ϛ


(



1
T


z

-
r

)


-


θ
T



Sz
.







The KKT conditions are then given by stationarity of the Lagrangian, complementary slack-ness, and feasibility.






z=v−βƒ
r−μ+γ−(ξ−ζ)1+STθ.





0≦β⊥ƒrTz−r≦0





0≦μ⊥z≦1





0≦γ⊥z≧0





0≦θ⊥Sz≧0





0≦ξ⊥1Tz−r−2≦0





0≦ζ⊥1Tz−r≧0.  (27)


A vector z that satisfies (27) and the following orthogonality conditions is equal to the projection of v onto custom-characterd.


To proceed, set







β
max

=


1
2



[


v

r
+
1


-

v

r
+
2



]






and define the parameterized vector






z(β):=Π[0,1]d(v−βƒr).  (28)


The following lemma implies that the optimizer of PQP, i.e., z*=custom-characterd(v), is z(βopt) for some βopt∈[0, βmax].


Lemma 4: There exists a αoptβ[0, βmax] such that z(αopt) satisfies the KKT conditions of the quadratic program PQP.


Proof: Note that when β>βmax we have that zr+1(β)<zr+2(β) and z(β) is ordered differently from v and ƒrTz(β)<r. Consequently z(β) cannot be the projection onto custom-characterd for β>βmax. At the other boundary of the interval, when β=0 we have z(0)=Π[0,1]d(v). If ƒrTz(0)=r, then z(0)∈custom-characterd by Corollary 1. But since z(0) is the closest point in [0, 1]d to v, it must also be the closest point in custom-characterB.


Assume now that ƒrTz(0)>r. Taking the directional derivative with respect to β increasing, we obtain the following:

















f
r
T




z


(
β
)





β


=




f
r
T






z


(
β
)





β









=







k


:






0

<


z
k



(
β
)


<
1








-

f

r
,
k

2









=




-



{


k


1

k

d


,

0
<


z
k



(
β
)


<
1


}




<
0.








(
29
)







proving that ƒrTz(β) is a decreasing function of β. Therefore, by the mean value theorem, there exists a βopt∈[0, βmax),] such that ƒrTz(βopt)=r.


First note that z(βopt) is feasible for Problem PQP. We need only verify (25). Recalling that r is defined as r=└|Π[0,1]d(v)∥1even, we get the lower bound:





1Tzopt)≧ƒrTzopt)=r.


The components of z(βopt) are all less than one, so Σk=1r+1zkopt)≦r+1. Combining this with the equality ƒrTz(βopt)=r tells us that Σk=r+2dzkopt)≦1. We therefore find that 1Tz(βopt) is no more than r+2.


To complete the proof, we need only find dual variables to certify the optimality. Setting ξ, ζ, and θ to zero, and μ and γ to the values required to satisfy (27) provides the necessary assignments to satisfy the KKT conditions.


Lemma 4 thus certifies that all we need to do to compute the projection is to compute the optimal β. To do so, we use the fact that the function ƒrTz(β) is a piecewise linear function of β. For a fixed β, define the active set to be the indices where z(β) is strictly between 0 and 1






custom-character(β):={k|1≦k≦d,0<zk(β)<1}.  (30)


Let the clipped set be the indices where z(β) is equal to 1.






custom-character(β):={k|1≦k≦d,zk(β)=1}.  (31)


Let the zero set be the indices where z(β) is equal to zero






custom-character(β):={k|1≦k≦d,zk(β)=0}.  (32)


Note that with these definitions, we have














f
r
T



z


(
β
)



=






C


(
β
)




+







(


z
j

-
β

)









=






C


(
β
)




-

β






(
β
)






+







z
j










(
33
)







Our algorithm simply increases beta until the active set changes, keeping track of the sets custom-character(β), custom-character(β), and custom-character(β). We break the interval [0, βmax] into the locations where the active set changes, and compute the value of ƒrTz(β) at each of these breakpoints until ƒrTz(β)<r. At this point, we have located the appropriate active set for optimality and can find βopt by solving the linear equation (33).


The breakpoints themselves are easy to find: they are the values of β where an index is set equal to one or equal to zero. First, define the following sets






custom-character
1
:={v
i−1|1≦i≦r+1},






custom-character
2
:={v
i|1≦i≦r+1},






custom-character
3
:={−v
i
|r+2≦i≦d},






custom-character
4:={−vi+1|r+2≦i≦d}.


The sets custom-character1 and custom-character2 concern the r+1 largest components of v; custom-character3 and custom-character4 the smallest components. The set of breakpoints is






B
:=


{


β





j
=
1

4



j





0

β


β
max



}




{

0
,

β
max


}

.






There are thus at most 2d+2 breakpoints.


To summarize, our Algorithm 2 sorts the input vector, computes the set of breakpoints, and then marches through the breakpoints until it finds a value of βicustom-character with ƒrTz(βi)≦r. Since we will also have ƒrTz(βi−)>r, the optimal β will lie in [βi−1:βi] and can be found by solving (33). In the algorithm box for Algorithm 2, b is the largest and a is the smallest index in the active set. We use V to denote the sum of the elements in the active set and Λ the total sum of the vector at the current break point. Some of the awkward if statements in the main for loop take care of the cases when the input vector has many repeated entries.


Algorithm 2 requires two sorts (sorting the input vector and sorting the breakpoints), and then an inspection of at most 2d breakpoints. Thus, the total complexity of the algorithm is linear plus the time for the two sorts.












Algorithm 2


Given u ∈ custom-characterd determine its projection on  custom-characterd, z*















1: Permute u to produce a vector v whose components are sorted in


 decreasing order, i.e., v1 ≧ v2 ≧ . . . ≧ vd. Let Q be the corresponding


 permutation matrix, i.e., v = Qu.


2: Compute {circumflex over (z)} ← Π[0,1]d (v).


3: Assign r = └∥{circumflex over (z)}∥1even and βmax = ½ [{circumflex over (z)}r+1 - {circumflex over (z)}r+2].


4: Define fr as in (22).


5: if fr+1T {circumflex over (z)} ≦ r then


6:  Return z* = {circumflex over (z)}.


7: end if


8: Assign εl = {vi − 1| 1 ≦ i ≦ r +1},


    custom-character1 ={vi | 1 ≦ i ≦ r +1},


    ε2 = {− vi | r + 2 ≦ i ≦ d},


    custom-character2 = {− vi + 1 | r + 2 ≦ i ≦ d}.


9: Assign the set of breakpoints:


custom-character := {β ∈ ∪j=12εj custom-characterj) |0 ≦ β ≦ βmax} ∪ {0, βmax}.


10: Index the breakpoints in  custom-character  in a sorted manner to get {βi}i where


 β1 ≦ β2 ≦ . . . ≦ β|B|.


11: Initialize a as the smallest index such that 0 < {circumflex over (z)}a < 1.


12: Initialize b as the largest index such that 0 < {circumflex over (z)}b < 1.


13: Initialize sum V = frT{circumflex over (z)}.


14: for i = 1 to | custom-character  | do


15: Set β0 ← βi.


16: if βi ∈ ε1 ∪ ε2 then


17:   Update a ← a − 1.


18:   Update V ← V + va.


19:  else


20:   Update b ← b + 1


21:   Update V ← V − vb.


22:  end if


23:  if i < d and βi ≠ βi+1 then


24:   Λ ← (a − 1) + V − β0(b − a + 1)


25:   if Λ ≦ r then break


26:  else if i = d then


27:   Λ← (a − 1) + V − β0(b − a + 1)


28:  end if


29: end for









If we work with an (un-augmented) Lagrangian








L
0



(

x
,
z
,
λ

)


:=



γ
T


x

+




j











λ
j
T



(



P
j


x

-

z
j


)








the dual subgradient ascent method consists of the iterations:





χk+1:=argmin∞∈custom-characterL0(χ,zkk)






z
k+1:=argminz∈ZL0k,z,λk)





λjk+1:=λjk+μ(Pjχk+1−zjk+1).


Note here that the χ and z updates are run with respect to the k iterates of the other variables, and can be run completely in parallel.


The χ-update corresponds to solving the very simple LP:






minimize







(

γ
+




j











P
j
T



λ
j
k




)

T


x







subject





to





x





[

0
,
1

]

N

.





This results in the assignment:







x

k
+
1


=


θ
(


-
γ

-




j












λ
j
k




)








where






θ


(
t
)


=

{



1



t
>
0





0



t

0









is the Heaviside function.


For the z-update, we have to solve the following LP for each j∈custom-character





maximize λjkTzj





subject to zjcustom-characterd.  (34)


Maximizing a linear function over the parity polytope can be performed in linear time. First, note that the optimal solution necessarily occurs at a vertex, which is a binary vector with an even hamming weight. Let r be the number of positive components in the cost vector λjk. If r is even, the vector v∈custom-characterd which is equal to 1 where λjk is positive and zero elsewhere is a solution of (34), as making any additional components nonzero decreases the cost as does making any of the components equal to 1 smaller. If λjk is odd, we only need to compare the cost of the vector equal to 1 in the r−1 largest components and zero elsewhere to the cost of the vector equal to 1 in the r+1 largest components and equal to zero elsewhere.


The procedure to solve (34) is summarized in Algorithm 3. Note that finding the smallest positive element and largest nonnegative element can be done in linear time. Hence, the complexity of Algorithm 3 is O(d).


While this subgradient ascent method is quite simple, it is requires vastly more iterations than the ADMM method, and thus we did not pursue this any further.












Algorithm 3 Given a binary d-dimensional vector c,


maximize cTz subject to z ε custom-characterd.
















1:
Let r be the number of positive elements in c.


2:
if r is even then


3:
 Return z* where zi* = 1 if ci > 0 and zi* = 0 otherwise.


4:
else


5:
 Find index ip of the smallest positive element of c.


6:
 Find index in of the largest non-positive element of c.


7:
 if cip > cin then


8:
  Return z* where zi* = 1 if ci > 0, zin* = 1, and zi* = 0 otherwise.


9:
 else


10:
  Return z* where zi* = 1 if ci > 0 and i ≠ ip, zip* =



  0, and zi* = 0 for all other i.


11:
 end if


12:
end if









REFERENCES

These references are incorporated in their entireties by reference.

  • [1] R. W. Hamming, “Error detecting and error correcting codes,” Bell Syst. Tech. J., vol. 29, no. 2, pp. 147-160, 1950.
  • [2] J. Feldman, Decoding Error-Correcting Codes via Linear Programming. PhD thesis, Mass. Instit. of Tech., 2003.
  • [3] J. Feldman, M. J. Wainwright, and D. Karger, “Using linear programming to decoding binary linear codes,” IEEE Trans. Inform. Theory, vol. 51, pp. 954-972, March 2005.
  • [4] P. 0. Vontobel and R. Koetter, “On the relationship between linear programming decoding and min-sum algorithm decoding,” in IEEE Int. Symp. Inform. Theory and Apps., (Parma, Italy), October 2004.
  • [5] P. 0. Vontobel and R. Koetter, “On low-complexity linear-programming decoding of LDPC codes,” European transactions on telecommunications, vol. 18, no. 5, pp. 509-517, 2007.
  • [6] M.-H. N. Taghavi and P. H. Siegel, “Adaptive methods for linear programming decoding,” IEEE Trans. Inform. Theory, vol. 54, pp. 5396-5410, December 2008.
  • [7] Y. Wang, J. S. Yedidia, and S. C. Draper, “Multi-stage decoding of LDPC codes,” in Proc. Int. Symp. Inform. Theory, (Seoul, South Korea), July 2009.
  • [8] J. Feldman, T. Malkin, R. A. Servedio, C. Stein, and M. J. Wainwright, “LP decoding corrects a constant fraction of errors,” in Proc. Int. Symp. Inform. Theory, (Chicago, Ill.), June 2004.
  • [9] C. Daskalakis, A. G. Dimakis, R. M. Karp, and M. J. Wainwright, “Probabilistic analysis of linear programming decoding,” IEEE Trans. Inform. Theory, vol. 54, pp. 3365-3578, August 2008.
  • [10]A. Arora, D. Steuer, and C. Daskalakis, “Message-passing algorithms and improved LP decoding,” in ACM Symposium on Theory of Computing (STOC), May 2009.
  • [11]B. J. Frey, R. Koetter, and A. Vardy, “Signal-space characterization of iterative decoding,” IEEE Trans. Inform. Theory, vol. 47, pp. 766-781, February 2001.
  • [12] R. Koetter and P. O. Vontobel, “Graph-covers and iterative decoding of finite length codes,” in Proc. Int. Symp. Turbo Codes and Related Topics, (Brest, France), 2003.
  • [13] D. J. C. MacKay and M. S. Postol, “Weaknesses of Margulis and Ramanujan-Margulis low-density parity-check codes,” Electronic Notes in Theoretical Computer Science, vol. 74, no. 0, pp. 97-104, 2003.
  • [14] T. Richardson, “Error floors of LDPC codes,” in Proc. Allerton Conf. on Communication, Control and Computing, (Monticello, Ill.), October 2003.
  • [15] L. Dolecek, P. Lee, Z. Zhang, V. Anatharam, B. Nikolic, and M. J. Wainwright, “Predicting error floors of structured LDPC codes: deterministic bounds and estimates,” IEEE J. Select. Areas Commun., vol. 27, pp. 908-917, August 2009.
  • [16] X.-Y. Hu, E. Eleftheriou, and D. M. Arnold, “Regular and irregular progressive edge-growth Tanner graphs,” IEEE Trans. Inform. Theory, pp. 386-398, January 2005.
  • [17] T. Tian, C. Jones, J. D. Villasenor, and R. D. Wesel, “Construction of irregular LDPC codes with low error floors,” in Proc. Int. Conf. Commun., (Anchorage, Ak.), pp. 3125-3129, May 2003.
  • [18]Y. Wang, S. C. Draper, and J. S. Yedidia, “Hierarchical and high-girth QC LDPC codes,” archive preprint 1111.0711, 2011, submitted to IEEE Trans. Inform. Theory.
  • [19]J. Zhang, J. S. Yedidia, and M. P. C. Fossorier, “Low-latency decoding of EG LDPC codes,” Journal of Lightwave Technology, vol. 25, pp. 2879-2886, September 2007.
  • [20]R. M. Tanner, “A recursive approach to low complexity codes,” IEEE Trans. Inform. Theory, vol. 27, pp. 533-547, September 1981.
  • [21]Y. Wang and M. Fossorier, “Doubly generalized LDPC codes,” in Proc. Int. Symp. Inform. Theory, (Seattle, Wash.), pp. 669673, July 2006.
  • [22] J. S. Yedidia, Y. Wang, and S. C. Draper, “Divide and concur and difference-map BP decoders for LDPC codes,” IEEE Trans. Inform. Theory, vol. 57, no. 2, pp. 786-802, 2011
  • [23] K. Yang, J. Feldman, and X. Wang, “Nonlinear programming approaches to decoding low-density parity-check codes,” IEEE J. Select. Areas Commun., vol. 24, pp. 1603-1613, August 2006.
  • [24] S. C. Draper, J. S. Yedidia, and Y. Wang, “ML decoding via mixed-integer adaptive linear programming decoding,” in Proc. Int. Symp. Inform. Theory, (Nice, France), July 2007.
  • [25] S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, “Distributed optimization and statistical learning via the alternating direction method of multipliers,” Machine Learning, vol. 3, no. 1, pp. 1-123, 2010.
  • [26]M. V. Afonso, J. M. Bioucas-Dias, and M. A. T. Figueiredo, “An augmented Lagrangian approach to the constrained optimization formulation of imaging inverse problems,” IEEE Transactions on Image Processing, vol. 20, no. 3, pp. 681695, 2011.
  • [27] A. F. T. Martins, M. A. T. Figueiredo, P. M. Q. Aguiar, N. A. Smith, and E. P. Xing, “An augmented Lagrangian approach to constrained MAP inference,” in Proceedings of the International Conference on Machine Learning, 2011.
  • [28] P. O. Vontobel and R. Koetter, “Towards low-complexity linear-programming decoding,” in Proc. Int. Symp. Turbo Codes and Related Topics, (Munich, Germany), April 2006.
  • [29] D. Burshtein, “Linear complexity approximate LP decoding of LDPC codes: generalizations and improvements,” in Proc. Int. Symp. Turbo Codes and Related Topics, (Lausanne, Switzerland), September 2008.
  • [30] D. Burshtein, “Iterative approximate linear programming decoding of LDPC codes with linear complexity,” IEEE Trans. Inform. Theory, vol. 55, no. 11, pp. 4835-4859, 2009.
  • [31]P. Vontobel, “Interior-point algorithms for linear-programming decoding,” in UCSD Workshop Inform. Theory Apps., (San Diego, Calif.), January 2008.
  • [32] T. Wadayama, “Interior point decoding for linear vector channels based on convex optimization,” in Proc. Int. Symp. Inform. Theory, (Toronto, CA), pp. 1493-1497, July 2008.
  • [33] T. Wadayama, “An LP decoding algorithm based on primal path-following interior point method,” in Proc. Int. Symp. Inform. Theory, (Seoul, Korea), pp. 389-393, July 2009.
  • [34] M.-H. N. Taghavi, A. Shokrollahi, and P. H. Siegel, “Efficient implementation of linear programming decoding,” IEEE Trans. Inform. Theory, vol. 55, pp. 5960-5982, September 2010.
  • [35]H. Liu, W. Qu, B. Liu, and J. Chen, “On the decomposition method for linear programming decoding of LDPC codes,” IEEE Trans. Commun., vol. 58, pp. 3448-3458, December 2010.
  • [36] R. G. Jeroslow, “On defining sets of vertices of the hypercube by linear inequalities,” Discrete Mathematics, vol. 11, no. 2, pp. 119-124, 1975.
  • [37] M. Yannakakis, “Expressing combinatorial optimization problems by linear programs,” Journal of Computer and System Sciences, vol. 43, no. 3, pp. 441-466, 1991.
  • [38]D. P. Bertsekas, A. Nedic, and A. E. Ozdaglar, Convex Analysis and Optimization. Belmont, Mass.: Athena Scientific, 2003.
  • [39]J. Nocedal and S. J. Wright, Numerical Optimization. Springer, 2nd ed., 2006.
  • [40] J. S. Yedidia, “The alternating direction method of multipliers as a message-passing algorithms.” Talk delivered at the Princeton Workshop on Counting, Inference and Optimization, October 2011.
  • [41]G. D. Formey, “Codes on graphs: normal realizations,” IEEE Trans. Inform. Theory, vol. 47, pp. 520-548, February 2001.
  • [42] F. R. Kschischang, B. J. Frey, and H. Loeliger, “Factor graphs and the sum-product algorithm,” IEEE Trans. Inform. Theory, vol. 47, pp. 498-519, February 2001.
  • [43]H.-A. Loeliger, “An introduction to factor graphs,” IEEE Signal Proc. Mag., vol. 21, pp. 28-41, January 2004.
  • [44]A. Marshall, I. Olkin, and A. B. C., Inequalities: theory of majorization and its applications. Springer, 2009.
  • [45] S. Barman, X. Liu, S. C. Draper, and B. H. Recht, “Decomposition methods for large-scale linear-programming decoding,” in Proc. Allerton Conf. on Communication, Control and Computing, (Monticello, Ill.), September 2011.

Claims
  • 1. An error correction electronic circuit comprising: a buffer memory for holding a received string of bits from a transmitted string of bits subject to a probability of transmission error;a parity rule memory holding a set of parity rules for the transmitted string of bits, each parity rule describing a relationship between a subset of the bits; anda linear programming optimizer communicating with the buffer memory and parity rule memory to generate a corrected string of bits from the received string of bits using a linear programming process configured to maximize a probability that the corrected string of bits represents the transmitted string of bits, subject to the parity rules for the bits.
  • 2. The error correction circuit of claim 1 wherein the linear programming optimizer iteratively repeats two steps, a first step adjusting values of the corrected string of bits based on an iteratively changing replicas and the second step updating the iteratively changing replicas based upon their deviation from actual parity rules.
  • 3. The error correction circuit of claim 2 wherein the first step of adjusting values of the corrected string of bits adjusts each bit of the corrected string of bits as a function of the iteratively changing replicas independent of the value of the other bits of the corrected string of bits.
  • 4. The error correction circuit of claim 2 wherein the error correction electronic circuit provides multiple independently executing computational elements associated with different replicas to substantially simultaneously adjust the different replicas.
  • 5. The error correction circuit of claim 2 wherein the error correction electronic circuit provides multiple independently executing computational elements associated with different values of the corrected string of bits to substantially simultaneously adjust the different values of the corrected string of bits.
  • 6. The error correction circuit of claim 2 wherein the second step of updating the iteratively changing replicas defines a projection of the iteratively changing replicas to a parity polytope being a convex hull whose vertices are defined by the parity rules.
  • 7. The error correction circuit of claim 2 wherein the first and second steps implement an alternating direction method of multipliers.
  • 8. The error correction circuit of claim 1 wherein the maximized probability models a binary symmetric channel.
  • 9. The error correction circuit of claim 1 wherein the parity rules provide a low density parity check taking as arguments a small subset of the bits of the transmitted string of bits.
  • 10. The error correction electronic circuit of claim 1 wherein the parity rules provide for even parity for a subset of bits of the transmitted string of bits.
  • 11. A method of performing error correction on a string of binary bits subject to a probability of transmission error using an electronic circuit comprising: receiving into a memory the string of bits;storing a set of parity rules for the received string of bits, each parity rule describing a relationship between a subset of the bits; andgenerating a corrected string of bits from the received string of bits using linear programming minimizing a probability that the corrected string of bits erroneously represents the received string of bits, subject to the parity rules for the bits, the linear programming divided into individually parallelizable tasks executed simultaneously on multiple circuit components.
  • 12. The method of claim 11 wherein the linear programming iteratively repeats two steps, a first step adjusting values of the corrected string of bits based on an iteratively changing replicas and the second step updating the iteratively changing replicas based upon their deviation from actual parity rules.
  • 13. The method of claim 12 wherein the first step of adjusting values of the corrected string of bits adjusts each bit of the corrected string of bits as a function of the iteratively changing replicas independent of the value of the other bits of the corrected string of bits.
  • 14. The method of claim 12 wherein the electronic circuit substantially simultaneously adjusts different replicas.
  • 15. The method of claim 12 wherein the electronic circuit substantially simultaneously adjusts different values of the corrected string of bits.
  • 16. The method of claim 13 wherein the second step of updating the iteratively changing replicas finds a projection of the iteratively changing replicas to a parity polytope being a convex hull whose vertices are defined by the parity rules.
  • 17. The method of claim 12 wherein the first and second steps implement an alternating direction method of multipliers.
  • 18. The method of claim 11 wherein the linear programming models a binary symmetric channel.
  • 19. The method of claim 11 wherein the parity rules provide a low density parity check taking as arguments a small subset of the bits of the transmitted string of bits.
  • 20. The method of claim 11 wherein the parity rules provide for even parity for a subset of bits of the transmitted string of bits.