[23470] in Athena Bugs
[louay@MIT.EDU: Broken latex file]
daemon@ATHENA.MIT.EDU (Richard Tibbetts)
Tue Aug 5 20:31:45 2003
Date: Tue, 5 Aug 2003 20:31:42 -0400
From: Richard Tibbetts <tibbetts@MIT.EDU>
To: bugs@mit.edu
Message-ID: <20030806003142.GQ5558@innocuous.mit.edu>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
Dear bugs,
I dealt with this user in the SIPB office. According to him, this
document latexed properly 2 weeks ago, on the previous Athena version.
It currently fails with a lot of error messages. If I change it from
\documentstyle to \documentclass, it works. This is the user's current
solution, and a fine one. But on zephyr Greg Hudson agreed that this
might indicate a bug in latex compatibility that would be worth
investigating.
Richard Tibbetts
----- Forwarded message from louay@MIT.EDU -----
Date: Tue, 5 Aug 2003 20:24:49 -0400
From: louay@MIT.EDU
To: tibbetts@MIT.EDU
Subject: Broken latex file
X-Sieve: CMU Sieve 2.2
X-Spam-Score: -0.1
X-Spam-Flag: NO
X-Scanned-By: MIMEDefang 2.28 (www . roaringpenguin . com / mimedefang)
\documentstyle[11pt, fullpage]{article}
%\documentstyle[11pt,doublespace]{article}
\input amssym.def \input amssym
\include{deff}
\title{\bf Endcoding Complexity Versus Minimum Distance
\footnote{
This research was support by ARO grant
DAA L03-92-G-0115, and MURI grant DAAD19-00-1-0466.}
}
\author{
Louay M.J. Bazzi
\footnote{
louay@mit.edu ,
Department of Electrical Engineering and Computer Science, MIT,
Cambridge, MA 02139.}
, Sanjoy K. Mitter
\footnote{
mitter@mit.edu,
Department of Electrical Engineering and Computer Science, MIT,
Cambridge, MA 02139.
}
}
\date{}
\begin{document}
\maketitle
\abstract{
We establish a bound on the minimum distance of a
binary error correcting code
given constraints on the computational time-space complexity
of its encoder in the general binary branching program model.
The bound we obtain
asserts that if the encoder
uses linear time and sublinear memory
in the most general sense,
then the minimum distance of the code cannot grow linearly with the
block length when the rate is nonvanishing, i.e.
minimum relative distance of the code tends to zero in such a setting.
The setting is general enough to include nonserially concatenated
Turbo like codes and various
generalizations.
Our argument is based on branching program techniques
introduced by Ajtai \cite{ajt}.
We consider also the case of constant-depth AND-OR circuits encoders
with unbounded fanins.
}
\section{Introduction}
A binary error correcting code is specified by an injective
map $C:\{0,1\}^n \rightarrow \{0,1\}^m$
from the binary strings of length $n$ to binary strings of length $m$,
where $C$ is called the {\em encoder}
or the {\em code} when there
is no possibility of confusion, $n$ is called the {\em message length},
$m$ is called the {\em block length},
the strings in the image of
$C$ are called the {\em codewords}
, and the ratio $n/m$ is called
the {\em rate}.
A fundamental parameter that characterizes the worst case
error correction capabilities of the code is its {\em minimum distance}
which is defined as the minimum hamming distance between two distinct
codewords.
One of the main goals in combinatorial coding theory
is to find codes where a good tradeoff is achieved
between the two opposing parameters rate and minimum distance, as the
message length tends to infinity. A code (meaning a family of codes
indexed by the block length) is called in the coding theory literature
{\em an asymptotically good code}
if the block length grows linearly
with the message
length and the minimum distance
grows linearly with the block length. The other codes are called
{\em asymptotically bad codes}.
In this paper, we consider the following question:
\begin{quote}
What can we say about the growth of the minimum distance
of a binary error correcting code
given constraints on the computational complexity
of its encoder?
\end{quote}
We concentrate mainly on the
time-space complexity of the encoder.
In this setting, the question is a natural tradeoffs question
between the parameters: code minimum distance,
code rate, encoding time, and encoding space.
From a practical perspective, this question is
important since there are popular error correcting
codes that have low time-space encoding
complexity.
We are referring here to Turbo codes,
or more precisely to parallel concatenated
Turbo codes introduced by
Berrou, Glavieux, and Thitimajshima in \cite{ctc},
and repeat-convlute codes
introduced by Divsalar, Jin, and McEliece in \cite{rac1}.
This low time-space encoding
complexity is crucial for the corresponding iterative
decoding algorithms because these algorithms process the
state space representation of the encoder.
Sharp bounds on minimum distance of Turbo codes were first
obtained by Kahale and Urbanke \cite{ku} in the setting of random
interleavers and constant memory convolutional codes.
In a recent joint work
with Mahdian and Spielman \cite{tlcd},
we derived strong bounds on the minimum distance of Turbo like codes
in variety of cases. One of these cases
is the well structured setting of
generalized repeat-convolute codes where
the convolutional code is substituted
by an arbitrary automaton.
We argued that such codes are asymptotically bad when
the memory of the automaton is sublinear and the number
of repetitions is constant.
In this paper, we extend this particular result
to the much more general setting where
the encoder is a binary branching program, or equivalently a nonuniform
random-access machine with binary input registers.
We establish a general theorem that asserts that
if the encoder is a binary branching program
that uses linear time and sublinear space,
then the minimum distance of the code cannot grow linearly
with the block length when the rate is nonvanishing,
which is a rather surprising
result. In general we derive a bound relating
the involved parameters.
Our proof is based on the branching
programs techniques introduced in the recent breakthrough
of Ajtai \cite{ajt}.
We consider also the case of constant-depth AND-OR circuits encoders
with unbounded fanins.
We conclude with a conjecture about the strongest possible
time-space tradeoffs for encoding asymptotically good codes.
\subsection{Branching program encoders}\label{bpe}
Consider a code specified as above by
an encoding map $C:\{0,1\}^n\rightarrow \{0,1\}^{m}$.
By a {\em branching program encoder}
(binary by default, i.e. $2$-way)
$B$ computing $C$ we mean
a connected
directed acyclic graph
with a single source
and multiple sinks,
together with a set of $n$ binary {\em input
variables} and a set of $m$ binary {\em output variables}
satisfying the following.
There are exactly two arrows leaving each non-sink node, the
first labeled with a one and the second with a zero.
Every non-sink node
is associated with an input variable.
Some of the
nodes are associated with output variables (possibly
more than one variable per node), in which case
the nodes are labeled by zeros or ones. The nodes of the
graph are called states, the source is called
the {\em start state}, and the sinks are called the {\em end states}.
The branching program $B$ computes $C$ as follows.
The computation starts by reading the
value of the variable associated with the start state
and moving according to its value to the next state
and so on by reading more bits, while outputting
when an output node is reached, until an end state
is reached.
We may want to assume that
on any input each output variable will be set
at least once, or we can assume that the output
variables are arbitrarily preset.
The {\em computation of the branching program on
an input}
is the corresponding sequence
of states starting with the start
state and ending with an end state.
The length of a computation is its
number of states, and its time
is the number of states plus the
total number of times each output variable is set
along the way.
The {\em length} of the branching program
is the maximum length of a computation.
The {\em time} $t$ of the branching program
is the maximum time of a computation.
The {\em size} $S$ of the branching program
is the total number of states.
The {\em memory} or the {\em space} $M$ of a branching program
is $M = \log{S}$.
\begin{remark}\label{rem1}{\em
The codes encodable by such general
branching programs correspond to
those encodable by {\em random-access machines}
with
bounded read-write space and binary input registers,
where the time of the branching program is the worst case running time,
and its size is $\Theta(2^M)$, $M$ being the number
read-write bits.
Note that the machine has two types
of read-only bits, those corresponding to the
input message, and those corresponding to the
code description. Complexity is measured in
terms of the number of bits of the first type.
Note that we are not restricting the size
of the read-only bits corresponding to
the code description.
This is not a problem
since we are deriving lower bounds.
See Example \ref{ex4} for some consequences of
this unlimited read-only space.}
\end{remark}
\subsubsection{Some special types of branching programs} \label{sptb}
The branching program is called {\em leveled}
if the states are divided into an
ordered collection of sets each called
a {\em level} where edges are between consecutive
levels only. In such a case, the {\em width}
of the
branching program is the maximum number of
states per level.
The branching program is called {\em oblivious}
if the input
variables (and the output variables)
are read (respectively, set)
in the same order
regardless of the input under consideration.
Thus an oblivious branching program is naturally leveled
in such a way that all the nodes in the same level read the
same input variables, and set the same output variables.
The branching program is called a {\em read-$k$-times}
branching program if
each input variable is read at most $k$ times
on any input.
The branching program is called a {\em write-$w$-times} if at most
$w$ output variables are set per state.
\subsubsection{Examples}
\begin{example}\label{ex1}{\em
The trellis of a convolutional code is an
oblivious, read-once, and write-once branching program.
The trellis of a systematic convolutional code is an
oblivious, read-once, and write-2-times branching program.
}
\end{example}
\begin{example}\label{ex2}{\em
Parallel concatenated Turbo codes
are encodable by low complexity oblivious
branching programs as follows.
A {\em parallel concatenated Turbo code} \cite{ctc}
$C$ with a constant number $k$ branches, message length $n$,
and memory $M$
is specified by $k$ permutations $\pi_1, \ldots, \pi_k$ each
on $n$ bits and a rate $1$
convolutional code $Q$ (the {\em component code}) of memory $M_0$.
For $x$ in $\{0,1\}^n$,
$C$ encodes $x$ as $C(x) = (x, Q(\pi_1(x)), \ldots, Q(\pi_k(x)))$,
where $\pi_i(x)$ is
the string obtained by permuting the bits of $x$ according to the
permutation
$\pi_i$, and
$Q(y)$ is the output of the convolutional encoder $Q$ on the input string $y$.
Thus $C$ is naturally encodable by an oblivious
read-$k$-times write-2-times
branching program $B$.
The states of $B$ are copies of
those of the automaton, and $B$
is naturally leveled.
$B$ has length $kn$ and time $\Theta(n)$.
The width of $B$ is $2^{M_0}$, and its size
is at most $kn2^{M_0}$.
Note that the same holds if we follow the technicality
of appending a terminating sequence to the input.
}
\end{example}
\begin{example}\label{ex3}{\em
Repeat-convolute codes fit in the same picture.
A {\em repeat-convolute code} \cite{rac1}
consists of a
repeat-$k$-times code, a convolutional code, and a permutation.
More precisely, a repeat-convolute
code $C$ of message length $n$ and memory $M$ is specified by a constant
integer $k$,
a permutation $\pi$ on $kn$ bits, and a convolutional encoder $Q$ of
memory $M$.
For $x$ in $\{0,1\}^n$, $C$ encodes $x$ as $C(x) = (x,Q(\pi(r(x))))$, where
$r$ is the repeat-$k$-times map, i.e. $r(x)$ is the concatenation of $k$
copy of $x$.
As in Example 2, $C$ is naturally encodable by a leveled, oblivious,
read-$k$-times, write-$2$-times, length-$kn$, time-$\Theta(n)$, and
width-$2^{M_0}$ branching program whose size is
at most $kn2^{M_0}$.
}
\end{example}
\begin{example}\label{ex4}{\em
Any code can be trivially encoded
in the binary branching program model
in linear time and linear space by a tree branching
program that on any input, outputs the whole codeword
at the corresponding leaf.
This makes sense in the random-access machine with binary
input registers encoding model because we are not
counting the read-only
space needed to store the code
description (See Remark \ref{rem1}).
It is worth noting here that
when this read-only space is taken into consideration,
we know from the work of Spielman \cite{Spi96} that there
exists an asymptotically good code encodable in
linear time and linear space.
}
\end{example}
\begin{example}\label{ex5}{\em
Any linear code is naturally
encodable by a leveled, oblivious,
width-$2$, quadratic-time, and write-once
branching program.
}\end{example}
\subsection{Main result}
\begin{theorem} \label{th1}
Let $C:\{0,1\}^{n}\rightarrow \{0,1\}^{m}$ be
a code $($i.e. an injective function$)$ encodable $($i.e. computable$)$
by branching program $B$ of size $S = S(n)$, time $t = t(n)$,
and length $l= l(n)$ $($so $l\leq t$$)$.
\noindent
If $t(n) = \Theta(n)$, then the minimum distance of $C$
is
\[
O\left(\left({\log{S}\over{n}}\right)^{1\over{\lfloor l/n \rfloor}} n\right).\]
Therefore, $C$ is asymptotically bad when
$S(n) = 2^{o(n)}$ and $t(n) = O(n)$.
\noindent
More generally, if $t(n) = \Omega(n)$,
then the minimum distance of $C$ is
\[
O\left(\left({t\over{n}}\right)^3 \left({\log{S}\over{n}}\right)^{n\over{2t}}n\right).
\]
Thus, $C$ is asymptotically bad also when
$S(n)=2^{O(n^{1-\e_1})}$ and $t(n)=O(n\log^{1-\e_2}{n})$,
for all $\e_1,\e_2>0$.
\end{theorem}
In other words linear time and sublinear space
for encoding imply that
the code is asymptotically bad, i.e. the minimum
distance cannot grow linearly with the block
length when the rate is nonvanishing.
Note that when the time is linear,
the sublinear memory requirement
is asymptotically tight for encoding asymptotically
good codes (See Example \ref{ex4}).
\subsubsection{Application to Turbo-like codes}
By applying Theorem \ref{th1} to
parallel concatenated Turbo codes and
repeat-convolute codes (See Examples \ref{ex2} and \ref{ex3}), we can recover
the corresponding bound in \cite{tlcd}
as follows.
The minimum distance of a parallel concatenated Turbo code
with a constant number $k$ branches, message length $n$, and memory $M_0$
is $O(n^{1-1/k}M_0^{1/k})$ because the size of the
corresponding branching program
is at most $kn2^{M_0}$. Similarly,
the minimum distance of a repeat-convolute code with
$k$ repetitions, message length $n$, and memory $M_0$
is $O(n^{1-1/k}M_0^{1/k})$.
So both types of codes will be asymptotically bad in
any reasonable setting, i.e. as long as $M_0$ is
sublinear in $n$. Note that the situation when $M_0$
is sublinear in $n$ corresponds to the case when
the underlying trellis has subexponential size, i.e.
when the corresponding
iterative Turbo decoding algorithm has
subexponential running time.
\section{Proof of Theorem \ref{th1}}
\subsection{Ajtai proof techniques for the hamming distance problem}
We use branching program techniques introduced by Ajtai in
\cite{ajt}.
More specifically, we are referring to the branching program
techniques that Ajtai introduced to show that there
is no $O(n)$-time and $o(n\log{n})$-space $R$-way branching
program, $R = n^c$ ($c$ some absolute constant),
that decides on the hamming distance problem:
given $n$ strings in $\{0,1\}^{\log{R}}$, decides whether any
(distinct) two of them are at $\lambda \log{R} $ hamming
distance apart ($\lambda$ another absolute constant
related to $c$).
Even though this is a decision problem
in the setting of $R$-way branching programs, while
ours is not a decision problem and is in the setting of $2$-way
branching programs, the techniques introduced by
Ajtai are behind the proof we describe below.
We refer the reader to Ajtai's paper \cite{ajt}.
\subsection{Objects under consideration and terminologies}
We will start by
making the branching program leveled.
Recall from Section \ref{bpe} that this means that
the states are partitioned into $l$ consecutive
sets of states each called a level in such a way
that edges, i.e. transitions,
occur only between consecutive levels.
We will divide (meaning partition)
the branching into {\em blocks}, meaning
sets of consecutive levels.
For a given block, we will be looking at states
in the {\em lower boundary level of the block}.
All what the lower boundary level of a block
mean is the last (w.r.t to the levels ordering)
level (which is a set of states)
in the block (which is a set of levels).
Given an input, we will be looking at the computation
of the branching program on this input, which as we explained
in Section \ref{bpe} is defined to be the corresponding
sequence of states starting with the start
state and ending with an end state.
So, in the leveled case,
each computation takes exactly $l$ steps, i.e. it contains
exactly $l$ states.
Fix an input $x$, and consider the
corresponding computation of the branching program
$B$ on $x$. Fix also a set $L$ of levels
or a set $T$ of blocks.
By a input bit or
variable (respectively, output bit or variable)
accessed or read (respectively, set)
in $L$ or $T$ {\em during the computation} of $B$ on $x$
(or equivalently by $L$ setting the input bit during the computation
and so on $\ldots$),
all what we mean is that there is a state
in the computation that belongs to a level
in $L$ or a level in a block in $T$
where the value of the input variable
is read in order to move to another state
(respectively, the value of the output variable is
set).
Finally, by a computation containing
a sequence of states, we mean that each state
in this sequence appears in the computation.
Note that here the order does not matter since
the states in a computation are distinct
because the branching program is acyclic.
\subsection{The oblivious case argument}
Recall that an oblivious branching program is naturally leveled
in such a way that all the nodes in the same level read the
same input variables, and set the same output variables.
Since the proof of Theorem \ref{th1} is
relatively long,
it is instructive to look first at the very special case
when $B$ is oblivious.
This case is very restrictive compared to a general
branching program.
To restrict the setting further, assume that $B$
is read-$k$-times and write-$w$-times,
where $k = O(1)$ and $w = O(1)$.
The argument we used in \cite{tlcd} to bound the
minimum distance of repeat-convolute codes
was in the setting of automata. More specifically,
we studied the case of a repeat-convolute code
where the convolutional code is replaced by
an arbitrary automaton.
Even though the automata setting is less general than the
case we are considering in this section,
the argument naturally extends as follows.
Assume that $B$ a read-$k$-times, write-$w$-times, and
oblivious branching program, where $k = O(1)$ and $w = O(1)$.
Thus $n\leq l \leq kn$ and $m \leq wn$. We want to argue
that the minimum distance of $C$ is
$O(n(\frac{\log{S}}{n})^{1/k})$.
Let $W$ be the width of $B$, thus $W \leq S$.
We will exhibit two distinct input strings that map
to two codewords at distance
$O(n ( {\log{W}\over{n}} )^{1/k})$.
We will do this by finding
a nonempty set of input variables $U$,
a subset $J$ of levels,
and two distinct strings $x_1$ and $x_2$ in $\{0,1\}^n$
such that $x_1$ and $x_2$ agree outside $U$,
and the computations of $B$ on $x_1$ and $x_2$
agree outside $J$. This will give us the desired bound on
the minimum distance. $J$ will be constructed as a union
of intervals from a partition of $B$ that we define next.
Partition $B$ into $b$ consecutive blocks, each consisting
of $s_1$ or $s_2$ levels, where
$s_1 = \lfloor kn /b \rfloor$ and
$s_2 = \lceil kn /b \rceil$.
Assume for now that $b$ is arbitrary as long as $s_1\geq 1$.
We will optimize on the integer $b$ later.
Each of the $n$ input variables is read by $B$ in at most
$k$ blocks. Recall that $B$ is oblivious. Thus,
for any specific variable,
these blocks will be the same irrespective
of the setting of the
input variables.
There are at most $b^k$ $k$-set
of blocks. Here by a $k$-set of blocks, we
mean a set of blocks of cardinality
at most $k$. So there
are at least $n/b^k$ input variables that
are read by $B$ in the same $k$-set of blocks.
Let $U$ be such a set of input variables with
$|U|=\lceil n/b^k \rceil$, $T$ be such a $k$-set of blocks,
thus $1\leq |T|\leq k$. The set $J$ we mentioned above is the union
of the blocks in $T$.
Consider the lower boundary levels $L_1, \ldots, L_{|T|}$
of the blocks in $T$ ordered by the level index, and let $Q$
be the set of strings in $\{0,1\}^n$
that are zero outside $U$, thus $|Q| = 2^{|U|}$.
There are at most $W^k$ state sequences in
$L_1\times \ldots \times L_{|T|}$,
and for each $x$ in $Q$ the computation
of $B$ on $x$ contains such a sequence. So
if we can guarantee that $2^{|U|} > W^k$,
we get that there should be a sequence of states
$\{\s_i\}_{i=1}^{|T|}$ in
$L_1\times \ldots \times L_{|T|}$
and two different strings $x_1$ and $x_2$ in $Q$
such that the computation of $B$ on both $x_1$
and $x_2$ contains $\{\s_i\}_{i=1}^{|T|}$.
Since $x_1$ and $x_2$ agree outside $U$,
the computation of $Q$ on $x_1$ and $x_2$
are exactly the same outside the
blocks in $T$.
The reason is that both computations
are in similar states each time an intervals
in $T$ is left.
Thus $C(x_1)$ and $C(x_2)$ can only
differ in the blocks in $T$.
This means that the distance
between $C(x_1)$ and $C(x_2)$ is at most
$|T|s_2w \leq k\lceil kn /b \rceil w$,
since $|T|\leq k$, and $s_2 = \lceil kn /b \rceil$.
This bound holds under the assumption that $2^{|U|} > W^k$,
which can be guaranteed if
$2^{n/b^k} > W^k$. So, choose
$
b = \lceil ({n\over{k\log{W}}})^{1/k}\rceil - 1
$.
Note that the only other constraints we have on $b$ are $1\leq b \leq kn$,
and the selected value satisfies these constraints when $W$ is not
exponential in $n$. Note also that if $W$ is exponential in $n$,
the statement of the theorem is trivial.
By replacing this value of $b$ in the upper bound $k\lceil kn /b \rceil w$
on the distance between $C(x_1)$ and $C(x_2)$, and using $S$ as
a upper bound on $W$,
we get that the
the minimum distance of $C$ is
$O(n(\frac{\log{S}}{n})^{1/k})$.
Note that we used here also that
$k= O(1)$, $w = O(1)$, and
$C(x_1)\neq C(x_2)$ because $x_1\neq x_2$ and
$C$ is injective.
This proof is short and simple. But when $B$ is not oblivious,
the proof does not go through. The main reason
is that we cannot construct $U$
and $T$
regardless of the setting
of the input variables as we did
above. When the branching program is
oblivious, the read-$k$-times and the write-$w$-times
restrictions are not fundamental. When it is not oblivious,
they become restrictive.
For example, in the general branching program model,
depending on the input,
a very large number
of the output variables
may be set in a particular state,
or a particular input variable
may be read a very large number of times.
The point to keep in mind is that the oblivious
assumption is very restrictive.
We will sketch in the next section how to handle the
general situation.
The proof is longer and more sophisticated.
This is not strange since the statement we are proving
is much more general.
The reader is encouraged to go carefully over the above argument before
proceeding to the general case.
\subsection{Proof technique}
We follow the techniques
introduced by Ajtai \cite{ajt} in the setting of
the hamming distance problem.
We want to find two input
strings $x_1$ and $x_2$ such that $C(x_1)$ and $C(x_2)$
are close to each others.
The first step is to make
the branching program leveled without affecting its
input-output behavior. Next, we divide
the branching program into blocks each consisting
of consecutive levels whose number will be suitably
selected later and whose sizes are as uniform as
possible.
To exhibit $x_1$ and $x_2$, we will
find a set $T$ of blocks such that:
\begin{itemize}
\item the size of $T$ is small,
\item the computations of $B$ on $x_1$ and $x_2$ are
exactly the same in the blocks outside $T$, and
\item
not too many output bits of $C(x_1)$
(respectively $C(x_2)$)
are set in any of the blocks in $T$ during
the computation of $B$ on $x_1$
(respectively $x_2$).
\end{itemize}
Thus $C(x_1)$
and $C(x_2)$ can only disagree on
the few output bits that are set in $T$.
To find such $x_1$,
$x_2$, and $T$, we find first $T$
together with a set $Q'$ of input strings in
$\{0,1\}^n$ and a sequence $\{\s_i\}_i$
of states in the lower boundary levels of the blocks in $T$ in such
a way that for each $x$ in $Q'$:
\begin{itemize}
\item the computation of $B$ on $x$
contains $\{\s_i\}_i$,
\item
not too many output bits of $C(x)$
are set in any of the blocks in $T$ during
the computation of $B$ on $x$, and
\item the
number of variables in $x$
that are accessed only in the blocks in $T$
during the computation of $B$ on $x$
is large.
\end{itemize}
We will eventually find the desired
$x_1$ and $x_2$ inside $Q'$ as follows.
We modify the branching program $B$ again
so that $B$ is forced to
to pass through a state in the sequence $\{\s_i\}_i$
each time it attempts
to leave a lower boundary level of a block in $T$, but
without affecting its input-output behavior on $Q'$.
Using $T$, define an equivalence relation
on $\{0,1\}^n$ by relating two strings
if:
\begin{itemize}
\item they share the same
the set of input variables that are not read
during the
computation of $B$ in blocks outside $T$,
and
\item
they agree on the values of their bits
outside this set.
\end{itemize}
Thus each
equivalence class $[x]$ is determined
by a set $I_{[x]}$ of input variables and
a setting of the variables outside
$I_{[x]}$.
We forced the computation of $B$
to contain the sates $\{\s_i\}_i$
on all inputs so that
we get $|[x]|=2^{|I_{[x]}|}$, and
hence the size of each
equivalence class $[x]$ can be guaranteed to
be large when $I_{[x]}$ is large.
Since for each input string in $Q'$, the
number of variables
that are accessed only in the block in $T$
during the computation of $B$
is large, we get that
the equivalence class of each
input in $Q'$ is large.
By
considering the set $\Omega$ of sufficiently large
equivalence classes so that the equivalence
classes of all the elements of $Q'$ are guaranteed to be elements
of $\Omega$, our problem reduces to selecting the
number of blocks so that $|Q'|$ is strictly larger than
$|\Omega|$, and hence there are distinct $x_1$ and $x_2$
in $Q'$ that have the same equivalence class.
The fact that $[x_1] = [x_2]$
means that the computations
of $B$ on $x_1$ and $x_2$ are exactly the same outside the
blocks in $T$, and hence $C(x_1)$ and $C(x_2)$ can
only disagree on the output bits that are set inside the
blocks in $T$.
By construction the number of those output
bits will be small. Moreover, and since $C$ is injective,
$C(x_1)$ and $C(x_2)$ are distinct.
The distance between $C(x_1)$ and
$C(x_2)$ will be the desired bound on the minimum distance
of $C$.
\subsection{Proof outline}
Assume for moment that $t = \Theta(n)$. We will deal with
the more general case when we are done by working
more carefully with the constants. So say
that
\begin{equation}\label{ac}
\mbox{$t = an$ and $l = cn$},
\end{equation}
where $a,c\geq 1$ are constants
($a,c\geq 1$ because $C$ is injective).
\begin{itemize}
\item[A)] We modify the branching program so that it is leveled
\footnote{
This can be done by a classical procedure.
Construct a leveled directed
graph of $l$ levels where each level consists of a copy of all the
nodes of the original branching program together with the related output labels.
Connect the nodes in each two
consecutive levels according to the the graph of the original
branching program. Associate the end states not in the last level with
arbitrary input variables.
Connect the end states in any tow consecutive levels by two arrows
labeled respectively by one and zero. Finally,
remove all the nodes (together with the related edges)
that are not accessible from the start sate in the first level
or can not reach an end state in the last level.
The start state
of the new branching program is the remaining state
in the first level, and its end states are those remaining
in the last level.
}. The modified
branching program computes the same function, i.e. $B$ computes
$C$. The length of resulting branching program
$B$ is $l$, its time is $t$, its
size is at most $Sl$, and its width is at most $S$.
The difference is that now edges are only between consecutive levels, and
each computation takes exactly $l$ steps.
\item[B)] Partition $B$ into $b$ consecutive block, each consisting
of $s_1$ or $s_2$ levels, where
\begin{equation}\label{s1s2}
s_1 = \left\lfloor \frac{l}{b} \right\rfloor = \left\lfloor \frac{cn}{b} \right\rfloor \mbox{ and }
s_2 = \left\lceil \frac{l}{b} \right\rceil = \left\lceil \frac{cn}{b} \right\rceil.
\end{equation}
Assume for now that in general $1 \leq b \leq l$ so that
$1 \leq s_1, s_2 \leq l$.
We will
optimize on the integer $b$ later.
\item[C)]
\begin{lemma}\label{lemmasi} There exist:
\begin{itemize}
\item[a)] absolute constants
$h,\alpha> 0 $,
\item[b)] $Q'\subset \{0,1\}^n$ such that
\begin{equation}\label{qprime}
|Q'| \geq \frac{2^n}{(Sb)^k},
\end{equation}
where
\begin{equation}\label{kk}
k = \lfloor c \rfloor,
\end{equation}
\item[c)] a set of blocks $T$,
\begin{equation}\label{tt}
1\leq |T|\leq k,
\end{equation}
\item[d)] and a sequence $\{\s_i\}_{i=1}^{|T|}$
of states in the lower boundary levels of the blocks in
$T$,
\end{itemize}
such that for each $x$ in $Q'$:
\begin{itemize}
\item[1)] the computation of $B$ on $x$
contains $\{\s_i\}_i$,
\item[2)] at most
\begin{equation}\label{ww}
w = \frac{hs_1 t}{l}
\end{equation}
output bits
of $C(x)$ are set in each block in $T$
during the computation of $B$ on $x$, and
\item[3)] the number of variables in $x$
that are accessed only in the blocks in $T$
during the computation of $B$ on $x$
is at least
\[
\frac{\alpha n}{b^k}.
\]
\end{itemize}
\end{lemma}
{\bf Proof.} See Section \ref{prfc}. \finito
\item[D)]
Now we will modify the branching program $B$ again
so that $B$ is forced to
to pass through a state in $\{\s_i\}_i$ each time it attempts
to leave a lower boundary level of a block in $T$,
while guaranteeing that
$B$ behaves exactly
like the old $B$ on the inputs in $Q'$, i.e. it computes $C(x)$
for each $x$ in $Q'$.
We can do this
by simply connecting (on both inputs) all the states
in the level above that of $\s_i$ to $\s_i$, for each $i$.
Note that $B$ need not compute an injective
function anymore, so it may not read all the input
variables on some inputs.
It may also leave some of the output
variables unset, but this is not a problem since
we can assume that the output
variables were arbitrarily preset.
\item[E)] Finally, we bound in Section \ref{dtle}
the minimum distance of $C$
by exhibiting distinct $x_1$ and $x_2$ in $Q'$ such that
the distance between
$C(x_1)$ and $C(x_2)$ is
$O(n({\log{S}\over{n}})^{1\over{k}})$.
\item[F)] In Section \ref{suplin},
we explain
how to
drop the assumption $t = \Theta(n)$.
\end{itemize}
\subsection{Proof of Lemma \ref{lemmasi}}\label{prfc}
Consider any input $x$ in $\{0,1\}^n$.
\begin{itemize}
\item
Let $k\geq 1$ be an integer, and let $h>0$.
We will set $k$ then $h$ as we go on.
\item
Let $R_x$ be the set
of blocks that each sets at most
\[
w \equiv \frac{hs_1t}{l}
\]
bits of $C(x)$
during the computation of $B$ on $x$.
\item
Let $D_x$ be the
set of input variables that are read in at most $k$ sates during
the computation of $B$ on $x$.
\item
And let $D'_x$ be the set of input variables in $D_x$
that are read only in blocks in $R_x$ during
the computation of $B$ on $x$.
\end{itemize}
First recall from (\ref{ac})
that $a,c\geq 1$ are the constants satisfying
\[
\mbox{$t \equiv an$ and $l \equiv cn$.}
\]
Recall also from (\ref{s1s2}) that
\[
s_1 \equiv \left\lfloor \frac{l}{b} \right\rfloor = \left\lfloor \frac{cn}{b} \right\rfloor \geq 1
\mbox{ and } s_2 \equiv \left\lceil \frac{l}{b} \right\rceil = \left\lceil \frac{cn}{b} \right\rceil.
\]
Some bounds:
\begin{itemize}
\item From the definition of $R_x$, we must have
$w(b - |R_x|) \leq t$, thus
\begin{equation}\label{Rxb}
|R_x| \geq b - \frac{l}{hs_1} \geq b\left( 1 - \frac{2}{h}\right),
\end{equation}
where the first inequality follows from
$w = hs_1t/l$, and the
second from the bound $s_1 = \lfloor l /b \rfloor \geq l/(2b)$.
\item Since $C$ is injective, each
input variable must be read at least once, so from the definition
of $D_x$, we must have have
\[
|D_x| + (k+1)(n-|D_x|)\leq l,
\]
i.e.
$|D_x| \geq n(1 - \frac{c-1}{k})$ because $l=cn$.
Thus if we set
\[
k \equiv \lfloor c \rfloor,
\]
we get
\begin{equation}\label{Dpx}
|D_x| \geq n(1 - \e),
\mbox{ where }
\e \equiv \frac{c-1}{k} <1.
\end{equation}
\item
The number of
input variables read in blocks outside $R_x$ is at most
\[
(b - |R_x|)s_2 \leq \frac{2bs_2}{h} \leq n\frac{4c}{h},
\]
where the first inequality follows from (\ref{Rxb}),
and the second from the bound
$s_2 = \lceil cn /b \rceil \leq 2 cn/b$.
Thus, by the definition of $D'_x$, we must have
\[
|D'_x| \geq |D_x| - \frac{4c}{h} \geq
n\left( 1-\e - \frac{4c}{h}\right),
\]
where the second inequality follows from (\ref{Dpx}).
Let $h$ be sufficiently large such that
\begin{equation}\label{alpha}
\alpha \equiv 1-\e - \frac{4c}{h}>0.
\end{equation}
Note
that this implies also that $1 - \frac{2}{h} > 0$ since $c\geq 1$,
i.e
\[
|R_x|> 0,
\]
by (\ref{Rxb}).
\end{itemize}
To sum up, we have fixed some constants $h,\alpha > 0$,
and specified $k \equiv \lfloor c \rfloor$, so that
\begin{equation}\label{rxdx}
1\leq |R_x| \leq b \mbox{ and }
|D'_x| \geq \alpha n.
\end{equation}
Now,
keep the definition of $R_x$ in mind,
ignore $D_x$, and recall that
$D_x'$ is a set of input variables
such that:
\begin{itemize}
\item each
input variable in $D'_x$
is read in at most $k$ levels during
the computation of $B$ on $x$, and
\item each of those
levels belongs to a block in $R_x$.
\end{itemize}
Recall also that so far we are fixing an input $x$ in $\{0,1\}^n$.
Consider
all the $k$-sets in $R_x$, i.e. the
subsets of $R_x$ of size at most $k$.
Each input variable in $D'_x$ is read in such
a $k$-set during the computation on $x$,
and there are at most $|R_x|^k$
such $k$-set, so there are at least
\[
\frac{|D'_x|}{|R_x|^k} \geq \frac{\alpha n}{b^k}
\]
variables in
$D'_x$ read in the same $k$-set of blocks,
where we have used (\ref{rxdx}) to obtain the estimate.
Let
$U_x$ be such a set of variables in $D'_x$, and
let $T_x$ be such a $k$-set of blocks in $R_x$.
So
\[
1 \leq |T_x|\leq k \mbox{ and } |U_x| \geq \frac{\alpha n}{b^k}.
\]
Note that $T_x$ is nonempty since $C$ is injective.
For each $x$ in $\{0,1\}^n$, there is such a $U_x$
and $T_x$. Fix any such correspondence.
There are at most $b^k$ such $T_x$,
so there is a subset $Q\subset \{0,1\}^n$ and a $k$-set
of blocks $T$ such that
\[
|Q|\geq \frac{2^n}{b^k},
\]
and
$T = T_x$ for each $x$ in $Q$.
Now consider the lower boundary levels $L_1, \ldots, L_{|T|}$
of the blocks in $T$ ordered by the level index.
There are at most $S^k$ state sequences in
$L_1\times \ldots \times L_{|T|}$,
and for each $x$ in $Q$, the computation
of $B$ on $x$ contains such a sequence, so there
is a sequence $\{\s_i\}_{i=1}^{|T|}$
of states in $L_1\times \ldots \times L_{|T|}$ and
a subset $Q'\subset Q$ such that
\[
|Q'|\geq \frac{|Q|}{S^k} \geq \frac{2^n}{(Sb)^k},
\]
and
the computation of $B$ on $x$ contains $\{\s_i\}_i$,
for each $x$ in $Q'$.
\subsection{Bounding the minimum distance}\label{dtle}
Now we are ready to find
the two distinct messages $x_1$ and $x_2$
that $C$ map to close codewords.
Using $T$, for each $x$ in $\{0,1\}^n$, let
$I_x$ be
the set of input variable that are not read
during the
computation of $B$ on $x$ in blocks outside $T$.
Note that we need a double negation since some of the
input variables may not be read at all because
we modified the branching program in (D).
So, by (C.3), for each $x$ in $Q'$,
\begin{equation}\label{Ixb}
|I_x| \geq \frac{\alpha n}{b^k}.
\end{equation}
Using $T$, define the equivalence relation $\sim$ on $\{0,1\}^n$
by $x\sim y$ if:
\begin{itemize}
\item $I_x = I_y$, and
\item $x$ agree with
$y$ on the bits outside $I_x$.
\end{itemize}
In other words,
$x|_{I_{[x]}} = y|_{I_{[y]}}$, where $[x]$
means the equivalence class of $x$.
Given any $x$ in $\{0,1\}^n$, each $y\sim x$ can only disagree
with $x$ on $I_x$. Conversely, if $y$
disagree with $x$ only inside $I_x$, it must be the case that
$y\sim x$. To see why this is true, note
that we forced in (D) all the computations of $B$ to leave the
blocks in $T$ in the same sates: the $\{\s_i\}_i$ that we exhibited
in (C.1). So the computations
of $B$ on $x$ and $y$ are exactly the same outside the
blocks in $T$, and hence any bit accessed on $x$
outside $T$ will be accessed on $y$ outside $T$
and none of the bits in $I_x$ will be accessed
on $y$ outside $T$. It follows that
\[
|[x]| = 2^{|I_x|}.
\]
Thus, by (\ref{Ixb}), for each $x$ in $Q'$,
\[
|[x]|\geq 2^{\alpha n/b^k}.
\]
Let $\Omega$ be the set of equivalence classes the size
of each is at least $2^{\alpha n/b^k}$. So, $[x]$ is
in $\Omega$ for each $x$ in $Q'$. Besides, since
the equivalence classes are disjoint, we must have
$|\Omega|2^{\alpha n/b^k} \leq 2^n$, i.e.
\begin{equation}\label{omegab}
|\Omega| \leq \frac{2^n}{2^{\alpha n/b^k}}.
\end{equation}
If we can guarantee that
\begin{equation}\label{cond2}
|Q'|> |\Omega|,
\end{equation}
we get that there should be $x_1\neq x_2$ in
$Q'$ such that $[x_1] = [x_2]$. The fact that $[x_1] = [x_2]$
means that the computations
of $B$ on $x_1$ and $x_2$ are exactly the same outside the
blocks in $T$, and hence $C(x_1)$ and $C(x_2)$ can
only disagree on the output bits that are set inside the
blocks in $T$. But, by (C.2), we constructed $Q'$
in such way
that the computation of $B$ on any $x$ in $Q'$
can set at most $w$ bits of $C(x)$
in each block in $T$. Thus $C(x_1)$ and $C(x_2)$
can disagree on at most
\begin{equation}\label{mdb}
2|T|w \leq 2k \frac{hs_1a}{c} \leq \frac{2khcna}{bc} = \frac{2khan}{b}
\end{equation}
bits, where the first inequality follows from
$|T|\leq k$ (by (\ref{tt})) and $w = hs_1t/l = hs_1a/c$ (by (\ref{ww}) and
(\ref{ac})), and the
second follows from
$s_1 = \lfloor l /b \rfloor = \lfloor cn /b \rfloor
\leq cn /b$ (by (\ref{s1s2})).
Moreover, $C(x_1)$ and $C(x_2)$
must disagree on at least one bit
since $x_1\neq x_2$, and $C$ is injective.
Using (\ref{qprime}) and (\ref{omegab}),
condition (\ref{cond2}) can be guaranteed to hold if
\[
2^{\alpha n/b^k} > (Sb)^k,
\]
which is fulfilled when
\[
{1\over{b}} > \left({k\log{(Scn)}\over{\alpha n}}\right
)^{1/k},
\]
since $b\leq l = cn$.
If we can select $b$
so that this holds, we get that the minimum distance
of $C$ is at most $2khan/b$. The only restriction
we have on $b$ is $1\leq b \leq l$, so use
\begin{equation}\label{setb}
b \equiv \left\lceil \left({\alpha n \over{k\log{(Scn)}}}\right)^{1/k} \right\rceil - 1.
\end{equation}
This is always below $l$, and it cannot go below $1$ unless
$S \geq 2^{\alpha n/k2^k - \log{(nc)}}$ in which case
the statement of the theorem is trivial. Thus, via
(\ref{mdb}),
the minimum distance of $C$ is at most
\[
{2khan\over{\left\lceil\left({\alpha n \over{k\log{(Scn)}}}\right)^{1/k} \right\rceil - 1}}
= O\left(n\left({\log{S}\over{n}}\right)^{1\over{k}} \right).
\]
\subsection{Dropping the linear time assumption}\label{suplin}
Now we drop the assumption that $t = \Theta(n)$, thus $a$ and $c$
need not be constants. Since $l\leq t$, we use $a$ as an upper bound on
$c$. Assume that $a$ grows with $n$ and assume also that
it is $O(\log{n})$ since otherwise the statement of the
theorem is trivial.
We will not set $k = \lfloor c \rfloor $.
Going back to (\ref{Dpx}) and (\ref{alpha}),
we have
\[
\alpha = 1 - \frac{c-1}{k} - \frac{4c}{h} > 1 - \frac{a-1}{k} - \frac{4a}{h},
\]
a value that we need to
keep bounded away from zero by a
positive constant. Set $k = \lceil 2(a-1) \rceil$ and $h = \lceil 16a \rceil $,
thus $\alpha > 1/4$.
By using the same choice of $b$ in (\ref{setb})
and the same bound on the
minimum distance of $C$ in (\ref{mdb}), but
with the new values of $k$ and $h$ and the bound $c\leq a$,
we get that the minimum distance of $C$ is at most
\[
{2 \lceil 2(a-1) \rceil \lceil 16a \rceil
an\over{\left\lceil\left(n/4 \over{\lceil 2(a-1) \rceil\log{(S a n)}}}\right)^{1/
\lceil 2(a-1)\rceil \right\rceil} - 1}}
= O\left(a^3 n\left({\log{S}\over{n}}\right)^{1\over{2a}}\right).
\]
\section{When the encoder is a constant-depth AND-OR circuit}
To outline the boundaries of the picture, we consider the same
problem but from the perspective of
the circuit complexity of the encoder.
Here we note that not much can be said
other than what is essentially expected.
Since we know from \cite{Spi96} that there are asymptotically
good codes that are encodable by linear-size and
logarithmic-depth circuits, we are left with constant-depth
circuits encoders with unbounded fanins.
Note first that if $C:\{0,1\}^n \rightarrow \{0,1\}^m$ is a code
(i.e. an injective map in general),
we say that $C$ is encodable by a depth $d$
AND-OR circuit with unbounded fanins if each of the $m$ output
bits is computable by a depth $d$ circuit where:
1) the only allowed gated are AND/OR
gates with possibly negated inputs, and 2) the number of
inputs per gate is unbounded. The size of the circuit
is the total number of gates.
We argue by a direct
application of Hastad switching Lemma
that a polynomial size constant-depth circuit cannot encode
an asymptotically good code (actually as long as
the circuit size is subexponential in a
power of the block length inversely proportional to the circuit depth).
This is not surprising since in the special case of linear codes,
a small depth circuit encoder corresponds to a code with a low density
generator matrix.
\begin{lemma}{\em Hastad switching Lemma \cite{has}}:
Let $f:\{0,1\}^n\rightarrow \{0,1\}$ be computable
by an unbounded-fanins depth-$d$ AND-OR circuit of size $M$.
Consider a random restriction $\rho$ that independently
keeps each input bit unset with a probability
$p = 1/(20k)^d$, sets it to $1$ with a probability $1 - p/2$, and to $0$
with a probability $1 - p/2$. Then the probability, over the choice of
$\rho$, that $f$, when restricted to the values set by $\rho$, cannot
be evaluated by a decision tree of depth $k$ is at
most $M2^{-2k}$.
\end{lemma}
Note that a decision tree computing a binary function
$b: \{0,1\}^n\rightarrow \{0,1\}$ on $n$ variables
is a binary tree where each node is associated with one
of the input variables, and each leaf is associated with
a $0$ or $1$ setting of the single output variable.
This implies that if we fix any setting of the input variables,
there are at most $k$ variables that when negated will
affect the value of $b$, where $k$ is the depth of the tree.
In other words, when $k$ is small, $b$ has low sensitivity.
Thus if a code $\{0,1\}^n\rightarrow \{0,1\}^m$
(an injective map)
is encodable by $m$ decision trees each of depth $k$,
a direct counting argument shows that its minimum
distance can be at most $km/n$.
Hastad switching Lemma essentially reduces the circuit
case to this situations.
\begin{theorem} Let
$C:\{0,1\}^n \rightarrow \{0,1\}^{m}$, $m = \Theta(n)$, be a code
(i.e. in general an injective map)
encodable by an unbounded fanins
AND-OR circuit of size $S$ and depth $l$,
then the minimum distance of $C$ is
\[
O((20)^l \log^{l+1}{mS}).
\]
Thus, $C$ is asymptotically bad when $l = O(1)$ and $S = 2^{o(m^{1/l})}$.
\end{theorem}
{\bf Proof.}
Let $x_1, \ldots, x_n$ be the input variables,
$A_1, \ldots, A_m$ the circuits
that compute the output variables $y_1, \ldots, y_m$,
and $a = m/n$. Thus
$a\geq 1$ is constant, the size of each $A_i$ is at most $S$, and the
depth of each $A_i$ is at most $l$.
Hit the $x_i$'s with a random
restriction $\rho$ that keeps each $x_i$ unset with a probability
$p = 1/(20k)^l$, sets $x_i$ to $1$ with a probability $1 - p/2$, and to $0$
with a probability $1 - p/2$.
Then, for each $A_i$,
from Hastad switching Lemma, the probability that $A_i$ does not
collapse to a decision tree of depth $k$ is at most $S2^{-2k}$.
Thus the probability that one of the $A_i$'s does not collapse, or the
number of remaining (unset) variables
variables is below $np/2$
is at most
\[
P = mS2^{-2k}+\frac{4(1-p)}{n p},
\]
where the later
term comes from Chebychev inequality.
Fix $k = \log{Sm}$ so that $P \leq 1/(Sm) + 4(20\log{Sm})^l/n < 1$
when $n$ is large enough and
$S$ is subexponential in $n^{1/l}$. Note that when $S$
is exponential in $n^{1/l}$, the statement of the theorem is trivial.
So, fix any restriction $\rho$ with the property that:
\begin{itemize}
\item
the set $I$ of
input variables left unset by $\rho$ has size at least
$np/2$, and
\item
each of the $A_i$'s collapses
under $\rho$ to a decision trees $T_i$ of depth $k$, where
$k =\log{Sm}$ and $p = 1/(20k)^l$.
\end{itemize}
Consider any setting of the
variables in $I$, and let $I_i$ be the set of variables in $I$
read by $T_i$ on this setting.
Each $I_i$ contains at most $k$ variables, and the output
of $T_i$ can only be affected when we change some
of the variables in $I_i$. So there should be a variable
in $I$ that appears in at most\
\[
\frac{\sum_j |I_j|}{|I|} \leq \frac{km}{np/2} = \frac{2ka}{p}
\]
of the
$I_i$'s . By flipping this variable, we can affect
at most $2ka/p$ output bits, and at least one output bit
since $C$ is injective. Hence the minimum distance of $C$
is at most
\[
\frac{2ka}{p} = O( (20)^l\log^{l+1}{Sm} ).
\]
\finito
\section{Open questions}
Using branching program techniques
introduced by Ajtai \cite{ajt},
we argued in Theorem \ref{th1} that there are no asymptotically
good codes that are encodable in linear time
and sublinear space in the most general
sense.
When the time is linear,
the sublinear memory requirement
is asymptotically tight for encoding asymptotically
good codes (See Example \ref{ex4}).
On the other extreme,
quadratic encoding time is achievable by random
linear codes while requiring minimal encoding memory
in the branching program model
(See Example \ref{ex5}).
We conjecture that in general
\begin{conjecture}
If $C: \{0,1\}^n \rightarrow \{0,1\}^m$, $m = O(n)$,
is a code (an injective map), that is computable
by a branching program of memory $M$ and time $T$,
where $MT = o(n^2)$, then the minimum distance
of $C$ must be $o(n)$.
\end{conjecture}
Proving the conjecture or finding the correct time-space
tradeoffs for encoding asymptotically good codes
when the encoding time is superlinear and subquadratic
is very desirable.
\section{Acknowledgment}
We would like to thank
Daniel Spielman and Ruediger Urbanke
for simulating discussions of this material.
\begin{thebibliography}{10}
\bibitem{ajt}
Miklos Ajtai, {\em
``Determinism versus Non-Determinism for Linear Time RAMs''},
Thirty-First Annual ACM Symposium on Theory of Computing (STOC), 1999, pp. 632-641.
\bibitem{tlcd}
L. Bazzi, M. Mahdian, D. Spielman.
{\em ``The Minimum Distance of Turbo-Like Codes''}, Submitted to IEEE
Transactions on Information Theory, 2003.
\bibitem{ctc} C. Berrou, A. Glavieux, and P. Thitimajshima,
{\em ``Near Shannon Limit error-correcting coding and decoding: Turbo-codes''},
in Proc., IEEE Int. Conf. on Commun., 1993, pp. 1064--1070.
\bibitem{rac1} D. Divsalar, H. Jin, R. J. McEliece,
{\em ``Coding Theorems for Turbo-Like Codes''},
Proc. of 36th Annual Allerton Conference on
Communication, Control and Computing, 1998, pp. 210-210.
\bibitem{has} Johan Hastad,
{\em ``Computational Limitations for Small Depth Circuits''}, PhD dissertation,
MIT, Cambridge, MA, 1986.
\bibitem{ku}
N.~Kahale and R.~Urbanke.
{\em ``On the Minimum Distance of Parallel and Serially Concatenated
Codes''},
to appear in IEEE Transactions on Information Theory, 1997.
\bibitem{Spi96} Daniel A. Spielman,
{\em ``Linear-time encodable and decodable error-correcting codes''},
IEEE
Transactions on Information Theory, 1996, Vol 42, No 6, pp. 1723-1732.
\end{document}
----- End forwarded message -----