[23470] in Athena Bugs

home help back first fref pref prev next nref lref last post

[louay@MIT.EDU: Broken latex file]

daemon@ATHENA.MIT.EDU (Richard Tibbetts)
Tue Aug 5 20:31:45 2003

Date: Tue, 5 Aug 2003 20:31:42 -0400
From: Richard Tibbetts <tibbetts@MIT.EDU>
To: bugs@mit.edu
Message-ID: <20030806003142.GQ5558@innocuous.mit.edu>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline

Dear bugs,

I dealt with this user in the SIPB office. According to him, this
document latexed properly 2 weeks ago, on the previous Athena version.
It currently fails with a lot of error messages. If I change it from
\documentstyle to \documentclass, it works. This is the user's current
solution, and a fine one. But on zephyr Greg Hudson agreed that this
might indicate a bug in latex compatibility that would be worth
investigating.

Richard Tibbetts

----- Forwarded message from louay@MIT.EDU -----

Date: Tue, 5 Aug 2003 20:24:49 -0400
From: louay@MIT.EDU
To: tibbetts@MIT.EDU
Subject: Broken latex file
X-Sieve: CMU Sieve 2.2
X-Spam-Score: -0.1
X-Spam-Flag: NO
X-Scanned-By: MIMEDefang 2.28 (www . roaringpenguin . com / mimedefang)

\documentstyle[11pt, fullpage]{article}

%\documentstyle[11pt,doublespace]{article}
  

\input amssym.def \input amssym


\include{deff}









\title{\bf Endcoding Complexity Versus Minimum Distance
\footnote{
This research was support by ARO grant  
DAA L03-92-G-0115, and  MURI grant DAAD19-00-1-0466.}
} 
\author{
Louay M.J. Bazzi
\footnote{
louay@mit.edu ,  
Department of Electrical Engineering and Computer Science, MIT,  
Cambridge, MA 02139.} 
, Sanjoy K. Mitter
\footnote{  
mitter@mit.edu, 
Department of Electrical Engineering and Computer Science, MIT, 
Cambridge, MA 02139.
}    
}
\date{}
\begin{document}
\maketitle



\abstract{ 
We establish  a bound on the  minimum distance  of a 
binary error correcting  code 
given constraints on the computational time-space complexity 
of its encoder in the general binary branching program model. 
The bound we obtain 
asserts that if the encoder 
uses linear time and sublinear memory 
in the most general sense,  
then the minimum distance of the code cannot  grow linearly with the 
block length when the rate is nonvanishing, i.e. 
minimum relative distance of the code tends to zero in such a setting. 
The setting is general enough to include nonserially concatenated 
Turbo like codes and various 
generalizations.
Our argument is based on  branching program techniques
introduced by Ajtai \cite{ajt}.
We consider also the case of constant-depth AND-OR circuits encoders
with unbounded fanins.
}




\section{Introduction}

A binary error correcting code is  specified by an injective 
map $C:\{0,1\}^n \rightarrow \{0,1\}^m$ 
from the binary strings of length $n$ to binary strings of length $m$,
where $C$ is called the {\em encoder} 
or the {\em code} when there 
is no possibility of confusion, $n$ is called the {\em message length},
$m$ is called the {\em block length}, 
the strings in the image of 
$C$ are called the {\em codewords} 
, and the ratio  $n/m$ is called  
the {\em rate}. 
 A fundamental parameter that characterizes the worst case 
error correction capabilities of the code is its {\em minimum distance} 
which is defined as the minimum hamming distance  between two distinct 
codewords. 
One of the main goals in combinatorial coding theory 
is to find codes where a good tradeoff is achieved 
between  the two opposing parameters rate and minimum distance, as the 
message length tends to infinity.  A code (meaning a family of codes
indexed by the block length)  is called in the coding theory literature 
{\em an asymptotically  good  code} 
 if the block length grows linearly 
with the message 
length and the minimum distance 
grows linearly with the block length. The other codes are called 
{\em asymptotically bad codes}. 


In this paper, we consider the following question: 
\begin{quote}
What can we say about the growth of the minimum distance 
of a binary error correcting  code 
given constraints on the computational complexity 
of its encoder?  
\end{quote}

We concentrate mainly on  the 
time-space complexity  of the encoder.
In this setting, the question is a natural tradeoffs question 
between the parameters: code minimum distance, 
code rate, encoding time, and encoding space.


From a practical perspective, this question is 
important since there are popular error correcting
codes that have low time-space encoding
complexity. 
 We are referring here to Turbo codes, 
or more precisely to parallel concatenated 
Turbo codes introduced by 
Berrou, Glavieux, and Thitimajshima in \cite{ctc},  
 and repeat-convlute codes  
introduced by Divsalar,  Jin,  and  McEliece in \cite{rac1}. 
This low time-space encoding 
complexity is  crucial for the corresponding iterative 
decoding algorithms because these algorithms  process the 
state space representation of the encoder.
Sharp bounds on minimum distance of Turbo codes were first 
obtained by Kahale and Urbanke \cite{ku} in the setting of random 
interleavers and constant memory convolutional codes.
In a recent joint work   
with  Mahdian and Spielman \cite{tlcd}, 
we derived strong bounds on the minimum distance of Turbo like codes 
in variety of cases.  One of these cases 
is the well structured setting of  
generalized repeat-convolute codes where  
the convolutional code is substituted 
by an arbitrary automaton. 
We argued that such codes are  asymptotically bad when 
the memory of the automaton is sublinear and the number 
of repetitions is constant.


In this paper, we extend this particular result
to the much more general setting  where 
the encoder is  a binary branching program, or equivalently a nonuniform 
random-access machine with binary input registers.


We establish a general theorem that asserts that 
if the encoder is a binary branching program 
that uses linear time and sublinear space,
then the minimum distance of the code cannot grow linearly 
with the block length when the rate is nonvanishing, 
which is a rather surprising 
result. In general we derive a bound relating 
the involved parameters.

Our proof is  based on the branching 
programs techniques introduced in the recent breakthrough 
of   Ajtai \cite{ajt}. 


We consider also the case of constant-depth AND-OR circuits encoders
with unbounded fanins.
We conclude with a conjecture about the strongest possible
time-space tradeoffs for encoding asymptotically good codes.














\subsection{Branching program encoders}\label{bpe}


Consider a code specified as above by 
an encoding map $C:\{0,1\}^n\rightarrow \{0,1\}^{m}$.
By a {\em branching program encoder} 
(binary by default, i.e. $2$-way) 
$B$ computing $C$ we mean 
a connected 
directed acyclic graph 
with a single source 
and multiple sinks, 
together with a set of $n$ binary {\em input 
variables} and a set of $m$ binary {\em output variables}  
satisfying the following. 
There are exactly two arrows leaving each non-sink node, the 
first labeled with a one and the second with a zero.
Every non-sink node 
is associated with an  input variable.
Some of the 
nodes are associated with  output variables (possibly 
more than one variable per node), in which case 
the nodes are labeled by zeros or ones. The nodes of the 
graph are called states, the source is called  
the {\em start state}, and the sinks are called the {\em end states}. 
The  branching program $B$ computes  $C$  as follows. 
The computation starts   by reading the 
value of the variable associated with the start state 
and moving according to its value to the next state 
and so on by reading more bits, while outputting  
when an output node is reached, until an end state
is reached.
We may want to assume  that 
on any input each  output variable will be set
at least once, or we can assume that the output 
variables are arbitrarily preset. 




The {\em computation  of the branching program on 
an input}
 is the corresponding sequence 
of states starting with the start 
state and ending with an end state.
The length of a computation is its 
number of states, and its time 
is the number of states plus the 
total number of times  each output variable is set
along the way.


The {\em length} of the branching program
is the maximum length of a computation. 
The {\em time} $t$ of the branching program 
is the maximum time of a computation. 
The {\em size} $S$ of the branching program 
is the total number of states.
The {\em memory} or the {\em space} $M$ of a branching program 
is $M = \log{S}$.  



\begin{remark}\label{rem1}{\em 
The codes encodable by such general 
branching programs  correspond to  
those encodable by {\em random-access machines} 
with 
bounded read-write space and binary input registers,
where the time of the branching program is the worst case running time, 
and its size is $\Theta(2^M)$, $M$ being the number 
read-write bits. 
Note that the machine has two types 
of read-only bits, those corresponding to the 
input message, and those corresponding to the 
code description. Complexity is measured in 
terms of the number of bits of the first type.
Note that we are not restricting the size 
of the read-only bits corresponding to  
the code description. 
This is not a problem 
since we are deriving lower bounds. 
See Example \ref{ex4} for  some consequences of
this unlimited read-only space.}
\end{remark}



\subsubsection{Some special types of branching programs} \label{sptb}
The branching program is called {\em leveled}  
if the states are divided into an 
ordered collection of sets  each called 
a {\em level} where edges are between consecutive
levels only. In such a case, the {\em width} 
of the  
branching program is the maximum number of 
states per level.


The branching program is called {\em oblivious} 
if the input 
variables  (and the output variables) 
are read (respectively, set) 
in the same order 
regardless of the input under consideration.
Thus an oblivious branching program is naturally leveled 
in such a way that all the nodes in the same level read the 
same input variables, and set the same output variables.


The branching program is called a {\em read-$k$-times} 
branching program if 
each  input variable is read at most $k$ times  
on any input.


The branching program is called a {\em write-$w$-times} if at most 
$w$ output variables are set per state. 



\subsubsection{Examples}
\begin{example}\label{ex1}{\em 
The trellis of  a convolutional code is an 
oblivious, read-once, and  write-once branching program.
The trellis of a systematic convolutional code is an 
oblivious, read-once,  and write-2-times  branching program.
}
\end{example}
\begin{example}\label{ex2}{\em 
Parallel concatenated Turbo codes 
are encodable by low complexity oblivious 
branching programs as follows.
A {\em parallel concatenated Turbo code} \cite{ctc} 
$C$ with a constant number $k$ branches, message length $n$, 
and memory $M$
is specified by $k$ permutations $\pi_1, \ldots, \pi_k$ each 
on $n$ bits and a rate $1$ 
convolutional code $Q$ (the {\em component code}) of memory $M_0$. 
For $x$ in $\{0,1\}^n$, 
$C$ encodes $x$ as $C(x) =  (x, Q(\pi_1(x)), \ldots, Q(\pi_k(x)))$, 
where $\pi_i(x)$ is 
the string obtained by permuting the bits of $x$ according to the 
permutation 
$\pi_i$, and  
$Q(y)$ is the output of the convolutional encoder $Q$ on the input string $y$. 
Thus $C$  is naturally encodable by an oblivious
read-$k$-times  write-2-times 
branching program $B$.
The states of $B$ are copies of 
those of the automaton, and $B$ 
is  naturally leveled.
$B$ has length $kn$ and  time $\Theta(n)$.
The  width of $B$ is $2^{M_0}$, and its size 
is at most $kn2^{M_0}$. 
Note that the same holds if we follow the technicality 
of appending a terminating sequence to the input.
}
\end{example}
\begin{example}\label{ex3}{\em 
Repeat-convolute codes fit in the same picture.
A {\em repeat-convolute code}  \cite{rac1} 
consists of a  
repeat-$k$-times code, a convolutional code, and a permutation.
More precisely, a repeat-convolute 
code $C$ of message length $n$ and memory $M$ is specified by a constant
 integer $k$, 
a permutation $\pi$ on $kn$ bits, and a convolutional encoder $Q$ of 
memory $M$. 
For $x$ in $\{0,1\}^n$, $C$ encodes $x$ as $C(x) = (x,Q(\pi(r(x))))$, where
$r$ is the repeat-$k$-times map, i.e. $r(x)$ is the concatenation of $k$ 
copy of $x$.
As in Example 2, $C$  is naturally encodable by a leveled, oblivious,
read-$k$-times,  write-$2$-times, length-$kn$, time-$\Theta(n)$, and 
width-$2^{M_0}$  branching program  whose size is 
at most $kn2^{M_0}$. 
}
\end{example}
\begin{example}\label{ex4}{\em 
   Any code can be trivially encoded  
   in the  binary branching  program model 
   in linear time and linear space by a tree branching 
   program that  on any input, outputs the whole  codeword 
   at the corresponding leaf.
   This makes sense in the random-access machine with binary   
   input registers encoding model because we are not 
    counting the  read-only 
    space needed to  store the code 
   description (See Remark \ref{rem1}).
   It is worth noting here that 
   when this read-only space is taken into consideration,  
   we know from the work of Spielman \cite{Spi96} that there 
   exists an asymptotically good code encodable in 
   linear time and linear space.
}
\end{example}
\begin{example}\label{ex5}{\em 
   Any linear code is naturally 
   encodable by a leveled, oblivious,  
    width-$2$, quadratic-time, and write-once 
    branching program.
}\end{example}






\subsection{Main result} 

 
\begin{theorem} \label{th1}
Let $C:\{0,1\}^{n}\rightarrow \{0,1\}^{m}$ be 
a code $($i.e. an injective function$)$ encodable $($i.e. computable$)$ 
by branching program $B$  of size $S = S(n)$, time  $t = t(n)$,  
and length $l= l(n)$ $($so $l\leq t$$)$. 

\noindent 
If   $t(n) = \Theta(n)$, then the  minimum distance of $C$ 
is 
\[
O\left(\left({\log{S}\over{n}}\right)^{1\over{\lfloor l/n \rfloor}} n\right).\]
Therefore,  $C$ is asymptotically bad when
$S(n) = 2^{o(n)}$  and $t(n) = O(n)$.

\noindent 
More generally, if $t(n) = \Omega(n)$, 
then the   minimum distance of $C$ is 
\[
O\left(\left({t\over{n}}\right)^3 \left({\log{S}\over{n}}\right)^{n\over{2t}}n\right).
\]
Thus, $C$ is asymptotically bad also when 
$S(n)=2^{O(n^{1-\e_1})}$ and $t(n)=O(n\log^{1-\e_2}{n})$, 
for all $\e_1,\e_2>0$.
\end{theorem}
 
In other words linear time and sublinear space 
for encoding  imply that 
the  code is asymptotically bad, i.e. the minimum 
distance cannot  grow linearly with the  block
length when the rate is nonvanishing. 



Note that when the time is linear,  
  the sublinear memory  requirement  
  is asymptotically  tight  for encoding asymptotically 
  good codes (See Example \ref{ex4}).





\subsubsection{Application to Turbo-like codes} 

By applying Theorem \ref{th1} to 
parallel concatenated  Turbo codes  and 
repeat-convolute  codes (See Examples \ref{ex2} and \ref{ex3}), we can recover 
the corresponding bound in \cite{tlcd} 
as follows.

The minimum distance of a parallel concatenated Turbo code
with a constant number $k$ branches, message length $n$, and memory $M_0$ 
is $O(n^{1-1/k}M_0^{1/k})$ because the size of the 
corresponding branching program 
is at most $kn2^{M_0}$. Similarly,
the minimum distance of a repeat-convolute  code  with 
$k$ repetitions,   message length $n$,  and memory $M_0$  
is $O(n^{1-1/k}M_0^{1/k})$.  


So both types of codes will be  asymptotically bad  in 
any reasonable setting, i.e. as long as $M_0$ is 
sublinear in $n$. Note that the situation when $M_0$ 
is sublinear in $n$  corresponds to the case when 
the underlying trellis has subexponential size, i.e. 
when the corresponding 
iterative Turbo decoding  algorithm has 
subexponential running time.








\section{Proof of Theorem \ref{th1}} 


\subsection{Ajtai proof techniques  for the hamming distance problem}
We use branching program techniques introduced by Ajtai in 
\cite{ajt}. 
More specifically, we are referring to the branching program 
techniques that Ajtai introduced to show that there 
is no $O(n)$-time and $o(n\log{n})$-space $R$-way branching 
program, $R = n^c$ ($c$ some absolute constant),  
that decides on the hamming distance problem: 
  given $n$ strings in $\{0,1\}^{\log{R}}$, decides whether any 
(distinct) two of them are at $\lambda \log{R} $ hamming
distance apart ($\lambda$ another absolute constant 
related to $c$). 


Even though this is a decision problem 
in the setting of $R$-way branching programs, while 
ours is not a decision problem and is in the setting of $2$-way 
branching programs, the techniques introduced by 
Ajtai are  behind the proof we describe below. 

We refer the reader to Ajtai's paper \cite{ajt}.



\subsection{Objects under consideration and terminologies} 
We will start by 
making the branching program leveled.
Recall from Section \ref{bpe} that this means that
the states are partitioned into $l$ consecutive 
sets of states each called a level in such a way
that edges, i.e. transitions,  
occur only between consecutive levels.


We will divide (meaning partition) 
the branching into {\em blocks}, meaning 
sets  of consecutive levels.
For a given block, we will be looking at states 
in the {\em lower boundary level of the block}.
All what the lower boundary level of a block 
mean is the last (w.r.t to the levels ordering) 
level (which is a set of states) 
in the block  (which is a set of levels).



Given an input, we will be looking at the computation 
of the branching program on this input, which as we explained 
in Section \ref{bpe} is defined to be the corresponding 
sequence of states starting with the start 
state and ending with an end state.
So, in the leveled case, 
each computation takes exactly $l$ steps, i.e. it contains 
exactly $l$ states.


 Fix an  input $x$, and consider the 
corresponding computation of the branching program 
$B$ on $x$.  Fix also a set $L$ of levels 
or a set $T$ of blocks.
By a input bit or
variable (respectively, output bit or variable)
accessed or read (respectively, set)
in $L$ or $T$ {\em during the computation} of $B$ on $x$
(or equivalently by $L$ setting the input bit during the computation 
and so on $\ldots$), 
all what we mean is that there is a state 
in the computation that belongs to a level 
in $L$ or a level in a block in $T$ 
where the value of the input variable
is read in order to move to another state 
(respectively, the value of the output variable is 
set). 



Finally, by a computation containing 
a sequence of states, we mean that each  state
in this sequence appears in the computation.
Note that here the order  does not matter since 
the states  in a computation are distinct 
because the branching program is acyclic.



\subsection{The oblivious case argument}



Recall that an oblivious branching program is naturally leveled 
in such a way that all the nodes in the same level read the 
same input variables, and set the same output variables.



Since  the proof of Theorem \ref{th1} is 
relatively long, 
it is instructive to look first at the very special case 
when $B$ is oblivious. 
This case is very restrictive compared to a general
branching program.
To restrict the setting further, assume that $B$ 
is read-$k$-times and  write-$w$-times, 
where $k = O(1)$ and $w = O(1)$.


The argument we used in  \cite{tlcd}  to bound the 
minimum distance of repeat-convolute codes
was  in the setting  of  automata. More specifically,
we studied the case of a  repeat-convolute code
where  the convolutional code  is replaced by 
an arbitrary  automaton.
Even though  the automata  setting is less general than the 
case we are considering in this section,  
the argument naturally extends as follows.

  
  
Assume that $B$ a  read-$k$-times,  write-$w$-times, and 
oblivious branching program, where $k = O(1)$ and $w = O(1)$.
Thus $n\leq  l \leq  kn$ and  $m \leq wn$. We want to argue 
that the minimum distance of $C$ is 
$O(n(\frac{\log{S}}{n})^{1/k})$.


Let $W$ be the width of $B$, thus $W \leq S$.
We will exhibit two distinct input strings that map 
to  two  codewords at distance 
$O(n ( {\log{W}\over{n}} )^{1/k})$.
We will do this by finding 
a nonempty set of input variables $U$,  
a subset $J$ of levels,  
and two distinct strings $x_1$ and $x_2$ in $\{0,1\}^n$
such that $x_1$ and $x_2$ agree outside $U$, 
and the computations of $B$ on $x_1$ and $x_2$ 
agree outside $J$. This will give us the desired bound on 
the minimum distance. $J$ will be constructed as a union 
of intervals from a partition of $B$ that we define next.


Partition $B$  into $b$ consecutive blocks, each consisting 
of $s_1$ or $s_2$ levels, where  
$s_1 = \lfloor kn /b \rfloor$ and 
$s_2 = \lceil kn /b \rceil$.
Assume for now that $b$ is arbitrary as long as  $s_1\geq 1$. 
We will optimize on the integer  $b$  later. 



Each of the $n$ input variables is read by $B$ in at most 
$k$ blocks. Recall that $B$ is oblivious. Thus, 
for any specific variable, 
 these blocks will be the same irrespective 
of the setting of the 
input variables.
There are at most $b^k$ $k$-set 
of blocks. Here  by a $k$-set of  blocks, we 
mean a set of blocks  of cardinality 
at most  $k$.  So there 
are at least $n/b^k$ input variables that
are read by $B$ in the same  $k$-set of blocks.
Let $U$ be such a set of input variables with  
$|U|=\lceil n/b^k \rceil$,  $T$ be such a $k$-set of blocks, 
thus $1\leq |T|\leq k$. The set $J$ we mentioned above is the union 
of the blocks in $T$.




Consider the lower boundary levels  $L_1, \ldots, L_{|T|}$
of the blocks in $T$ ordered by the level index, and let $Q$
be the set of strings in $\{0,1\}^n$ 
that are zero outside $U$, thus $|Q| = 2^{|U|}$.
There are at most $W^k$ state sequences  in 
$L_1\times \ldots \times  L_{|T|}$, 
and for each $x$ in $Q$ the computation 
of $B$ on $x$ contains such a sequence. So 
if  we can guarantee that  $2^{|U|} > W^k$, 
we get that there should be a sequence of states 
$\{\s_i\}_{i=1}^{|T|}$  in 
$L_1\times \ldots \times  L_{|T|}$ 
and two different strings $x_1$ and $x_2$ in $Q$
such that  the computation of $B$ on both $x_1$
and $x_2$ contains $\{\s_i\}_{i=1}^{|T|}$.
Since $x_1$ and $x_2$ agree outside $U$, 
the computation of $Q$ on $x_1$ and $x_2$ 
are exactly the same outside the 
blocks in $T$.
The  reason is that both computations 
are in similar states each time  an intervals 
in $T$ is left. 
Thus  $C(x_1)$ and $C(x_2)$  can only 
differ in the blocks in $T$.
This means that the distance 
between $C(x_1)$ and $C(x_2)$ is  at most 
$|T|s_2w \leq k\lceil kn /b \rceil w$,
since $|T|\leq k$, and $s_2 = \lceil kn /b \rceil$. 



This bound holds under the assumption that  $2^{|U|} > W^k$, 
which can be guaranteed if 
$2^{n/b^k} > W^k$.  So,  choose
$
b = \lceil ({n\over{k\log{W}}})^{1/k}\rceil - 1 
$. 
Note that the only other constraints we have on $b$ are $1\leq b \leq kn$, 
and the selected value satisfies these constraints when $W$ is not 
exponential in $n$. Note also that  if $W$ is exponential in $n$, 
the statement of the theorem is trivial. 
By replacing this value of $b$ in the upper bound $k\lceil kn /b \rceil w$
on the distance between $C(x_1)$ and $C(x_2)$, and using $S$ as 
a upper bound on $W$, 
we get that the 
the minimum distance of $C$ is
$O(n(\frac{\log{S}}{n})^{1/k})$.
Note that we used here also that
$k= O(1)$, $w = O(1)$, and  
$C(x_1)\neq C(x_2)$ because $x_1\neq x_2$ and 
$C$ is injective.




This proof is short and simple. But when $B$ is not oblivious, 
the proof does not go through. The main reason
is that we cannot construct $U$ 
and $T$ 
regardless of the setting 
of the input variables as we did 
 above. When the branching program is 
oblivious, the read-$k$-times and the write-$w$-times 
restrictions are not fundamental. When it is not oblivious,  
they  become restrictive. 
For example,  in the general  branching program  model,
depending on the input, 
a very large number 
of the output variables 
may be set in a particular state, 
or a particular  input variable 
may be read a very large number of times.
The point to keep in mind is that the oblivious 
assumption is very restrictive. 



We will sketch in the next section how to handle the 
general situation.
The proof is longer and more sophisticated. 
This is not strange since the statement we are proving
is much more general. 
The reader is encouraged to go  carefully over the above argument before 
proceeding to the general case.







\subsection{Proof technique}
We follow the techniques   
introduced by Ajtai \cite{ajt} in the setting  of 
the hamming distance problem.


We want to find two input 
strings $x_1$ and $x_2$ such that $C(x_1)$ and $C(x_2)$ 
are close to each others. 



The first step is to make 
the branching program leveled without affecting its 
input-output behavior. Next,  we divide 
the branching program into blocks each  consisting
of consecutive levels whose number will be suitably 
selected later and whose sizes are as uniform as 
possible. 


To exhibit $x_1$ and $x_2$, we will
find a set $T$ of blocks such that:
\begin{itemize}
\item the size of  $T$ is small,
\item the computations of $B$ on $x_1$ and $x_2$ are 
exactly the same in the blocks outside $T$, and  
\item
 not too many output bits of $C(x_1)$ 
(respectively $C(x_2)$) 
are set in any of the blocks in $T$ during
the computation of $B$ on $x_1$ 
(respectively $x_2$).
\end{itemize}
 Thus $C(x_1)$ 
and $C(x_2)$ can only disagree on 
the few output bits that are set in $T$.  


To find such $x_1$, 
$x_2$, and $T$, we find first $T$ 
together with a set $Q'$ of input strings in 
$\{0,1\}^n$ and a sequence $\{\s_i\}_i$ 
of states in the lower boundary levels of the blocks in $T$ in such 
a way that for each $x$ in $Q'$:
\begin{itemize}
\item the computation of $B$ on $x$ 
contains $\{\s_i\}_i$, 
\item 
not too many output bits of $C(x)$ 
are set in any of the blocks in $T$ during
the computation of $B$ on $x$, and 
\item  the 
number of variables in $x$ 
that are accessed only in the blocks in $T$ 
during the computation of $B$ on $x$ 
is large. 
\end{itemize}
We will eventually find the desired 
$x_1$ and $x_2$ inside $Q'$ as follows.


We modify the branching program $B$ again 
so that  $B$  is forced to 
to pass through  a state in the sequence $\{\s_i\}_i$ 
each time it attempts 
to leave a lower boundary level of a block in $T$, but
without  affecting its input-output behavior on $Q'$.



Using $T$, define an equivalence relation 
on $\{0,1\}^n$  by relating two  strings 
if:
\begin{itemize} 
\item they share the same  
the set of input variables that are not read 
during the 
computation of $B$ in blocks outside $T$,
and 
\item 
 they agree on the values of their bits 
outside this set. 
\end{itemize}
Thus each 
equivalence class $[x]$ is determined 
by a set $I_{[x]}$ of input variables and 
a setting  of the variables outside 
$I_{[x]}$.

We forced the computation of $B$ 
to contain the sates $\{\s_i\}_i$ 
on all inputs so that 
we get $|[x]|=2^{|I_{[x]}|}$, and 
hence the size of each 
equivalence class $[x]$ can be guaranteed to 
be large when $I_{[x]}$ is large.

Since for each input string in $Q'$, the 
number of variables  
that are accessed only in the block in $T$ 
during the computation of $B$  
is large, we get that 
the equivalence class of each 
input in $Q'$ is large. 

By 
considering the set $\Omega$ of sufficiently  large 
equivalence classes  so that the equivalence 
classes of all the elements of $Q'$ are guaranteed to be elements 
of $\Omega$, our problem reduces to selecting the 
number of blocks so that $|Q'|$ is strictly larger than 
$|\Omega|$, and hence there are distinct $x_1$ and $x_2$ 
in $Q'$ that have the same equivalence class.
The fact that $[x_1] = [x_2]$
means that the computations 
of $B$ on $x_1$ and $x_2$ are exactly the same outside the 
blocks in $T$, and hence $C(x_1)$ and $C(x_2)$ can 
only disagree on the output bits that are set inside the 
blocks in $T$. 

By construction the number of those output 
bits will be small. Moreover,  and  since  $C$ is injective, 
$C(x_1)$ and $C(x_2)$ are distinct.
 The distance between $C(x_1)$ and 
$C(x_2)$ will be the desired bound on the minimum distance 
of $C$. 





 

\subsection{Proof outline}
Assume for moment that $t = \Theta(n)$. We will deal with 
the more general case when we are done by working 
more carefully  with the constants. So  say 
that 
\begin{equation}\label{ac}
\mbox{$t = an$ and $l = cn$}, 
\end{equation}
where $a,c\geq 1$ are constants 
($a,c\geq 1$ because $C$ is injective). 


\begin{itemize}
\item[A)] We modify the branching program so that it is leveled
\footnote{
This can be done by a classical procedure. 
Construct a leveled directed
graph of $l$ levels where each level consists  of a copy of all the 
nodes of the original branching program together with the related output labels. 
Connect the  nodes in each two 
consecutive levels according to the the graph of the original 
branching program. Associate the end states not in the last level with 
arbitrary input  variables.
Connect the end states in any tow consecutive levels by two arrows 
labeled respectively by one and zero. Finally,   
remove all the nodes (together with the related edges)
that are not accessible from the start sate in the first level
or can  not reach an end state in the last level.
The start state 
of the new branching program is the remaining state 
in the first level, and its end states are   those  remaining 
in the last level. 
}. The modified   
          branching program computes the same function, i.e. $B$ computes 
          $C$. The length of resulting branching program 
$B$ is $l$, its time is $t$, its
           size is at most $Sl$, and its width is at most $S$.
The difference is that now  edges are  only between consecutive levels, and 
each computation takes exactly $l$ steps.


\item[B)]  Partition $B$  into $b$ consecutive block, each consisting 
of $s_1$ or $s_2$ levels, where 
\begin{equation}\label{s1s2} 
s_1 = \left\lfloor \frac{l}{b} \right\rfloor = \left\lfloor \frac{cn}{b} \right\rfloor \mbox{ and  }
s_2 = \left\lceil \frac{l}{b} \right\rceil = \left\lceil \frac{cn}{b} \right\rceil. 
\end{equation}
Assume for now that in general $1 \leq b \leq l$ so that 
$1 \leq s_1, s_2 \leq l$.
We will 
optimize on the integer  $b$  later.



\item[C)] 
\begin{lemma}\label{lemmasi} There exist: 
\begin{itemize}
\item[a)]  absolute constants 
$h,\alpha> 0 $, 
\item[b)] $Q'\subset \{0,1\}^n$ such that  
\begin{equation}\label{qprime}
|Q'| \geq \frac{2^n}{(Sb)^k}, 
\end{equation}
where 
\begin{equation}\label{kk}
k = \lfloor c \rfloor,
\end{equation}
\item[c)] a set  of blocks $T$, 
\begin{equation}\label{tt}
1\leq |T|\leq k,
\end{equation}
\item[d)] and a sequence $\{\s_i\}_{i=1}^{|T|}$ 
of states in the lower boundary levels of the blocks in  
$T$,  
\end{itemize}
such that  for each $x$ in $Q'$:
\begin{itemize}
\item[1)] the computation of $B$ on $x$ 
contains $\{\s_i\}_i$,  
\item[2)] at most  
\begin{equation}\label{ww}
w = \frac{hs_1 t}{l}
\end{equation}
output bits 
of $C(x)$ are set in each block in $T$
during the computation of $B$ on $x$, and 
\item[3)] the number of variables in $x$ 
that are accessed only in the blocks in $T$ 
during the computation of $B$ on $x$ 
is at least 
\[
\frac{\alpha n}{b^k}.
\]
\end{itemize} 
\end{lemma}
{\bf Proof.} See Section \ref{prfc}.  \finito 

\item[D)]
Now we will modify the branching program $B$ again 
so that  $B$  is forced to 
to pass through a state  in $\{\s_i\}_i$ each time it attempts 
to leave a lower boundary level of a block in $T$,
while  guaranteeing that
$B$  behaves exactly 
like the old $B$ on the inputs in $Q'$, i.e. it computes $C(x)$ 
for each $x$ in $Q'$.



We can do this 
by simply connecting (on both inputs) all the states 
in the level above that of  $\s_i$ to $\s_i$, for each $i$.
Note that $B$ need not compute an injective 
function anymore, so it may not read all the input 
variables on some inputs.
It  may also leave some of the output 
variables unset, but this is not a problem since 
we can assume that the output 
variables were arbitrarily preset. 


\item[E)] Finally, we bound in Section \ref{dtle}
the minimum distance of $C$ 
by exhibiting distinct $x_1$ and $x_2$  in $Q'$ such that 
the distance between 
$C(x_1)$ and $C(x_2)$ is 
$O(n({\log{S}\over{n}})^{1\over{k}})$.


\item[F)] In Section \ref{suplin}, 
we explain 
how to 
drop the assumption  $t = \Theta(n)$.


\end{itemize}

\subsection{Proof of Lemma \ref{lemmasi}}\label{prfc} 
Consider any input $x$ in $\{0,1\}^n$.
\begin{itemize}
\item 
Let $k\geq 1$ be  an integer, and let $h>0$.
We will set $k$ then $h$ as we go on. 
\item 
Let $R_x$  be the set 
of blocks that each sets at most  
\[
w \equiv  \frac{hs_1t}{l}
\]
bits of $C(x)$
during the computation of $B$ on $x$.
\item 
Let $D_x$ be the 
set of input variables that are read in at most $k$ sates during 
the computation of $B$ on $x$.
\item
And let $D'_x$ be the set of input variables in $D_x$ 
that are read only in blocks in $R_x$ during 
the computation of $B$ on $x$.
\end{itemize}
First recall from (\ref{ac})
that $a,c\geq 1$ are the constants satisfying  
\[
\mbox{$t \equiv an$ and  $l \equiv cn$.}
\]
Recall also from (\ref{s1s2}) that
\[
s_1 \equiv \left\lfloor \frac{l}{b} \right\rfloor =  \left\lfloor \frac{cn}{b} \right\rfloor \geq 1
\mbox{ and } s_2 \equiv  \left\lceil \frac{l}{b} \right\rceil =  \left\lceil \frac{cn}{b} \right\rceil. 
\]
Some bounds:
\begin{itemize}
\item From  the definition of $R_x$, we must have 
$w(b - |R_x|) \leq t$, thus 
\begin{equation}\label{Rxb}
|R_x| \geq  b - \frac{l}{hs_1} \geq b\left( 1  -  \frac{2}{h}\right),
\end{equation}
where the first inequality follows from 
$w = hs_1t/l$,  and the 
second from the bound $s_1 = \lfloor l /b \rfloor \geq l/(2b)$. 
\item Since $C$ is injective, each  
input variable must be read at least once, so from the definition 
of $D_x$, we must have have 
\[
|D_x| + (k+1)(n-|D_x|)\leq l,
\]
 i.e. 
$|D_x| \geq  n(1 - \frac{c-1}{k})$  because $l=cn$. 
Thus if we set 
\[
k \equiv \lfloor c \rfloor,
\] 
we get 
\begin{equation}\label{Dpx}
|D_x| \geq  n(1 - \e),
\mbox{  where } 
\e \equiv \frac{c-1}{k} <1.
\end{equation}
\item
The number of 
input variables read in blocks outside  $R_x$ is  at most
\[ 
(b - |R_x|)s_2 \leq \frac{2bs_2}{h} \leq n\frac{4c}{h},
\]
where the first inequality follows from (\ref{Rxb}),  
and the second from the bound 
$s_2 = \lceil cn /b \rceil \leq 2 cn/b$.
Thus, by the definition of $D'_x$, we must have 
\[
|D'_x| \geq |D_x| - \frac{4c}{h} \geq 
n\left( 1-\e - \frac{4c}{h}\right),
\]
where the second inequality follows from (\ref{Dpx}). 

Let $h$ be sufficiently large such that 
\begin{equation}\label{alpha}
\alpha \equiv 1-\e - \frac{4c}{h}>0.
\end{equation}
Note 
that this implies also that $1  - \frac{2}{h} > 0$ since $c\geq 1$, 
i.e 
\[
|R_x|> 0, 
\]
by (\ref{Rxb}).


\end{itemize}
To sum up, we have fixed some constants $h,\alpha > 0$,  
and  specified  $k \equiv \lfloor c \rfloor$, so that 
\begin{equation}\label{rxdx}
1\leq |R_x|  \leq b \mbox{ and }  
|D'_x| \geq \alpha n.
\end{equation}
Now, 
keep the definition of $R_x$ in mind,
ignore  $D_x$,  and recall that 
$D_x'$  is a set of input variables 
such that:
\begin{itemize}
\item each 
input variable in $D'_x$ 
is read in at most $k$ levels during 
the computation of $B$ on $x$,  and 
\item each of those 
levels belongs to a block in $R_x$. 
\end{itemize}
Recall also that so far we are fixing an input $x$ in $\{0,1\}^n$.



Consider 
all the $k$-sets in  $R_x$, i.e. the 
subsets of $R_x$ of size at most $k$.
Each input variable in $D'_x$ is read in such 
a $k$-set during the computation on $x$, 
and there are at most $|R_x|^k$ 
such $k$-set, so there are at least 
\[
\frac{|D'_x|}{|R_x|^k} \geq \frac{\alpha n}{b^k}  
\]
variables in 
$D'_x$ read in the same $k$-set of blocks, 
where we have used (\ref{rxdx}) to obtain the estimate.
 Let 
$U_x$ be such a set of variables in  $D'_x$,  and 
let $T_x$ be such a $k$-set of blocks in $R_x$.
So 
\[
1 \leq  |T_x|\leq k \mbox{ and } |U_x| \geq \frac{\alpha n}{b^k}.
\]
Note that $T_x$ is nonempty since $C$ is injective. 


For each $x$ in $\{0,1\}^n$, there is such a $U_x$ 
and $T_x$.  Fix any such correspondence.
There are at most $b^k$ such  $T_x$,
so there is a subset $Q\subset \{0,1\}^n$ and a $k$-set 
of blocks $T$ such that 
\[
|Q|\geq \frac{2^n}{b^k},
\]
and 
$T = T_x$ for each $x$ in $Q$. 
Now consider the lower boundary levels  $L_1, \ldots, L_{|T|}$
of the blocks in $T$ ordered by the level index. 
There are at most $S^k$ state sequences  in 
$L_1\times \ldots \times  L_{|T|}$, 
and for each $x$ in $Q$, the computation 
of $B$ on $x$ contains such a sequence, so there 
is a sequence $\{\s_i\}_{i=1}^{|T|}$  
of states in $L_1\times \ldots \times  L_{|T|}$ and 
a  subset $Q'\subset Q$ such that  
\[
|Q'|\geq \frac{|Q|}{S^k} \geq \frac{2^n}{(Sb)^k},  
\]
and 
the computation of $B$ on $x$ contains $\{\s_i\}_i$, 
for each $x$ in $Q'$.








\subsection{Bounding the minimum distance}\label{dtle} 
Now we are ready  to find
the two distinct messages $x_1$ and $x_2$ 
that $C$ map to close codewords.

 
Using $T$, for each $x$ in $\{0,1\}^n$, let
 $I_x$ be 
the set of input variable that are not read 
during the 
computation of $B$ on $x$ in blocks outside $T$.
Note that we need a double negation since some of the 
input variables may not be read at all because 
we modified the branching program in (D).


So, by (C.3),  for each $x$ in $Q'$, 
\begin{equation}\label{Ixb}
|I_x| \geq  \frac{\alpha n}{b^k}.  
\end{equation}

Using $T$, define the equivalence relation $\sim$ on $\{0,1\}^n$ 
by $x\sim y$ if:
\begin{itemize}
\item $I_x = I_y$, and 
\item $x$ agree with 
$y$ on  the bits outside $I_x$.
\end{itemize}
In other words, 
$x|_{I_{[x]}} = y|_{I_{[y]}}$, where $[x]$ 
means  the equivalence class of $x$.


Given any $x$ in $\{0,1\}^n$, each $y\sim x$ can only disagree
with $x$ on $I_x$. Conversely, if $y$ 
disagree with $x$ only  inside  $I_x$, it must be the case that 
$y\sim x$. To see why this is true, note 
that we forced in (D) all the  computations of $B$ to leave the 
blocks in $T$ in the same sates: the $\{\s_i\}_i$ that we exhibited 
in (C.1).  So the computations 
of $B$ on $x$ and $y$ are exactly the same outside the 
blocks in $T$, and hence any bit accessed on $x$  
outside $T$ will be accessed on  $y$ outside $T$  
and none of the bits in $I_x$ will be accessed  
on $y$ outside $T$.  It follows that  
\[
|[x]| = 2^{|I_x|}.
\]


Thus, by (\ref{Ixb}), for each $x$ in $Q'$, 
\[
|[x]|\geq 2^{\alpha n/b^k}.
\]  
Let $\Omega$ be the set of equivalence classes the size 
of each is at least $2^{\alpha n/b^k}$. So, $[x]$ is 
in $\Omega$ for each $x$ in $Q'$. Besides, since 
the equivalence classes are disjoint,  we must have 
$|\Omega|2^{\alpha n/b^k} \leq 2^n$, i.e.
\begin{equation}\label{omegab}
|\Omega| \leq \frac{2^n}{2^{\alpha n/b^k}}.
\end{equation}
If we can guarantee that 
\begin{equation}\label{cond2}
|Q'|> |\Omega|,
\end{equation}
we get that there should be  $x_1\neq x_2$ in 
$Q'$ such that $[x_1] = [x_2]$. The fact that $[x_1] = [x_2]$
means that the computations 
of $B$ on $x_1$ and $x_2$ are exactly the same outside the 
blocks in $T$, and hence $C(x_1)$ and $C(x_2)$ can 
only disagree on the output bits that are set inside the 
blocks in $T$. But, by  (C.2), we constructed $Q'$ 
in such way 
that the computation of $B$ on any  $x$ in $Q'$ 
can set at most  $w$ bits of $C(x)$
in each block in $T$. Thus $C(x_1)$ and $C(x_2)$ 
can disagree on at most 
\begin{equation}\label{mdb}
2|T|w \leq 2k \frac{hs_1a}{c}  \leq \frac{2khcna}{bc} = \frac{2khan}{b}
\end{equation}
bits, where the  first inequality follows from 
$|T|\leq k$ (by (\ref{tt})) and $w = hs_1t/l = hs_1a/c$ (by (\ref{ww}) and
(\ref{ac})), and the 
second follows from 
$s_1 = \lfloor l /b \rfloor = \lfloor cn /b \rfloor 
  \leq   cn /b$ (by (\ref{s1s2})).



Moreover, $C(x_1)$ and $C(x_2)$ 
must disagree on at least one bit 
since $x_1\neq x_2$,  and $C$ is injective.


Using (\ref{qprime}) and (\ref{omegab}), 
condition (\ref{cond2}) can be guaranteed to hold if 
\[
2^{\alpha n/b^k} > (Sb)^k, 
\]
which is  fulfilled when  
\[
{1\over{b}} >  \left({k\log{(Scn)}\over{\alpha n}}\right
)^{1/k},
\]
 since $b\leq  l = cn$.
If we can select $b$ 
so that this holds, we get that the minimum distance 
of $C$ is at most $2khan/b$. The only restriction
we have on $b$ is  $1\leq  b \leq l$,  so use
\begin{equation}\label{setb}
 b  \equiv \left\lceil \left({\alpha n \over{k\log{(Scn)}}}\right)^{1/k} \right\rceil - 1.
\end{equation}
This is always below  $l$,  and it cannot  go below $1$ unless 
$S \geq 2^{\alpha n/k2^k - \log{(nc)}}$ in which case 
the statement of the theorem is trivial.  Thus, via 
(\ref{mdb}),
the minimum distance of $C$ is at most 
\[
{2khan\over{\left\lceil\left({\alpha n \over{k\log{(Scn)}}}\right)^{1/k}  \right\rceil - 1}} 
= O\left(n\left({\log{S}\over{n}}\right)^{1\over{k}} \right).
\]


\subsection{Dropping the linear time assumption}\label{suplin}
Now we drop the assumption that $t = \Theta(n)$, thus $a$ and $c$ 
need not be constants. Since $l\leq t$, we use $a$ as an upper bound on 
$c$. Assume that $a$ grows with $n$ and assume also that 
it is $O(\log{n})$ since otherwise the statement of the 
theorem is trivial. 
We will not set $k = \lfloor c \rfloor $. 
Going back to (\ref{Dpx}) and (\ref{alpha}),
we have 
\[
\alpha = 1 - \frac{c-1}{k} - \frac{4c}{h} >  1 - \frac{a-1}{k} - \frac{4a}{h},
\]
a value that we need to 
keep bounded away from zero by a 
positive constant. Set $k = \lceil 2(a-1) \rceil$  and $h = \lceil 16a \rceil $,
thus $\alpha > 1/4$.


 By using the same choice of $b$ in (\ref{setb})
and the same bound on the 
minimum distance of $C$ in (\ref{mdb}),  but 
with the new values of $k$ and $h$ and the bound $c\leq a$,
we get that the minimum distance of $C$  is at most 
\[
{2 \lceil 2(a-1) \rceil \lceil 16a \rceil
an\over{\left\lceil\left(n/4 \over{\lceil 2(a-1) \rceil\log{(S a n)}}}\right)^{1/
\lceil 2(a-1)\rceil \right\rceil}  - 1}} 
= O\left(a^3 n\left({\log{S}\over{n}}\right)^{1\over{2a}}\right).
\]










\section{When the encoder is a constant-depth AND-OR circuit}
To outline the boundaries of the picture, we consider the same 
problem but from the perspective of 
the circuit complexity of the encoder.
Here we note that not much can be said 
other than what  is essentially expected.  
Since we know from \cite{Spi96} that there are asymptotically 
good  codes that are encodable by linear-size and  
logarithmic-depth circuits, we are  left  with constant-depth 
circuits encoders with unbounded fanins.


Note first that if $C:\{0,1\}^n \rightarrow \{0,1\}^m$ is a code
(i.e. an injective map in general),  
we say that $C$ is encodable by a depth $d$ 
AND-OR circuit with unbounded fanins if  each of the $m$ output 
bits is computable by a depth  $d$ circuit where: 
1) the only allowed gated are AND/OR 
gates with possibly negated inputs, and 2) the number of 
inputs per gate is unbounded. The size of the circuit
is the total number of gates.



We argue by a direct 
application of Hastad switching Lemma
that a polynomial size  constant-depth circuit cannot encode 
an asymptotically good code (actually as long as 
the circuit size is subexponential in a 
power of the block length inversely proportional to the circuit depth).
This is not surprising since in the special case of linear codes, 
a small depth circuit encoder corresponds to a code with a low density 
generator matrix.



\begin{lemma}{\em Hastad switching Lemma \cite{has}}:
Let $f:\{0,1\}^n\rightarrow \{0,1\}$ be computable
by  an  unbounded-fanins  depth-$d$ AND-OR circuit of size $M$. 
Consider a random restriction $\rho$ that  independently 
keeps each input bit  unset with a probability 
$p = 1/(20k)^d$, sets it to $1$ with a probability $1 - p/2$, and to $0$ 
with a probability $1 - p/2$. Then the probability, over the choice of 
$\rho$, that $f$, when restricted to the values  set by $\rho$, cannot 
be evaluated by a decision tree of depth $k$ is at 
most $M2^{-2k}$.
\end{lemma}



Note that a decision tree computing a binary function 
$b: \{0,1\}^n\rightarrow \{0,1\}$ on $n$ variables 
is a binary tree where each node is associated with one 
of the input variables, and each leaf is associated with 
a $0$ or $1$ setting of the single output variable.
This implies that  if we fix any setting of the input variables, 
there are at most $k$ variables that when negated will 
affect the value of $b$, where $k$ is the depth of the tree.
In other words, when $k$ is small, $b$ has low sensitivity.
Thus if a code $\{0,1\}^n\rightarrow \{0,1\}^m$
(an injective map)
is encodable by $m$ decision trees each of depth $k$, 
a direct  counting argument  shows that its minimum 
distance can be at most $km/n$. 
Hastad switching Lemma essentially reduces the circuit
case to this situations.




\begin{theorem} Let 
$C:\{0,1\}^n \rightarrow \{0,1\}^{m}$, $m = \Theta(n)$, be a code 
(i.e. in general an injective map)
encodable by an unbounded fanins 
AND-OR circuit of size $S$ and depth $l$,
then the minimum distance of $C$ is 
\[
O((20)^l \log^{l+1}{mS}).
\]
Thus, $C$ is asymptotically bad when $l = O(1)$ and $S = 2^{o(m^{1/l})}$.
\end{theorem}
{\bf Proof.} 
Let $x_1, \ldots, x_n$ be the input variables,  
$A_1, \ldots, A_m$ the circuits 
that compute the output variables  $y_1, \ldots, y_m$,  
and $a = m/n$. Thus 
$a\geq 1$ is constant, the size of each $A_i$ is at most $S$, and the 
depth of each $A_i$ is at most $l$. 

Hit the $x_i$'s with a random 
restriction $\rho$  that keeps each $x_i$ unset with a probability 
$p = 1/(20k)^l$, sets $x_i$ to $1$ with a probability $1 - p/2$, and to $0$ 
with a probability $1 - p/2$. 


Then, for each  $A_i$, 
from Hastad switching Lemma,  the probability that $A_i$ does not 
collapse to a decision tree of depth $k$ is at most $S2^{-2k}$.  
Thus the probability that one of the $A_i$'s does  not collapse, or the 
number of remaining (unset) variables 
variables is below $np/2$
is  at most 
\[
P = mS2^{-2k}+\frac{4(1-p)}{n p},
\]
where the later 
term  comes from Chebychev inequality.



Fix $k = \log{Sm}$ so that $P \leq  1/(Sm) +  4(20\log{Sm})^l/n < 1$   
when $n$ is large enough and  
$S$ is subexponential in $n^{1/l}$. Note that when $S$ 
is exponential in $n^{1/l}$, the statement of the theorem is trivial.


So, fix any restriction $\rho$ with the property that: 
\begin{itemize}
\item 
 the set $I$ of 
input variables left unset by $\rho$ has size at least 
$np/2$, and 
\item 
  each of  the $A_i$'s collapses 
under $\rho$ to a decision trees $T_i$ of depth $k$, where 
$k =\log{Sm}$ and $p = 1/(20k)^l$.  
\end{itemize}
Consider any setting of the 
variables in $I$, and let $I_i$ be the set of variables in $I$ 
read by $T_i$ on this setting. 
Each $I_i$  contains at most $k$ variables, and the output 
of $T_i$ can only be affected when we change some  
of the variables in $I_i$. So there should be a variable 
in $I$ that appears in at most\
\[ 
\frac{\sum_j |I_j|}{|I|} \leq \frac{km}{np/2} = \frac{2ka}{p}
\]
of the 
$I_i$'s . By flipping this variable, we can affect
at most  $2ka/p$ output bits, and at least one output bit 
since $C$ is injective. Hence the minimum distance of $C$ 
is at most 
\[
\frac{2ka}{p} = O( (20)^l\log^{l+1}{Sm} ).
\]
\finito 


\section{Open questions}



  Using branching program techniques
  introduced by Ajtai \cite{ajt},
  we argued in Theorem \ref{th1} that there are no asymptotically 
  good codes that are encodable in linear time 
  and sublinear space in the most general 
  sense.
  When the time is linear,  
  the sublinear memory  requirement  
  is asymptotically  tight  for encoding asymptotically
  good codes (See Example \ref{ex4}).

  On the other extreme, 
  quadratic encoding time  is achievable by random
linear codes while requiring  minimal encoding memory 
 in the branching program model 
(See Example \ref{ex5}).  


We conjecture that in general 

\begin{conjecture}
If $C: \{0,1\}^n \rightarrow \{0,1\}^m$, $m = O(n)$,
is a code (an injective map), that is computable 
by a branching program of memory $M$ and time $T$,
where $MT = o(n^2)$, then the minimum distance 
of $C$ must be $o(n)$.
\end{conjecture}




Proving the conjecture or finding the correct time-space 
tradeoffs for encoding asymptotically  good codes 
when the encoding time is  superlinear and subquadratic
is very desirable. 




\section{Acknowledgment} 
We would like to thank 
Daniel Spielman and Ruediger Urbanke 
for simulating  discussions of this material.


\begin{thebibliography}{10}

\bibitem{ajt}
Miklos Ajtai,  {\em 
``Determinism versus Non-Determinism for Linear Time RAMs''},  
Thirty-First Annual ACM Symposium on Theory of Computing (STOC),  1999, pp. 632-641.

   
\bibitem{tlcd} 
L. Bazzi, M.  Mahdian, D. Spielman.
{\em ``The Minimum Distance of Turbo-Like Codes''}, Submitted to IEEE 
 Transactions on Information Theory,  2003.



\bibitem{ctc} C. Berrou, A. Glavieux, and P. Thitimajshima, 
{\em ``Near Shannon Limit error-correcting coding and decoding: Turbo-codes''},
in Proc., IEEE Int. Conf. on Commun., 1993, pp. 1064--1070. 

	
\bibitem{rac1}  D. Divsalar,  H. Jin,   R. J. McEliece,  
{\em ``Coding Theorems for Turbo-Like Codes''}, 
 Proc.  of 36th  Annual Allerton Conference on
 Communication, Control and Computing, 1998, pp. 210-210. 




\bibitem{has} Johan  Hastad, 
{\em ``Computational Limitations for Small Depth Circuits''}, PhD dissertation,  
MIT, Cambridge, MA, 1986. 

\bibitem{ku}
N.~Kahale and R.~Urbanke.
{\em ``On the Minimum Distance of Parallel and Serially Concatenated
  Codes''}, 
to appear in IEEE Transactions on Information Theory, 1997.


\bibitem{Spi96}  Daniel A. Spielman, 
{\em ``Linear-time encodable and decodable error-correcting codes''}, 
IEEE
Transactions on Information Theory, 1996, Vol 42, No 6, pp. 1723-1732.











\end{document}

----- End forwarded message -----

home help back first fref pref prev next nref lref last post