\documentclass{article}
\usepackage{amsmath,amssymb}
\input{preamble.tex}
\newcommand{\ppoly}{P/$poly$} %% P/Poly shorthand
\begin{document}
\lecture{8}{March 4, 1999}{Dan Spielman}{Jon Feldman}
This lecture concerns randomized complexity classes. We begin by
looking at the problem of finding polynomial identities in \ppoly,
which leads to an interesting open question. We will then give the
definitions of the two major randomized complexity classes, RP and
BPP. Finally we prove BPP $\subseteq$ \ppoly$\ $ using probability
amplification.
\section{Finding Polynomial Identities in co-RP and in \ppoly}
Zero polynomials are ones in which all coefficients are zero.
An algorithm to test whether an input polynomial is zero would be a
useful one. In fact you could use such an algorithm to test if two
polynomials are equal by just subtracting them and testing if the
result is zero.
We will give such an algorithm, but first we need a fact about
polynomials that will be useful: if you evaluate a non-zero
polynomial at a random point, you usually don't get zero.
\subsection{Schwartz/Zippel Lemma}
Let $P$ be a non-zero polynomial of degree $d$ with $m$ variables.
Choose the values of the variables $x_1, ..., x_m$ at random from some
large set $S$.
\begin{center}
Pr[$P(x_1 ... x_m) = 0$] $\leq \frac{md}{|S|}$
\end{center}
\subsection{The Language POLY-IDENT}
Let's define the language for which we want to test membership:
\begin{center}
POLY-IDENT = $\{ \langle P(y_1, ... ,y_k), q_1(x_1, ..., x_m), ...,
q_k(x_1, ..., x_m) \rangle \ |\ P(q_1, ..., q_k) = 0 \}$
\end{center}
The Schwartz-Zippel lemma suggests an obvious algorithm to test
membership in POLY-IDENT: evaluate the input polynomial on a bunch of
random points from a huge field, and if it ever evaluates to something
other than zero, reject, otherwise accept. This algorithm has
\textit{one-sided error}, since it will always accept if the input is
in the language, and it will usually reject if the input is not in the
language. So, POLY-IDENT $\in$ co-RP, which will make sense once we
define RP later in the lecture. First, let's see why POLY-IDENT $\in$
\ppoly.
\begin{theorem}
POLY-IDENT $\in$ \ppoly
\end{theorem}
\noindent \textbf{Proof:}
To show POLY-IDENT $\in$ \ppoly, we need to define an advice sequnce
such that for each polynomial of a certain length, the advice string
for that length will let a mchine decide deterministically if the
polynomial is zero. We will use a point set as our advice string
for each length, so that we can deterministically evaluate the
polynomial at those points and accept if we get zero.
So, if we show the following, an advice sequence must exist, and the
theorem follows:\\
\begin{itemize}
\item
$\forall$ polynomials $W$ s.t. $|W| = n$,
\item
$\exists X = (x_1 ... x_m), x_i \in \{1, ..., 5^n\}$ s.t.
\item
$W \in $POLY-IDENT $\iff W(x_1, ..., x_m) = 0$
\end{itemize}
To show that such a point set exists, we will use the probabilistic
method. This is a way to show the existence of an object by showing
that the probability of choosing it randomly is greater than zero.
This is a strange concept to get used to, since one usually thinks of
it the other way around. But it turns out to be a very clean and
concise way to prove many things that would otherwise be very
difficult to prove.
So, for a given $n$, we choose $x_1, ..., x_m$ at random. The event
${\cal E}$ we want to occur is:
\begin{center}
${\cal E}$: $\forall W $ s.t. $|W| = n$, $W \in $POLY-IDENT $\iff W(x_1,
..., x_m) = 0$
\end{center}
If we show that the probability over all choices of $X$ that ${\cal E}$
occurs is greater than zero, then an $X$ must exist that will serve as
our advice string. We will do this by working with $\overline{{\cal
E}}$:
\begin{center}
$\overline{{\cal E}}$:
$\exists W $ s.t. $|W| = n$, $W \notin $POLY-IDENT
and $W(x_1, ..., x_m) = 0$
\end{center}
If we show $\underset{X}{Pr}[\overline{{\cal E}}] < 1$ then our
existence proof is complete. Using a union bound, where the
probability of a union of events is less than or equal to the sum of
the probabilities of the individual events, we have:
$$
\underset{X}{Pr}[\overline{{\cal E}}] \leq
\sum_{
\tiny{
\begin{array}{c}
W \mathrm{s.t.} |W| = n \\
W \notin \mathrm{POLY-IDENT}
\end{array}
}
}
\underset{X}{Pr}[W(X) = 0]
$$
We can bound the number of variables in $W$ by $n$ since we can only
write down at most $n$ variables. We can bound the degree of $W$ by
$2^n$, since we could never write down a greater degree. There are at
most $2^n $ choices for $W$. So, by Schwartz/Zippel, we have:
$$
\underset{X}{Pr}[\overline{{\cal E}}] \leq
(2^n)\left( \frac{(n)(2^n)}{5^n} \right) < 1
$$
$\square$
An interesting open question is whether there is a concrete way to
find such a valid set of points in polynomial time.
\section{Randomized Complexity Classes}
A randomized poly-time TM is a TM that runs in polynomial time and can
flip coins to make decisions. The class BPP corresponds to languages
that we can decide in polynomial time with a reasonable probability
of coming out with the right answer.\\
\begin{definition}
A language $L$ is in the class \textbf{BPP} if $\exists$ a randomized
poly-time TM $M$ s.t.
\end{definition}
\begin{itemize}
\item
$x \in L \implies $Pr[$M(x)$ accepts] $> \frac{2}{3}$
\item
$x \notin L \implies $Pr[$M(x)$ accepts] $< \frac{1}{3}$ \\
\end{itemize}
Let's be a bit more formal just for fun:\\
\begin{definition}
A language $L$ is in the class \textbf{BPP} if $\exists$ $A \in P$,
$f(n) \leq n^{{\cal O}(1)}$, s.t.
\end{definition}
\begin{itemize}
\item
$x \in L \implies$
$\underset{r\ \mathrm{s.t.}\ |r| = f(|x|)}
{\mathrm{Pr}}[\langle x,r \rangle \in A] > \frac{2}{3}$
\item
$x \notin L \implies$
$\underset{r\ \mathrm{s.t.}\ |r| = f(|x|)}
{\mathrm{Pr}}[\langle x,r \rangle \in A] < \frac{1}{3}$\\
\end{itemize}
The class RP will be a bit stronger. Algorithms in RP have the
advantage of only one-sided error; if the string is in the language,
we still have a good probability of accepting. But if the string isn't
in the language, we will always reject.
\begin{definition}
A language $L$ is in the class \textbf{RP} if $\exists$ a randomized
poly-time TM $M$ s.t.
\end{definition}
\begin{itemize}
\item
$x \in L \implies $Pr[$M(x)$ accepts] $> \frac{2}{3}$
\item
$x \notin L \implies $Pr[$M(x)$ accepts] $= 0$ \\
\end{itemize}
If a TM has one-sided error on the other side, the lagnuage of the TM
is in co-RP, as is POLY-IDENT.
\subsection{The Chernoff Bound}
The \textbf{Chernoff Bound} is very useful for showing that the sum of
independent identically distributed random variables does not drift
far from its expectation:
Let $(y_1, ..., y_n)$ be a set of individual identically
distributed random variables in $[0,1]$. Let $\mu = E[Y]$,
where $Y = \sum_{i=1}^n y_i$. For any $\epsilon$ between 0 and 1,
\begin{itemize}
\item
Pr[$Y < (1 - \epsilon) \mu$ ] $< e^{- \frac{\mu \epsilon^2}{3}}$ and
\item
Pr[$Y > (1 + \epsilon) \mu$ ] $< e^{- \frac{\mu \epsilon^2}{3}}$\\
\end{itemize}
\subsection{Probability Amplification}
We will use the Chernoff Bound to show that even if you have a TM that
has a very small gap between the probabilities on either side of it's
error, you can make another TM that will ``blow up'' the chance of
successfully accepting or rejecting by running the original TM
repeatedly and honing in on its success rate.
So, let's redefine BPP one more time, and formalize this theorem:
\begin{theorem}
\noindent Let $L$ be a language s.t. $\exists$
a randomized poly-time TM $M$, functions $c(n)$, $s(n)$, $a > 0$ s.t.
\end{theorem}
\begin{itemize}
\item
$x \in L \implies $Pr[$M(x)$ accepts] $\geq c(|x|)$
\item
$x \notin L \implies $Pr[$M(x)$ accepts] $\leq s(|x|)$
\item
$c(n) - s(n) > \frac{1}{n^a}$
\end{itemize}
\noindent \textit{Then, $\forall b > 0,\ \exists $ a randomized poly-time TM $M'$ s.t.}
\begin{itemize}
\item
$x \in L \implies $Pr[$M'(x)$ accepts] $\geq 1 - 2^{-|x|^b}$
\item
$x \notin L \implies $Pr[$M'(x)$ accepts] $\leq 2^{-|x|^b}$
\end{itemize}
\noindent \textbf{Proof:}
\noindent $M'$ does the following on input $x$:
\begin{enumerate}
\item
Run $M$ $k$ times, where $k = 12|w|^{3a+b}$.
\item
Accept if $M$ accepts more than
$\left( \frac{c(|w|)+s(|w|)}{2} \right) k$ times.
\end{enumerate}
\noindent Let $(y_1, ...,y_k)$ be indicator variables, where $y_i = 1$
if $M'$ accepts on its $i^{th}$ run.
Let $\mu = E[Y]$, where $Y = \sum_{i=1}^k y_i$.
We want to express the event that our algorithm fails when $w \in L$
in terms of the deviation of $Y$ from the mean.
If $w \in L, \mu \geq c(|w|)k$.
If $M'$ rejects, $Y \leq$
$\left( \frac{c(|w|)+s(|w|)}{2} \right) k$.
Set $\epsilon = \frac{1}{2} - \frac{k\ s(w)}{2 \mu}$,
so $Y \leq (1 - \epsilon) \mu$.
By the Chernoff bound, \\
\noindent Pr[$M'$ rejects $x$] = Pr[$Y \leq (1 - \epsilon) \mu$]
$< e^{- \frac{\mu \epsilon^2}{3}}$
$\leq e^{-|x|^b}$
$\leq 2^{-|x|^b}$
A symmetric argument shows the desired result for the case where $x
\notin L$. $\square$\\
\subsection{Randomness is Subsumed by Non-Uniformity}
A corrolary to the amplification theorem we just proved is BPP
$\subseteq$ \ppoly. We prove this by taking advantage of the fact
that the probability of making an error using an amplified BPP TM is
very small; so small, in fact, that we can union bound over all the
possible inputs of a given length and know that there is a random
string for that length that will never make an error. We can use the
random strings of each length as an advice sequence.
We first look at a reinterpretation of the amplification theorem, then
we apply some simple probability to complete the existence proof of
such a random string.
\begin{theorem}
BPP $\subseteq$ \ppoly
\end{theorem}
\noindent \textbf{Proof:}
\noindent Let $L$ be a language in BPP. By the amplification theorem,
$\exists$ function $f(n) = n^{{\cal O}(1)}$, $A \in P$ s.t.
\begin{itemize}
\item
$x \in L \implies$
$\underset{r\ \mathrm{s.t.}\ |r| = f(|x|)}
{\mathrm{Pr}}[\langle x,r \rangle \in A] > 1 - 2^{-|x|-2}$
\item
$x \notin L \implies$
$\underset{r\ \mathrm{s.t.}\ |r| = f(|x|)}
{\mathrm{Pr}}[\langle x,r \rangle \in A] < 2^{-|x|-2}$\\
\end{itemize}
We construct a sequence of strings $r$ for each arbitrary length $n$.
We prove the existence of a good string $r$ for length $n$ by choosing
$r$ at random of length $f(n)$.
\begin{eqnarray*}
\Pr{\forall x \mbox{ s.t. } |x| = n, x \in L, \langle x,r \rangle \in A}
& = &
1 - \Pr{\exists x \mbox{ s.t. } |x| = n, x \in L, \langle x,r \rangle
\notin A} \\
&\geq &
1 - \underset{x \in L, |x| = n}{\sum}
\Pr{r\langle x,r \rangle \notin A} \\
&\geq& 1 - 2^n 2^{-n-2} \geq \frac{3}{4}
\end{eqnarray*}
\begin{eqnarray*}
\Pr{\forall x \mbox{ s.t. }|x| = n, x \notin L, \langle x,r \rangle \notin A}
& = &
1 - \Pr{\exists x \mbox{ s.t. } |x| = n, x \notin L, \langle x,r \rangle
\in A} \\
&\geq& 1 - \underset{x \notin L, |x| = n}{\sum}
\Pr{\langle x,r \rangle \in A}\\
&\geq& 1 - 2^n 2^{-n-2} \geq \frac{3}{4}.
\end{eqnarray*}
\noindent Therefore, $\exists r$ s.t.
\begin{center}
$\forall x$ s.t. $|x| = n, x \in L \iff \langle x,r \rangle \in A$
\end{center}
\noindent Using $r$ as an advice string for length $n$, the theorem follows. $\square$\\
\end{document}