\documentclass{article}
\usepackage{amsmath}
\input{preamble.tex}
\begin{document}
\lecture{20}{April 25, 2001}{Dan Spielman}{Jan Vondr\'ak}
\section*{The Multilinearity Test}
In this lecture, we describe the {\em multilinearity
test} which is used by the verifier in the proof of
$\textrm{NEXP} \subseteq \textrm{PCP(poly,poly)}$.
For a finite field $F$ and a function $f : F^n \rightarrow F$,
given by a table of values, we want to decide if $f$ is "close"
to a multilinear function, by looking at a polynomial number of
entries in the table. Our test will be randomized and we would
like to have the following properties:
\begin{itemize}
\item If $f$ is multilinear then $Pr[\mbox{test accepts}] = 1$.
\item If $Pr[\mbox{test accepts}] > {1 \over 2}$ then there is a
multilinear function $L$ which differs from $f$ at most in a
${1 \over n^k}$-fraction of places (for some $k$ fixed).
\end{itemize}
The test is surprisingly simple. However, its analysis is more
difficult.
\vspace{20pt}
{\bf The Test:}
\vspace{10pt}
Repeat $t$ times:
\begin{itemize}
\item Choose $(x_1, \ldots, x_n) \in F^n$ uniformly at random.
\item For $i = 1 \ldots n$, let
$ a_i(z) = f(x_1, \ldots, x_i' = z, \ldots, x_n).$
\item If for any $i$,
$ a_i(x_i) \neq a_i(0) + x_i (a_i(1) - a_i(0)),$
stop and reject.
\end{itemize}
Accept if all iterations have been completed successfully.
\begin{definition}
Let $ML_n$ denote the class of all multilinear functions
from $F^n$ to $F$. For functions $f, g: F^n \rightarrow F$,
define their distance as
$$ d(f,g) = \underset{x \in F^n}{Pr}[f(x) \neq g(x)].$$
\end{definition}
\begin{lemma}
If $|F| = q > 6n$ then for any two different $L_1, L_2 \in ML_n$,
$$ d(L_1, L_2) > {5 \over 6}. $$
\end{lemma}
{\bf Proof:}
If $L_1 \neq L_2$ then $L_1 - L_2$ is a non-zero multilinear function.
By Schwartz's lemma (with degree at most 1 in each variable),
$$ \underset{x \in F^n}{Pr}[(L_1 - L_2)(x) = 0] \leq {n \over q}
< {1 \over 6}.$$
As a corollary, for any $f$ there can be at most one multilinear
function "close" to $f$ (satisfying $d(f,L) < {1 \over 3}$, for instance).
\vspace{10pt}
First, we demonstrate how the test works in two variables.
Let $f: F^2 \rightarrow F$. If $f$ is multilinear then the test
clearly accepts with probability 1.
So suppose $f$ is not multilinear. From $f$, we derive
two functions which are linear in $x_1$ and $x_2$, respectively.
$$ f^1(x_1, x_2) = f(0, x_2) + x_1 (f(1, x_2) - f(0, x_2)), $$
$$ f^2(x_1, x_2) = f(x_1, 0) + x_2 (f(x_1, 1) - f(x_1, 0)). $$
The way the test works, we are actually testing the difference
between $f$ and $f^1, f^2$. If $d(f,f^i) = \epsilon_i$, the
test for the $i$-th coordinate rejects with probability $\epsilon_i$.
Therefore,
$$ Pr[\mbox{test accepts}] \leq (1 - \max\{d(f,f^1), d(f,f^2)\})^t.$$
Our goal is to show that if the test accepts with high probability,
$f$ is not only close to $f^1$ and $f^2$ but it is close to some
multilinear function. Call $(x_1, x_2) \in F^2$ {\em good} if
$f(x_1, x_2) = f^1(x_1, x_2) = f^2(x_1, x_2)$. In other words,
the test passes an iteration iff it chooses a good point.
\begin{lemma}
$$ \underset{x_1,x_2}{Pr}[(x_1,x_2) \mbox{ is good}] >
1 - \epsilon \Rightarrow \exists L \in ML_2; d(f,L) < 5 \epsilon.$$
\end{lemma}
{\bf Proof:}
If
$$\underset{x_1,x_2}{Pr}[(x_1,x_2) \mbox{ is good}]
> 1 - \epsilon$$
then
$$\underset{x_1,x_2}{Pr}[(x_1,x_2) \mbox{ is bad}]
< \epsilon$$
and by Markov's inequality, for at least 1/2 values of $x_1$,
$$\underset{x_2}{Pr}[(x_1,x_2) \mbox{ is bad}]
< 2 \epsilon.$$
Let $c, d$ be two values of $x_1$ for which this is true
and define a multilinear function
$$ L(x_1, x_2) = f^2(c, x_2) + {f^2(d, x_2) - f^2(c, x_2)
\over d - c} (x_1 - c).$$
At least a $(1 - 4 \epsilon)$-portion of $x_2$'s are good for
both $c$ and $d$; i.e., $f^1(c,x_2) = f^2(c,x_2) = L(c,x_2)$ and
$f^1(d,x_2) = f^2(d,x_2) = L(d,x_2)$. For such $x_2$, $L(x,x_2)$
and $f^1(x,x_2)$ are equal as functions of $x$ (because a linear
function is determined by its values at two points). Thus
$$ d(f^1, L) = \underset{x_1, x_2}{Pr}[L(x_1,x_2) \neq f^1(x_1,x_2)]
\leq 4 \epsilon. $$
Obviously, $d(f, f^1) \leq \epsilon$ and so
$$ d(f, L) \leq 5 \epsilon.$$
\vspace{10pt}
For $\epsilon = {1 \over n^k}$, we repeat our test $n^k$ times and
then if the acceptance probability is still high, the lemma implies
that $f$ is closer than $O({1 \over n^k})$ to a multilinear function.
Unfortunately, this proof does not generalize to an arbitrary
number of variables. We sketch out how the test can be analyzed for
$n$ variables.
From $f: F^n \rightarrow F$, we derive $f^1, \ldots, f^n$
linear in $x_1, x_2, \ldots, x_n$ respectively. They key
observation is that $d(f^1, ML_n)$ is reasonably approximated
by $d(f^1_{x_1 = c}, ML_{n-1})$, i.e. instead of $f^1$ we consider
the slice taken at a random point $x_1 = c$, which reduces the
number of variables by one.
More precisely, we give the following without proof.
\begin{lemma}
If $f^1$ is linear in $x_1$, then either
$$ \underset{c \in F}{Pr}[d(f^1, ML_n) - d(f^1_{x_1=c}, ML_{n-1})
\leq {1 \over \sqrt{q}}] \geq 1 - {1 \over \sqrt{q}}$$
or $ d(f^1_{x_1=c}, ML_{n-1}) > {1 \over 6} $
for all but one value of $c_1$.
\end{lemma}
For now, we ignore the latter possibility and apply this idea
successively to all variables. For "most" values of $c_1, c_2, \ldots$,
we have
$$ d(f, ML_n) \leq d(f, f^1) + d(f^1, ML_n) \leq d(f, f^1) +
d(f^1_{x_1=c_1}, ML_{n-1}) + {1 \over \sqrt{q}} $$
$$ \leq d(f, f^1) + d(f^1_{x_1=c_1}, f^2_{x_1=c_1}) +
d(f^2_{x_1=c_1}, ML_{n-1}) + {1 \over \sqrt{q}} $$
$$ \leq d(f, f^1) + d(f^1_{x_1=c_1}, f^2_{x_1=c_1}) +
d(f^2_{x_1=c_1, x_2=c_2}, ML_{n-2}) + {2 \over \sqrt{q}}$$
$$ \ldots $$
$$ \leq d(f, f^1) + d(f^1_{x_1=c_1}, f^2_{x_1=c_1}) +
d(f^2_{x_1=c_1,x_2=c_2}, f^3_{x_1=c_1,x_2=c_2}) + \ldots +
{n \over \sqrt{q}}. $$
This holds for all but a ${1 \over \sqrt{q}}$-fraction of
each $c_i$; by averaging over the $c_i$'s, we get an additional
error term of ${1 \over \sqrt{q}}$ for each variable:
$$ d(f, ML_n) \leq d(f, f^1) + d(f^1, f^2) + d(f^2, f^3) +
\ldots + d(f^{n-1}, f^n) + {2n \over \sqrt{q}} $$
$$ \leq 2 \sum_{i=1}^n{d(f, f^i)} + {2n \over \sqrt{q}}.$$
So far, we have ignored the possibility that $d(f^i, ML_{n-i})
> 1/6$ for all but one value of $c_i$. However, if we multiply
the right hand side by a factor of 6, the inequalities will
be satisfied trivially in this case.
\begin{theorem}
$$ d(f, ML_n) \leq 12 \left(\sum_{i=1}^n{d(f, f^i)} +
{n \over \sqrt{q}} \right).$$
\end{theorem}
This means that if our test accepts with high probability
and the distances $d(f, f^i)$ are small then $f$ must
be very close to a multilinear function.
\end{document}