# Some news regarding the Paley graph

Let $\mathbb{F}_p$ denote the field with $p$ elements, and let $Q_p$ denote the multiplicative subgroup of quadratic residues. For each prime $p\equiv 1\bmod 4$, we consider the Paley graph $G_p$ with vertex set $\mathbb{F}_p$, where two vertices are adjacent whenever their difference resides in $Q_p$. For example, the following illustration from Wikipedia depicts $G_{13}$:

The purpose of this blog entry is to discuss recent observations regarding the Paley graph.

— New bound on the clique number —

There are several proofs that $\omega(G_p)\leq \sqrt{p}$. In 2006, Maistrelli and Penman proved $\omega(G_p)\leq \sqrt{p-4}$, and in 2013, Bachoc, Matolcsi and Ruzsa proved that $\omega(G_p)\leq \sqrt{p}-1$ holds for most $p$. This contrasts with the best-known lower bound of $\Omega(\log p)$. Considering this disparity between upper and lower bounds, I was excited to see progress on the upper bound this year:

Theorem (Hanson–Petridis 2019).    $\displaystyle \omega(G_p)\leq \sqrt{\frac{p}{2}-\frac{1}{4}}+\frac{1}{2}.$

The proof uses Stepanov’s method of auxiliary polynomials: Given $A \subseteq \mathbb{F}_p$ such that $A-A \subseteq Q_p \cup \{0\}$, Hanson and Petridis construct a polynomial $F$ of degree $(p-1)/2$ with coefficients in $\mathbb{F}_p$ such that every member of $A$ is a root of $F$ of multiplicity at least $|A|-1$. It follows that $|A|(|A|-1) \leq (p-1)/2$.

Considering $\omega(G_q)=\sqrt{q}$ when $q$ is an even power of an odd prime, the proof necessarily makes use of the fact that $p$ is prime. We highlight how in the following proof sketch:

Fix $d=(p-1)/2$, $M=|A|$, and $A=\{a_1,\ldots,a_M\}$, and let $c_1,\ldots,c_M\in \mathbb{F}_p$ denote the unique solution to the linear system

$\displaystyle \sum_{k=1}^M c_k(-a_k)^n =\left\{ \begin{array}{cl} 1 & \text{if } n=M-1\\ 0 & \text{if } n\in\{0,\ldots,M-2\}.\end{array} \right.$

(Indeed, this Vandermonde system is invertible since the $a_k$‘s are distinct.) Define

$\displaystyle F(x) := -1 + \sum_{k=1}^M c_k(x-a_k)^{d+M-1}.$

By our choice of $c_k$‘s, an application of the binomial theorem reveals that $F$ has degree $d$. (Here, it’s important that $p$ is a prime rather than a prime power, since otherwise the coefficient of $x^d$ might be zero.) Since $A-A \subseteq Q_p \cup \{0\}$, we have $(a-a')^{d+1}=a-a'$ for every $a,a'\in A$. This identity allows us to relate $F$ to the polynomial

$\displaystyle G(x) := \sum_{k=1}^M c_k(x-a_k)^{M-1},$

which equals 1 by the binomial theorem and our choice of $c_k$‘s. In particular, for every $a\in A$, it holds that

\displaystyle \begin{aligned} F(a) &= -1 + \sum_{k=1}^M c_k(a-a_k)^{d+M-1}\\ &= -1 + \sum_{k=1}^M c_k(a-a_k)^{M-1} = -1 + G(a) = 0.\end{aligned}

To show that each $a\in A$ is a root of $F$ with multiplicity at least $M-1$, we take derivatives of $F$ before evaluating at $a$. For example,

\displaystyle \begin{aligned} F'(a) & = (d+M-1)\sum_{k=1}^M c_k(a-a_k)^{d+M-2} \\ & = (d+M-1)\sum_{k=1}^M c_k(a-a_k)^{M-2} =\frac{d+M-1}{M-1}\cdot G'(a) =0.\end{aligned}

(Notice that division by $M-1$ might not be possible if $p$ were a prime power rather than a prime.) Similarly, the $j$th derivative of $F$ vanishes on $A$ for each $j\in[M-2]$. This gives the desired result.

— 3-point bounds on $\omega(G_p)$ are computationally fast —

Recall that the Lovasz theta number provides a semidefinite relaxation of the clique number of the complement graph. To see this, let $1_A\in\mathbb{R}^V$ denote the indicator vector of any vertex subset $A\subseteq V$ of a graph $G=(V,E)$. Defining

$\displaystyle X_A:=\frac{1}{|A|}1_A1_A^\top,$

then one may write

$\displaystyle \omega(G) = \max ~\sum_{ij}(X_A)_{ij}~ \text{ s.t. } A \text{ is a clique in } G.$

One may relax this integer program by recording convex constraints that $X_A$ satisfies:

$\displaystyle \vartheta(\bar{G}) = \max ~\sum_{ij}X_{ij}~ \text{ s.t. } \textrm{tr}X=1, X_{ij}=0 \text{ for } (i,j)\not\in E, X\succeq 0.$

This is known as the Lovasz theta number of $\bar{G}$, and by virtue of the relaxation, it holds that $\omega(G)\leq \vartheta(\bar{G})$. When Lovasz first introduced the theta function, he proved that

$\vartheta(G)\vartheta(\bar{G})\geq |V|,$

with equality when $G$ is vertex-transitive. Since $G_p$ is both vertex-transitive and self-complementary, it follows that $\omega(G)\leq\vartheta(\bar{G})=\sqrt{p}$.

For a sharper bound on the clique number, one might consider a higher-order member of an appropriate SDP hierarchy. For example, Gvozdenovic, Laurent and Vallentin proposed such a hierarchy in 2009, and they evaluated its performance on the Paley graph. In this setting, it has become popular to refer to the classical Lovasz program as a 2-point bound, while higher-order programs produce $k$-point bounds for $k>2$. As an application, Gvozdenovic, Laurent and Vallentin computed 3-point bounds for $G_p$, denoting them by $L_+(\mathrm{TH}(G_p))$; see their Table 2 for explicit estimates. Unfortunately, going up the SDP hierarchy comes at a substantial computational price. Case in point, it took 4.5 hours to compute $L_+(\mathrm{TH}(G_{809}))$ on a 3GHz processor with 1GB of RAM. In joint work with Mark Magsino and Hans Parshall, we managed to exploit symmetries in the Paley graph to reduce this runtime to under 20 seconds. Thanks to this speedup, we were able to run extensive experiments that suggest that $L_+(\mathrm{TH}(G_p))$ is always $O(1)$ away from the Hanson–Petridis bound:

Above, blue dots correspond to the 3-point bound, whereas the red curve gives the closed-form Hanson–Petridis bound. In particular, the 3-point bound improves upon the Hanson–Petridis bound in what appears to be a fraction of primes.

An easy interpretation of the 3-point bound is as follows: Since $G_p$ is vertex-transitive, we know that $0$ resides in a maximum clique. As such, we can restrict our search to the local graph $L_p$, i.e., the subgraph of $G_p$ induced by the neighborhood $Q_p$ of $0$. In particular, it holds that

$\omega(G_p) \leq \vartheta(\bar{L}_p)+1.$

Next, the vertices $Q_p$ of $L_p$ form a cyclic multiplicative group, and $L_p$ is circulant over this group. Since $L_p$ is circulant, the Lovasz SDP is invariant under conjugation by circulant permutations, and so we may average over these orbits to reduce to a linear program (this last move is common practice). While our numerical experiments solved this linear program for all primes less than 3000, we suspect that it’s possible to compute with $p$ in the millions by exploiting a fast Fourier transform in the LP solver. It would be interesting if one could construct feasible points for the dual LP that improve upon the Hanson–Petridis bound infinitely often. Our experiments suggest that the improvement would be an additive constant of 1 infinitely often, much like Bachoc, Matolcsi and Ruzsa’s improvement over the “trivial” bound.

— Random Paley subgraphs satisfy a Kesten–McKay law —

Select $q\in(0,1)$, and consider the random graph $G_{p,q}$ obtained by deleting vertices of $G_p$ independently with probability $1-q$. What can be said about the statistics of such a random graph?

Intuitively, when $q$ is small, this random graph resembles the Erdos–Renyi graph with edge probability 1/2. The adjacency matrix of the latter graph is given by

$\displaystyle A=\frac{1}{2}\Big(W+11^\top-I\Big),$

where $W$ is symmetric with zeros on the diagonal and independent Rademacher variables on the off-diagonal. Since $W$ is a Wigner matrix, its spectrum follows a semi-circle law, and so $A$ enjoys a similar spectrum with an additional spike corresponding to the all-ones contribution. To corroborate our intuition, consider the following histograms; one illustrates the bulk of the spectrum of a 998-vertex instance of $G_{p,q}$ with $p=10^6+33$ and $q=10^{-3}$, and the other corresponds to an instance of the Erdos–Renyi graph on 998 vertices with edge probability 1/2. The largest eigenvalue of both graphs is off the chart at about 498. (Can you tell which is which?)

When $q$ is larger, we expect $G_{p,q}$ to inherit more of the structure from its parent graph $G_p$, and therefore lose the semicircular behavior. For example, the following is a histogram of the bulk of the spectrum of an instance of $G_{p,q}$ with $p=10^4+9$ and $q=0.3$:

In general, it turns out that the limiting spectral distribution for $G_{p,q}$ (where we fix $q$ and send $p\to\infty$) is the so-called Kesten–McKay distribution. In joint work with Mark Magsino and Hans Parshall, we prove this using the moment method.

This result corresponds to a recent observation by Haikin, Zamir and Gavish: If $X$ is a random $d\times qrd$ submatrix of a $d\times rd$ equiangular tight frame, then the eigenvalues of $X^\top X$ appear to exhibit a Wachter distribution with parameters determined by $r$ and $q$. This observation will likely have implications for compressed sensing, where it’s important to have control over the conditioning of submatrices. To see the connection with the Paley graph, recall that the Paley equiangular tight frame is a real $d\times 2d$ matrix $\Phi$ with $2d=p+1$ such that

$\displaystyle \sqrt{p}\cdot(\Phi^\top \Phi-I)=\left[\begin{array}{ll}0 & 1^\top \\ 1 & S\end{array}\right],$

where $S$ is the Seidel adjacency matrix of the Paley graph $G_p$; that is, $S_{ii}=0$, $S_{ij}=-1$ if $(i,j)\in E(G_p)$, and otherwise $S_{ij}=1$. Furthermore, in this special case where $r=2$, the Wachter distribution reduces to the Kesten–McKay distribution.

We’re excited to tackle more of the Haikin–Zamir–Gavish observations in the coming months.