Deterministic RIP matrices: Breaking the square-root bottleneck, III

This blog entry is the third in a series which seeks to maximize \epsilon_0 such that there exist explicit M\times N matrices of arbitrarily large aspect ratio N/M which are RIP for K=\Omega(M^{1/2+\epsilon_0}). Continuing our study of the breakthrough paper by Bourgain et al., this entry will identify what the set \mathcal{A} needs to satisfy, how to construct it, and we will also use this set to complete the proof of the main result. In particular, we will prove a general version of the result which allows for optimizing constants, and then we will perform an optimization routine that improves \epsilon_0 by a few orders of magnitude.

— Everything you need to know about \mathcal{A}

The main result will use an even number m as a free parameter, which in turn determines another parameter \alpha=\alpha(m). For a given choice of m, the proof of the main result requires a subset \mathcal{A}\subseteq\mathbb{F}_p such that (i)

|\mathcal{A}|\leq p^\alpha \quad (2.3)

(i.e., \mathcal{A} is sufficiently small), and (ii) for any a\in\mathcal{A}, the only way for a_1,\ldots,a_{2m}\in\mathcal{A}\setminus\{a\} to satisfy

\displaystyle{\sum_{j=1}^m\frac{1}{a-a_j}=\sum_{j=m+1}^{2m}\frac{1}{a-a_j} \quad (2.4)}

is by taking (a_1,\ldots,a_m) and (a_{m+1},\ldots,a_{2m}) to be permutations of each other. Here, division (and addition) is taken in the field \mathbb{F}_p.

Unfortunately, these requirements on \mathcal{A} lead to very little intuition compared to our current understanding of \mathcal{B}. From my perspective, this choice for \mathcal{A} is merely sufficient to produce the remaining flat-RIP-type cancellations we seek, so there may be room for improvement here. Regardless, we will continue by considering how Bourgain et al. constructs \mathcal{A}. The following lemma describes their construction and makes a slight improvement to \alpha:

Lemma. Pick \epsilon>0 and take L:=\lfloor p^{1/(4m-2+\epsilon)(2m-1+\epsilon)}\rfloor and U:=\lfloor L^{4m-2+\epsilon}\rfloor. Then

\mathcal{A}:=\{x^2+Ux:1\leq x\leq L\}

satisfies (i) and (ii) above if we take


and p is a sufficiently large prime.

Proof: For (i), we have |\mathcal{A}|=L\leq p^\alpha. For (ii), we claim it suffices to show that for any n\in\{1,\ldots,2m\} [note: the paper contains a typo here, namely n\geq 2m], any distinct x,x_1,\ldots,x_n\in\{1,\ldots,L\}, and any nonzero integers \lambda_1,\ldots,\lambda_n such that |\lambda_1|+\cdots+|\lambda_n|\leq 2m, then

\displaystyle{V=\sum_{j=1}^n\frac{\lambda_j}{(x-x_j)(x+x_j+U)}\quad (1)}

is nonzero (in \mathbb{F}_p). To see this, define a:=x^2+Ux and a_j:=x_j^2+Ux_j. Then


As such, if (ii) fails to hold, then subtracting the right-hand side of (2.4) from the left-hand side produces an example of V which is zero, violating our statement. Thus, our statement implies (ii) by the contrapositive.

We now seek to prove our statement. To this end, define D_1:=\prod_{j=1}^n(x-x_j) and D_2:=\prod_{j=1}^n(x+x_j+U). Note that (1) being nonzero in \mathbb{F}_p is equivalent to having p not divide D_1D_2V as an integer. To prove this, we will show that

(a) p does not divide D_1,

(b) D_2V is nonzero, and

(c) |D_2V|<p.

Indeed, (b) and (c) together imply that p does not divide D_2V, which combined with (a) implies that p does not divide D_1D_2V.

For (a), we have D_1\neq0 since x and the x_j‘s are distinct by assumption, and since x and each x_j has size at most L, we also have |D_1|\leq L^{2m}. To complete the proof of (a), it then suffices to have

(a’) L^{2m}<p,

which we will verify later.

For (b), we will prove that D_1D_2V is nonzero. We first write


For each term in the above sum, note that both fractions are integers, and for every j\neq1, x+x_1+U is a factor of D_2/(x+x_j+U). As such, the entire sum is congruent to the first term modulo x+x_1+U:


Each factor of the form x+x_j+U can be further simplified:

x+x_j+U=(x_j-x_1)+(x+x_1+U)\equiv x_j-x_1\mod(x+x_1+U),

and so

\displaystyle{D_1D_2V\equiv V_1\mod(x+x_1+U)},



To prove that D_1D_2V is nonzero, it suffices to show that x+x_1+U does not divide V_1. To this end, we first note that V_1 is nonzero since x and the x_j‘s are distinct by assumption. Next, since |\lambda_1|\leq 2m and x and each x_j is at most L, we have |V_1|\leq 2mL^{2n-2}\leq 2mL^{4m-2}, and so it suffices to have

(b’) 2mL^{4m-2}\leq U

since U<x+x_1+U. We will verify this later.

For (c), we have

\displaystyle{|D_2V|\leq\sum_{j=1}^n\frac{|\lambda_j|}{|x-x_j|}\prod_{\substack{j'=1\\j'\neq j}}^n|x+x_{j'}+U|}\leq\bigg(\sum_{j=1}^n|\lambda_j|\bigg)\cdot(2L+U)^{n-1}.

Considering our assumption on the \lambda_j‘s we then have |D_2V|\leq 2m(2L+U)^{n-1}\leq2m(2L+U)^{2m-1}, and so it suffices to have

(c’) 2m(2L+U)^{2m-1}<p.

Overall, we have shown it suffices to have

(a’) L^{2m}<p,

(b’) 2mL^{4m-2}\leq U, and

(c’) 2m(2L+U)^{2m-1}<p.

To satisfy (b’) for sufficiently large L, we take U:=\lfloor L^{4m-2+\epsilon}\rfloor. Then L=o(U), and so 2m(2L+U)^{2m-1}<U^{2m-1+\epsilon}\leq L^{(4m-2+\epsilon)(2m-1+\epsilon)} for sufficiently large U. As such, for (c’), it suffices to take L:=\lfloor p^{1/(4m-2+\epsilon)(2m-1+\epsilon)}\rfloor, which also satisfies (a’).     \Box

— The main result —

Before things get messy, let’s discuss the structure of the proof of the main result (i.e., Lemma 2 in the paper). The goal is to prove flat-RIP-type cancellations, namely that

\displaystyle{|S(A_1,A_2)|:=\bigg|\sum_{\substack{a_1\in A_1\\a_2\in A_2}}\sum_{\substack{b_1\in\Omega_1(a_1)\\b_2\in\Omega_2(a_2)}}\bigg(\frac{a_1-a_2}{p}\bigg)e_p\bigg(\frac{(b_1-b_2)^2}{2(a_1-a_2)}\bigg)\bigg|\leq p^{1-\epsilon_1-\epsilon}}

for subsets A_1,A_2\subseteq\mathcal{A}. (Actually, we get to assume that these subsets and the \Omega_i(a_i)‘s satisfy certain size constraints since we have an extra -\epsilon in the power of p; having this will imply the general case without the \epsilon.) Thanks to the construction of \mathcal{B}, Lemma 9 gives sufficient cancellation when each A_i is sufficiently small. In the remaining case, Bourgain et al. prove sufficient cancellation by invoking Lemma 10, which concerns the following quantity:

\displaystyle{T_{a_1}(A_2,B):=\sum_{\substack{b_1\in B\\a_2\in A_2,~b_2\in\Omega_2(a_2)}}\bigg(\frac{a_1-a_2}{p}\bigg)e_p\bigg(\frac{(b_1-b_2)^2}{4(a_1-a_2)}\bigg)}.

Specifically, Lemma 10 gives that |T_{a_1}(A_2,B)| is small provided B has sufficient additive structure. In the proof of Lemma 2, they take a maximal subset B_0\subseteq\Omega_1(a_1) such that |T_{a_1}(A_2,B_0)| is small, and then they use Lemma 10 show that \Omega_1(a_1)\setminus B_0 necessarily has little additive structure. By Lemma 9, this in turn forces |T_{a_1}(A_2,B_1)| to be small, and so |T_{a_1}(A_2,\Omega_1(a_1))| (and furthermore |S(A_1,A_2)|) are also small due to a triangle inequality.

The remainder of this section mimics the proofs of Lemmas 2 and 10, but instead of arbitrarily picking certain constants, we leave them as variables which must satisfy certain constraints. We then solve the corresponding constrained optimization in the next section. What follows is a generalized version of the statement of Lemma 10, which we then use to prove a generalized version of Lemma 2. Later, we will prove a generalized version of Lemma 10 (which says that certain constraints imply the following statement, and thus Lemma 2):

Definition. Let \mathrm{L10}=\mathrm{L10}[\alpha_1,\alpha_2,k_0,k_1,k_2,m,y] denote the following statement about the construction of \mathcal{A} and \mathcal{B}: For every \epsilon>0, there exists P>0 such that for every prime p\geq P the following holds:

Take \Omega_1,\Omega_2\subseteq\mathcal{A}\times\mathcal{B} such that

|A_2|\geq p^{y}, \quad (6.2)

and for which there exist powers of two M_1,M_2 such that

\frac{M_i}{2}\leq |\Omega_i(a_i)|<M_i \quad (2.9\mbox{a})


|A_i|M_i\leq 2\sqrt{p} \quad (2.9\mbox{b})

for i=1,2 and for every a_i\in A_i. Then for every B\subseteq\mathbb{F}_p such that

p^{1/2-\alpha_1}\leq|B|\leq p^{1/2} \quad (6.3)


|B-B|\leq p^{\alpha_2}|B|, \quad (6.4)

we have

|T_{a_1}(A_2,B)|\leq|B|p^{1/2-\epsilon_2+\epsilon} \quad (6.5)

with \epsilon_2=k_0y-(k_1\alpha_1+k_2\alpha_2)/m for every a_1\in A_1.

Lemma 2′. Pick \epsilon>0 and an even integer m. According to the construction of \mathcal{A}, define \alpha=1/(2(2m-1)^2). According to the construction of \mathcal{B}, take M=\lceil 2^{8m^2-1+\epsilon}\rceil, and let \tau denote the solution to


Pick \epsilon_1,\gamma,\ell,x such that L10 holds with

\epsilon_1+\epsilon<\alpha_1-\alpha-(4/3)x-\epsilon, \quad (A)

\ell\leq 1/2+(4/3)x-\alpha_1+\epsilon/2, \quad (B)

(2\tau-1)(\ell-\gamma)\geq10\gamma, \quad (C)

\epsilon_1+\epsilon<\gamma/4-y/4-\epsilon, \quad (D)

\alpha_2\geq 9x+\epsilon, \quad (E)

\epsilon_2\leq x/8-\alpha/4, \quad (F)

\epsilon_1+\epsilon<\epsilon_2. \quad (G)

Then there exists P=P(\epsilon) such that the following holds for every prime p\geq P:

Take \Omega_1,\Omega_2\subseteq\mathcal{A}\times\mathcal{B} for which there exist powers of two M_1,M_2 such that

\frac{M_i}{2}\leq |\Omega_i(a_i)|<M_i \quad (2.9\mbox{a})


|A_i|M_i\leq 2\sqrt{p} \quad (2.9\mbox{b})

for i=1,2 and for every a_i\in A_i. Then |S(A_1,A_2)|\leq p^{1-\epsilon_1-\epsilon}.

Proof: First note that |A_1|M_1<p^{1/2+(4/3)x+\alpha-\alpha_1+\epsilon} implies that


by (2.9a), and by (2.9b), we also have


As such, the triangle inequality gives that

|S(A_1,A_2)|\leq|A_1||A_2||\Omega_1(a_1)||\Omega_2(a_2)|\leq 2p^{1+(4/3)x+\alpha-\alpha_1+\epsilon}\leq p^{1-\epsilon_1-\epsilon},

where the last step uses the fact that

\epsilon_1+\epsilon<\alpha_1-\alpha-(4/3)x-\epsilon. \quad (A)

Thus, we can assume |A_1|M_1\geq p^{1/2+(4/3)x+\alpha-\alpha_1+\epsilon}, and so (2.3) gives

M_1\geq\frac{1}{|A_1|}p^{1/2+(4/3)x+\alpha-\alpha_1+\epsilon}\geq p^{1/2+(4/3)x-\alpha_1+\epsilon}. \quad (6.1)

Applying (2.9a) and (6.1) then gives

|\Omega_1(a_1)|\geq\frac{M_1}{2}\geq\frac{1}{2}p^{1/2+(4/3)x-\alpha_1+\epsilon}>p^{1/2+(4/3)x-\alpha_1+\epsilon/2}\geq p^\ell,

where the last step uses the fact that

\ell\leq 1/2+(4/3)x-\alpha_1+\epsilon/2. \quad (B)

Note that we can redo all of the preceding analysis by interchanging indices 1 and 2. As such, we also have |\Omega_2(a_2)|>p^\ell. (This will enable us to use Corollary 4.) At this point, we bound


using Lemma 9:


Next, since (C) holds and |\Omega_1(a_1)|,|\Omega_2(a_2)|>p^\ell, Corollary 4 with \epsilon\leftarrow4\epsilon gives


At this point, the triangle inequality gives

\displaystyle{|S(A_1,A_2)|\leq\sum_{\substack{a_1\in A_1\\a_2\in A_2}}\bigg|\sum_{\substack{b_1\in\Omega_1(a_1)\\b_2\in\Omega_2(a_2)}}e_p\bigg(\frac{(b_1-b_2)^2}{4(a_1-a_2)}\bigg)\bigg|\leq\sum_{\substack{a_1\in A_1\\a_2\in A_2}}|\Omega_1(a_1)|^{7/8}|\Omega_2(a_2)|^{7/8}p^{1/8-\gamma/4+\epsilon}}

which can be further bounded using (2.9a) and (2.9b):

\leq 2^{7/4}|A_1|^{1/8}|A_2|^{1/8}p^{1-\gamma/4+\epsilon}

Thus, if |A_1|,|A_2|<p^y, then

|S(A_1,A_2)|\leq2^{7/4}p^{1+y/4-\gamma/4+\epsilon}\leq p^{1-\epsilon_1-\epsilon},

where the last step uses the fact that

\epsilon_1+\epsilon<\gamma/4-y/4-\epsilon. \quad (D)

As such, we may assume that either |A_1| or |A_2| is \geq p^y. Without loss of generality, we assume

|A_2|\geq p^y. \quad (6.2)

(This will enable us to use L10.) At this point, take B_0\subseteq\Omega_1(a_1) to be a maximal subset satisfying (6.5) for B=B_0, and denote B_1:=\Omega_1(a_1)\setminus B_0. Then the triangle inequality gives

\displaystyle{|T_{a_1}(A_2,B_1)|\leq\sum_{a_2\in A_2}\bigg|\sum_{\substack{b_1\in B_1\\b_2\in\Omega_2(a_2)}}e_p\bigg(\frac{(b_1-b_2)^2}{4(a_1-a_2)}\bigg)\bigg|},

and then Lemma 9 gives

\displaystyle{\leq\sum_{a_2\in A_2}|B_1|^{1/2}E(B_1,B_1)^{1/8}|\Omega_2(a_2)|^{1/2}E(\Omega_2(a_2),\Omega_2(a_2))^{1/8}p^{1/8}}.

This can be bounded further by applying E(\Omega_2(a_2),\Omega_2(a_2))\leq|\Omega_2(a_2)|^3, (2.9a), (2.9b) and (2.3):

\leq 2^{7/8}|B_1|^{1/2}E(B_1,B_1)^{1/8}p^{\alpha/8+9/16}. \quad (*)

At this point, we claim that E(B_1,B_1)\leq p^{-x}M_1^3. To see this, suppose otherwise. Then |B_1|^3\geq E(B_1,B_1)>p^{-x}M_1^3, implying

|B_1|>p^{-x/3}M_1, \quad (**)

and by (2.9a), we also have

E(B_1,B_1)>p^{-x}M_1^3>p^{-x}|\Omega_1(a_1)|^3\geq p^{-x}|B_1|^3.

Thus, Corollary 1 with K=p^x produces a subset B_1'\subseteq B_1 such that

|B_1'|\geq\frac{|B_1|}{20p^x}>\frac{M_1}{20p^{(4/3)x}}\geq\frac{1}{20}p^{1/2-\alpha_1+\epsilon}\geq p^{1/2-\alpha_1}

where the second and third inequalities follow from (**) and (6.1), respectively, and

|B_1'-B_1'|\leq 10^7p^{9x}|B_1|\leq p^{9x+\epsilon}|B_1|\leq p^{\alpha_2}|B_1|,

where the last step follows from the fact that

\alpha_2\geq 9x+\epsilon. \quad (E)

As such, |B_1'| satisfies (6.3) and (6.4), implying that B=B_1' satisfies (6.5) by L10. By the triangle inequality, B=B_0\cup B_1' also satisfies (6.5), contradicting B_0‘s maximality. We conclude that E(B_1,B_1)\leq p^{-x}M_1^3, and continuing (*) gives


Now we apply (6.5) to B=B_0 and combine with this to get



Applying |B_0|,|B_1|\leq|\Omega_1(a_1)|\leq M_1 by (2.9a) then gives

|T_{a_1}(A_2,\Omega_1(a_1))|\leq M_1p^{1/2-\epsilon_1}+2^{7/8}M_1^{7/8}p^{9/16+\alpha/8-x/8}.

Now we apply the triangle inequality to get

\displaystyle{|S(A_1,A_2)|\leq\sum_{a_1\in A_1}|T_{a_1}(A_2,\Omega_1(a_1))|\leq|A_1|\Big(M_1p^{1/2-\epsilon_1}+2^{7/8}M_1^{7/8}p^{9/16+\alpha/8-x/8}\Big)}

and applying (2.9b) and (2.3) then gives

\leq 2p^{1-\epsilon_2}+2^{7/4}p^{1+\alpha/4-x/8}\leq2p^{1-\epsilon_2}+2^{7/4}p^{1-\epsilon_2}\leq p^{1-\epsilon_1-\epsilon},

where the last steps use the facts that

\epsilon_2\leq x/8-\alpha/4 \quad (F)


\epsilon_1+\epsilon<\epsilon_2. \quad (G)

This completes the proof.     \Box

Now we wish to determine when L10 is true:

Lemma 10′. L10 is true with k_0=c_0/8, k_1=1/4 and k_2=9/8 provided

my\leq\min\{\frac{1}{2}-\alpha_1,\frac{1}{2}-\alpha_2\} \quad (H)


3\alpha_2-2\alpha_1\leq(2-c_0)my. \quad (I)

Proof: [The proof is identical to the original until immediately after equation (6.14).] At this point, we have

\lambda_1(\xi):=\frac{\lambda(\xi)}{|A_2|^m}, \quad \|\lambda\|_2^2\leq m!|A_2|^m.



where the last step uses (6.2). Next, (6.3) and (6.4) together give

|B-B|\geq|B|\geq p^{1/2-\alpha_1}


|B-B|\leq p^{\alpha_2}|B|\leq p^{1/2+\alpha_2}.


\|\lambda_1\|_2+|B-B|^{-1/2}+|B-B|^{1/2}p^{-1/2}\leq \sqrt{m!}p^{-my/2}+p^{\alpha_1/2-1/4}+p^{\alpha_2/2-1/4}

\leq p^{-my/2+4m\epsilon},

where the last step follows from the fact that

my\leq\min\{\frac{1}{2}-\alpha_1,\frac{1}{2}-\alpha_2\}. \quad (H)

So, by (6.12) and (6.14) [following the original proof], we have

\|\zeta*\zeta\|_2\ll |A_2|^{2m}p^{-(c_0/2)my+4c_0m\epsilon}|B-B|^{3/2},

and subsequent application of (6.9), (6.10), and (6.11) gives

\displaystyle{\sum_{b_1,b\in B}|F(b,b_1)|^m\leq(\tfrac{m}{2})!(M_2|A_2|)^m|A_2|^{-m/2}|B-B||B+B|}

+O(M_2^m|A_2|^m|B-B|^{3/4}|B+B|^{3/4}p^{-(c_0/4)my+2c_0m\epsilon}p^{1/4}). \quad (***)

By Lemma 4 (which states that |A+A|\leq|A-A|^2/|A|), condition (6.4) implies that

|B+B|\leq\frac{|B-B|^2}{|B|}\leq p^{2\alpha_2}|B|.

We now use this with (2.9b), (6.2) and (6.4) to bound (***):


Next, the left-hand inequality of (6.3) gives that p^{1/4}\leq|B|^{1/2}p^{\alpha_1/2}, leading to the following bound:


Overall, we have

\displaystyle{\sum_{b_1,b\in B}|F(b,b_1)|^m\leq|B|^2p^{m/2+\alpha_1/2+(9/4)\alpha_2-(c_0/4)my+2m\epsilon}},

since c_0<1 and

3\alpha_2-2\alpha_1\leq(2-c_0)my. \quad (I)

Thus, (6.8) gives



Finally, taking square roots produces the result.     \Box

— Maximizing \epsilon_0

In this last section, we optimize the constants from the previous section to make \epsilon_0 as large as possible. If we ignore the \epsilon‘s, then the inequalities (A)(I) can be simplified quite a bit. In fact, the existence of x and \ell satisfying these is equivalent to having

\displaystyle{\left\{\begin{array}{rcl}  2\alpha+8\epsilon_2&<&\min\{\frac{3}{4}(\alpha_1-\alpha-\epsilon_2),\frac{\alpha_2}{9}\}\\  \frac{2\tau+9}{2\tau-1}(4\epsilon_2+y)&<&\min\{\frac{1}{2}-\alpha-\epsilon_2,\frac{1}{2}-\alpha_1+\frac{4\alpha_2}{27}\}\\  my&<&\min\{\frac{1}{2}-\alpha_1,\frac{1}{2}-\alpha_2\}\\  3\alpha_2-2\alpha_1&<&(2-c_0)my  \end{array}\right.}

Note that m determines \alpha and \tau, and so for each m, the above system determines a polytope over which we seek to maximize the (linear) objective


I performed this optimization for various values of m, and for m=53,000,000, I got \epsilon_2\approx 8.8933\times 10^{-24}. [I wanted to upload the Mathematica notebook file I used to simplify the system of inequalities and optimize, but it doesn’t seem to be an allowed file type for WordPress…] Dividing by 2 then gives a new world record:

\epsilon_0\approx 4.4466\times 10^{-24}.

While this optimization did make a substantial improvement (this is over 300 times larger than the previous record, and over 8,000 times larger than the original record of Bourgain et al.), the constant is still tiny! For this particular construction of \mathcal{A} and \mathcal{B}, I think the remaining bottlenecks lie at the very foundations of additive combinatorics. For example, if c_0=1/2, then taking m=10,000 leads to \epsilon_2 being on the order of 10^{-12}.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s