Achieving the Welch bound with difference sets

Equiangular tight frames (ETFs) are of particular interest throughout frame theory, and they are used in various sparse signal processing applications such as compressed sensing. The following paper describes how to build what are called harmonic ETFs:

Achieving the Welch bound with difference sets

Pengfei Xia, Shengli Zhou, Georgios B. Giannakis

Let F denote the N\times N discrete Fourier transform (DFT) matrix, and build \Phi by collecting M rows of F and normalizing the columns. Such frames are called harmonic frames, and it’s easy to verify that \Phi is an M\times N unit norm tight frame. We are particularly interested in the inner products between columns of \Phi. Specifically, we want to minimize the size of the largest such inner product, called the worst-case coherence. Recall the Welch bound: the worst-case coherence is \geq\sqrt{\frac{N-M}{M(N-1)}}, with equality precisely when \Phi is an equiangular tight frame (ETF), meaning every pair of distinct frame elements has the same sized (small) inner product.

The main result of this paper characterizes which choices of rows from F make \Phi an ETF:

Theorem (Xia, Zhou, Giannakis). Build \Phi by collecting rows from the DFT which are indexed by \mathcal{M}. Then \Phi is an ETF if and only if \mathcal{M} is a difference set for \mathbb{Z}_N, that is, there exists a positive integer \lambda such that every nonzero member of \mathbb{Z}_N can be expressed as the difference of members of \mathcal{M} in exactly \lambda different ways.

[UPDATE: Apparently, the “if” direction of the above theorem was first suggested by Strohmer and Heath using the Singer difference set and analysis from Konig.]

This characterization is particularly useful, considering difference sets are well-studied as combinatorial designs; this makes the construction of (harmonic) ETFs particularly easy, and this is precisely the message of the paper. After proving the above theorem, the paper provides a laundry list of difference sets, but I would instead consult this book for more information on difference sets (or any other sort of combinatorial design, for that matter). The remainder of the paper is, in my opinion, less exciting; the authors propose a numerical routine to find frames of small coherence, and they give an improvement to the Welch bound for the case where the number of vectors is larger than the square of the dimension.

The heart of the paper is identifying difference sets as the way to choose rows of a DFT to build harmonic ETFs. To make this clear, this blog entry will show why such a choice works (thereby proving the “if” direction of the above theorem). First, let’s understand how the Gram matrix \Phi^*\Phi of a harmonic frame \Phi is related to the choice of rows. To this end, note that the (n,n')th entry of \Phi^*\Phi is the inner product

\displaystyle{\langle \varphi_n,\varphi_{n'}\rangle  =\sum_{m\in\mathcal{M}}\bigg(\frac{1}{\sqrt{M}}e^{-2\pi imn/N}\bigg)\overline{\bigg(\frac{1}{\sqrt{M}}e^{-2\pi imn'/N}\bigg)}  =\frac{1}{M}(F\mathbf{1}_\mathcal{M})[n-n']}.

Note that we take F to be the DFT defined by F[m,n]=e^{-2\pi imn/N}, so we don’t have a \sqrt{N} factor floating around. The above equation shows that \Phi^*\Phi is circulant, meaning all of the rows are translated versions of each other. Furthermore, the zeroth row of \Phi^*\Phi is a multiple of the Fourier transform of (the indicator function of) \mathcal{M}. Thus, minimizing the coherence of a harmonic frame corresponds to minimizing the largest nonzero Fourier coefficient of an indicator function.

Now that we’ve identified the Gram matrix with the Fourier transform of \mathcal{M}, let’s check that \Phi is an ETF when \mathcal{M} is a difference set. Considering the previous discussion, it suffices to study the nonzero Fourier coefficients of \mathcal{M}:

\displaystyle{|(F\mathbf{1}_\mathcal{M})[n]|^2  =(F\mathbf{1}_\mathcal{M})[n]\overline{(F\mathbf{1}_\mathcal{M})[n]}  =\sum_{m\in\mathcal{M}}\sum_{m'\in\mathcal{M}}e^{-2\pi i(m-m')n/N}}

\displaystyle{=\sum_{m\in\mathcal{M}}1+\sum_{\substack{m,m'\in\mathcal{M}\\ m\neq m'}}e^{-2\pi i(m-m')n/N}}.

Since \mathcal{M} is a difference set, we know every nonzero member of \mathbb{Z}_N can be expressed as a difference in exactly \lambda different ways:

\displaystyle{|(F\mathbf{1}_\mathcal{M})[n]|^2  =M+\lambda\sum_{m\in\mathbb{Z}_N\setminus\{0\}}e^{-2\pi imn/N}  =M-\lambda},

where the last step uses the fact that n\neq 0. The nonzero Fourier coefficients correspond to off-diagonal Gram matrix entries, and so \Phi is equiangular since these entries all have equal size. Actually, we can make this more evident by demonstrating Welch-bound equality. Since every nonzero member of \mathbb{Z}_N must be a difference in exactly \lambda different ways, and since there are M(M-1) total differences available, we have \lambda=\frac{M(M-1)}{N-1}. Thus,

\displaystyle{|\langle\varphi_n,\varphi_{n'}\rangle|^2=\bigg|\frac{1}{M}(F\mathbf{1}_\mathcal{M})[n-n']\bigg|^2=\frac{1}{M^2}(M-\lambda)=\frac{N-M}{M(N-1)}},

as desired.

Taking a step back, this characterization of harmonic ETFs leads to a particularly interesting idea for compressed sensing. By now, it is well known that selecting rows at random from a DFT will, with high probability, form a sensing matrix which satisfies the restricted isometry property (RIP). Here’s what’s interesting: A semi-recent development in additive combinatorics is the notion of Fourier-pseudorandomness. Specifically, \mathcal{M}\subseteq\mathbb{Z}_N is considered to be Fourier-pseudorandom if the largest nonzero Fourier coefficient of \mathcal{M} is small. Since random subsets are Fourier-pseudorandom with high probability, this definition is reasonable. It’s also useful, as it guarantees certain desirable properties, such as random-like intersections. The reader is invited to google “Fourier-pseudorandom” for more information. All I care to mention is the following: Random sets correspond to RIP harmonic frames in the same way that Fourier-pseudorandom sets correspond to incoherent harmonic frames. But perhaps Fourier-pseudorandomness guarantees something much stronger. Certainly, low coherence doesn’t imply random-like RIP, but it might when you restrict to harmonic frames. This would be a monumental result for deterministic compressed sensing, and as this paper illustrates, it would have important consequences in number theory.

4 thoughts on “Achieving the Welch bound with difference sets

    1. It’s certainly relevant. Two things comes to mind. First, Meyer’s paper is explicitly motivated by the fact that harmonic frames with a prime number of columns map sparse vectors injectively. This is an immediate consequence of a theorem by Chebotarev, as I discuss and generalize here:

      Full spark frames

      Second, Meyer’s approach is quite different, since it deals with continuous functions as opposed to finite-dimensional vectors. The results are presented in the context of sampling and interpolation, which are introduced rather nicely here:

      Sampling and Interpolation

      It’s striking that Meyer gets L1-minimization guarantees from mere injectivity. (Is the reconstruction stable if the measurements are corrupted with noise?) The guarantee requires the sparse function to be nonnegative, but is there an analogous result with finite-dimensional harmonic frames?

Leave a comment