Introduction to the polynomial method (and other similar things)

The polynomial method has made some waves recently (see this and that, for instance), and last week, Boris Alexeev gave a very nice talk on various applications of this method. This post is loosely based on his talk. All errors are my own.

It’s hard to pin down what exactly the polynomial method is. It’s a technique in algebraic extremal combinatorics, where the goal is to provide bounds on the sizes of objects with certain properties. The main idea is to identify the desired cardinality with some complexity measure of an algebraic object (e.g., the dimension of a vector space, the degree of a polynomial, or the rank of a tensor), and then use algebraic techniques to estimate that complexity measure. If at some point you use polynomials, then you might say you applied the polynomial method.

What follows is a series of instances of this meta-method.

Continue reading Introduction to the polynomial method (and other similar things)

Recent developments in equiangular tight frames, II

Equiangular tight frames (ETFs) are optimal packings of lines through the origin. At the moment, they are the subject of a rapidly growing literature. In fact, there have been quite a few updates since my last post on this subject (less than five months ago), and I’ve revamped the table of ETFs accordingly. What follows is a brief discussion of the various developments:

1. There is an ETF of 76 vectors in \mathbb{C}^{19}

See this paper. Last time, I mentioned a recent proof that there is no ETF of 76 vectors in \mathbb{R}^{19}. It turns out that a complex ETF of this size does exist. To prove this, it actually seems more natural to view the vectors as columns of a 20\times 76 matrix whose row vectors sum to zero. As a lower-dimensional example, consider the following matrix:

\displaystyle{\left[\begin{array}{cccccccccc}+&+&+&+&+&+&+&+&+&+\\+&+&+&+&-&-&-&-&-&-\\+&-&-&-&+&+&+&-&-&-\\-&+&-&-&+&-&-&+&+&-\\-&-&+&-&-&+&-&+&-&+\\-&-&-&+&-&-&+&-&+&+\end{array}\right]}

Continue reading Recent developments in equiangular tight frames, II

SOFT 2016: Summer of Frame Theory

This month, several experts in frame theory will be visiting my department, and so Matt Fickus and I decided to organize a workshop in the style of AIM. Considering the recent progress we’ve made on equiangular tight frames (ETFs) — namely, one, two, three, and four — we are hoping this workshop will spur further progress in this area. To kick off the month, I asked a few people to prepare hour-long chalk talks, and what follows are the extended abstracts:

1. Introduction to ETFs (Dustin G. Mixon)

Given a d-dimensional Hilbert space space \mathbb{H}_d and a positive integer n, we are interested in packing n lines through the origin so that the interior angle between any two is as large as possible. It is convenient to represent each line by a unit vector that spans the line, and in doing so, the problem amounts to finding unit vectors \{\varphi_i\}_{i=1}^n that minimize coherence:

\displaystyle{\mu\Big(\{\varphi_i\}_{i=1}^n\Big)=\max_{i\neq j}|\langle \varphi_i,\varphi_j\rangle|.}

This minimization amounts to a nonconvex optimization problem. To construct provably optimal packings, one must prove a lower bound on \mu for a given n and spatial dimension d, and then construct an ensemble which meets equality in that bound. To date, we know of three bounds that are sharp:

Continue reading SOFT 2016: Summer of Frame Theory

The Voronoi Means Conjecture

UPDATE (July 26, 2016): Boris Alexeev recently disproved the Voronoi Means Conjecture! In particular, he found that certain stable isogons fail to exhibit the conjectured behavior, and his solution suggests a certain refinement of the conjecture. I asked him to write a guest blog entry about his solution, so expect to hear more in the coming weeks.

Suppose you’re given a sample from an unknown balanced mixture of k spherical Gaussians of equal variance in dimension d:

gmm

In the above example, k=3 and d=2. How do you estimate the centers \{\gamma_i\}_{i=1}^k of each Gaussian from the data? In this paper, Dasgupta provides an algorithm in which you project the data onto a randomly drawn subspace of some carefully selected dimension so as to concentrate the data points towards their respective centers. After doing so, there will be k extremely popular regions of the subspace, and for each region, you can average the corresponding points in the original dataset to estimate the corresponding Gaussian center. With this algorithm, Dasgupta proved that

\displaystyle{\mathrm{MSE}:=\frac{1}{k}\sum_{i=1}^k\|\hat{\gamma}_i-\gamma_i\|^2\lesssim d\sigma^2 \qquad\text{whp}}

provided \mathrm{SNR}:=\frac{1}{\sigma^2}\min_{i\neq j}\|\gamma_i-\gamma_j\|^2\gtrsim d.

Continue reading The Voronoi Means Conjecture

The HRT Conjecture

A couple of weeks ago, I attended a workshop hosted by Darrin Speegle on the HRT Conjecture. First posed twenty years ago by Chris Heil, Jay Ramanathan, and Pankaj Topiwala in this paper, the conjecture states that every finite collection of time-frequency shifts of any nonzero function in L^2(\mathbb{R}) is linearly independent. In this post, I will discuss some of the key ideas behind some of the various attempts at chipping away at HRT, and I’ll describe what appear to be fundamental barriers to a complete solution. For more information, see these survey papers: one and two.

First, some notation: Denote the translation-by-a and modulation-by-b operators by

(T_af)(x):=f(x-a),\qquad (M_bf)(x)=e^{2\pi i bx}f(x),

respectively. Then the formal conjecture statement is as follows:

The HRT Conjecture. For every 0\neq f\in L^2(\mathbb{R}) and every finite \Lambda\subset\mathbb{R}^2, the collection \{M_bT_af\}_{(a,b)\in\Lambda} is linearly independent.

What follows are some of the popular methods for tackling HRT:

Continue reading The HRT Conjecture

On the low-rank approach for semidefinite programs arising in synchronization and community detection

From MaxCut to PhaseLift, semidefinite programming has proven to be rather powerful, especially for convex relaxation. SDP solvers take polynomial time, but the exponent is large, and anyone who’s run an SDP on CVX has experienced some frustration with the runtime. In practice, the SDP-optimal matrix tends to have extremely low rank, and so one may apply a rank constraint to facilitate the search for the SDP’s solution. This heuristic was first introduced by Burer and Monteiro, and it works well in practice, but the rank-constrained program is nonconvex and the theory is scant. Recently, the theory gap started to close with this paper:

On the low-rank approach for semidefinite programs arising in synchronization and community detection

Afonso S. Bandeira, Nicolas Boumal, Vladislav Voroninski

As the title suggests, this paper provides strong performance guarantees for the Burer-Monteiro heuristic in the particular cases of \mathbb{Z}_2 synchronization and community detection. I was very excited to see this paper, and so I interviewed one of the authors (Nicolas Boumal). I’ve lightly edited his responses for formatting and hyperlinks:

Continue reading On the low-rank approach for semidefinite programs arising in synchronization and community detection

Clustering noisy data with semidefinite relaxations

Soledad Villar recently posted our latest paper on the arXiv (this one coauthored by her advisor, Rachel Ward). The paper provides guarantees for the k-means SDP when the points are drawn from a subgaussian mixture model. This blog entry will discuss one of the main ideas in our analysis, which we borrowed from Guedon and Vershynin’s recent paper.

Let’s start with two motivating applications:

The first application comes from graph clustering. Consider the stochastic block model, in which the n vertices are secretly partitioned into two communities, each of size n/2, and edges between vertices of a common community are drawn iid with some probability p, and all other edges are drawn with probability q<p. The goal of community estimation is to estimate the communities given a random draw of the graph. For this task, you might be inclined to find the maximum likelihood estimator for this model, but this results in an integer program. Relaxing the program leads to a semidefinite program, and amazingly, this program is tight and recovers the true communities with high probability when p=(\alpha\log n)/n and q=(\beta\log n)/n for good choices of (\alpha,\beta). (See this paper.) These edge probabilities scale like the threshold for connected Erdos-Renyi graphs, and this makes sense since we wouldn’t know how to assign vertices in isolated components. If instead, the probabilities were to scale like 1/n, then we would be in the “giant component” regime, so we’d still expect enough signal to correctly assign a good fraction of the vertices, but the SDP is not tight in this regime.

Continue reading Clustering noisy data with semidefinite relaxations