## Optimal line packings from association schemes

Joey Iverson recently posted our paper with John Jasper on the arXiv. As the title suggests, this paper explains how to construct optimal line packings (specifically, equiangular tight frames) using association schemes. In particular, we identify many schemes whose adjacency algebra contains the Gram matrix of an ETF. This unifies several existing constructions, and also enabled us to construct the first known infinite family of ETFs that arise from nonabelian groups.

John is on the job market this year, and when reading his research statement, I was struck by his discussion of our paper, so I asked him to expand his treatment to a full blown blog entry. Without further ado, here is John’s guest blog post (which I’ve lightly edited for hyperlinks and formatting):

## Recent developments in equiangular tight frames, II

Equiangular tight frames (ETFs) are optimal packings of lines through the origin. At the moment, they are the subject of a rapidly growing literature. In fact, there have been quite a few updates since my last post on this subject (less than five months ago), and I’ve revamped the table of ETFs accordingly. What follows is a brief discussion of the various developments:

1. There is an ETF of 76 vectors in $\mathbb{C}^{19}$

See this paper. Last time, I mentioned a recent proof that there is no ETF of 76 vectors in $\mathbb{R}^{19}$. It turns out that a complex ETF of this size does exist. To prove this, it actually seems more natural to view the vectors as columns of a $20\times 76$ matrix whose row vectors sum to zero. As a lower-dimensional example, consider the following matrix:

$\displaystyle{\left[\begin{array}{cccccccccc}+&+&+&+&+&+&+&+&+&+\\+&+&+&+&-&-&-&-&-&-\\+&-&-&-&+&+&+&-&-&-\\-&+&-&-&+&-&-&+&+&-\\-&-&+&-&-&+&-&+&-&+\\-&-&-&+&-&-&+&-&+&+\end{array}\right]}$

## The Voronoi Means Conjecture

UPDATE (July 26, 2016): Boris Alexeev recently disproved the Voronoi Means Conjecture! In particular, he found that certain stable isogons fail to exhibit the conjectured behavior, and his solution suggests a certain refinement of the conjecture. I asked him to write a guest blog entry about his solution, so expect to hear more in the coming weeks.

Suppose you’re given a sample from an unknown balanced mixture of $k$ spherical Gaussians of equal variance in dimension $d$:

In the above example, $k=3$ and $d=2$. How do you estimate the centers $\{\gamma_i\}_{i=1}^k$ of each Gaussian from the data? In this paper, Dasgupta provides an algorithm in which you project the data onto a randomly drawn subspace of some carefully selected dimension so as to concentrate the data points towards their respective centers. After doing so, there will be $k$ extremely popular regions of the subspace, and for each region, you can average the corresponding points in the original dataset to estimate the corresponding Gaussian center. With this algorithm, Dasgupta proved that

$\displaystyle{\mathrm{MSE}:=\frac{1}{k}\sum_{i=1}^k\|\hat{\gamma}_i-\gamma_i\|^2\lesssim d\sigma^2 \qquad\text{whp}}$

provided $\mathrm{SNR}:=\frac{1}{\sigma^2}\min_{i\neq j}\|\gamma_i-\gamma_j\|^2\gtrsim d$.

## Clustering noisy data with semidefinite relaxations

Soledad Villar recently posted our latest paper on the arXiv (this one coauthored by her advisor, Rachel Ward). The paper provides guarantees for the k-means SDP when the points are drawn from a subgaussian mixture model. This blog entry will discuss one of the main ideas in our analysis, which we borrowed from Guedon and Vershynin’s recent paper.

The first application comes from graph clustering. Consider the stochastic block model, in which the $n$ vertices are secretly partitioned into two communities, each of size $n/2$, and edges between vertices of a common community are drawn iid with some probability $p$, and all other edges are drawn with probability $q. The goal of community estimation is to estimate the communities given a random draw of the graph. For this task, you might be inclined to find the maximum likelihood estimator for this model, but this results in an integer program. Relaxing the program leads to a semidefinite program, and amazingly, this program is tight and recovers the true communities with high probability when $p=(\alpha\log n)/n$ and $q=(\beta\log n)/n$ for good choices of $(\alpha,\beta)$. (See this paper.) These edge probabilities scale like the threshold for connected Erdos-Renyi graphs, and this makes sense since we wouldn’t know how to assign vertices in isolated components. If instead, the probabilities were to scale like $1/n$, then we would be in the “giant component” regime, so we’d still expect enough signal to correctly assign a good fraction of the vertices, but the SDP is not tight in this regime.

## Recent developments in equiangular tight frames

Equiangular tight frames (ETFs) are optimal packings of lines through the origin. Last year, Matt and I tabulated all known existence results for ETFs. Since then, there have been a few interesting developments on the subject, so I’ll have to update the table soon. In the meantime, this blog entry covers the highlights.

First, some context: In the real case, ETFs are in one-to-one correspondence with certain strongly regular graphs (SRGs). Waldron’s paper on the subject is the standard modern treatment of this correspondence. SRGs have been an active area of research for a few decades, and Brouwer’s table provides a survey of the various known constructions. This suggests a straightforward program for constructing real ETFs: Go to Brouwer’s table, find an SRG of the requisite size, investigate the reference that constructs that graph, follow Waldron’s treatment to construct the corresponding Gram matrix, and then decompose the Gram matrix to get the desired ETF.

Now for the news:

## Probably certifiably correct algorithms

This post is based on two papers (one and two). The task is to quickly solve typical instances of a given problem, and to quickly produce a certificate of that solution. Generally, problems of interest are NP-hard, and so we consider a random distribution on problem instances with the philosophy that real-world instances might mimic this distribution. In my community, it is common to consider NP-hard optimization problems:

minimize $f(x)$ subject to $x\in S$.     (1)

In some cases, $f$ is convex but $S$ is not, and so one might relax accordingly:

minimize $f(x)$ subject to $x\in T$,     (2)

where $T\supseteq S$ is some convex set. If the minimizer of (2) happens to be a member of $S$, then it’s also a minimizer of (1) — when this happens, we say the relaxation is tight. For some problems (and distributions on instances), the relaxation is typically tight, which means that (1) can be typically solved by instead solving (2); for example, this phenomenon occurs in phase retrieval, in community detection, and in geometric clustering. Importantly, strong duality ensures that solving the dual of the convex relaxation provides a certificate of optimality.

## Recent advances in mathematical data science

Part of the experience of giving a talk at Oberwolfach is documentation. First, they ask you to handwrite the abstract of your talk into a notebook of sorts for safekeeping. Later, they ask you to tex up an extended abstract for further documentation. This time, I gave a longer version of my SPIE talk (here are the slides). Since I posted my extended abstract on my blog last time, I figured I’d do it again:

This talk describes recent work on three different problems of interest in mathematical data science, namely, compressive classification, $k$-means clustering, and deep learning. (Based on three papers: one, two, three.)

First, compressive classification is a problem that comes on the heels of compressive sensing. In compressive sensing, one exploits the underlying structure of a signal class in order to exactly reconstruct any signal from the class given very few linear measurements of the signal. However, many applications do not require an exact reconstruction of the image, but rather a classification of that image (for example, is this a picture of a cat, or of a dog?). As such, it makes intuitive sense that the classification task might succeed given far fewer measurements than are necessary for compressive sensing.