Joey Iverson recently posted our latest paper with John Jasper on the arXiv. This paper can be viewed as a sequel of sorts to our previous paper, in which we introduced the idea of hunting for Gram matrices of equiangular tight frames (ETFs) in the adjacency algebras of association schemes, specifically group schemes. In this new paper, we focus on the so-called Schurian schemes. This proved to be a particularly fruitful restriction: We found an alternate construction of Hoggar’s lines, we found an explicit representation of the “elusive” packing from the real packings paper (based on a private tip from Henry Cohn), we found an packing involving the Mathieu group (this one beating the corresponding packing in Sloane’s database), we found some low-dimensional mutually unbiased bases, and we recovered nearly all small sized ETFs. In addition, we constructed the first known infinite family of ETFs with Heisenberg symmetry; while these aren’t SIC-POVMs, we suspect they are related to the objects of interest in Zauner’s conjecture (as in this paper, for example). This blog entry briefly describes the main ideas in the paper.
Marco Mondelli recently posted his latest paper on the arXiv (joint work with Andrea Montanari). This paper proves sharp guarantees for weak recovery in phase retrieval. In particular, given phaseless measurements against Gaussian vectors, they demonstrate that a properly tuned spectral estimate exhibits correlation with the ground truth, even when the sampling rate is at the information-theoretic limit. In addition, they show that their spectral estimate empirically performs well even when the measurements follow a more realistic coded diffraction model. I decided to reach out to Marco to learn more, and what follows is my interview. I’ve lightly edited his responses for formatting and hyperlinks:
DGM: Judging by your website, this project in phase retrieval appears to be a departure from your coding theory background. How did this project come about?
MM: Many of the tools employed in information and coding theory are very general and they prove useful also to solve problems in other fields, such as, compressed sensing, machine learning or data analysis. So this is the general philosophy that motivated my “detour”.
This summer, I participated in several interesting conferences. This entry documents my slides and describes a few of my favorite talks from the summer. Here are links to my talks:
- Packings in real projective spaces, FoCM and SPIE
- Explicit restricted isometries, ILAS
- Probably certifiably correct k-means clustering, ILAS
- Equiangular tight frames from association schemes, SIAM AG17
- Open problems in finite frame theory, SIAM AG17
UPDATE: SIAM AG17 just posted a video of my talk.
Now for my favorite talks from FoCM, ILAS, SIAM AG17 and SPIE:
In machine learning, you hope to fit a model so as to be good at prediction. To do this, you fit to a training set and then evaluate with a test set. In general, if a simple model fits a large training set pretty well, you can expect the fit to generalize, meaning it will also fit the test set. By conventional wisdom, if the model happens to fit the training set exactly, then your model is probably not simple enough, meaning it will not fit the test set very well. According to Ben, this conventional wisdom is wrong. He demonstrates this by presenting some observations he made while training neural nets. In particular, he allowed the number of parameters to far exceed the size of the training set, and in doing so, he fit the training set exactly, and yet he still managed to fit the test set well. He suggested that generalization was successful here because stochastic gradient descent implicitly regularizes. For reference, in the linear case, stochastic gradient descent (aka the randomized Kaczmarz method) finds the solution of minimal 2-norm, and it converges faster when the optimal solution has smaller 2-norm. Along these lines, Ben has some work to demonstrate that even in the nonlinear case, fast convergence implies generalization.
There has been a lot of work recently on constructing line packings that achieve equality in either the Welch bound or the orthoplex bound. It has proven much harder to pack in regimes where these bounds are not tight. To help fill this void, about a month ago, I posted a new paper with Matt Fickus and John Jasper on the arXiv. We provide a few results to approach this case, which I outline below.
1. The minimal coherence of 6 unit vectors in is 1/3.
The Welch bound is known to not be tight whenever lies strictly between and (see the next section for a proof sketch). As such, new techniques are required to prove optimality in this range. We leverage ideas from real algebraic geometry to show how to solve the case of vectors in for all sufficiently small . For example, our method provides a new proof of the optimality of 5 non-antipodal vertices of the icosahedron in , as well as the optimality of Sloane’s packing of 6 lines in .
Vlad Voroninski recently posted an arXiv preprint with Paul Hand that provides compressed sensing guarantees using a neural net-based generative signal model. This offers some theoretical justification for the shocking empirical results presented in the “Compressed sensing using generative models” paper, which demonstrates compressed sensing with 10 times fewer measurements than conventional compressed sensing (the source code is available here). I was especially excited to see this paper, having recently read Michael Elad’s editorial on deep learning. To learn more, I interviewed Vlad (see below); I’ve lightly edited his responses for formatting and hyperlinks:
DGM: What is the origin story of this project? Were you and Paul inspired by the “Compressed sensing using generative models” paper?
VV: I have been working extensively with applied deep learning for the last year or so, and have been inspired by recent applications of deep generative image priors to classical inverse problems, such as the super resolution work by Fei Fei Li et al. Moreover, recent work on regularizing with deep generative priors for synthesizing the preferred inputs to neural activations, by Yosinski et al., made me optimistic that GAN-based generative priors are capturing sophisticated natural image structure (the synthetic images obtained in this paper look incredibly realistic).
Last week, I visited Joey Iverson at the University of Maryland, and we spent a lot of time working through different approaches to Zauner’s conjecture. In general, my relationship with this problem is very similar to Steve Flammia‘s description, as paraphrased by smerkel on Physics Stack Exchange:
[Flammia described] the SIC-POVM problem as a “heartbreaker” because every approach you take seems super promising but then inevitably fizzles out without really giving you a great insight as to why.
Case in point, Joey and I identified a promising approach involving ideas from our association schemes paper. We were fairly optimistic, and Joey even bet me $5 that our approach would work. Needless to say, I now have this keepsake from Joey:
While our failure didn’t offer any great insights (as Flammia predicted), the experience forced me to review the literature on Zauner’s conjecture a lot more carefully. A few things caught my eye, and I’ll discuss them here. Throughout, SIC denotes “symmetric informationally complete line set” and WH denotes “the Weyl-Heisenberg group.”
Two years ago, I blogged about Tuan-Yow Chien’s PhD thesis, which proved Zauner’s conjecture in dimension 17. The idea was to exploit certain conjectures on the field structure of SIC-POVM fiducial vectors so as to round numerical solutions to exact solutions. This week, the arXiv announced Chien’s latest paper (coauthored with Appleby, Flammia and Waldron), which extends this work to find exact solutions in 8 new dimensions.
The following line from the introduction caught my eye:
For instance the print-out for exact fiducial 48a occupies almost a thousand A4 pages (font size 9 and narrow margins).
As my previous blog entry illustrated, the description length of SIC-POVM fiducial vectors appears to grow rapidly with . However, it seems that the rate of growth is much better than I originally thought. Here’s a plot of the description lengths of the known fiducial vectors (the new ones due to ACFW17 — available here — appear in red):