This summer, I participated in several interesting conferences. This entry documents my slides and describes a few of my favorite talks from the summer. Here are links to my talks:
Now for my favorite talks from FoCM, ILAS, SIAM AG17 and SPIE:
Ben Recht — Understanding deep learning requires rethinking generalization
In machine learning, you hope to fit a model so as to be good at prediction. To do this, you fit to a training set and then evaluate with a test set. In general, if a simple model fits a large training set pretty well, you can expect the fit to generalize, meaning it will also fit the test set. By conventional wisdom, if the model happens to fit the training set exactly, then your model is probably not simple enough, meaning it will not fit the test set very well. According to Ben, this conventional wisdom is wrong. He demonstrates this by presenting some observations he made while training neural nets. In particular, he allowed the number of parameters to far exceed the size of the training set, and in doing so, he fit the training set exactly, and yet he still managed to fit the test set well. He suggested that generalization was successful here because stochastic gradient descent implicitly regularizes. For reference, in the linear case, stochastic gradient descent (aka the randomized Kaczmarz method) finds the solution of minimal 2-norm, and it converges faster when the optimal solution has smaller 2-norm. Along these lines, Ben has some work to demonstrate that even in the nonlinear case, fast convergence implies generalization.
Continue reading Talks from the Summer of ’17
Vlad Voroninski recently posted an arXiv preprint with Paul Hand that provides compressed sensing guarantees using a neural net-based generative signal model. This offers some theoretical justification for the shocking empirical results presented in the “Compressed sensing using generative models” paper, which demonstrates compressed sensing with 10 times fewer measurements than conventional compressed sensing (the source code is available here). I was especially excited to see this paper, having recently read Michael Elad’s editorial on deep learning. To learn more, I interviewed Vlad (see below); I’ve lightly edited his responses for formatting and hyperlinks:
DGM: What is the origin story of this project? Were you and Paul inspired by the “Compressed sensing using generative models” paper?
VV: I have been working extensively with applied deep learning for the last year or so, and have been inspired by recent applications of deep generative image priors to classical inverse problems, such as the super resolution work by Fei Fei Li et al. Moreover, recent work on regularizing with deep generative priors for synthesizing the preferred inputs to neural activations, by Yosinski et al., made me optimistic that GAN-based generative priors are capturing sophisticated natural image structure (the synthetic images obtained in this paper look incredibly realistic).
Continue reading Global Guarantees for Enforcing Deep Generative Priors by Empirical Risk
Last week, I visited Joey Iverson at the University of Maryland, and we spent a lot of time working through different approaches to Zauner’s conjecture. In general, my relationship with this problem is very similar to Steve Flammia‘s description, as paraphrased by smerkel on Physics Stack Exchange:
[Flammia described] the SIC-POVM problem as a “heartbreaker” because every approach you take seems super promising but then inevitably fizzles out without really giving you a great insight as to why.
Case in point, Joey and I identified a promising approach involving ideas from our association schemes paper. We were fairly optimistic, and Joey even bet me $5 that our approach would work. Needless to say, I now have this keepsake from Joey:
While our failure didn’t offer any great insights (as Flammia predicted), the experience forced me to review the literature on Zauner’s conjecture a lot more carefully. A few things caught my eye, and I’ll discuss them here. Throughout, SIC denotes “symmetric informationally complete line set” and WH denotes “the Weyl-Heisenberg group.”
Continue reading Notes on Zauner’s conjecture
Two years ago, I blogged about Tuan-Yow Chien’s PhD thesis, which proved Zauner’s conjecture in dimension 17. The idea was to exploit certain conjectures on the field structure of SIC-POVM fiducial vectors so as to round numerical solutions to exact solutions. This week, the arXiv announced Chien’s latest paper (coauthored with Appleby, Flammia and Waldron), which extends this work to find exact solutions in 8 new dimensions.
The following line from the introduction caught my eye:
For instance the print-out for exact fiducial 48a occupies almost a thousand A4 pages (font size 9 and narrow margins).
As my previous blog entry illustrated, the description length of SIC-POVM fiducial vectors appears to grow rapidly with . However, it seems that the rate of growth is much better than I originally thought. Here’s a plot of the description lengths of the known fiducial vectors (the new ones due to ACFW17 — available here — appear in red):
Continue reading Zauner’s conjecture is true in dimensions 18, 20, 21, 30, 31, 37, 39 and 43
Soledad Villar recently posted her latest paper on the arXiv (joint work with Afonso Bandeira, Andrew Blumberg and Rachel Ward). This paper reduces an instance of cutting-edge data science (specifically, shape matching and point-cloud comparison) to a semidefinite program, and then investigates fast solvers using non-convex local methods. (Check out her blog for an interactive illustration of the results.) Soledad is on the job market this year, and I read about this paper in her research statement. I wanted to learn more, so I decided to interview her. I’ve lightly edited her responses for formatting and hyperlinks:
Continue reading A polynomial-time relaxation of the Gromov-Hausdorff distance
Joey Iverson recently posted our paper with John Jasper on the arXiv. As the title suggests, this paper explains how to construct optimal line packings (specifically, equiangular tight frames) using association schemes. In particular, we identify many schemes whose adjacency algebra contains the Gram matrix of an ETF. This unifies several existing constructions, and also enabled us to construct the first known infinite family of ETFs that arise from nonabelian groups.
John is on the job market this year, and when reading his research statement, I was struck by his discussion of our paper, so I asked him to expand his treatment to a full blown blog entry. Without further ado, here is John’s guest blog post (which I’ve lightly edited for hyperlinks and formatting):
Continue reading Optimal line packings from association schemes