Smoothed analysis for low-rank solutions to semidefinite programs in quadratic penalty form

Smoothed analysis was introduced by Spielman and Teng to help explain how the simplex method efficiently solves real-world linear programs despite having a large worst-case complexity. Perhaps a natural analog to the simplex method for semidefinite programs (SDPs) is the Burer–Monteiro factorization method, in which the $n\times n$ decision variable matrix $X\succeq 0$ is assumed to have rank at most $k$, and is therefore replaced by $X=UU^T$ for some $n\times k$ matrix $U$. Recently, Srinadh Bhojanapalli, Nicolas Boumal, Prateek Jain and Praneeth Netrapalli posted a paper on the arXiv that uses smoothed analysis to analyze the landscape of this factorized problem. In particular, they show that a large class of SDPs have the property that for every cost matrix, a random perturbation of the cost matrix will ensure that, with high probability, all approximate local optima of the Burer–Monteiro problem are approximately optimal in the perturbed SDP. (To be explicit, these approximate local optima are approximate second-order stationary points, or approximate SOSPs, which exhibit a small gradient and a nearly positive semidefinite Hessian.) I reached out to the authors to learn more, and ended up interviewing both Nicolas and Srinadh. I’ve lightly edited their responses for formatting and hyperlinks:

Fundamental Limits of Weak Recovery with Applications to Phase Retrieval

Marco Mondelli recently posted his latest paper on the arXiv (joint work with Andrea Montanari). This paper proves sharp guarantees for weak recovery in phase retrieval. In particular, given phaseless measurements against Gaussian vectors, they demonstrate that a properly tuned spectral estimate exhibits correlation with the ground truth, even when the sampling rate is at the information-theoretic limit. In addition, they show that their spectral estimate empirically performs well even when the measurements follow a more realistic coded diffraction model. I decided to reach out to Marco to learn more, and what follows is my interview. I’ve lightly edited his responses for formatting and hyperlinks:

DGM: Judging by your website, this project in phase retrieval appears to be a departure from your coding theory background. How did this project come about?

MM: Many of the tools employed in information and coding theory are very general and they prove useful also to solve problems in other fields, such as, compressed sensing, machine learning or data analysis. So this is the general philosophy that motivated my “detour”.

Global Guarantees for Enforcing Deep Generative Priors by Empirical Risk

Vlad Voroninski recently posted an arXiv preprint with Paul Hand that provides compressed sensing guarantees using a neural net-based generative signal model. This offers some theoretical justification for the shocking empirical results presented in the “Compressed sensing using generative models” paper, which demonstrates compressed sensing with 10 times fewer measurements than conventional compressed sensing (the source code is available here). I was especially excited to see this paper, having recently read Michael Elad’s editorial on deep learning. To learn more, I interviewed Vlad (see below); I’ve lightly edited his responses for formatting and hyperlinks:

DGM: What is the origin story of this project? Were you and Paul inspired by the “Compressed sensing using generative models” paper?

VV: I have been working extensively with applied deep learning for the last year or so, and have been inspired by recent applications of deep generative image priors to classical inverse problems, such as the super resolution work by Fei Fei Li et al. Moreover, recent work on regularizing with deep generative priors for synthesizing the preferred inputs to neural activations, by Yosinski et al., made me optimistic that GAN-based generative priors are capturing sophisticated natural image structure (the synthetic images obtained in this paper look incredibly realistic).

A polynomial-time relaxation of the Gromov-Hausdorff distance

Soledad Villar recently posted her latest paper on the arXiv (joint work with Afonso Bandeira, Andrew Blumberg and Rachel Ward). This paper reduces an instance of cutting-edge data science (specifically, shape matching and point-cloud comparison) to a semidefinite program, and then investigates fast solvers using non-convex local methods. (Check out her blog for an interactive illustration of the results.) Soledad is on the job market this year, and I read about this paper in her research statement. I wanted to learn more, so I decided to interview her. I’ve lightly edited her responses for formatting and hyperlinks:

On the low-rank approach for semidefinite programs arising in synchronization and community detection

From MaxCut to PhaseLift, semidefinite programming has proven to be rather powerful, especially for convex relaxation. SDP solvers take polynomial time, but the exponent is large, and anyone who’s run an SDP on CVX has experienced some frustration with the runtime. In practice, the SDP-optimal matrix tends to have extremely low rank, and so one may apply a rank constraint to facilitate the search for the SDP’s solution. This heuristic was first introduced by Burer and Monteiro, and it works well in practice, but the rank-constrained program is nonconvex and the theory is scant. Recently, the theory gap started to close with this paper:

On the low-rank approach for semidefinite programs arising in synchronization and community detection

Afonso S. Bandeira, Nicolas Boumal, Vladislav Voroninski

As the title suggests, this paper provides strong performance guarantees for the Burer-Monteiro heuristic in the particular cases of $\mathbb{Z}_2$ synchronization and community detection. I was very excited to see this paper, and so I interviewed one of the authors (Nicolas Boumal). I’ve lightly edited his responses for formatting and hyperlinks:

Phase retrieval from coded diffraction patterns

In a previous post, I described a paper I wrote with Afonso and Yutong about how to design $O(\log n)$ masked illuminations that enable efficient phase retrieval of $n$-dimensional signals for X-ray crystallography and related applications. This masked-illumination methodology was originally proposed in this paper, and our phase retrieval guarantee was based on a recovery method known as polarization. This week, the following paper was posted online, and it gives the first guarantee of this kind for a more popular recovery method called PhaseLift:

Phase retrieval from coded diffraction patterns

Emmanuel J. Candes, Xiaodong Li, Mahdi Soltanolkotabi

In particular, this paper shows that $O(\log^4 n)$ masked illuminations enable PhaseLift recovery, and their result actually holds for a wide assortment of masks. Since this paper is so related to my research, I decided to interview one of the authors (Mahdi Soltanolkotabi). I’ve lightly edited his responses for formatting and hyperlinks:

Determination of all pure quantum states from a minimal number of observables

Last year, I wrote a blog post that describes how phase retrieval applies to quantum mechanics.  The main idea is that the probability distribution of outcomes of a particular observable provide intensity measurements of the (pure) state vector with the observable’s eigenstates. In other words, each observable corresponds to a unitary matrix, with which we will effectively take intensity measurements of the desired state vector.  In a paper from 1978, Andrew Vogt mentioned Wright’s conjecture – that 3 observables suffice to determine the state vector up to phase.  This was inadvertently disproved in 2011 by Teiko Heinosaari, Luca Mazzarella and Michael M. Wolf (see this paper), who showed that $4M-o(M)$ measurements are necessary for injectivity over $\mathbb{C}^M$.  The natural question is then:

Do 4 observables suffice to completely determine a pure state?

The answer (in the affirmative) appeared on the arXiv today:

Determination of all pure quantum states from a minimal number of observables