Phase retrieval from power spectra of masked signals

I’ll start with the commercial:  I have a new paper on the arXiv with Afonso S. Bandeira and Yutong Chen.  I’m pretty excited about what this paper has to offer, but before I explain, I want to give some background information.

From my perspective, the latest resurgence in phase retrieval research has been in the wake of five important papers:

While each of these papers give a lot of information, I will simplify their contributions with the following narrative:  The first paper used algebraic geometry to demonstrate the existence of O(M) injective intensity measurements for \mathbb{C}^M.  The second provided a fast reconstruction algorithm from O(M^2) intensity measurements.  The key trick here was to lift the intensity measurements into the M^2-dimensional vector space of self-adjoint operators, in which the measurements are actually Hilbert-Schmidt inner products.  In this way, the M^2 inner products completely determine the signal provided the lifted measurement vectors are linearly independent.  The third paper then takes inspiration from the lifting trick, but exploits additional information about the desired signal to reduce the number of measurements necessary (think compressed sensing).  In fact, there is a lot of additional information available: the lifted signal of interest is rank-1 and positive semidefinite.  Relaxing the obvious rank-minimization program then leads to the semidefinite program known as PhaseLift, for which stability results are reported in the fourth paper.  To some extent, this work culminated in the fifth paper, where it is established that O(M) intensity measurements suffice for efficient reconstruction with PhaseLift.

While there are numerous applications of phase retrieval reported in these papers, there is one of particular interest.  Specifically, “Phase retrieval via matrix completion” proposed a new methodology to image small objects.  It is already common to observe the diffraction pattern of an unknown object and discern from it various characteristics of the object (this is how the structure of DNA was originally deduced).  But there isn’t enough information in the diffraction pattern to completely determine the object.  To be clear, the diffraction pattern is the modulus-squared of the Fourier transform of the object, so we are missing a lot of phase information.  Traditionally, scientists overcome this by applying additional time-domain assumptions on the object, but this has found limited success.  In their paper, Candes et al. propose to use an ensemble of masks to change the appearance of the object, and then fuse the information from multiple diffraction patterns to reconstruct the object.  This is nicely illustrated in the following figure from their paper:

They provide some theoretical results that demonstrate the plausibility of this setup.  For example, they show that three carefully designed masks can be used to uniquely determine almost every object.  However, they do not guarantee that PhaseLift will reconstruct the object in these cases.  Nevertheless, their numerical simulations clearly demonstrate that PhaseLift actually works, so the task has since been to prove this behavior.  The other two papers on PhaseLift prove theorems based on an assumption that the measurement vectors are drawn from a unitarily invariant probability distribution, so while these results are important, they don’t exactly correspond to the masked illuminations setting.

In the time since “the PhaseLift revolution,” I have worked on a reconstruction alternative that my coauthors and I have been calling polarization.  I wrote a couple of blog entries on this (one and two), and we also have a paper that goes into more detail.  The main difference between PhaseLift and polarization is that we require a certain type of structure in the measurement vectors in order to run polarization, whereas PhaseLift is rather flexible.  However, if your particular application of phase retrieval allows you to design your measurement vectors with the structure that polarization requires, then you’re in business.  In our most recent paper, we show that the masked illuminations setting is one such polarization-friendly application.

The main idea behind polarization is that you design a graph whose vertices correspond to measurement vectors, and each edge corresponds to three polarized combinations of vertex vectors.  When you measure with all of these vectors, you can determine the relative phases between adjacent vertex measurements by applying the polarization identity to the three corresponding edge measurements.  You can’t do this if one of the vertex measurements is zero (relative phase is not well defined in this case), and so such measurements have the effect of removing vertices from the graph.  But if your vertex vectors are full spark, you won’t be forced to remove too many vertices.  Also, if your graph is sufficiently connected, then you’ll still have a large component after the deletion of vertices.  Thus, after you find a large component, you can propagate the relative phases to determine the vertex inner products up to a global phase factor, and then find the least-squares estimate of the signal.

Designing the vertex vectors to have the masked Fourier structure was not a problem.  But we also wanted the edge vectors to be implementable using masks as well.  This was a little tricky, but doable – the idea (encapsulated in Lemma 3) is to draw edges in parallel between pairs of vertex masks.  Imagine two copies of \mathbb{Z}_M and draw an edge from one copy to another if the corresponding points have difference a.  In this way, the masked Fourier design criteria imposed certain structure on the graph.  At this point, we needed to ensure that the graph was sufficiently connected, which we can evaluate by observing the spectral gap, and as we show in Lemma 5, the spectral gap of our graph is a decreasing function of the Fourier bias of the set of offsets a used to draw the graph.  This is a concept from additive combinatorics that measures the extent to which a 01-sequence is not random, so this indicates that random offsets are good for our graph.  (Incidentally, the natural appearance of Fourier bias in our problem led this paper to be my first that cites both Candes and Tao, but not as coauthors.)  We then show that random offsets make our graph sufficiently connected provided we use a total of O(\log M) masks, and we also show that the spectral gap of our graph is bounded away from zero only if we use \Omega(\log M) masks, meaning our analysis is tight.

The end of the paper reports numerical simulations, which compare polarization to several different applications of PhaseLift.  Overall, we find that polarization is extremely fast compared to PhaseLift (i.e., seconds versus hours), but PhaseLift offers slightly more stable estimates of the signal.  This trade-off makes intuitive sense for a couple of reasons.  For polarization, the computational bottleneck is the least-squares step at the end (meaning it’s basically as slow as reconstruction would be if you were given the phases in the beginning).  On the other hand, PhaseLift is a semidefinite program, which are notoriously slow in high-dimensional settings.  Also, we expect PhaseLift to be at least a little more stable to noise because it uses the measurements in a more democratic fashion, compared to the disenfranchising pruning processes we apply in polarization.

5 thoughts on “Phase retrieval from power spectra of masked signals

Leave a comment