This last semester, I was a long-term visitor at the Simons Institute for the Theory of Computing. My time there was rather productive, resulting in a few (exciting!) arXiv preprints, which I discuss below.
Suppose you have a bunch of points in high-dimensional Euclidean space, some labeled “cat” and others labeled “dog,” say. Can you find a low-rank projection such that after projection, cats and dogs remain separated? If you can implement such a projection as a sensor, then that sensor collects enough information to classify cats versus dogs. This is the main idea behind compressive classification.
At it’s heart, this problem concerns linear dimensionality reduction. For the sake of illustration, suppose we want to find an appropriate projection for this dataset:
The gut-instinct method of dimensionality reduction is PCA, but this delivers poor results:
Of course, PCA ignores labels. Instead, you could run PCA on the differences between points of different labels, but in this case, you’d still get a dominant z-component, so this doesn’t help. Alternatively, you could run LDA, which projects onto the difference between class centroids (times an inverse covariance matrix). This also produces poor results:
Intuitively, we want to thumb through all possible projections to find a good one. This is what SqueezeFit does:
In particular, SqueezeFit is an SDP relaxation of the problem “find the minimum-rank orthogonal projection that keeps points of different labels separated.” In the paper, we prove some performance guarantees before applying SqueezeFit to real data. Overall, SqueezeFit provides an improvement over the standard approaches for linear dimensionality reduction. We’re excited to apply variants of SqueezeFit to various important settings.
There has been a lot of effort lately to use mathematical tools to help detect partisan gerrymandering. However, any detection procedure requires a technical definition of “excessively favoring one party over the other.” Any choice of definition can be perceived as arbitrary, or even sociological gobbledygook.
Considering this difficulty of fighting partisan gerrymandering in the courts, one might instead prevent gerrymandering from happening in the first place. Almost half of the states in the country use some sort of redistricting commission to draw a new map after the decennial census. Gamified redistricting offers a protocol for a bipartisan redistricting commission that leads to provably beneficial results. For example, the I-cut-you-freeze protocol is a modification of the I-cut-you-choose solution to the fair cake-cutting problem that provides a beautiful votes–seats curve in the limit as the number of districts goes to infinity (see Figure 1 in the paper).
Sadly, for smaller numbers of districts (which we frequently encounter in the real world), I-cut-you-freeze gives significant advantage to the first player. As an alternative, we propose Utility Ghost, which is a modification of the word game Ghost in which players take turns assigning precincts to districts. In the paper, we prove that in an idealized setting, if both players have the same number of votes, then under optimal play, they end up with the same number of seats.
We also show that Utility Ghost performs well in real-world settings. For example, consider the case of New Hampshire, which is made up of 10 counties and two U.S. congressional districts. If we don’t split counties, there are seven ways to partition New Hampshire into two districts with roughly the same sized population:
Here, proto-districts are colored according to the 2016 presidential election returns, where in New Hampshire, Hillary Clinton received 47.62 percent of the vote and Donald Trump received 47.25 percent. Since there are only two districts, the I-cut-you-freeze protocol is not helpful: the first player becomes de facto map maker, while the second player has no say in the matter. In particular, if the Democrats play first, they get to select the map in which they win both seats. This seems unfair, considering half of the voters are Republican. Alternatively, Utility Ghost avoids this map under optimal play, regardless of who plays first. Time will tell whether such gamified redistricting will be incorporated in protocols for bipartisan commissions following Census 2020.
In compressed sensing, we encode sparse signals with random measurements, and then reconstruct the signals using L1 minimization. Here, the number of random measurements scales roughly linearly with the sparsity level. There has been some work to replicate this encoding performance with deterministic measurements, but the best theory to date requires a number of measurements that scales almost quadratically with the sparsity level.
Instead, one might attempt to minimize the number of random bits needed to accomplish the desired linear scaling. To this end, a previous paper leveraged pseudorandom properties of the Legendre symbol to derandomize sensing matrices composed of entries. Our new paper provides a more general treatment of derandomization for compressed sensing. As a special instance of our result, we can accommodate entries by sampling rows from an orthogonal array. The resulting sensing matrix uses slightly fewer measurements and slightly more randomness than the Legendre symbol–based construction. Our methods also provide derandomization by sampling from mutually unbiased bases.
In practice, reconstruction performance is identical to the fully random counterparts:
Still, our sensing matrices require a number of random bits that fail to break the “Johnson–Lindenstrauss bottleneck” identified in this paper. Is this a fundamental barrier to derandomized compressed sensing?