## Zero to One: Notes on Startups, or How to Build the Future

I’ve been thinking a lot about my place in the world lately. I’m interested in doing math that makes a difference, and considering much of the breakthroughs in our society have come from various startups, I decided to investigate the startup culture. How might academia benefit from startup culture? One could easily imagine a hip research environment adorned with beanbag chairs and foosball tables, but these perks aren’t the stuff that makes a startup successful. To catch a glimpse, I turned to a book recently written by Peter Thiel (of PayPal fame):

## Probably certifiably correct algorithms

This post is based on two papers (one and two). The task is to quickly solve typical instances of a given problem, and to quickly produce a certificate of that solution. Generally, problems of interest are NP-hard, and so we consider a random distribution on problem instances with the philosophy that real-world instances might mimic this distribution. In my community, it is common to consider NP-hard optimization problems:

minimize $f(x)$ subject to $x\in S$.     (1)

In some cases, $f$ is convex but $S$ is not, and so one might relax accordingly:

minimize $f(x)$ subject to $x\in T$,     (2)

where $T\supseteq S$ is some convex set. If the minimizer of (2) happens to be a member of $S$, then it’s also a minimizer of (1) — when this happens, we say the relaxation is tight. For some problems (and distributions on instances), the relaxation is typically tight, which means that (1) can be typically solved by instead solving (2); for example, this phenomenon occurs in phase retrieval, in community detection, and in geometric clustering. Importantly, strong duality ensures that solving the dual of the convex relaxation provides a certificate of optimality.

## Recent advances in mathematical data science

Part of the experience of giving a talk at Oberwolfach is documentation. First, they ask you to handwrite the abstract of your talk into a notebook of sorts for safekeeping. Later, they ask you to tex up an extended abstract for further documentation. This time, I gave a longer version of my SPIE talk (here are the slides). Since I posted my extended abstract on my blog last time, I figured I’d do it again:

This talk describes recent work on three different problems of interest in mathematical data science, namely, compressive classification, $k$-means clustering, and deep learning. (Based on three papers: one, two, three.)

First, compressive classification is a problem that comes on the heels of compressive sensing. In compressive sensing, one exploits the underlying structure of a signal class in order to exactly reconstruct any signal from the class given very few linear measurements of the signal. However, many applications do not require an exact reconstruction of the image, but rather a classification of that image (for example, is this a picture of a cat, or of a dog?). As such, it makes intuitive sense that the classification task might succeed given far fewer measurements than are necessary for compressive sensing.

## Applied Harmonic Analysis and Sparse Approximation

Last week, I attended this workshop at Oberwolfach. The weather was great, and I was rather impressed by the quality of the talks. Throughout the workshop, I paid special attention to the various problems that different people are thinking about. In what follows, I briefly discuss several of these problems.

## Frames and Combinatorial & Algebraic Geometry

This last week, I attended a workshop at Universitat Bremen on Frames and Combinatorial & Algebraic Geometry. The organizers (Emily King and Eva Maria Feichtner) did a phenomenal job, and I was impressed by the quality of the talks (see this page for the slides). Below are a few of the things that we discussed throughout the workshop:

— Strohmer’s Problem —

At this year’s SampTA conference, Thomas Strohmer posed an interesting frame theory problem during his plenary talk. On the first day of the workshop, Ole Christensen suggested that we study the problem, and it ended up being one of our main focuses. Here’s a technical statement of the problem sans frame-theoretic jargon:

Strohmer’s Problem. Let $A$ be a known $n\times d$ matrix, let $D$ be an unknown $n\times n$ diagonal matrix, and let $X$ be an unknown $d\times m$ matrix. Under what conditions does $y=DAX$ uniquely determine $D$ and/or $X$?

This problem is motivated by the real-world problem of self-calibration (see for example this blog entry). In particular, the columns of $X$ represent unknown test signals, whereas $D$ represents unknown parameters of a linear sensor $DA$. The goal of self-calibration is to calibrate your sensor without requiring knowledge of the input (this is particularly important for applications like deep-space imaging).

## A relaxation of deep learning?

Jesse Peterson and I recently arxiv’d our paper for Wavelets and Sparsity XVI at SPIE this year. This paper focuses on learning functions $f\colon\{\pm1\}^n\rightarrow\{\pm1\}$ of the form

$\displaystyle{f(x_1,\ldots,x_n) = \mathrm{sign}\bigg(\sum_{i=1}^ka_i\prod_{j\in S_i}x_j\bigg)}, \qquad (*)$

where $k$ is small, $a_1,\ldots,a_k\in\mathbb{R}$, and $S_1,\ldots,S_k\subseteq\{1,\ldots,n\}$. Notice that any such Boolean function can be viewed as a labeling function of strings of $n$ bits, and so learning the function from labeled instances amounts to a binary classification problem.

If we identify $\{\pm1\}^n$ with $\mathbb{Z}_2^n$, then the $a_i$‘s are essentially the big entries of the Walsh–Hadamard transform of $f$, and these entries are indexed by the $S_i$‘s. As such, functions of the form $(*)$ are essentially the Boolean functions of concentrated spectra. These functions have been shown to well approximate the Boolean functions with sufficiently simple circuit implementations (e.g., see one, two, three), and given the strong resemblance between Boolean circuits and neural networks, the following hypothesis seems plausible:

## Conjectures from SampTA

Back in May, I attended this year’s SampTA at American University. I spoke in a special session on phase retrieval, and as luck would have it, Cynthia Vinzant spoke in the same session about her recent solution of the 4M-4 conjecture. As you might expect, I took a moment during my talk to present the award I promised for the solution:

Recall that Cynthia (and coauthors) first proved part (a) of the conjecture, and then recently disproved part (b). During her talk, she also provided a refinement of part (b). Before stating the conjecture, recall that injectivity of the mapping $x\bmod\mathbb{T}\mapsto |Ax|^2$ is a property of the column space $\mathrm{im}(A)$.