A relaxation of deep learning?

Jesse Peterson and I recently arxiv’d our paper for Wavelets and Sparsity XVI at SPIE this year. This paper focuses on learning functions f\colon\{\pm1\}^n\rightarrow\{\pm1\} of the form

\displaystyle{f(x_1,\ldots,x_n) = \mathrm{sign}\bigg(\sum_{i=1}^ka_i\prod_{j\in S_i}x_j\bigg)}, \qquad (*)

where k is small, a_1,\ldots,a_k\in\mathbb{R}, and S_1,\ldots,S_k\subseteq\{1,\ldots,n\}. Notice that any such Boolean function can be viewed as a labeling function of strings of n bits, and so learning the function from labeled instances amounts to a binary classification problem.

If we identify \{\pm1\}^n with \mathbb{Z}_2^n, then the a_i‘s are essentially the big entries of the Walsh–Hadamard transform of f, and these entries are indexed by the S_i‘s. As such, functions of the form (*) are essentially the Boolean functions of concentrated spectra. These functions have been shown to well approximate the Boolean functions with sufficiently simple circuit implementations (e.g., see one, two, three), and given the strong resemblance between Boolean circuits and neural networks, the following hypothesis seems plausible:

Continue reading A relaxation of deep learning?

Conjectures from SampTA

Back in May, I attended this year’s SampTA at American University. I spoke in a special session on phase retrieval, and as luck would have it, Cynthia Vinzant spoke in the same session about her recent solution of the 4M-4 conjecture. As you might expect, I took a moment during my talk to present the award I promised for the solution:

award

Recall that Cynthia (and coauthors) first proved part (a) of the conjecture, and then recently disproved part (b). During her talk, she also provided a refinement of part (b). Before stating the conjecture, recall that injectivity of the mapping x\bmod\mathbb{T}\mapsto |Ax|^2 is a property of the column space \mathrm{im}(A).

Continue reading Conjectures from SampTA

Three Paper Announcements

I’ve been pretty busy lately with writing and researching with visitors. These announcements serve as a quick summary of what I’ve been up to:

1. Tables of the existence of equiangular tight frames (with Matthew Fickus). Today, there’s quite a bit known about equiangular tight frames (ETFs), but what is known seems to be scattered across different papers. This paper surveys everything that is known, and tabulates all of the known real and complex ETFs with sufficiently few vectors in sufficiently small dimension. The tables were generated by coding up existence theorems in MATLAB so as to minimize errors. This serves as a “solution” to problem 21 in this documentation of the open problems discussed at the AIM workshop Frame theory intersects geometry. Recently, Matt and I have made a few ETF discoveries with John and Jesse, so you can expect this table to be updated after we announce these discoveries in the coming months.

Continue reading Three Paper Announcements

The Signal and the Noise

I recently finished Nate Silver‘s famous book. Some parts were more fun to read than others, but overall, it was worth the read. I was impressed by Nate’s apparently vast perspective, and he did a good job of pointing out how bad we are at predicting certain things (and explaining some of the bottlenecks).

Based on the reading, here’s a brief list of stars that need to align in order to succeed at prediction:

Continue reading The Signal and the Noise

Zauner’s conjecture is true in dimension 17

Congrats to Tuan-Yow Chien for proving Zauner’s conjecture in dimension 17! The proof appears in his PhD thesis, which was written under the advice of Shayne Waldron. Recall that Zauner’s conjecture claims that SIC-POVMs (i.e., collections of d^2 equiangular lines in \mathbb{C}^d) exist for every dimension d. To date, SIC-POVMs are only known to exist in a few dimensions, and a recent numerical study suggests that they exist whenever d\leq 67. At some level, I’m surprised that an infinite family of SIC-POVMs has yet to emerge despite the apparent wide-spread interest in the problem (for example, see these statements of interest by Scott Aaronson and Peter Shor).

I find Tuan-Yow’s solution to be particularly interesting because of the techniques he uses. In particular, he first looks to this paper for the numerical approximation of a suspected SIC-POVM in dimension 17. We note that these SIC-POVMs are generated by taking the orbit of some vector (called the fiducial vector) under the Heisenberg-Weyl group. He increases the precision of this approximation by using it to initialize an iterative procedure that one might suspect converges to a SIC-POVM. After obtaining 2000 digits of precision, he then considers the projection operator onto the span of the numerical fiducial vector and decomposes it in terms of a certain basis of operators (namely, d^2 orthogonal members of the Heisenberg-Weyl group). Call the coefficients in this basis the overlaps.

Continue reading Zauner’s conjecture is true in dimension 17

The 4M-4 Conjecture is False!

Congratulations to Cynthia Vinzant for disproving the 4M-4 conjecture! The main result of her 4-page paper is the existence of a 4\times 11 matrix \Phi such that x\mapsto |\Phi x|^2 is injective modulo a global phase factor (indeed, 11<12=4(4)-4). This is not Cynthia’s first contribution to this problem—her recent paper with Conca, Edidin and Hering proves that the conjecture holds for infinitely many M.

I wanted to briefly highlight the main idea behind this paper: She provides an algorithm that, on input of a 4\times 11 matrix \Phi with complex rational entries, either outputs “not known to be injective” or outputs “injective” along with a certificate of injectivity. The algorithm is fundamentally based on the following characterization of injectivity:

Continue reading The 4M-4 Conjecture is False!

Cone Programming Cheat Sheet

I’m pretty excited about Afonso‘s latest research developments (namely, this and that), and I’ve been thinking with Jesse Peterson about various extensions, but we first wanted to sort out the basics of linear and semidefinite programming. Jesse typed up some notes and I’m posting them here for easy reference:

Many optimization problems can be viewed as a special case of cone programming. Given a closed convex cone K in \mathbb{R}^n, we define the dual cone as

K^*=\{y : \langle x,y\rangle\geq 0~\forall x\in K\}.

Examples: The positive orthant and the positive semidefinite cone are both self-dual, and the dual of a subspace is its orthogonal complement. Throughout we assume we have c\in\mathbb{R}^n, b\in\mathbb{R}^m, closed convex cone K\in\mathbb{R}^n, closed convex cone L\in\mathbb{R}^m, and linear operator A\colon\mathbb{R}^n\rightarrow\mathbb{R}^m. We then have the primal and dual programs

\mbox{(P)} \quad \max \quad \langle c,x\rangle \quad \mbox{s.t.} \quad b-Ax \in L,\quad x \in K

\mbox{(D)} \quad \min \quad \langle b,y\rangle \quad \mbox{s.t.} \quad A^\top y-c\in K^*,\quad y\in L^*

Notice when L and K are both the positive orthant, this is a standard linear program. Further, considering the space of n\times n real symmetric matrices as \mathbb{R}^{n(n+1)/2}, when L=\{0\} and K is the positive semidefinite cone, this is a standard semidefinite program. We consider several standard results (weak duality, strong duality, complementary slackness) in terms of the general cone program.

Continue reading Cone Programming Cheat Sheet