- Packings in real projective spaces, FoCM and SPIE
- Explicit restricted isometries, ILAS
- Probably certifiably correct k-means clustering, ILAS
- Equiangular tight frames from association schemes, SIAM AG17
- Open problems in finite frame theory, SIAM AG17

Now for my favorite talks from FoCM, ILAS, SIAM AG17 and SPIE:

**Ben Recht — Understanding deep learning requires rethinking generalization**

In machine learning, you hope to fit a model so as to be good at prediction. To do this, you fit to a training set and then evaluate with a test set. In general, if a simple model fits a large training set pretty well, you can expect the fit to generalize, meaning it will also fit the test set. By conventional wisdom, if the model happens to fit the training set exactly, then your model is probably not simple enough, meaning it will not fit the test set very well. According to Ben, this conventional wisdom is wrong. He demonstrates this by presenting some observations he made while training neural nets. In particular, he allowed the number of parameters to far exceed the size of the training set, and in doing so, he fit the training set exactly, and yet he still managed to fit the test set well. He suggested that generalization was successful here because stochastic gradient descent implicitly regularizes. For reference, in the linear case, stochastic gradient descent (aka the randomized Kaczmarz method) finds the solution of minimal 2-norm, and it converges faster when the optimal solution has smaller 2-norm. Along these lines, Ben has some work to demonstrate that even in the nonlinear case, fast convergence implies generalization.

**Afonso Bandeira — The sample complexity of multi-reference alignment**

How do you reconstruct a function over given a collection of noisy translations of the function? Intuitively, you might use one of the noisy translations as a template and then try to find the translation of best fit for each of the others before averaging. Perhaps surprisingly, this fails miserably. For example, if you find the translation of best fit for a bunch of pure noise functions, then the average appears to approach the template, thereby demonstrating so-called model bias. Another approach is to collect translation-invariant features of the functions. For example, you can estimate the average value of the function with samples, the power spectrum with samples, and the bispectrum with samples. It turns out that the bispectrum determines generic functions up to translation, but is there an alternative that provides smaller sample complexity? Afonso’s main result here: No. In fact, there are two functions that are confusable unless you see enough samples. I wonder what sort of improvements can be made given additional structural information on the function.

**Vern Paulsen — Quantum chromatic numbers via operator systems**

Given a graph, color the vertices so that adjacent vertices receive distinct colors. The chromatic number of the graph is the smallest number of colors you need to accomplish this task. Here’s another way to phrase the coloring task: Put Alice and Bob in separate rooms, and simultaneously ask them the color of certain vertices. If the vertices you ask about are adjacent, Alice and Bob must report different colors. If the vertices are identical, they must report the same color. The chromatic number is the smallest number of colors for which Alice and Bob have a winning strategy without communicating. If you allow Alice and Bob access to a common random source, then this smallest number of colors does not change. However, if you allow them access to entangled particles, then the smallest number of colors frequently does change. This suggests a new graph invariant called the * quantum chromatic number*. Interestingly, the quantum version is sometimes much smaller and much easier to calculate than the classical version. For example, the Hadamard graph of parameter , the classical chromatic number is only known to be somewhere between and , whereas the quantum chromatic number is known to be exactly . Developing a quantum protocol for any given graph amounts to finding interesting arrangements of subspaces, which I think would appeal to the frame theory community.

**Hamid Javadi — Non-negative matrix factorization via archetypal analysis**

Given a collection of points in , how do you find a small number of “archetypes” such that each is close to the convex hull of the ‘s and each is close to the convex hull of the ‘s? This problem has a number of applications in data science, and if we further ask for the ‘s to be entrywise nonnegative, this is equivalent to the problem of nonnegative matrix factorization (NMF). A lot of work in NMF has used a generative model with a so-called * separability* assumption, which asks for each archetype to be one of the data points. Other work by Cutler and Breiman relaxed the separability assumption, merely asking for each archetype to lie in the convex hull of the data points. Unfortunately, these assumptions break if the data points avoid the corners of the hull of the archetypes. So how can we hope to reconstruct the archetypes in such cases? Well, instead of constraining the archetypes to the convex hull of the ‘s, you can penalize distance from the convex hull. This amounts to regularizing the objective to encourage achetype-ness. The following illustration from the paper is helpful:

The top left illustrates how the data points were generated from the unknown true archetypes, the top right shows the output of a method that assumes separability, the bottom left assumes each archetype lies in the convex hull of the data, and the bottom right gives the regularized reconstruction, which is closest to the ground truth. Figure 3 of the paper illustrates how they can robustly reconstruct molecule spectra from mixtures better than the competition.

**Venkat Chandrasekaran — Relative entropy relaxations for signomial optimization**

A * signomial* is a function of the form

where each and each . How can we certify whether is nonnegative for every ? In the special case where the ‘s have nonnegative integer entries, then can be expressed as a polynomial of for , and so we can show that is a sum of squares. Instead, Venkat’s paper provides an analogous decomposition: He uses the AM-GM inequality to certify the nonnegativity of certain signomials with at most one negative , and then he provides a tractable routine for testing whether a given signomial is a sum of such functions, i.e., a sum of AM-GM exponentials, or * SAGE*. Interestingly, testing for SAGE is often faster than testing for SOS. In fact, if you want to test whether a polynomial is nonnegative over the positive orthant, this suggests that changing variables to signomials and testing for SAGE might be a better alternative.

**Yaniv Plan — De-biasing low-rank projection for matrix completion**

In the real world, when you’re asked to do matrix completion, the matrix entries you’re given are far from uniformly distributed, and you don’t have time to run an SDP. Yaniv investigated how one might get around both of these bottlenecks. First, in the uniform case, instead of running an SDP, you get decent performance by just grabbing the top singular vectors of the incomplete matrix. As such, for runtime considerations, it makes sense to replicate this spectral-type approach in the non-uniform case. For the non-uniform case, notice that if you don’t see any entries of a given row or column, then the singular vectors will zero out that row/column. The quality of reconstruction should therefore be measured in terms of how well a given row or column is sampled. To quantify the extent to which a row/column is sampled, just grab the top left and right singular vector of the 0-1 matrix with 1s at the sampled locations. This weighting actually serves two purposes: It helps to evaluate the quality of reconstruction, and it also allows one to “de-bias” the matrix samples before running the spectral matrix completion method. In particular, if we suppose that the weighting corresponds to a probability distribution over which the samples were drawn, then if we entrywise divide a random incomplete matrix by the weighting matrix, the expected quotient will be the desired completed matrix. As such, one should divide by the weighting and complete with the top singular vectors, and then the weighted Frobenius norm of the error is guaranteed to be small.

**Deanna Needell — Tolerant compressed sensing with partially coherent sensing matrices**

Compressed sensing tells us that you can reconstruct any sparse vector from its product with a short, fat matrix that has incoherent columns. But what if the columns are coherent? Of course, if columns and are identical, then you won’t be able to tell the difference between vector supports that include from those that include , but in applications like radar, it is permissible to confuse certain entries. For example, suppose you are willing to confuse entries whose indices are at most away from each other. Then we can get away with having nearby columns of being coherent, as long as the distant columns are incoherent. In particular, the support of any vector can be recovered to within a tolerance of provided the sparse vector’s nonzero entries are sufficiently spread apart. This reminds me a lot of the CLEAN algorithm that’s commonly used in radar, and I find it interesting that you can get a guarantee that allows for tolerance in the support recovery. For this reason, this seems fundamentally different from the superresolution work that concerns conditions for exact recovery. I wonder if it’s possible to accommodate more nonzero entries with the help of randomness (a la RIP).

]]>

**1. The minimal coherence of 6 unit vectors in is 1/3.**

The Welch bound is known to not be tight whenever lies strictly between and (see the next section for a proof sketch). As such, new techniques are required to prove optimality in this range. We leverage ideas from real algebraic geometry to show how to solve the case of vectors in for all sufficiently small . For example, our method provides a new proof of the optimality of 5 non-antipodal vertices of the icosahedron in , as well as the optimality of Sloane’s packing of 6 lines in .

Our method hinges on an application of the Tarski–Seidenberg theorem, whose statement requires a definition: A * semialgebraic set* is any subset of a finite-dimensional real vector space that can be expressed in terms of a finite collection of polynomial equalities and inequalities. For example, the positive definite cone is semialgebraic by Sylvester’s criterion.

**Tarski–Seidenberg Theorem.** Any projection of a semialgebraic set is semialgebraic.

The proof of this theorem amounts to an explicit algorithm that finds the polynomial relations of the projection of a given semialgebraic set onto a hyperplane. One can then iterate this to project onto any lower-dimensional subspace. Notice that our packing problem is equivalent to finding the smallest for which there exists a Gram matrix of unit vectors in such that the largest squared off-diagonal entry is at most . Since the set of all such forms a semialgebraic set, one may run the Tarski–Seidenberg theorem to project onto the coordinate. The projection will have the form , and so one may conclude that is the tight lower bound on coherence in this case.

While this provides a finite-time algorithm for solving the packing problem, you wouldn’t want to solve the problem this way, since the Tarski–Seidenberg algorithm is way too slow. Instead, you’d use alternatives like cylindrical algebraic decomposition (CAD), but even this takes double exponential time in the number of variables. Considering our Gram matrix has variables, we are inclined to use more information about our problem to decrease this number of variables. To this end, you can show that for an optimal packing, the locations in the Gram matrix that achieve coherence correspond to the adjacency matrix of a so-called **-secure graph**, which tend to have edges. When is close to , e.g., , this reduces the number of variables to , which is far more palatable.

We used this technique to solve and . We applied Mathematica’s built-in implementation of CAD, and the code is available here and here. Both codes are fast, proving tight bounds in less than 30 seconds. We found that the runtime was extremely sensitive to the order of variables, and we didn’t have the patience to work out the case where . I mentioned this in a talk at SIAM AG17, and I was informed of even faster alternatives to CAD. We are currently investigating whether those alternatives will make larger cases more accessible.

**2. The Welch bound is within a constant factor of optimal in the Gerzon range.**

Recall that unit vectors achieve equality in the Welch bound only if they are equiangular. By lifting each vector to its outer product , one can see that the Gram matrix of these outer products is the entrywise square of the original Gram matrix. By equiangularity, the Gram matrix of the outer products has the form , where is the matrix of all ones and (provided ). As such, the Gram matrix is positive definite, meaning the outer products are not linearly dependent. Since these outer products lie in the -dimensional vector space of symmetric matrices, this then implies that the Welch bound is tight only if . Considering Welch bound equality is closed under the Naimark complement, this then produces a lower bound on the number of vectors in the nontrivial case where . Combined, we get the so-called * Gerzon range*:

Even in this range, it is known that Welch bound equality is an uncommon occurrence since certain integrality conditions must be satisfied. As such, it is interesting to consider how close the bound is to being tight in this range. To this end, we provide a linear-time constant-factor approximation algorithm for packing. In particular, given any in the Gerzon range, we provide an explicit packing whose coherence is guaranteed to be no larger than 49 times the Welch bound. Our construction is based on the following complex construction, which in turn is based on a famous character sum estimate due to Andre Weil:

**Theorem.** Let be a nontrivial additive character of . For each , define

For each , let denote the ‘s such that and . Then

We convert this packing to a real packing by replacing each complex entry with a 2-by-2 matrix involving its real and imaginary parts. This conversion doesn’t hurt the coherence. Also, in cases where is not twice a prime, we pad with zeros, but this doesn’t hurt too much thanks to Bertrand’s postulate. Surprisingly, it was hardest to analyze the left-most edge of the Gerzon range, and we had to play interesting games with Naimark complements to make good constructions.

We did not attempt to optimize our analysis, and judging by Sloane’s database, we suspect the optimal constant is less than 2. We think follow-on work along these lines would make a nice project for a student.

**3. We found two new infinite families of locally optimal packings.**

Since proving global optimality is so hard, we also looked into how to prove local optimality. We can reformulate the problem as minimizing subject to being in the manifold of Gram matrices of spanning -packings in . To see how to prove local optimality of a given packing, considering the following illustration:

We want to show that is a local minimizer of subject to the manifold . To this end, we can locally model the sublevel set and manifold (left) with the descent cone and tangent space (right). If 0 is the unique member of the intersection between the descent cone and the tangent space (which can be certified using the dual linear program provided the descent cone is a polytope), then is a local minimizer.

To use this theory, we studied the packings in Sloane’s database, and we observed some interesting patterns. For example, some of Sloane’s putatively optimal packings arise by removing a vector from an equiangular tight frame (ETF). Also, in the case where , his putatively optimal packings are frequently orthobiangular, and can be constructed by lifting smaller ETFs. We show that all such packings are locally optimal by constructing the appropriate dual certificates. Furthermore, some of these packings beat Sloane’s putatively optimal packings. See Table 1 in the paper for the improvements (denoted by stars).

**4. Many of Sloane’s putatively optimal packings are tight frames with few angles.**

When looking through Sloane’s database, we found that surprisingly many of the packings happen to form tight frames. Furthermore, most of these tight frames have few angles. This suggests a generalization of the Welch bound in which equality is characterized by a generalization of ETFs. In the absence of this more general theory, we developed short descriptions of each of Sloane’s putatively optimal packings that happen to be tight with small angle sets. Most of these packings involve classical objects like polytopes or lattices, incidence structures like Steiner ETFs, or “marriages” likes those introduced by Bodmann and Haas.

We were able to generalize some of the constructions based on incidence structures to infinite families whose coherence is a factor of away from the Welch bound. We were able to prove that these orthobiangular tight packings are optimal when restricting to packings that are orthobiangular and tight. It would be interesting to see some computational evidence of their optimality for larger dimensions than those investigated by Sloane.

]]>

**DGM:** What is the origin story of this project? Were you and Paul inspired by the “Compressed sensing using generative models” paper?

**VV:** I have been working extensively with applied deep learning for the last year or so, and have been inspired by recent applications of deep generative image priors to classical inverse problems, such as the super resolution work by Fei Fei Li et al. Moreover, recent work on regularizing with deep generative priors for synthesizing the preferred inputs to neural activations, by Yosinski et al., made me optimistic that GAN-based generative priors are capturing sophisticated natural image structure (the synthetic images obtained in this paper look incredibly realistic).

I’ve been aware of and excited about the idea of using deep learning to improve compressed sensing since CVPR 2016, but the numerics and theory provided by Bora et al. tipped me over the edge. In particular the numerical evidence in Bora et al. is pretty strong, both because of 10X reduction in sample complexity from traditional CS and the fact that SGD worked out of the box on empirical risk. It struck me as significant that MRI could potentially be sped up by another factor of 10X by these techniques.

My spiked interest in using generative models for CS coincided with a visit that Paul Hand made to the bay area during which we planned to initiate a new deep learning collaboration, and we laid down the initial theoretical and empirical groundwork for our paper during that same visit.

**DGM:** Do you have any intuition for why there are two basins of attraction? What is the significance of ?

**VV:** We have not found a particularly satisfying answer to this question. The behavior of the empirical risk objective at the distinguished negative multiple of (call it ) is a bit subtle; for instance the expected Hessian there is positive semi-definite but not strictly psd, so it’s unclear if it’s a local extremum or a saddle (we suspect it’s a degenerate saddle from some preliminary calculations). Empirically, it appears that there are potentially two critical points which get pushed closer and closer to as one cranks up the expansivity of the layers. The expected gradient at is zero, and one interpretation is that this has to do with the peculiarity of the ReLu activation function, in how it “restricts” movement in the higher layers. Note that in the one layer case, there is only one basin of attraction, so this double basin phenomenon only manifests itself for 2 and more layers.

A more technical explanation is as follows: consider the case of a two layer generator. Note that any non-zero vectors and for any , get mapped to disjoint support (and thus orthogonal) vectors by any map of the form where is a linear transformation. Because of the form of the ReLu function, any perturbation for small enough only changes the positive activations of the first layer, but the expected gradient of the 2nd layer w.r.t the first can be made to be zero along the dimensions corresponding to the positive activations by choosing a particular .

**DGM:** Do your landscape results reflect the empirical behavior of (non-random) generative models that are actually learned from data?

**VV:** The main evidence for deep CS using non-random generative models that I can point to are the empirical results in Bora et al. where indeed the observed behavior matches our theory, in that gradient descent on empirical risk converges to the desired solution. We have also done extensive numerics on random instances, for which the same is true. Regarding the assumptions of our theory, the expansivity of the layers seems to be realistic (an ideal generator should not be collapsing information between layers, and better compression corresponds to a smaller latent code space), and the independence of weights in each layer may be closer to reality than it appears at first glance, since an ideal generator may strive for independence between layer representations to maximize information efficiency.

**DGM:** What is Helm.ai? Do your results have any implications for autonomous navigation?

**VV:** Helm.ai is a startup I’m working on, which builds robust perception systems for autonomous navigation. We are tackling the most challenging aspects of the technology required to reach full autonomy for self-driving cars, drones and consumer robots. Semi supervised learning is a large component of what we work on, and deep generative models are certainly relevant toward that goal.

There is always a gap between theory and practice, but conceptually I believe that using deep generative models for CS will have wide implications, including for autonomous navigation. For instance, there are companies out there building LIDAR sensors with a higher resolution per time to cost ratio (which is necessary for applications) by using concepts from compressed sensing. If and when DCS takes off, we should see benefits to such efforts, but of course it takes years for new algorithmic techniques to trickle down… it took 12 years to get from the first convincing results on CS to an FDA approved CS-based MRI machine which is 10X faster.

**DGM:** What’s next for this line of investigation? Denoising? Phase retrieval? Other inverse problems?

**VV:** There are (too) many interesting follow-up directions! There are of course many technical extensions, which we will tackle in the journal version of the paper, but I will comment below on what I find as the most interesting high level direction, on which we are currently preparing an ArXiv submission.

The theoretical framework we propose potentially applies to any inverse problem for which deep generative priors may be obtained, especially when empirical risk minimization is an appropriate reconstruction method in un-regularized versions of those inverse problems. Given the work by Candes et al. on Wirtinger Flow, and work by John Wright et al. on the geometry of quadratic recovery problems, empirical risk should be a reasonable approach to enforcing generative priors for phase retrieval.

I am particularly excited about using generative priors for phase retrieval, because of the potential for tangibly improving performance in applications and due to recent evidence of the more severe sample complexity bottlenecks in sparsity-based compressive phase retrieval as compared to traditional CS. Phase retrieval is inherently ill-posed, and classical approaches at overcoming this ill-posedness involve enforcing numerous instance-specific constraints, which is tedious and requires fairly specific expertise. More modern proposals, for instance by Candes et al., are to take redundant measurements with different “masks”, but this technique is not readily physically realizable and requires blasting the sample of interest multiple times, which rapidly degrades or destroys the sample at hand (which is typically difficult to prepare/acquire in the first place). Thus, it would be beneficial to use a minimal number of observations, without changing the measurement modality, by exploiting signal structure.

Recent attempts to combine classical sparsity-based compressed sensing with phase retrieval toward this goal have been met with potential computational complexity bottlenecks, since observations seem to be required for recovering -sparse signals via compressive phase retrieval using current methods, which makes it all the more important to exploit more sophisticated structure of natural signals. Meanwhile, building generative priors is purely a data-driven approach, which doesn’t require building new physical apparatus or acquisition methodologies, nor does it necessarily require intimate knowledge of the specific problems at hand. All it would require is a large/diverse enough dataset of reconstructed biological structures and a powerful enough deep generative model. Enforcing such a deep generative prior then becomes a purely algorithmic challenge, without putting any extra onus on experimental scientists, in fact reducing the amount of modeling they typically have to do.

]]>

[Flammia described] the SIC-POVM problem as a “heartbreaker” because every approach you take seems super promising but then inevitably fizzles out without really giving you a great insight as to why.

Case in point, Joey and I identified a promising approach involving ideas from our association schemes paper. We were fairly optimistic, and Joey even bet me $5 that our approach would work. Needless to say, I now have this keepsake from Joey:

While our failure didn’t offer any great insights (as Flammia predicted), the experience forced me to review the literature on Zauner’s conjecture a lot more carefully. A few things caught my eye, and I’ll discuss them here. Throughout, SIC denotes “symmetric informationally complete line set” and WH denotes “the Weyl-Heisenberg group.”

**1. WH over produces a SIC only if .**

Of course, this works for since this reduces to WH over a cyclic group. The case corresponds to the famous Hoggar lines (introduced here). Last week, I learned that Godsil and Roy proved that this doesn’t work in general (see Lemma 3.1 here). What’s the obstruction? The system of equations doesn’t have a solution except for these two special cases. Sadly, this is not terribly enlightening.

Before seeing this result, I had assumed that SICs would arise from all WHs over finite abelian groups. After seeing Godsil and Roy’s result, I wrote an interpretation of the numerical optimization code briefly described in the computer study paper, and I failed to find SICs from WHs over other small non-cyclic abelian groups. Apparently, the cyclic groups and are special, but I have no idea why.

**2. SICs can be generated by groups other than WH.**

Back in 2003, Renes et al performed numerical optimization to determine whether SICs could be obtained from non-WH groups. To this end, they churned through certain members of the SmallGroups Library and found that the groups G(36,11), G(36,14), G(64,8) and G(81,9) all lead to SICs. Later, Grassl found exact SICs for the first and third of these groups by computing the appropriate Groebner bases (he also points out that G(36,14) is actually WH). For the record, the exact coordinates in these cases are about as ugly as the coordinates for the WH SICs. But as far as I can tell, these alternative constructions (which reside in dimensions 6 and 8) have been forgotten by the modern SIC literature. For example, they do not appear as “sporadic SICs” in the Exact SICs table.

**3. Prime-dimensional group-generated SICs are necessarily generated by WH.**

This was established by Huangjun Zhu in this paper back in 2010, and it suggests that WH is the “right” group to work with (if the substantial evidence in favor of Zauner’s conjecture weren’t enough). Unfortunately, the description length of exact fiducial vectors over WH scales poorly with the dimension. One is inclined to compress these descriptions into shorter, workable representations before attempting pattern recognition for theorem discovery. Based on my experience with constructing infinite families of ETFs, this is the most promising approach for a constructive proof of Zauner’s conjecture.

**4. It looks like WH SICs are always determined by explicit equations, instead of .**

Fuchs et al recently posed what they call * the 3d conjecture*, which asserts that the WH SICs are precisely the solutions to equations they give in (27)–(29) of their paper. The conjecture holds for , and it’s held up to numerical scrutiny for . This suggests a couple of new approaches: (1) Prove the 3d conjecture. (2) Prove that 3d implies Zauner. I wouldn’t be surprised if it’s easier to determine whether 3d admits solutions, so this could be an interesting conditional proof of Zauner.

**5. A constructive proof of Zauner’s conjecture may require progress on Hilbert’s 12th problem.**

A constructive proof requires a finite-length description of an infinite family of SICs, since the proof would contain such a description. For all known non-maxial ETFs (see this paper for a survey), the Gram matrix can always be phased in such a way that all of the entries are cyclotomic, and furthermore, expressing the Gram matrix entries in this way allows patterns to emerge that enables both a short description and a proof of ETF-ness for an infinite family. (For an illustrative example, consider the harmonic ETFs.)

As established back in 2012, all of the known exact WH SICs have the property that the orthogonal projection onto the line spanned by the fiducial vector has matrix entries that lie in an abelian extension of . Since they lie in an abelian extension of an abelian extension of , these entries are expressible by radicals, and this is the representation of choice in the Exact SICs table. However, Hilbert’s 12th problem suggests that a better representation might be available. By analogy, the Kronecker–Weber theorem gives that every abelian extension of is cyclotomic, and the Kronecker Jugendtraum gives that every abelian extension of an imaginary quadratic field can be obtained with values of certain elliptic functions. By contrast, we are looking at abelian extensions of a *real* quadratic field, which is not a solved case of Hilbert’s 12th. Still, one might leverage the Stark conjectures to find a suitable basis. Apparently, the computer algebra system PARI/GP makes this a plausible enterprise, but I haven’t found the time to write the necessary code. (I’m still recoiling from my latest Zauner burn with Joey.)

]]>

The following line from the introduction caught my eye:

For instance the print-out for exact fiducial 48a occupies almost a thousand A4 pages (font size 9 and narrow margins).

As my previous blog entry illustrated, the description length of SIC-POVM fiducial vectors appears to grow rapidly with . However, it seems that the rate of growth is much better than I originally thought. Here’s a plot of the description lengths of the known fiducial vectors (the new ones due to ACFW17 — available here — appear in red):

Note that the vertical axis has logarithmic scale. Unlike my interpretation from two years ago, the description lengths appear to exhibit subexponential growth in . Putting the horizontal axis in log scale says even more:

The dotted line depicts . This suggests that the description length scales with the number of entries in the Gram matrix.

For context, let’s consider the more general problem of constructing equiangular tight frames (ETFs) of vectors in dimension ; see this paper for a survey. In the real case, it suffices to determine the sign pattern of an ETF’s Gram matrix, which can be naively described in bits. However, there are several infinite families of real ETFs with much shorter description length. Indeed, the sign patterns are determined by certain strongly regular graphs, many of which enjoy a straightforward algebro-combinatorial construction.

In the case of SIC-POVMs, the Gram matrix is complex, so it doesn’t correspond to a strongly regular graph in the same way, but the conjectures used in ACFW17 suggest that the Gram matrix may be selected so as to satisfy certain group and number theoretic properties. But even after reducing to such specific structure, the description length appears to scale with the size of the Gram matrix (i.e., the naive scaling in the real case). As such, an infinite family of explicit SIC-POVMs will likely require the identification of additional structure. This is shocking, considering the conjectured structures that are currently used already seem miraculous.

]]>

**DGM: **How were you introduced to this problem? Do you have any particular applications of shape matching or point-cloud comparison in mind with this research?

**SV:** This problem was introduced to me by Andrew Blumberg in the context of topological data science. Andrew is an algebraic topologist who is also interested in applications, in particular in computational topology. There is a vast literature on the registration problem for 3d shapes and surfaces, but usually they are tailored to the geometric properties of the space and rely on strong geometry assumptions. Our goal was to study this problem in an abstract setting, that could have potential impact in spaces with unusual geometry. In particular we are thinking of spaces of phylogenetic trees, protein-protein interaction data, and text processing. We don’t have experimental results for those problems yet but we are working on it.

A reason why it is so hard to obtain meaningful results for these “real data” problems is that it is hard to validate whether the method produces a meaningful result. A simple way for a mathematician like me to validate the performance of our methods and algorithms is to compare with problems where the ground truth solution is known (like the teeth classification and shape matching), and this is what we did in the paper.

For future scientific applications, I’m working with Bianca Dumitrascu, who is a graduate student in computational biology at Princeton. Bianca works with large datasets of protein-protein interaction information. She has the intuition that the existence of isometries between protein interaction measurements in different biological systems should be correlated with similar roles between corresponding proteins. However such behavior is very hard to test in real data because of scalability issues, the large amount of noise present in the data, and the lack of a theoretical ground truth in most cases.

**DGM:** Do you have any intuition for why your polynomial-time lower bound on Gromov-Hausdorff distance satisfies the triangle inequality?

**SV:** The intuitive answer: I think this is a phenomenon aligned with the “data is not the enemy” philosophy. The Gromov-Hausdorff distance is NP-hard in the worst case, but it is actually computable in polynomial time for a generic set of metric spaces. Since in the small scale our relaxed distance coincides with the Gromov-Hausdorff distance, then intuitively we could expect that it is actually a distance (and therefore satisfies triangle inequality).

The practical answer: Considering the relations that realize and , there is a straightforward way to define a relation between and so that the Gromov-Hausdorff objective value for that relation is smaller or equal than . Just consider the composition! If the result of our semidefinite program is interpreted as a soft assignment between points from one metric space to another, then it is natural to ask what the composition of soft assignments is, whether it is feasible for the semidefinite program, and if it is upper bounded by . This is basically why the triangle inequality holds.

**DGM: **You proved that generic finite metric spaces enjoy a neighborhood of spaces whose Gromov-Hausdorff distance from equals your lower bound (i.e., your bound is generically tight for small perturbations). However, the size of the allowable perturbation seems quite small. Later, you mention that you frequently observe tightness in practice. Do you think that tightness occurs for much larger perturbations in the average case over some reasonable distribution?

**SV:** I think tightness occurs for relatively large perturbations of the isometric case provided that the data is well conditioned. However, in an extreme case, if all pairwise distances are the same, then the solution of the semidefinite program is not unique and therefore tightness will not occur. When studying the distance from the topological point of view, a result of the form “there exists a local neighborhood such that the distances coincide” is relevant. From an applied mathematical perspective, it would interesting to quantify for how large perturbations the semidefinite program is tight. The techniques I know for obtaining such a result rely on the construction of dual certificates. The dual certificates I managed to construct also had a dependency on (the minimum nonzero entry in ) due to degeneracy issues. I think it should be possible to obtain a tightness result for larger perturbations but I think it may be a hard problem. The way I would start thinking about this is with numerical experiments and a conjectured phase transition for tightness of the semidefinite program as a function of noise, for different ‘s.

**DGM:** How frequently does your local method GHMatch recover the Gromov-Hausdorff distance in practice? Is there a way to leverage the smallest eigenvector of to get a better initialization (a la Wirtinger Flow for phase retrieval)?

**SV: **The algorithm GHMatch often gets stuck in local minima. In many non-convex optimization algorithms, good initialization is good enough to guarantee convergence to a global optimal after some steps of gradient descent. However our optimization problem has non-negative constraints which makes it significantly harder because the variable needs to be at least thresholded to a non-negative after each iteration. There is a class of algorithms that attempts to do such things for Synchronization problems, such as Projected Power Methods, (see for example this paper). But the right algorithm is not to project just like that, but to weight carefully with Approximate Message Passing, as they do for example in this paper.

]]>

John is on the job market this year, and when reading his research statement, I was struck by his discussion of our paper, so I asked him to expand his treatment to a full blown blog entry. Without further ado, here is John’s guest blog post (which I’ve lightly edited for hyperlinks and formatting):

An equiangular tight frame (ETF) is a set of unit vectors in that achieves equality in the so-called * Welch bound*:

One elegant construction of ETFs relies on a group theoretic/combinatorial object called a difference set. If is a finite group and , then we say that is a * difference set* if the cardinality of is independent of the choice of .

The construction of so-called * harmonic ETFs* is as follows. Let be a finite abelian group, and let be the group of characters on . If is a difference set and is the indicator function of , then is an ETF for its span, where denotes the left regular representation, and is the inverse Fourier transform of . A frame generated by the orbit of a vector under the action of a group representation is called a

It is natural to ask whether nonabelian groups can be used in a similar way to construct ETFs. Indeed, the most famous open problem in the theory of ETFs, Zauner’s conjecture, asks for an ETF with vectors in given as a group frame under the action of the Heisenberg group. The main obstruction is the fact that for nonabelian groups, the Pontryagin dual does not naturally form a group. Thus, it is not immediately clear what could stand in for a difference set. Our approach involves generalizing beyond nonabelian groups to association schemes.

A collection of 0-1 matrices is called an * association scheme* if the identity is in , is the all ones matrix, and spans a commutative -algebra.

Given a finite abelian group we can construct an association scheme by taking to be the translation operator on . We will refer to as an * abelian group scheme*. Not all association schemes arise in this way, and thus association schemes can be seen as a generalization of abelian groups. Moreover, we can do harmonic analysis on association schemes, as we shall see.

Since is a commutative -algebra, by the spectral theorem there is another basis for consisting of the projections onto the maximal eigenspaces in . If we expand a given matrix in both bases

then we can think of as the * Fourier transform* of . For an abelian group scheme as described above the indexing sets of and can be taken to be and respectively. With this identification the current definition of the Fourier transform is identical to the usual one. Thus, in a general association scheme we can think of the 0-1 matrices as playing the role of the group, and the projection matrices play the role of the dual group.

Since the set generates , it naturally has the algebraic structure that lacks when is a nonabelian group. This allows us to define a * hyperdifference set*, which generalizes the concept of a difference set to the setting of association schemes. For any subset we define

Since the projections are mutually orthogonal, it is immediate that is a projection. If is the Gram matrix of an ETF, then we call a hyperdifference set. See our recent paper for all the details on the construction of ETFs from association schemes.

In the case of abelian group schemes, hyperdifference sets are exactly difference sets, but association schemes and hyperdifference sets generalizes more than just harmonic ETFs. Indeed, any real ETF with centroidal symmetry (see this paper) is an instance of this construction, as are the ETFs constructed by Renes and Strohmer. But our goal is to use nonabelian groups to make ETFs, so let’s get on with it.

Given a finite * nonabelian* group , one can construct an association scheme called the

In this case, the matrix is the Gram matrix of an ETF with vectors in a space of dimension .

There are several known infinite families of difference sets in abelian groups, and thus there are infinite families of harmonic ETFs. However, until our recent paper, there was no known infinite family of ETFs constructed as group frames generated by a nonabelian group. In our construction the groups are formed by a “twisted” cross product of two vector spaces over the field with two elements. It turns out that these are instances of * Suzuki 2-groups* (see this paper). We then construct a set of irreducible characters satisfying above, thus giving a hyperdifference set in the group scheme. In the end, this gives us ETFs with vectors in a space of dimension for any positive integer .

]]>

It’s hard to pin down what exactly the polynomial method is. It’s a technique in algebraic extremal combinatorics, where the goal is to provide bounds on the sizes of objects with certain properties. The main idea is to identify the desired cardinality with some complexity measure of an algebraic object (e.g., the dimension of a vector space, the degree of a polynomial, or the rank of a tensor), and then use algebraic techniques to estimate that complexity measure. If at some point you use polynomials, then you might say you applied the polynomial method.

What follows is a series of instances of this meta-method.

**— Linear algebraic bounds —**

In this section, we identify certain combinatorial structures with vectors in a vector space. After identifying that these vectors must be linearly independent, we conclude an upper bound on the cardinality of these structures (namely, the dimension of the vector space).

**Problem (from the Tricki).** Suppose is even. What is the maximum number of subsets of with the property that each has odd size and the intersection between any two has even size?

For each set , consider its indicator function . (We let our indicator functions take values in the field of order 2 since parity plays a leading role in this problem.) We seek indicator functions of odd support that are pairwise orthogonal. In this case, orthogonality implies linear independence (why?), and so we know that the maximum possible number is . Furthermore, we can saturate this bound by selecting all singletons, and so the bound is sharp.

Next, consider a -balanced incomplete block design, that is, a collection of size- subsets of , called blocks, with the property that every point in belongs to exactly blocks, and every pair of distinct points is contained in exactly blocks. We take to prevent degenerate cases.

**Fisher’s inequality.** A -balanced incomplete block design exists only if .

Consider the matrix whose th row is the indicator function of the th block. (Here, we view the entries as real numbers.) First, implies (why?). Next, the definition gives , where denotes the all-ones matrix. Since is positive definite, it has rank , which in turn implies that has rank , which is only possible if .

Notice that in this case, the linearly independent vectors are the columns of , as opposed to the blocks’ indicator functions. The block designs which achieve equality in Fisher’s inequality are called symmetric designs. The Fano plane is an example.

The following result has a similar proof:

**The Gerzon bound.** Take of unit norm such that whenever . Then .

Here, the matrices are linearly independent in the -dimensional real vector space of self-adjoint matrices, which can be seen from their Gram matrix. The ensembles which achieve equality in the Gerzon bound are called symmetric informationally complete positive operator–valued measures (SIC-POVMs), and they are conjectured to exist for every dimension .

**— Simple polynomials have simple zero sets —**

In this section, the main idea is that simple polynomials have simple zero sets. The following is the most basic instance of this idea:

**Theorem.** Let be a field. A nonzero polynomial of degree at most has at most roots.

This immediately produces a similar instance for polynomials of multiple variables:

**Corollary.** Let be a field. If has degree at most and vanishes on more than points of a line in , then it vanishes on the entire line.

To see this, parameterize the line in terms of a variable , and then plug this parameterization into to get a polynomial in of degree at most . Then by the previous theorem, this has more than roots only if , which in turn establishes that vanishes on the entire line. The following provides yet another example of simple multivariate polynomials having simple zero sets:

**Theorem (Alon’s Combinatorial Nullstellensatz).** Let be a field, and suppose has total degree . If the coefficient of is nonzero and satisfy for each , then there exists such that .

As an application of this result, consider the following:

**Problem (Problem 6 from IMO 2007).** Let be a positive integer, and consider

as a set of points in . Determine the smallest number of planes, the union of which contains but does not include .

There are a couple of obvious choices of planes, for example shifts of each coordinate plane, or shifts of the orthogonal complement of the all-ones vector. We will show that is the smallest possible number of planes by showing that planes are insufficient. In particular, having only planes will lead to a polynomial that is too simple for the zero set to satisfy the desired constraints.

Suppose to the contrary that planes are sufficient, and let denote the th plane. Put

Then the zero set of is the union of the planes, and so by assumption. Since we want to apply Alon’s Combinatorial Nullstellensatz, we want a polynomial whose zero set contains an entire grid, so we will modify so as to vanish over all of . To this end, we know that

is also zero on , and furthermore, . As such,

will serve as the desired modification. This polynomial has total degree , and the coefficient of is . Furthermore, taking satisfies the hypothesis of Alon’s Combinatorial Nullstellensatz, which then gives that fails to vanish on all of , a contradiction.

**— Simple sets are zero sets of simple polynomials —**

Given a sufficiently small set, there is a low-degree nonzero polynomial that vanishes on that set. For example:

**Theorem.** Let be a field, and take of size at most . Then there exists a nonzero polynomial of degree at most that vanishes on .

This can be proven explicitly or implicitly. For the explicit version, just take . For the implicit version, take the linear operator that maps to by evaluating a given polynomial at each point in . Since the subspace of polynomials in of degree at most has dimension , this mapping has a nontrivial nullspace, thereby implicating the desired nonzero polynomial . This latter proof can be used to obtain a more general lemma:

**Theorem.** Let be a field, and take of size strictly less than . Then there exists a nonzero polynomial of degree at most that vanishes on .

Indeed, the dimension of the subspace of of polynomials with degree at most is the number of monomials of degree at most , which equals the number of -tuples of nonnegative integers whose sum is at most . A stars and bars argument then gives that this dimension is

where the last step follows from Pascal’s identity by induction on . From this, one may conclude (for example) that any two points in lie on a line, any five points lie on a (possibly degenerate) conic section, and any three points lie on a (possibly degenerate) cubic curve.

This theorem was used to prove the following:

**Theorem (Finite field Kakeya conjecture, Dvir 2008).** Let be a finite field, and suppose contains a line in every direction (i.e., is a Kakeya set). Then has size at least , where depends only on (and not ).

More explicitly, we will show that has size at least

(Note that this differs slightly from Terry Tao’s exposition.) To do this, we suppose to the contrary that . Then by the previous result, there exists a nonzero polynomial of degree that vanishes on . Use this to form a homogeneous polynomial by multiplying terms of degree by . Put . Then . By assumption, we have that for every , there exists such that is in the zero set of . This then implies that is in the zero set of . Note that for every , the homogeneity of gives

That is, has degree at most and is zero at for every nonzero . Since this makes up different points on a line, the corollary of the previous section gives that is also zero when , namely, at . Since the choice of was arbitrary, this means whenever . But is the polynomial made up of the terms of maximum degree in , which cannot be identically zero (by Alon’s Combinatorial Nullstellensatz, for example, though this is overkill).

**— Capset bounds —**

A particularly recent application of the polynomial method made a surprising contribution to the following open problem:

**The Capset Problem.** How large can a subset be if it contains no lines?

Such a set is known as a * capset*. The obvious bounds on are and , and with a bit of extra work, these were improved to and . On his blog, Terry Tao suspected that the lower bound could be improved all the way up to , and in one of his MathOverflow answers, he suspected that the polynomial method might be leveraged to improve the known bounds. He was recently proven both wrong and right when the polynomial method was used to establish the following:

**Theorem (Ellenberg–Gijswijt 2016).** If contains no lines, then .

The proof uses ideas from Croot–Lev–Pach 2016. First observe the following identity:

Indeed, precisely when either forms a line in or . Since contains no lines by assumption, both sides of the above identity are zero unless , in which case both sides are 1.

The above identity equates functions of the form . View this as a 3-way tensor whose entries lie in and whose rows, columns and tubes are indexed by members of . We will estimate a certain complexity measure of 3-way tensors known as * slice rank*: Here, a tensor is said to be decomposable if it can be expressed as

and a tensor has slice rank if there are decomposable tensors that sum to and no fewer. As one might expect, the slice rank of a diagonal tensor is the number of nonzero diagonal entries, and so the right-hand side of the above identity has slice rank . As such, it suffices to bound the slice rank of the left-hand side.

To this end, we view the left-hand side as a subtensor of , defined over all , whose slice rank is an upper bound on the desired slice rank. The following lemma estimates the slice rank of this larger tensor:

**Lemma.** The slice rank of over is at most , where

The result then follows from analyzing the asymptotic behavior of (see this post for details). The proof of the lemma is perhaps more interesting, considering this is where polynomials actually play a role. The first step is to express the tensor as a polynomial. To do this, note that over , we have . As such,

Bounding the slice rank then amounts to decomposing the above polynomial into only a few terms, where either , , or -dependence can be factored out of each term.

To this end, consider the , and -degrees of each monomial, the sum of which gives the total degree of that monomial. Observe from the polynomial’s definition that each monomial has total degree at most , implying that one the , or -degrees is at most . Partition the monomials according to which degree is smallest (breaking ties arbitrarily). For the moment, let’s focus on the monomials for which the -degree is smallest. Then we combine like terms according to the contribution from the variables, resulting in combined terms of the form . How many such terms are there? Considering the polynomial’s definition, we know that each lies in . Then letting , and denote the numbers of ‘s that are 0, 1 and 2, respectively, it follows that counts the total number of possible combined terms that were combined according to . Since the same number arises from combining terms according to or , we end up with the bound .

]]>

**1. There is an ETF of 76 vectors in **

See this paper. Last time, I mentioned a recent proof that there is no ETF of 76 vectors in . It turns out that a complex ETF of this size does exist. To prove this, it actually seems more natural to view the vectors as columns of a matrix whose row vectors sum to zero. As a lower-dimensional example, consider the following matrix:

Here, the columns are all possible vectors of s that sum to zero (modulo antipodes), and in this case, they happen to form an ETF for their span, namely, the orthogonal complement of the all-ones vector. ETFs like this (where the entires are all s and the row vectors sum to zero) are particularly well-suited as * supersaturated designs*. Unfortunately, the naive generalization of this construction fails to produce ETFs. However, a generalization of sorts does exist: There is a Steiner-type construction with the incidence matrix of any finite projective plane that contains something called a

**2. Certain generalized quadrangles lead to new complex ETFs**

See this paper. For this construction, first imagine taking any incidence matrix of a Steiner system, for example:

(The blank entries denote zeros.) Here, each of the 12 rows is the indicator function of a line containing 3 points, and there are a total of 9 points (columns). This particular example is the incidence matrix of an affine plane of order 3. Every point is contained in 4 lines, and so each column has squared norm 4. Also, two points determine a line, so the supports of every pair of columns overlap in exactly one entry. As such, you can think of the columns as being equal-norm and equiangular, even if you replace each 1 with an arbitrary unimodular constant. With this freedom, we can attempt to design unimodular constants in such a way that the columns of the resulting matrix form an ETF for their span. Amazingly, this is possible:

Indeed, the above columns form an ETF for a 6-dimensional subspace of . In general, one may remove the spread from something called an * abelian generalized quadrangle* and use the remaining incidence structure as instructions for producing such an ETF. This results in an infinite family of ETFs, most of which are real (and whose strongly regular graphs were previously discovered by Godsil). However, the complex ETFs in this infinite family are new.

**3. There is no ETF of 96 vectors in **

See this paper. Last time, I pointed to a similar paper which disproved the existence of real ETFs. This new paper uses similar techniques (and again, lots of computation) to establish that no real ETF exists. This is the third nonexistence result of real ETFs that goes beyond the necessary integrality conditions and Gerzon bound. Considering they are so hard to come by, I’m always happy to learn of new necessary conditions like this.

**4. There are new line packings that meet the orthoplex and lifted Toth bounds (!)**

See this paper and that paper. This is a slight stretch, since neither of these are ETF constructions, but they are similarly important because they are provably optimal packings of lines through the origin.

Recall that ETFs are known to be optimal line packings because they meet the Welch bound. There are actually a few lower bounds on coherence that packings might meet. For example, maximal mutually unbiased bases are known to be optimal packings because they meet the orthoplex bound. Recently, Bodmann and Haas used a completely different approach to find infinite families of packings that meet this bound in totally new dimensions. Their main idea is to take a large ETF with all unimodular entries and union it with an identity basis. Such a packing will be too large for the Welch bound be sharp (due to the Gerzon bound), but it is straightforward to show that the packing meets the orthoplex bound.

One way to prove the Welch and orthoplex bounds is to lift to the space of self-adjoint matrices, and then project onto the orthogonal complement of the identity matrix. Such a mapping will send each vector in to a “lifted traceless” real space of dimension , and the squared modulus of a given inner product in the original space can be expressed in terms of an inner product in the new space (no modulus). Through this mapping, line packings are converted into spherical codes, and so the Rankin bound may be applied to the lifted traceless space to produce bounds on line packings. In the special case where , the lifted traceless space is 3-dimensional, and spherical codes in this dimension also satisfy the so-called * Toth bound*. This bound is known to be sharp in the case of the equilateral triangle, the regular tetrahedron, the regular octahedron, and the regular icosahedron. Furthermore, each of these correspond to an optimal line packing in (the last of these was recently established by Casazza and Haas).

]]>

**1. Introduction to ETFs (Dustin G. Mixon)**

Given a -dimensional Hilbert space space and a positive integer , we are interested in packing lines through the origin so that the interior angle between any two is as large as possible. It is convenient to represent each line by a unit vector that spans the line, and in doing so, the problem amounts to finding unit vectors that minimize * coherence*:

This minimization amounts to a nonconvex optimization problem. To construct provably optimal packings, one must prove a lower bound on for a given and spatial dimension , and then construct an ensemble which meets equality in that bound. To date, we know of three bounds that are sharp:

**Trivial bound.**, sharp only if**Welch bound.**, sharp only if**Orthoplex bound.**, sharp only if

Of course, equality in the trivial bound occurs precisely when the vectors are orthogonal. It turns out that equality in the Welch bound occurs precisely when there exist constants and such that

In words, the ensemble is -equiangular and -tight, and so we call the ensemble an * equiangular tight frame* (ETF). Far less is known about ensembles that achieve equality in the orthoplex bound, though Bodmann and Haas have recently made an important stride in this direction.

ETFs were first introduced by Strohmer and Heath in 2003, and in the time since, they have proven to be notoriously difficult to construct. Still, they have found interesting connections with various combinatorial designs, and these connections have been particularly fruitful for constructing new infinite families of ETFs. In this talk we will discuss some of these success stories, paying particular attention to the research programs that led to their discovery.

**2. Nonabelian Harmonic ETFs (Joseph W. Iverson)**

A classic construction of equiangular tight frames relies on the discrete Fourier transform matrix (DFT) of a finite abelian group . Each column of the DFT corresponds to an element of , and each row gives the values of a homomorphism , also called a * character*. In particular, the entries of the DFT are unimodular. The DFT is a scalar multiple of a unitary matrix, so if we pull out any subset of rows, the resulting short, fat matrix will be an equal-norm tight frame. We obtain an

Recent papers by the speaker (here) and by Thill and Hassibi (there) suggest a generalization of this procedure for a * non*abelian group . In this setting, we replace characters with irreducible unitary representations , where is the dimension of the representation and is the group of unitary matrices. As in the abelian case, we can list the values of the irreducible representations in one, big matrix . The only difference is that now it will be convenient to scale the values of by . When , for instance, we get a matrix like the following, with :

Like before, we pull out any set of rows:

Once we collapse the columns, we get an equal-norm tight frame:

Thill and Hassibi have a way of picking the rows in this process that ensures the resulting frame has low coherence. Let be the set of irreducible representations of , up to unitary equivalence. Any group of automorphisms acts on by precomposition:

Fix an irreducible , and let be the orbit of under the action of . Now choose * all* of the rows of that correspond to the representations in the orbit , and make the resulting tight frame. The main result of Thill and Hassibi puts an upper bound on the coherence of this frame. In general, these will not be equiangular tight frames, but in at least one (abelian) example, Thill and Hassibi

**3. Polyphase ETFs and abelian generalized quadrangles (Matthew Fickus)**

We discuss a new way to construct ETFs which involves signing/phasing the incidence matrix of a balanced incomplete block design (BIBD). As we’ll see, these phased BIBD ETFs are naturally represented as the columns of a rank-deficient, tall-skinny matrix. In this form, it will be obvious that these vectors are equiangular but hard to see that they form a tight frame for their span. This contrasts with many other known constructions of ETFs, such as harmonic ETFs and Steiner ETFs, where tightness is obvious but equiangularity is not.

To date, we have constructed three infinite families of phased BIBD ETFs, and one of these contains an infinite number of new complex ETFs. For all of these, what we actually construct is a matrix obtained from a BIBD’s incidence matrix by replacing each of its nonzero entries with a monomial in the ring of polynomials (the convolution algebra) over a finite abelian group. Evaluating the matrices at any nontrivial character of this group produces a phased BIBD ETF.

Being matrices with polynomial entries, any such generalized ETF is an example of a polyphase matrix of a filter bank. As we will explain, the filter banks corresponding to our polyphase BIBD ETFs are closely related to special types of combinatorial designs known as generalized quadrangles (GQs). GQs have a rich literature, and we explain how each of our three infinite families of phased BIBD ETFs relate to it.

Our construction is also related to another recently introduced method for constructing complex ETFs, namely the one given in the recent paper “Equiangular lines and covers of the complete graph” by Coutinho, Godsil, Shirazi and Zhan. Their construction generalizes a well-known connection between real ETFs and strongly regular graphs (SRGs), identifying certain types of complex ETFs with abelian distance-regular antipodal covers of complete graphs (DRACKNs).

Overall, we will see that certain special types of filter banks simultaneously yield ETFs, abelian GQs and abelian DRACKNs. We also discuss some partial converses to these results. For example, the existence of a certain type of phased BIBD ETF actually implies the existence of such a filter bank, which in turn implies the existence of certain GQs and DRACKNs. This opens up some exciting new possibilities for future research: for a long time, frame theory has been leveraging the rich literature of combinatorial designs in order to construct ETFs; these results allow us to use frame theory to prove new results in combinatorial design.

**4. Maximal ETFs by combinatorial techniques (John Jasper)**

The most famous open problem in the study of equiangular tight frames (ETFs) concerns * maximal ETFs*, that is, ETFs with vectors in . To those studying quantum information theory the collection of outer products of a maximal ETF is called a

Zauner’s original conjecture was actually more detailed regarding how one can obtain a SIC-POVM. In particular, he conjectured that for each there is a maximal ETF formed by taking the orbit of a single vector, called a fiducial vector, under the action of the Heisenberg group. A great deal of effort has been put into proving Zauner’s conjecture. Exact solutions are known for dimensions , and numerical solutions are known for . With one exception, these solutions are all obtained by finding a fiducial vector for the Heisenberg group.

There is some good evidence that the group theoretic approach suggested by Zauner may have some fundamental limitations (see this blog entry). Looking at the study of ETFs in general, we see that ETFs generated by a group are important, for example, harmonic ETFs. However, there are several more constructions that are more purely combinatorial in nature. These include Steiner ETFs, Kirkman ETFs, Tremain ETFs, and many more. In this talk, we will discuss some alternate approaches to finding maximal ETFs. One of the smallest examples of a Steiner ETF is actually a maximal ETF in dimension 3, and this is the only instance where these combinatorial constructions have yielded a maximal ETF so far. However, we have recently seen some tantalizing evidence that some previously unknown combinatorial structure is lurking inside of maximal ETFs, illuminating new connections to existing constructions. Our hope is that with some work, we can leverage this combinatorial structure into new constructions of ETFs, perhaps even new maximal ones.

**5. Achieving the orthoplex bound and constructing weighted complex projective 2-designs with Singer sets (Nathaniel Hammen)**

(Based on this paper by Bernhard G. Bodmann and John Haas)

In many situations, we desire a unit-norm frame that has a small maximum magnitude among pairwise inner products. According to a bound by Welch, equiangular tight frames are the minimizers for the maximum magnitude of pairwise inner products. However, in a dimensional Hilbert space, the Welch bound is only achievable if the number of vectors in the frame is at most . If the number of vectors in the frame is larger than this, then the orthoplex bound serves as an alternative to the Welch bound.

In analogy with the Welch bound, the orthoplex bound is only achievable if the number of vectors in the frame is at most . In this talk we show that if a unit-norm frame has a maximum magnitude of pairwise inner products that is less than or equal to the orthoplex bound and there exists a basis such that the inner products between a frame vector and a basis vector are all identical, then the frame formed by the union of these two sets satisfies the orthoplex bound. If the initial frame is tight, then the orthoplectic frame will also be tight. In particular, the union of an equiangular tight frame made up of at least vectors with such a basis will always form a tight orthoplectic frame.

Two families of such orthoplectic frames are constructed using cyclic frames generated by difference sets and relative difference sets. When is a prime power, we obtain a tight frame with vectors, and when is a prime power, we obtain a tight frame with vectors. In addition, the orthoplectic frames that are constructed can also be shown to be weighted 2-designs. These are useful in quantum state tomography.

**6. Optimal subspace packings (Dustin G. Mixon)**

While the previous talks have discussed packing 1-dimensional subspaces in a finite-dimensional Hilbert space, in this talk, we will pack higher-dimensional subspaces. In the 1-dimensional case, we considered the interior angle between two lines, also known as the * principal angle*. When the subspaces are higher-dimensional, which angle shall we attempt to maximize?

In the higher-dimensional case, there are actually several principal angles to work with. Given subspaces and , the principal angles are defined iteratively: Find the unit vectors and that maximize , and then for each , find a unit vector orthogonal to and a unit vector orthogonal to that together maximize . Then the th principal angle between and is .

Now that we have principal angles, we can use them to define a worthy packing objective. When Conway, Hardin and Sloane first wrestled with this, they discussed several alternatives, and found that the so-called * chordal distance* was the easiest to work with theoretically:

where and are -dimensional and are the principal angles. Later, Dhillon, Heath, Strohmer and Tropp introduced another notion of distance, called the * spectral distance*:

When this latter notion of distance is large for all pairs in a given ensemble of subspaces, the ensemble is particularly well-suited for the compressed sensing of block-sparse signals (see this paper).

In this talk, we will discuss subspace packings which are chordal- and spectral-distance optimal. In particular, we will discuss generalizations of the Welch bound, as well as known constructions.

]]>