Sorry it took so long to update this blog. Boris will be the first to tell me that I’m in violation of the law, but I assure you I have been rather busy. I started teaching today, so I used the last couple of months to knock out some research before it was too late!

Last time, I discussed three open problems involving short, fat matrices. Each of these problems concerned the conditioning of submatrices, and in this post, I discuss some new and exciting work along this vein. Specifically, this is about a paper that I just wrote with Matt Fickus, John Jasper and Jesse Peterson, which analyzes a special class of short, fat matrices for which it is relatively tractable to estimate extreme singular values over all submatrices of a certain size.

In this paper, there is no single key idea we exploit, but rather a series of important observations. First, we focus on submatrices of columns which are also short and fat; having these sorts of submatrices be well conditioned is important for numerical erasure robustness. For these submatrices, say is a submatrix of , the smallest and largest singular values can be visualized in terms of the column vectors that make up the submatrices. Specifically, we get these singular values by minimizing and maximizing over all unit vectors , and so we expect these values to be more extreme for subcollections of columns which are more bunched up in the sphere.

In this case, we can switch the order of optimization:

so that instead of calculating singular values of all submatrices, we find the columns of which are closest each unit-norm vector . Of course, there is a continuum of such ‘s, i.e., many more than . However, we can approximate the extrema of our objective function over the sphere by sampling the sphere at sufficiently many points. Indeed, this application of -nets is already old hat in the random matrix community, which uses this sort of discretization of the sphere to perform a union bound. However, they only use the existence of certain sized -nets to obtain probability estimates, whereas we explicitly construct -nets with certain desirable properties.

Indeed, we don’t want to have to compute our objective function at every point in an -net because the smallest known -nets of the sphere grow exponentially with the dimension. To decrease the number of operations, we seek a special type of -net, namely, one which shares a (large) symmetry group with the columns of . To be explicit, denote and take the -net to be generated by the symmetry group of , which is a subgroup of the orthogonal group . By the definition of the symmetry group of , we have that for every , there exists a permutation such that . Using this fact, we have

,

and so the smallest/largest inner products with ‘s are -invariant. This means we only need to compute our objective function at different -net points, speeding up our procedure by a factor of . As such, this procedure is best suited for ‘s which have a large symmetry group.

The magnitude of this speedup is best evaluated in terms of particular groups . To this end, our paper focuses on the group of signed permutation matrices, and we use intuition from Lebesgue integration to construct a -invariant -net in which grows subexponentially with . As an example of how significant this speedup is, we apply our -invariant -net to estimate extreme singular values of submatrices of a certain matrix. After three minutes of computation in MATLAB (on an old laptop), we are able to establish that every submatrix has condition number . Considering there are a total of such submatrices, this is a particularly significant speedup!

Check out the paper for more details, and for some related research questions posed in the conclusions section.

Nice work! I enjoy reading these papers as a mental exercise. One of these days I hope to understand the entire article and apply it in some application.