Spherical codes and designs

Later this month, Hans Parshall will participate in a summer school on “Sphere Packings and Optimal Configurations.” In preparation for this event, Hans was assigned the task of writing lecture notes that summarize the main results of the following paper:

P. Delsarte, J. M. Goethals, J. J. Seidel,

Spherical codes and designs,

Geometriae Dedicata 6 (1977) 363–388.

I found Hans’ notes to be particularly helpful, so I’m posting them here with his permission. I’ve lightly edited his notes for formatting and hyperlinks.

Without further ado:

— Spherical codes —

A spherical code is a finite set X of unit vectors in Euclidean space \mathbf{R}^d. Fundamental problems in discrete geometry and communication theory are concerned with the interplay between the size |X| and the set of inner products A(X) := \{\langle x,y \rangle : x,y \in X, x \neq y\}. For instance, the kissing number \tau(d) is the largest number of unit spheres in \mathbf{R}^d that touch S^{d - 1} without overlapping. Equivalently, \tau(d) is the size of the largest spherical code X \subseteq S^{d - 1} with A(X) \subseteq [-1,1/2].

More generally, we call a spherical code X \subseteq S^{d - 1} an A-code when A(X) \subseteq A \subseteq [-1,1). Delsarte, Goethals and Seidel obtain upper bounds for the size |X| of an arbitrary A-code X \subseteq S^{d - 1} in terms of polynomials that interact nicely with the set A. To see the utility of such a strategy, consider their absolute bound:

Theorem 1 (Theorem 4.8 in [DGS77]). If X \subseteq S^{d - 1} is an A-code with s = |A| < \infty,

\displaystyle |X| \leq \binom{d + s - 1}{s} + \binom{d + s - 2}{s - 1}. \qquad (1)

Proof: Observe that the polynomial F(t) = \prod_{\alpha \in A} (t - \alpha)/(1 - \alpha) vanishes on A and has degree s. For each y \in X, define the polynomial function F_y : S^{d - 1} \rightarrow \mathbf{R} by F_y(x) = F(\langle x,y \rangle). For each x,y \in X, notice F_y(x) = \delta_{x = y}, and so each F_y is linearly independent. Moreover, each F_y resides in the vector space of real-valued functions on S^{d - 1} that can be represented by a polynomial of degree at most s. It follows that |X| is at most the dimension of this vector space, which is given by \binom{d + s - 1}{s} + \binom{d + s - 2}{s - 1}; see Section 2.2 here. \Box

Theorem 1 is clearly sharp for s = 1 by considering the d + 1 vertices X \subseteq S^{d - 1} of a regular simplex. However, already for s = 2, equality in (1) is only known to occur for d \in \{2,6,22\}, where the corresponding spherical code X \subseteq S^{d - 1} is constructed from a set of \binom{d + 2}{2} equiangular lines in \mathbf{R}^{d + 1}. In what follows, we derive improved upper bounds on |X| for arbitrary A-codes X \subseteq S^{d - 1} that depend on more detailed information about A.

— Gegenbauer polynomials —

The linear programming bound for spherical codes in \mathbf{R}^d is stated in terms of Gegenbauer polynomials \{Q_k\}_{k = 0}^\infty \subseteq \mathbf{R}[t]. These can be described recursively by Q_0(t) = 1, Q_1(t) = dt, and, for k \geq 2,

\displaystyle \frac{k}{d + 2k - 2} Q_{k}(t) = t Q_{k - 1}(t) - \frac{d + k - 4}{d + 2k - 6}Q_{k - 2}(t). \qquad (2)

Note the dependence on the ambient dimension d. This recursion has the benefit of being concrete and the drawback of being completely unmotivated. Before we see how these polynomials are useful, we give some motivation as to why they naturally appear in the context of spherical codes. The rest of this section is based loosely on the excellent lecture notes by Vallentin.

We want an upper bound on |X|, where X \subseteq S^{d - 1} is an arbitrary A-code. Equivalently, we want an upper bound on the clique number of the infinite graph with vertices S^{d - 1} and an edge between x,y \in S^{d - 1} exactly when \langle x,y\rangle \in A. One such influential bound for finite graphs is given by the Lovasz theta number, which is defined as a semidefinite program. This was strengthened by Schrijver and subsequently extended to infinite graphs by Bachoc, Nebe, Oliveira and Vallentin. To state a specialized version of their bound, let \mathcal{C}(S^{d - 1} \times S^{d - 1}) denote the set of continuous functions K : S^{d - 1} \times S^{d - 1} \rightarrow \mathbf{C}, which we call kernels. A kernel K is called positive, and we write K \succeq 0, if for every set of m points x_1, \ldots, x_m \in S^{d - 1}, the matrix (K(x_i,x_j))_{i,j \in [m]} is positive semidefinite. We have the following bound.

Proposition 2. If X \subseteq S^{d - 1} is an A-code, then

\displaystyle \begin{aligned} |X| \leq \inf \{\lambda :\; &{ \exists K \in \mathcal{C}(S^{d-1} \times S^{d-1})} \text{ such that } K \succeq 0,\\ & K(x,x) = \lambda - 1 \; \forall x \in S^{d - 1}, \text{and } \nonumber K(x,y) \leq -1 \; \forall \langle x,y \rangle \in A\}. \quad (3) \end{aligned}

Proof: Let K \in \mathcal{C}(S^{d - 1} \times S^{d - 1}) be feasible for (3). Since K \succeq 0, we have

\displaystyle \sum_{x,y \in X} K(x,y) = \mathbf{1}^T (K(x,y))_{x,y \in X} \mathbf{1} \geq 0,

and so

\displaystyle 0 \leq \sum_{x,y \in X} K(x,y) = \sum_{x \in X} K(x,x) + \sum_{\substack{x,y \in X \\ x \neq y}} K(x,y) \leq |X|(\lambda - 1) - |X|(|X| - 1).

Rearranging yields |X| \leq \lambda as desired. \Box

In Proposition 2, we may without loss of generality restrict our attention to O(d)-invariant kernels K, where K(Rx,Ry) = K(x,y) for all R \in O(d). In particular, if K is O(d)-invariant, then K(x,y) = F(\langle x,y \rangle) for some continuous function F \in \mathcal{C}(S^{d - 1}). The Peter–Weyl theorem provides the decomposition \mathcal{C}(S^{d-1})=\bigoplus_{k=0}^\infty H_k, where H_k is the vector space of restrictions of homogenous degree-k harmonic polynomials to S^{d - 1}. Hence, we may express O(d)-invariant kernels in terms of the orthogonal projections \pi_k : \mathcal{C}(S^{d - 1}) \rightarrow H_k. Set h_k := \dim (H_k) and let \{v_{ik}\}_{i = 1}^{h_k} be a real orthonormal basis for H_k. A kernel representation for \pi_k is given by

\displaystyle (\pi_k f)(x) = \sum_{i = 1}^{h_k} \langle f, v_{ik} \rangle v_{ik}(x) = \int_{S^{d - 1}} f(y) \sum_{i = 1}^{h_k} v_{ik}(x) v_{ik}(y) \; d\sigma(y) \;\; (f \in \mathcal{C}(S^{d - 1})).

The upshot is that every positive O(d)-invariant kernel K can be expressed as \sum_{k = 0}^\infty f_k K_k with f_k \geq 0, with each kernel defined by the addition formula K_k(x,y) := \sum_{i = 1}^{h_k} v_{ik}(x) v_{ik}(y). Moreover, each kernel K_k can be expressed as K_k(x,y) = F_k(\langle x,y \rangle) for a polynomial F_k \in \mathbf{R}[t] of degree k, and the orthogonality of the spaces \{H_k\}_{k = 0}^\infty leads to the orthogonality relation

\displaystyle \int_{-1}^1 F_j(t) F_k(t) (1 - t^2)^{(d - 3)/2}\;dt = 0 \text{ for } j \neq k.

This determines the polynomials \{F_k\}_{k = 0}^\infty recursively, and rescaling each F_k appropriately yields the Gegenbauer polynomials \{Q_k\}_{k = 0}^\infty described by (2). The choice of scaling is irrelevant for our applications.

— The linear programming bound for spherical codes —

While we could derive the linear programming bound from Proposition 2, we instead give a concrete proof in the spirit of Delsarte, Goethals, and Seidel.

Theorem 3 (Theorem 4.3 in [DGS77]). If X \subseteq S^{d - 1} is an A-code, then

\displaystyle|X| \leq \inf \{F(1) : F = \sum_{k = 0}^\infty f_kQ_k, \; f_0 = 1, \; f_k \geq 0 \; \forall k, \; F(\alpha) \leq 0 \; \forall \alpha \in A\}. \quad(4)

Equality in (4) occurs if and only if F(\alpha) = 0 for all \alpha \in A(X) and

\displaystyle f_k \sum_{x,y\in X} Q_k(\langle x,y \rangle)=0 \text{ for all } k\geq 1.

Proof: Let F = \sum_k f_kQ_k be feasible for (4). The key idea is to consider bounding \sum_{x,y \in X} F(\langle x,y \rangle) from above and below. To begin, expand

\displaystyle \begin{aligned} \sum_{x,y \in X} F(\langle x,y \rangle) &= \sum_{x,y \in X} Q_0(\langle x,y \rangle) + \sum_{k \geq 1} f_k\sum_{x,y \in X} Q_k(\langle x,y \rangle)\\ &= |X|^2 + \sum_{k \geq 1} f_k\sum_{x,y \in X} Q_k(\langle x,y \rangle). \end{aligned}

For k \geq 1, Q_k(\langle x,y \rangle) is a positive kernel, and so \sum_{x,y \in X} F(\langle x,y \rangle) \geq |X|^2. For an upper bound, the constraint F(\alpha) \leq 0 for all \alpha \in A provides

\displaystyle \sum_{x,y \in X} F(\langle x,y \rangle) = \sum_{x \in X} F(\langle x,x \rangle) + \sum_{\substack{x,y \in X \\ x \neq y}} F(\langle x,y \rangle) \leq |X| F(1).

All together, we have |X| \leq F(1) with equality exactly when claimed. \Box

Observe that the only property of the Gegenbauer polynomials that we used for Theorem 3 was that, for each k, \sum_{x,y \in X} Q_k(\langle x,y \rangle) \geq 0 for all finite X \subseteq S^{d - 1}. This is weaker than each Q_k(\langle x,y \rangle) being a positive kernel, and Pfender obtained slight improvements based on this observation.

The case of equality in (4) motivates the following definition. A t-design is a spherical code X \subseteq S^{d - 1} with \sum_{x,y \in X} Q_k(\langle x,y \rangle) = 0 for all 1 \leq k \leq t. Equivalently, for every polynomial f \in \mathbf{R}[x_1, \ldots, x_d] of degree at most t,

\displaystyle \int_{S^{d - 1}} f(x) d\sigma(x) = \frac{1}{|X|} \sum_{x \in X} f(x);

see Section 9.6 here. These highly uniform sets are good candidates for A-codes of maximal cardinality. Indeed, if X \subseteq S^{d - 1} is a t-design with A(X) \subseteq A and F = \sum_{k = 0}^t f_k Q_k is a polynomial that is feasible for (4) that vanishes on A(X), then every A-code has size at most |X|.

This strategy gives the exact values for the kissing numbers \tau(8)=240 and \tau(24) = 196560. For d = 8, the 240 shortest vectors X_8 \subseteq S^7 of the E_8 lattice have inner products A(X_8) = \{-1, -1/2, 0, 1/2\}. Hence, X_8 is a [-1,1/2]-code and \tau(8) \geq 240. Delsarte, Goethals and Seidel showed that X_8 is a 7-design, and later Levenshtein and Odlyzko and Sloane independently showed that

\displaystyle F(t) = \tfrac{320}{3}~(t+1)~(t+\tfrac{1}{2})^2~t^2~(t-\tfrac{1}{2})

satisfies F = \sum_{k = 0}^6 f_kQ_k with f_0 = 1 and f_k \geq 0. Applying Theorem 3 proves \tau(8) \leq F(1) = 240. A similar strategy with the Leech lattice yields \tau(24) = 196560.

Delsarte, Goethals and Seidel again use Gegenbauer polynomials to give a linear programming lower bound (Theorem 5.10 in [DGS77]) on the size |X| of an arbitrary t-design X \subseteq S^{d - 1}. In some sense, this lower bound is dual to Theorem 3 and the proof is similar. They give an upper bound on |X| for spherical t-designs X \subseteq S^{d - 1} with fixed s = |A(X)| and show that, in all cases, t \leq 2s (Theorem 6.6 in [DGS77]).

The linear programming method has been generalized beyond Theorem 3. Musin developed a nonconvex extension to prove \tau(4) = 24. Cohn and Elkies extended the linear programming method to noncompact settings, leading to the resolution of the sphere packing problems in \mathbf{R}^8 by Viazovska and \mathbf{R}^{24} by Cohn, Kumar, Miller, Radchenko, and Viazovska. De Laat and Vallentin identified a general semidefinite programming hierarchy for problems in discrete geometry, the lowest level of which is the linear programming method. For more on the development of these methods, the reader is encouraged to consult the notes of Vallentin and Cohn.

2 thoughts on “Spherical codes and designs

Leave a comment