This year’s Math 256x is a “topics class”
(taught here before, but last in 2001)
that develops the mathematical theory of error-correcting codes
and its connections with other areas of mathematics
ranging from combinatorics to group and representation theory
to projective and algebraic geometry.
We meet Tuesdays and Thursdays from 11:30 AM to 1:00 PM
in Room 309 Room 109
of the Science Center.
If you find a mistake, omission, etc., please let me know by e-mail.
September 3:
Overview
September 5:
Introduction:
Hamming distance and Hamming space;
September 10: Linear codes
(and isometries of Hamming space)
First problem set:
continuity of the asymptotic rate functions;
a bit on spherical codes;
binary codes with
September 12: No Class
September 17:
Linear codes and point configurations in projective space
(and introduction of several structures and constructions that
will recur later in the semester)
Second problem set
[Corrected Sep.18, see 3ii]:
More about point configurations and the associated linear codes
September 19:
Codes and point configurations, cont’d;
Segre’s theorem on ovals in algebraic projective planes of odd order
September 24:
More on conics, especially over finite fields, and projective spaces
in general; rational normal curves and classical Goppa codes.
We’ll use these
Lecture notes
September 26:
Introduction to weight enumerators and the
MacWilliams identity
See also
Part II
of my Notices article
“Lattices, Linear Codes, and Invariants”
(Part I is in the analogous picture for
lattices in Euclidean space, which is developed to an extent
later in the course).
October 1:
Some uses of the MacWilliams identity; Gleason on the
Hamming weight enumerators of self-dual binary codes
October 3:
Gleason for self-dual codes of Type II, III, and IV
October 8:
Existence and uniqueness of the binary Golay codes
Third problem set:
Synthemes, totals, and the projective plane of order 5
October 10:
The binary Golay codes and related structures, cont’d
October 15:
Extremal enumerators and codes; the Mallows-Odlyzko-Sloane theorem
October 17:
The sphere-packing problem; lattices and their theta series
Table listing, in several code and lattice
contexts, the parameters q, F, Δ,
October 22:
Overview of sphere-packing bounds; theta series of unimodular lattices;
extremal theta functions and lattices
October 24:
Type I and Type II lattices (a.k.a. odd and even unimodular lattices)
and their theta series; Construction A
October 29:
Asymptotics of kissing numbers of extremal codes and lattices
via the stationary-phase method
October 31:
Asymptotic impossibility of (nearly-)extremal codes.
Start on Reed-Muller codes
November 5:
Reed-Muller codes, cont’d
November 7:
a bit more on Reed-Muller codes;
introduction to cyclic codes
November 12:
Cyclic codes cont’d: the BCH bound and
BCH
codes;
QR codes,
and Golay codes as QR codes
November 14:
Krawtchouk polynomials
and Lloyd’s theorem
on perfect codes
November 19: Introduction to the
linear programming (LP) bounds
on error-correcting codes
November 21:
orthogonal polynomial basics
November 26:
LP bounds II: an asymptotic upper bound
November 28: No Class
(Thanksgiving break)
December 3:
The ternary Golay codes and related objects
The general motivating setup from information theory for
error-correcting block codes (as opposed to other kinds such as
convolutional, let alone cryptographic: we’re aiming to protect against
error, not eavesdropping).
Natural languages such as English are very suboptimal
error-correcting codes (fingerpuinting?); warning about
real-world™ errors beyond the usual scope of the
mathematical theory (eror-collecting colds).
Coming attractions, drawing on combinatorics,
symmetry and groups, linear algebra, finite geometry,
invariant theory, orthogonal polynomials, etc.
Example: the binary Golay code
[formalities about texts, office hours, grading, etc.; see the initial handout]
General setup for block codes:
Fix a finite set A (“alphabet”) of
q>1 elements (“letters”).
[q=1 is vacuous,
as already observed by Lewis Carroll
— in general, a letter from an alphabet of size q
carries
For any integer n>0 consider the set An
of
A code is a nonempty subset C of Hamming space.
C is said to be binary, ternary, quaternary, “etc.”
if the alphabet is of size
A basic question of coding theory is how large M can get
given q, n, and d
(or more-or-less equivalently how large d can get
given q, n, and M).
As is often the case in this kind of combinatorics,
except in trivial cases it is rare that we can get an exact answer
for a given choice of
q, n, and M or d,
and even the asymptotic behavior is not known: we can only give
lower and upper bounds, which hardly ever coincide.
Here one standard asymptotic direction is to fix q,
let
An elementary upper bound is due to
Singleton (1964):
an
Two trivial examples of MDS codes are An itself
(with d=1) and a one-word code (with d=n+1
— so yes, a singleton code attains equality in the Singleton bound).
Only slightly less trivial is the example with d=n of the
repetition code consisting of the q words of the form
An example of an optimal code that is not MDS is the
Hamming code (1950),
with parameters
In general, the sphere-packing bound
(a.k.a. the Hamming bound)
is the observation that if
What does the Hamming bound look like in the asymptotic regime where we
fix q and let n and d grow
while their ratio δ remains roughly constant?
We need the following fundamental computation.
The number of words at distance exactly e from
a given word is
It soon follows that the number of words at distance at most
e from a given word is given by the same asymptotic formula
Replacing e by d−1 in the Hamming bound yields the
Gilbert-(Shannon-)Varshamov bound, which is a lower bound:
there exists an
It is a perennial source of embarrassment that, while the Gilbert-Varshamov bound is quite weak for small n, the asymptotic form is very hard to beat; the only improvements known use surprisingly sophisticated techniques from number theory (curves with many points over finite fields), and for small q (all q up to about 40) the Gilbert-Varshamov bound is still the best we have for all δ. [There are similar embarrassments elsewhere in combinatorics, as in Ramsey theory and Euclidean sphere packing in high dimensions, where the probabilistic method is asymptotically better than any known explicit construction.]
We usually take for q a prime power, and identify A
with a finite field, which as usual we denote by F
[the alternative k, for German Körper,
is pre-empted as we shall see a couple of paragraphs below].
Hamming space is then the
where “wt” is the (Hamming) weight:
the weight of any word w is its number
A code C ⊆ Fn
is said to be linear if it is closed under
addition and scalar multiplication, that is, if it is
a vector subspace
Suppose C is a linear (n, M, d) code.
Then
The rate
The sphere-packing bound of course holds for linear codes
as a special case. The Gilbert-Varshamov bound holds for linear codes
as well, though here we must prove it anew. One way is to mimic our
proof for arbitrary codes, adding one basis vector at a time while the
Hamming spheres of radius
Often in mathematics when we introduce a new mathematical structure
we follow it up not just with examples but also with operations that
give new examples from old (direct sums and quotients of groups or
vector spaces, field extensions and polynomial rings, etc.).
Such operations exist for codes too, and we’ve seen a few examples
without calling attention to the general constructions
(e.g. a subset of an
[Warning:
since we hardly ever use F⊆R,
there are usually plenty of nonzero v with
As usual
The dual of the Hamming [7, 4, 3] code is the [7, 3, 4] code whose
nonzero codewords are the line complements; this is a
constant-weight code:
all nonzero words have the same weight, here 4.
That
Interlude on isometries:
Hamming space An has two natural sources of isometries:
we can apply a permutation of A to each coordinate,
or an arbitrary permutation to the set of n coordinates.
These generate a group, call it G, of
Two codes C, C’
of the same length over the same alphabet are said to be
isomorphic if
If C is linear then translation by any codeword is
an automorphism. But we usually exclude nonzero translations
of linear codes, because the choice of 0
is part of the vector space structure,
and thus should be fixed by any automorphism.
Of the isometries of Fn that preserve 0,
the only linear ones are the
Putting the signed permutation matrices together with translations
The interlude on Aut(C) also serves as a segue to a
dictionary between linear codes and configurations of points
in finite projective spaces. In applications, and in treatises
aimed at applications, an
Here’s one way to think about it,
leading to a nice geometric description of linear codes.
Hamming space Fn is quite a concrete vector space,
coming to us with a choice of coordinates, ambiguous only up to
permutation and scaling individual coordinates. But C
has no intrinsic structure except for what it gains as a subspace
(*) In general, an
So how to recover the minimum weights d and d*
of C and C⊥ from the
geometry of this configuration SC
in the dual projective space? Nonzero codewords c up to scaling
↔ hyperplanes H in the dual
Since any
Note that if C has neither an identically zero coordinate
nor a
Returning now to our usual setting of a finite field, we can use this
picture to construct and study some basic examples of linear codes.
Suppose for starters that C* is single-error-correcting, i.e. that
When k=2, the Hamming code is MDS, and is the longest
When n is small,
it is easy to describe and enumerate
Once n>6, the number of extensions of an
Still we can say something about the largest n-arcs,
or equivalently the longest MDS codes of dimension 3
(and dually of dimension
Tuesday, Sep. 17:
Linear codes and point configurations in projective space
An [n, k, *] code
without identically-zero coordinates is tantamount to a
spanning (multi)set SC of n points in
This respects the relevant symmetries: two codes are isomorphic iff
their associated configurations are equivalent under
d* is the smallest size of
a linearly dependent subset of S.
(In general, δ points in
a projective space are said to be “linearly dependent”
when they lie on a linear projective subspace of dimension
less than δ−1; here there must be d* points
on a subspace of dimension exactly d−2, else the
minimum dual weight would be even smaller.) So again C is MDS
iff C⊥ is MDS.
Thursday, Sep. 19:
Codes and point configurations, cont’d;
Segre’s theorem
• n = q+2
• n = q+1
The latter is a celebrated
theorem of Segre
(Canad. J. Math. 7 (1955), 414–416
[NB what Segre calls a “Galois field” γ is our
finite field F]);
in our setting, this theorem says that the
The upper bound q+2 has a purely combinatorial proof, and thus holds for
an arbitrary finite projective plane, not just for algebraic planes;
the same is true for the result that if a plane of order q
has a hyperoval then q is even. The former result is easy:
given a point P of an
Still working in the general combinatorial setting, we see that each point P
of an oval O is on exactly one line
[Proof of
Segre’s theorem, where the key step is in effect
that every triangle inscribed in an oval (i.e., formed by three of its points)
has a “symmedian point”, or equivalently that
every triangle circumscribed about an oval (i.e., formed by three of its tangents)
has a Gergonne point. (For more on these, see X(6) and X(7) in the
“Encyclopedia of Triangle Centers”,
which starts with X(1)=incenter and X(2)=centroid, and has reached
X(5394) and beyond.)
Along the way we use in effect the classical
theorems of
Menelaus
See the notes.
Once we know that every conic with a point can be written as
Once we know that every conic C with a rational point is a
rational normal curve of degree 2,
Bézout’s theorem
becomes elementary for the intersection of C with any curve.
Indeed once we identify C with
The (Hamming) weight enumerator of a linear code C
of length n
is a generating function that encodes the counts of words of
weight w for each
Before giving and proving MacWilliams’ formula relating
Theorem (MacWilliams identity for binary codes). The weight enumerators of any linear binary code C and its dual code C⊥ are related by
The proof is an application of
Poisson summation for finite abelian groups.
For any subgroup H of a finite abelian group G,
and any function
Theorem (MacWilliams identity for Hamming weight enumerators). The weight enumerators of any linear code C over a q-element field and its dual code C⊥ are related by
So, for instance, our formula for the weight enumerator of a
single-checksum code generalizes to
We next develop Pontrjagin duality and Fourier analysis for finite abelian groups, where all the relevantOnce q>2 we can form a weight enumerator that keeps track of more information than WC, counting not just the distribution of zero vs. nonzero coordinates in codewords but also which nonzero field elements occur. (Yes, this breaks some of our symmetry, though it still treats all n coordinates equally.) The resulting complete weight enumeratorC-vector spaces are finite dimensional. See for example paragraphs 2 and 3 of page 5 in this chapter of my notes on analytic number theory from 2009 (where this theory is used to describe Dirichlet characters); see also the further remarks on page 8, and Exercise 5 on page 10. The Fourier transform on binary Hamming space(Z/2Z)n is an important special case also known by other names such as the “Hadamard transform”. As usual with Fourier analysis there are several choices of normalization in the literature; we’ll use the same normalization that I chose here (page 1385), where the Fourier transform off :G→C is the function taking any character χ to∑g∈G χ(g) f (g) , which saddles the formula for the inverse Fourier transform with the factor|G|−1 and the complex conjugateχ(g) . [ForG = (Z/2Z)n we need not worry about the complex conjugate becauseχ(g) is always±1 .]Poisson summation figures in analytic number theory too (see this chapter of the 2009 notes), but again we need only the finite case, which is easier than Poisson summation on
Z ⊂ R and entirely elementary. (We may cover Poisson summation onZ ⊂ R , or more generally on latticesin Rn , when/if we get to the connection between linear codes and Euclidean lattices.)
Some applications of the MacWilliams identity:
Now consider what this means for the weight enumerator
A special property of binary codes is that once a linear code
C is self-orthogonal the map taking a codeword c
to its weight modulo 4 is a homomorphism
We can detect whether a self-orthogonal binary code C is
singly or doubly even from its weight enumerator:
The weight enumerator of a Type II code is also invariant under
the substitution of
[Overview of Gleason I–IV in the context of
complex reflections groups.
[Here’s the paper
“Finite unitary reflection groups” by Shephard and Todd
(Canad. J. Math. 6 (1954), 274–304)
that determined all such groups and gave the first proof of the
theorem that these are precisely the finite
subgroups of
[For a picture of the vertices of the regular octahedron permuted by
[...]
Thus the weight enumerator of a Type II code of length n
has only
It is known that these are the only two Type II codes
of length 16, and there are 9 such codes of length 24
(after which combinatorial explosion soon sets in).
This is the first case where Gleason does not determine
WC uniquely, and again (as with Type I)
we can make WC unique by doubling the minimum weight,
in this case getting the enumerator of a
Remarkably it is again true that there is a unique such code,
and that removing any coordinate yields a perfect code,
in this case the binary Golay code
A Type III code is a self-dual ternary code.
(Recall that “ternary code” means
“code over the
We draw the same kind of consequences as before:
it takes only
A Type IV code is a quaternary linear code that is its own
Hermitian dual, i.e. dual under the sesquilinear pairing
where σ: a ↔ a2
is the Galois involution of the
But we already know what quaternary
Type IV is the final variation on our theme,
at least for Hamming weight enumerators:
there are no GV, GVI, etc.,
because (using the classification of finite subgroups of
PGL2(C))
there are no further cases where the MacWilliams involution
together with a map
Beyond Hamming weight enumerators, there are some further examples
(complete or joint weight enumerators, etc.) where elementary observations
plus MacWilliams yield invariance under a complex reflection group,
or a group close enough to being a complex reflection group that one
can still give a satisfactory explanation of its invariant ring.
See for instance Rains and Sloane’s chapter
“Self-Dual Codes” from the Handbook of Coding Theory,
especially Section 6
(“Invariant Theory”, starting on p.29)
on general results and techniques,
and Section 7
(“Gleason’s theorem and generalizations”,
starting on p.47) for many further cases. For example (p.59),
a self-dual quinary code
There are (at least) two natural directions to go with the
Gleason theorems: the extended Golay codes
The Golay codes have symmetry groups that are large
(e.g. large enough to be multiply transitive on the coordinates)
and sporadic. The size indicates that there are various ways to
prove uniqueness: in each case there are various natural objects
on which the group acts uniquely (small subsets of the coordinates,
codewords of given weight, etc.), and we can fix one of them,
try to show there’s still a unique way to fit a code of the
desired properties around it, and then get as a bonus the transitivity
on objects of that type. The fact that the groups are sporadic
suggests that no proof of uniqueness can tell us everything of interest
about the Golay codes: different choices will reveal only different parts
of the structure of the code and its automorphism group, and suggest
generalizations in a different direction. One could give an entire
graduate course on the Golay codes and Mathieu groups (even if
I might not be “one” who
could give such a course), but since that’s not an option
I must choose one approach and more-or-less stick with it.
Given what we’ve seen so far, I’ll take the route via
the subgroup
Suppose, then, that C is a Type II
If we fix one, two, or three of the 24 coordinates, and consider
only those octads (
We already know Π has hyperovals; choose one, and call it O.
Every point outside O is on three lines meeting O in
two points each, and thus gives a “syntheme”,
which in this context is the (somewhat quaint 19th-century) terminology for
a partition of the six points of O into three pairs.
But there are
Each of the O* lines, say l, lies on five syntheme points.
This gives
Before resuming the construction of
We pause to note some exceptional behavior of the permutation groups
A6
Our copy
The other exceptional object is the
triple cover
Further remarks: One can likewise use (hyper)ovals to prove the
uniqueness of the projective planes of orders 2, 3, and 5,
and count their automorphisms.
For q=2 (respectively q=3),
a hyperoval (resp. oval) O has four points, six secants, and
three points off O where two secants meet
(so here we get not an outer automorphism
Let C1 be the code generated by
(characteristic functions) of lines in Π
and C0 be the even subcode
We need to study
[…]
Each of the special codes suggested by Gleason’s Theorem
(extended Hamming,
We can describe each of our four cases as follows.
Dehomogenize WC by setting
Proposition
(Mallows-Odlyzko-Sloane 1975).
Suppose all the coefficients of the power series for
F0 are positive,
and the qk coefficient in the power series expansion of
Δ−1 is positive for all
The hypothesis holds for each of our four Types of self-dual codes,
so we obtain:
Corollary.
Proof of Proposition, using the invariance of the
formal residue of a power series:
see this TeX page.
[This approach is simpler than the original proof from 1975, though
that proof gives further information that we’ll return to next time.
The approach we use here does let us give formulas for the
[General introduction to Euclidean lattices, the
Hermite constant
(best lattice packing of equal spheres), etc.,
with material mostly contained in the first chapter of
“SPLAG” = Sphere Packings, Latiices,
and Groups by Conway and Sloane.
For the analogue in the setting of unimodular lattices and their
theta series, see also
Part I of my Notices article
“Lattices, Linear Codes, and Invariants”.
In this setting, in the Type II (= even unimodular) case,
Δ is actually the modular form usually denoted by Δ,
so its inverse is
The analogue for sphere packings of the Gilbert-Varshamov bound is the
Minkowski-Hlawka bound,
which holds for packings not just of spheres but of translates of
an arbitrary centrally-symmetric convex body B
If L is unimodular, then (as for a self-dual binary code)
the map taking a lattice vector v to its norm
Construction A associates to any binary linear code
[Remarks on some nonlattice periodic sphere packings:
Construction LC makes sense for any binary code
C, whether linear or not; in general LC
is a periodic subset
Back to the case of linear codes C and lattices LC:
the theta series
for Y.
[The notation “L〈2〉” means
the lattice L with all inner products multiplied by 2;
for
This story generalizes in various directions (the space of lattices
Construction A generalizes in several ways, with
[For more on “shadows” and their weight enumerators or
theta series, see also Chapter 5 (starting on page 26)
and also pages 48–49 and 73–75 of
Rains-Sloane,
and also my two papers
“a characterization of the
Zn lattice”
and
“Lattices and codes with long shadows”.]
For a statement and proof sketch of the stationary-phase asymptotics
that we use, see this write-up
(and note the two Exercises at the end).
There are naturally plenty of sources for this and related techniques
in general; for instance it is a recurring [no pun intended] theme
in Stanley’s Enumerative Combinatorics.
See this write-up outlining a proof of
the Mallows-Odlyzko-Sloane theorem.
CORRECTED Nov.2 (and replaced k by F
for the field, because we need k for the dimension):
Reed-Muller codes
The minimum weight is actually easier: I claim that
[The same inductive technique shows that if
To obtain the dimension of
It follows that
We saw that
and each subset of e monomials has weight a power of 2
that’s at least
The weight enumerator of
We easily generalize the construction of Reed-Muller codes by
making F an arbitrary finite field and regarding the
polynomials of degree
Once q>2 we cannot expect that most of these codes will have
any nontrivial divisibility condition on their Hamming weights;
for instance, the single-checksum code of degree
Away from characteristic 2 it is simpler to proceed as follows:
define
At the end of the Krawtchouk/Lloyd notes we need the fact that
the Krawtchouk polynomial
The Krawtchouk polynomials for given q and n
are orthogonal.
We shall apply some general results about orthogonal polynomials,
which are classical but not well-known, at least not known as well
as they would have been ome decades ago. So most of this class
will be devoted to developing the results we shall use.
This material is still standard enough that special-purpose lecture notes
should not be necessary; for references, I think the relevant chapters of
Körner’s Fourier Analysis contain everything
we shall use, while Szegö’s Orthogonal Polynomials
has a more extensive treatment of the topic.
We show that for fixed q and
δ ∈ [0, (q−1)/q]
the asymptotic rate
[...]
When a system {Pi} of orthogonal polynomials
does allow for a closed form for the triple products
The stationary-phase estimates on Ki(x)
show further that the oscillatory region between the smallest and largest
roots of Ki is also the region that contributes
the most to
Here are extended lecture notes
for existence and uniqueness of the extended ternary Golay code
A couple of addenda:
1)
We noted already
that Leech’s construction of his lattice
Λ24
from
2)
[...]
Yet again it turns out (and we may show) that there is a unique such code,
and that removing any coordinate yields a perfect code,
this time the
Tuesday, Oct. 8:
Existence and uniqueness of the binary Golay codes
Thursday, Oct. 10:
The binary Golay codes and related structures, cont’d
Here are some more details about the
(re)construction of the
Tuesday, Oct. 15:
Extremal enumerators and codes; the Mallows-Odlyzko-Sloane theorem
i) A Type I code of length n has minimum distance at most
ii) A Type II code of length n has minimum distance at most
iii) A Type III code of length n has minimum distance at most
iv) A Type IV code of length n has minimum distance at most
Thursday, Oct. 17:
The sphere-packing problem; lattices and their theta series
Tuesday, Oct. 22:
Overview of sphere-packing bounds; theta series of unimodular lattices;
extremal theta functions and lattices
Elkies, N.D., Odlyzko, A.M., and Rush, J.A.:
On the packing densities of superballs and other bodies,
Invent. Math. 105 (1991), 613–639.
For upper bounds on Euclidean sphere-packings,
the (literal!) sphere-packing bound on the density is 1,
and it’s not too hard
to improve this to
Thursday, Oct. 24:
Overview of sphere-packing bounds; theta series of unimodular lattices;
extremal theta functions and lattices
Tuesday, Oct. 29:
Asymptotics of kissing numbers of extremal codes and lattices
via the stationary-phase method
Thursday, Oct. 31:
Asymptotic impossibility of (nearly-)extremal codes.
Start on Reed-Muller codes
Tuesday, Nov. 5:
Reed-Muller codes, cont’d
Tuesday, Nov. 7:
a bit more on Reed-Muller codes;
introduction to cyclic codes
Tuesday, Nov. 12:
Cyclic codes cont’d: the BCH bound and
BCH
codes;
QR codes,
and Golay codes as QR codes
Thursday, Nov. 14:
Krawtchouk polynomials
and Lloyds’s theorem
on perfect codes
Tuesday, Nov. 19:
Introduction to the
linear programming (LP) bounds
on error-correcting codes
Thursday, Nov. 21: orthogonal polynomial basics
[I switched the order of the last two because the three-term recurrence
also gives an easy inductive proof of the interlacing property:
given that
Tuesday, Nov. 26: LP bounds II:
an asymptotic upper bound
we introduced back in mid-September, and

[Taking ι=10/150 and 11/150 in the relation between
δ and ι at the transition points, and multiplying
the resulting δ’s by 150, gives the estimates
Tuesday, Dec. 3:
The ternary Golay codes and related objects