Math 263x: Computational Techniques in Number Theory and Algebraic Geometry (Fall 2021)

Warning: MathJax requires JavaScript to process the mathematics on this page.
If your browser supports JavaScript, be sure it is enabled.

Math 263x: Computational Techniques in Number Theory and Algebraic Geometry (Fall 2021)

Math 263x is a new “topics class” concentrating on some of the computational tools and techniques that can complement theoretical research in number theory, algebraic geometry, and related fields. We meet Mondays and Wednesdays from 12 noon to 1:15 PM in Sever Hall Room 203.

If you find a mistake, omission, etc., please let me know by e-mail. Thanks to Anselm Blumer for alerting me to several typos (or TeX-os or HTML bloopers) which I have now corrected.

September 1: Introduction; example: Fermat’s two-square theorem; interlude: don’t plot partial sums in time $N^2$; introducing Belyi functions
$\phantom\infty$Digression: computing square roots and non-squares in a finite field
September 8: Belyi maps and some of their uses; interlude: rational reconstruction (a theme with many variations to come)
September 13: Start on computation of Belyi functions; interlude: finding duplicates
$\phantom\infty$Example: the modular covers $X_0(2)\to X(1),\ X_0(6)/w_2\to X_0(2)/w_2$ as Belyi maps
September 15: Computation of Belyi polynomials, cont’d
September 20: More on Belyi polynomials etc.
September 22: Counting solutions of $g_0 g_1 g_\infty = {\rm id}$; resultants
September 27: Using multivariate (and usually $p$-adic) Newton’s method
September 29: A cube minus a square
October 4: A cube minus a square, cont’d
October 6: interlude on tables for computing mod $p$; positive- [usually 1-]dimensional families
[October 11: No class: University holiday]
October 13: Curves of genus 0 through 5; equations for some modular curves
October 18: Low-genus curves and modular equations, cont’d; a Weil-Belyi function on an elliptic curve (and parametrizing 5-torsion etc.)
October 20: Overview of complex reflection groups and their invariant rings (which give rise to highly symmetric curves and higher-dimensional varieties)
October 25:
October 27: Introduction to finite subgroups of ${\rm GL}_2({\bf C})$ and their invariants; details of the tetrahedral case
November 1: Finite subgroups of ${\rm GL}_2({\bf C})$ and their invariants, cont’d: octahedral and icosahedral details
[November 3: No class: I’m out of town]
November 8: Explicit generators for the Weil representation (odd $p$); the complete weight enumerator of a self-dual code containing the all-$1$’s word; introduction to $W(F_4)$ and its invariant ring
November 10: Generators of the invariants of $W(F_4)$ and $W(E_6)$
November 15: Introduction to Shioda’s “excellent families” of rational elliptic surfaces with an additive fiber at $t=\infty$
November 17: Shioda’s “excellent families” cont’d: the case of $E_6$; variations: complex reflection groups from $E_8$ and $E_6$, and a Shioda-Usui family for $W(A_5)$
November 22: Another variation on a theme of Shioda: an “excellent family” of rational elliptic surfaces of rank $4$ with a $2$-torsion section and an action of $W(F_4).2$
[November 24: No class: Thanksgiving break]
November 29: Sieves, logical and quantitative (or: Sieves, 0-1 and cumulative)
December 1: Elliptic curves with a configuration of integral points
December 6: Final lecture:

Wednesday, Sep. 1: Introduction

After outlining the general purpose and spirit of the class, we give an example that illustrates some of our concerns in a context that does not require most of the background that will be freely assumed later in the semester. The example is Fermat’s celebrated two-squares theorem: A prime $p$ can be written as a sum of two distinct squares if and only if $p \equiv 1 \bmod 4.$ The representation is unique up to switching the two summands. So take say $p = \lfloor 10^{37} \pi \rfloor = 31415926535897932384626433832795028841.$ Fermat promises an essentially unique solution to the Diophantine equation $p = x^2 + y^2$.

How to actually find this solution?

Trying all $x < p^{1/2}$ works in finite time, but not “finite enough” even with the computer (and if/when the computers catch up I can double the number of digits in $p$…). One proof of the theorem almost yields an efficient algorithm, using an idea attributed to Cornacchia (1908): x/y is a square root of $-1 \bmod p,$ and conversely given such a root we recover $(x,y)$ in time $\ll \log^c \! p$ by lattice reduction (which in two dimensions is basically the Euclidean algorithm). [NB $\log^c \! p$ is “polynomial time” here because it takes $\sim\!\log p$ digits to specify $p.$] All the ingredients we used are already implemented in packages such as gp, so the resulting algorithm can be expressed by a one-liner such as

fermat(p) = qflll([lift(sqrt(Mod(-1,p))),p;1,0])[1,]

[Victor Miller 1992, transcribed some time later into the new gp syntax]. So for instance

fermat(p) = qflll([lift(sqrt(Mod(-1,p))),p;1,0])[1,] # fermat(31415926535897932384626433832795028841)

returns [4223562448517994405, -3684758713859920604] in about 0 ms. (and this would even be feasible, if arduous, to do by hand).

[The digits of $\pi$ aren’t special; I chose such a prime rather than a “random” one so that I could not be tempted to cheat by choosing $x$ and $y$ first! Fortunately primes of this size are plentiful enough that one can easily find examples. To be sure this begs the question of how did I know that $\lfloor 10^{37} \pi \rfloor$ is prime in the first place. For numbers of this size, factorization and primality proving has long been routine; that is an interesting story in its own right, but well known, and too elementary for us to take time to explore it in detail in Math 263x. Likewise for other fundamental tools such as polynomial factorization over finite fields or number fields, which are nontrivial (e.g. polynomial-time factorization in ${\bf Q}[X]$ was the initial application of the LLL algorithm!) but standard and readily available.]

Why did we write that this analysis “almost yields an efficient algorithm”? Well, how do we find the square root mod p? An embarrassment: it’s easy to evaluate the Legendre symbol, but if it’s +1 we generally don’t know how to get a square root in deterministic polynomial time unless we assume the extended Riemann hypothesis for the Legendre character mod p — though we can do it in “random polynomial time”. (It is enough to find a single “quadratic nonresidue” of $p$; indeed the two problems are equivalent under polynomial-time reductions.) However, modular square roots of small numbers can be evaluated in polynomial (albeit not practical) time by using the arithmetic of elliptic curves $\bmod p$ ! That was the application Schoof gave for his algorithm [ = René Schoof: Elliptic Curves over Finite Fields and the Computation of Square Roots mod p, Math. of Computation 44, pages 483–494 (1985)] for counting rational points on an elliptic curve mod p. In our case we count points on the curve $Y^2 = X^3 - X,$ which is relevant because it has complex multiplication by a square root of $-1$: the count is $p + 1 \pm 2x$ or $p + 1 \pm 2y,$ from which we recover the two-square representation in determinstic polynomial time.

Interlude: even very routine calculations can hide inefficiencies (and opportunities for improvements). For example, suppose we wish to plot the partial sums $s_n := \sum_{k=1}^n a_k$ of some real sequence $a_1,\ldots,a_N$; that is, we want to plot the $N$ points $(n, s_n)$ for $1 \leq n \leq N$. Directly translating this to something like for(n=1;n<=N;n++) plot(n, sum(k=1,n,a_k)) yields code that takes about $N^2/2$ work, while only $\sim\!N$ is needed: s=0; for(n=1;n<=N;n++) { s+=a_n; plot(n,s) }. (One cannot go below $\sim\!N$ because it takes time $N$ just to read or compute the terms $a_k$.) This may seem much too basic to mention in a graduate topics class, but the $N^2/2$ pseudocode comes from a well-known fellow computational number theorist, and if even [redacted] can slip this way then anybody can.

Our motivating task for at least the next few weeks will be to compute explicit covers of curves with given ramification. Let $f: X' \to X$ be a map of compact Riemann surfaces with $\deg(f) = n > 1$, and $B \subset X$ the branch locus, which is a finite (possibly empty) set of points. Given $(X, B, n)$ there are finitely many choices of $(X',f)$, corresponding to the index-$n$ subgroups of the fundamental group $\pi_1(X-B)$. Now a compact Riemann surface is an algebraic curve over C; so we are given a curve $X$, a finite set of points on the curve, and an integer $n$, and construct a list of curves $X'$ and maps $f$. If $(X,B)$ is defined over some field $F \subset \bf C$ then $(X',f)$ is defined over some finite extension $F'/F$. We shall see that a diverse collection of computational problems in number theory and algebraic geometry can be encoded in the problem of recovering $(X',f)$ from $X,B$, and maybe some additional combinatorial data such as ${\rm Gal}(X'/X)$. But this problem is already nontrivial for relatively small $B$ and $n$, even when $X$ is the Riemann sphere ${\bf CP}^1$. Moreover, the topological construction using subgroups of $\pi_1(X-B)$ is fundamentally “transcendental”, since it uses the Riemann existence theorem to identify $X'$ with an algebraic curve over C; we can use it to get some information about $(X',f)$, such as an upper bound on $[K':K]$, but not to compute explicit equations. Such computations will be our first series of goals.

Wednesday, Sep. 8: Belyi maps [unramified covers of P¹ − {0,1,∞}] and some of their uses

Some examples: if $B = \emptyset$ and $X$ has genus $1$ then we have unramified covers, which are isogenies $X' \to X$ (with $X'$ also of genus $1$); such isogenies also arise from certain covers of ${\bf CP}^1$ branched at $4$ points (the branch points of a degree-$2$ cover $X \to {\bf CP}^1$). More generally, if $E$ has genus $g > 0$ and the cover $X'/X$ is abelian then it corresponds to a finite subgroup of the Jacobian of $X$. If $n=2$ then necessarily $\#B$ is even, and the choice of cover is equivalent to a choice of divisor class $[D]$ of degree $\frac12\#B$ such that $2[D] \sim \sum_{p\in B} p$. If moreover $X = {\bf CP}^1$ then the choice is unique, and $X'$ is a hyperelliptic curve(*) of genus $g$ where $\#B = 2g+2.$ Still assuming $X = {\bf CP}^1$, if $n=3$ and all the branch points are simple then there are $(3^{2g}-1)/2$ choices, corresponding to $3$-element subgroups of the Jacobian of the same hyperelliptic curve $X'$. Modular curves $X_0(N), X_1(N), X(N)$ arise as covers of the $j$-line that is unramified outside the three points with $j=\infty, 0, 1728;$ in general any cover $X' \to X$ of modular curves (associated to groups $\Gamma' \subset \Gamma$ with $[\Gamma : \Gamma'] = n$ is unramified outside the elliptic points and cusps of $X$, which are the points with nontrivial stabilizer in $\Gamma$. One application of our techniques will be finding explicit equations for some modular curves.

(*) for us “hyperelliptic curves” include curves of genus $0$ or $1$ equipped with a degree-$2$ map to ${\bf CP}^1$, which arise for $\#B = 2$ or 4.

[...]

We usually make the simplest choice ${\bf CP}^1$ of $X$ (which is also the only one without any continuous moduli). Then $\# B > 1$, because both ${\bf CP}^1$ and the once-punctured Riemann surface (a.k.a. the complex plane) are simply connected. Moreover if $\# B = 2$ then $\pi_1(X-B) = \pi_1({\bf C}^*) = \bf Z$; for each $n$ there is a unique index-$n$ subgroup of $\pi_1(X-B)$, namely $n\cdot \bf Z$, which corresponds to the map $X' = {\bf CP}^1,$ $f: z \mapsto z^n.$ Note that we chose a coordinate on ${\bf CP}^1$ that makes $B = \{0, \infty\}$, which we can do because ${\rm Aut}({\bf CP}^1)` = {\rm PGL}_2({\bf C})$ acts doubly transitively.

In fact the action of ${\rm PGL}_2({\bf C})$ on ${\bf CP}^1$ is sharply 3-transitive: for any two ordered triples $(z_1,z_2,z_3)$, $(z'_1,z'_2,z'_3)$ of pairwise distinct points, there exists a unique $g \in {\rm Aut}({\bf CP}^1)$ taking each $z_j$ to the corresponding $z'_j$. Thus for $\# B = 3$ our problem still has no continuous moduli. But here we have a much richer landscape of unramified covers of $X-B$, because the fundamental group is free on two generators, so index-$n$ subgroups correspond to two-generator subgroups of the symmetric group $S_n$, for which there is a large range of choices.

If $\#B = 3$ then $X'$ is defined over some number field (finite extension of Q), because given $n$ there are only finitely many choices of $X'$ once we have used ${\rm Aut}({\bf CP}^1)$ to find a projective coordinate in which $B = \{0,1,\infty\}$. Remarkably the converse is true: if $X'$ is any algebraic curve defined over a number field then there is a rational function $f$ on $X'$ that is unramified outside $\{0,1,\infty\}$, i.e. outside the poles of $f$ and the zeros of $f$ and $f-1$. This is a famous theorem of Belyi, who moreover proved that for any finite set $B'$ of algebraic points on $X'$ there exists such a function $f$ that maps $B'$ to B. Such functions $f$ are thus often called Belyi functions, and their computation will be our first motivating task.

See Serre’s Topics in Galois Theory (Boston: Jones & Bartlett 1992) for the application to the inverse Galois problem (perhaps the best-known arithmetic application) and other results concerning Belyi functions. In algebraic geometry, such functions might be most famous for the equality case in the Hurwitz bound of 84(g−1) on the number of automorphisms of a Riemann surface (a.k.a. algebraic curve over C) of genus g>1: if C attains this bound, or more generally has more than 12(g−1) automorphisms, then the quotient map C→C/Aut(C) is a Belyi function.

Such functions appear surprisingly often in other contexts; one of these years I might write an article on the ubiquity of Belyi functions. For now, I give references and/or links to some of the places where I’ve run across Belyi functions over the years:

• ABC implies Mordell, International Math. Research Notices 1991 #7, 99–109 [bound with Duke Math. J. 64 (1991)].
• The Klein Quartic in Number Theory (1998, in the MSRI volume The Eightfold Way on Klein’s quartic curve $x^3 y + y^3 z + z^3 x = 0)$
• “slides” from a 1999 talk at MSRI on “Other Arithmetic Manifestations of Branched Covers”
• Shimura curve computations (1998) [especially the curves associated to groups commensurate with arithmetic triangle groups]
• Rational points near curves and small nonzero $|x^3-y^2|$ via lattice reduction (2000) [see the start of Section 4, pages 22–25; some of the other material here will figure later in the course]
• Trinomials $ax^7+bx+c$ and $ax^8+bx+c$ with Galois Groups of Order 168 and $8 \cdot 168$ (with Nils Bruin), Lecture Notes in Computer Science 2369 (proceedings of ANTS-5, 2002; C.Fieker and D.R.Kohel, eds.), 172–188.
• My HCMR article on “The ABC’s of Number Theory”, starting on page 57 of the first issue (2007).

Some more detail on the topology: a Belyi map $C \to {\bf CP}^1$ of degree $n$ is determined by permutations $g_0, g_1, g_\infty$ that satisfy $g_0 g_1 g_\infty = {\rm id}$ and generate a transitive group G of permutations of the n sheets. This group is then the Galois group of the Galois closure of the function-field extension ${\bf C}(C) / {\bf C}(t)$ associated to the cover (where $t$ is a coordinate on ${\bf CP}^1$). Warning: if the cover is defined over a field $F$ that is not algebraically closed then one might have to first take an extension of this ground field before obtaining a function-field extension with Galois group $G$; this is already seen for the (2-point!) Belyi cover $t = z^n$ if $n > 2$ and $F$ is a field such as $\bf Q$ that does not contain the $n$th roots of unity. Also, the $g_i$ are defined only up to conjugation in the normalizer of $G$ in $S_n$. Distinct solutions might still be algebraically conjugate because the generators of $\pi_1({\bf CP}^1 - \{0,1,\infty\})$ are not canonical. The number of solutions of $g_0 g_1 g_\infty = {\rm id}$ given the $G$-conjugacy classes of the $g_i$ can be computed from the character table (see again Serre), though checking whether a given solution actually generates $G$ can be trickier. It has been done for enough examples to show (together with more theory about fields of definition, plus Hilbert’s specialization theorem) that every sporadic group except possibly $M_{23}$ is the Galois group of infinitely many extensions of $\bf Q$! Some of these extensions are so big that we don’t expect to ever see them, but for smaller groups such as $M_{11}$ (and interesting non-sporadic groups) we can actually compute the Belyi covers and specialize to find explicit extensions.

Interlude on rational reconstruction
Often we can closely approximate some target number(s) that we know or expect to be rational, say $r = a/b$. Given an upper bound $H$ on $|a|$ and $|b|,$ there are about $H^2$ choices for $r$, so we had better know $r$ to within about $1/H^2$; in other words, if $a$ and $b$ will be at most $d$-bit (or $d$-digit) numbers, we need at least $2d$ bits (or digits) of precision in $r$. (In practice we will want somewhat more than $2d$ so that we have some confidence that the best possible $a,b$ are significantly smaller than what we would expect for random $r$ — if only because we sometimes make mistakes and have an effectively random $r$ instead of the correct one.) Here we can find $a,b$ in time polynomial in $d = \log H$ by expanding $r$ in a continued fraction, or equivalently applying the Euclidean algorithm to $1$ and $r$ (which have a common factor of $1/b$).
One ubiquitous lesson of modern number theory is to treat archimedean and non-archimedean absolute values on an equal footing. In our setting the close approximation to $r$ will often be $p$-adic, so we will know $r \bmod p^d.$ Using the same counting argument as before, we see that if $|a|,|b| \leq H$ then we need $d$ large enough that $p^d > H^2.$ This is again sufficient, and again reduces to the Euclidean algorithm or continued fractions. (It is also a special case of recovering $a,b$ from $a/b \bmod N$ for $N \gg H^2$, which is the topic of Wikipedia’s entry on “rational reconstruction”. There is yet another equivalent description of this technique that will be the most productive for generalizations to higher dimension: we reconstruct $r$ from the shortest nonzero vector in a two-dimensional lattice, here $\{(a,b) \in {\bf Z}^2: a \equiv rb \bmod N\}.$ (This picture already appeared last week in the description of Cornacchia’s algorithm.) If $r$ is known as an approximate real number, we can use the lattice ${\bf Z}^2$ with the positive-definite quadratic form $Q(a,b) = (a-rb)^2 + \epsilon (a^2+b^2)$ for some $\epsilon > 0:$ if $Q(a,b) \leq q$ then $|a-rb| < q^{1/2}$ and $|a|,|b| \leq (q/\epsilon)^{1/2},$ and conversely (to within a constant factor) if $|a-rb| < q^{1/2}$ and $|a|,|b| \leq (q/\epsilon)^{1/2}$ then $Q(a,b) < 3q$. We shall soon generalize this to simultaneous rational approximation, detection of a $\bf Z$-linear dependence, etc.

Postlude on $X_0(5782)$:
In class I improvised an example involving the modular curve $X_0(5782)$ (because Rosh Ha-Shanah), and rashly said it is hopeless to exhibit such a curve by explicit equations. In fact it is not too hard because $5782 = 2 \cdot 49 \cdot 59$ so $X_0(5782)$ is the fiber product of $X_0(2),$ $X_0(49),$ $X_0(59)$ with respect to their maps (of degrees $3, 56, 60$) to the $j$-line $X(1);$ that is, $X_0(5782)$ has an equation $j_2(x_2) = j_{49}(x_{49},y_{49}) = j_{59}(x_{59},y_{59})$, where $x_N$ or $x_N,y_N$ are coordinates on $X_0(N)$, and each $j_N$ gives $j$ as a rational function on $X_0(N).$ For $N=2,49,59$ the curve $X_0(N)$ and $j_N$ is still accessible, and the curves are reasonably nice (rational, CM elliptic, and hyperelliptic of genus 5) though $j_{49}$ and $j_{59}$ aren’t pretty. We shall see how to compute such formulas later in the course.

Wednesday, Sep. 13: start on computing explicit Belyi functions

We start with some of the simpler cases, where $C$ is rational and $g_\infty$ is an $n$-cycle; equivalently, the map $t=f(x)$ becomes a polynomial when we choose the rational coordinate $x$ on $C$ so that $x=\infty$ is the unique preimage of $t=\infty.$ We have already seen the first of these: the two-point cover $t = cx^n$ for some $c \in {\bf C}^\times$, where $g_0 = g_\infty^{-1}$ and $g_1 = {\rm id}$. The next simplest has $g_1$ a simple transposition, which makes $g_0$ a product of cycles of length $a_0,a_1$ for some positive integers $a_i$ with $a_0+a_1=n.$ Here $t=0$ has two preimages, with multiplicities $a_0$ and $a_1;$ by an affine-linear change of coordinates we put them at $x=0$ and $x=1$ respectively. Then $t = c x^{a_0} (1-x)^{a_1},$ with $c$ chosen so that the remaining critical point $x = a_0 / (a_0 + a_1)$ maps to $t=1$ — explicitly, $c = (a_0 + a_1)^{a_0 + a_1} / (a_0^{a_0} a_1^{a_1}) = n^n / (a_0^{a_0} a_1^{a_1}).$

This example (if not the choice of $c$) is familiar from a first course in differential calculus (and from high-school contest math, where the maximum of $x^{a_0} (1-x)^{a_1}$ on $0 < x < 1$ is located using the AM-GM inequality); it also provides one of the ingredients of Belyi’s proof, because it gives for every rational number $r$ a Belyi function sending $r$ to one of $0,1,\infty$: if $r$ is not in $(0,1)$, use $f(1/x)$ or $f((x-1)/x)$. A few covers of modular curves also appear as special cases; the most familiar is the degree-$3$ map $X_0(2) \to X_0(1),$ but there is also the map $X_0(6)/w_2 \to X_0(2)/w_2$ of degree $4$ (see this page for explicit formulas in those two cases), and a few further examples for covers of Shimura modular curves.

Interlude on finding duplicates
Suppose we have reduced some computational problem to finding an element of the intersection of two sets of size $M$ and $N$. Comparing each pair of elements takes $MN$ work. But if the sets are listed in order then only $O(M+N)$ work is needed. So start by sorting each set (even if this requires imposing a mathematically unnatural total order); it is “well-known” — although not obvious — that sorting a list of length $N$ takes only $O(N \log N)$ work, so we find the intersection in $O((M+N) \log (M+N))$ work, which is a huge improvement on $MN$ if $M,N$ are at all large. In particular, we can find duplicates in a list of length $N$ by sorting the list (time $O(N \log N)$) and then comparing consecutive elements ($N-1$ comparisons), which again is much better for large $N$ than comparing all $(N^2-N)/2$ pairs.

Wednesday, Sep. 15: Computation of Belyi polynomials, cont’d

Next we might make $g_0$ the product of three cycles, of lengths $a_0,a_1,a_2$, so we have $t = c x^{a_0} (x-1)^{a_1} (x-w)^{a_2}$ and must also choose the parameter $w$. For generic $w$, there are two critical points other than $\infty, 0, 1, w,$ namely the roots of the quadratic in the numerator of the logarithmic derivative $ a_0/x + a_1/(x-1) + a_2/(x-w) $ of $t$. There are two ways to make this a Belyi function: either the roots coincide, in which case $g_1$ is a $3$-cycle, or they are distinct but mapped to the same $t$, making $g_1$ a double transposition. The former is simpler, as we see both from the equations and by counting solutions of $g_0 g_1 g_\infty = {\rm id}$ with the appropriate cycle structures. We thus consider first that case, where $ a_0/x + a_1/(x-1) + a_2/(x-w) $ has a double root. The roots coincide if and only if the discriminant of the quadratic in $x$ vanishes. We calculate that this discriminant is $(a_0+a_1)^2 w^2 - 2 (a_0 n - a_1 a_2) w + (a_0+a_2)^2.$ So there are two solutions $w$, and indeed we can see directly that there are two solutions of $g_0 g_1 g_\infty = {\rm id}$ up to $S_n$-conjugacy with the specified cycle structures, provided the $a_i$ are distinct. If not, there’s a single solution, but we might not be able to put the zeros at $x=0,1,w$ because two zeros (or all three) might be algebraic conjugates. Example: $n=5$, $(a_0,a_1,a_2) = (3,1,1)$ yields $$ x^3 (x^2+15x+60) = (x+6)^3 (x^2-3x+6) - 6^4, $$ so $-x^3 (x^2+15x+60) \, / \, 6^4$ [or equivalently $x^3 (x^2-15x+60) \, / \, 6^4,$ with $x$ changed to $-x]$ is a quintic Belyi polynomial with group $A_5.$

In any case the solutions of the quadratic in $w$ cannot be rational, or even real, because the discriminant of that quadratic is $-16 a_0 a_1 a_2 n < 0$; we could also see directly that such a polynomial cannot exist over $\bf R$, because its logarithmic derivative $ a_0/x + a_1/(x-1) + a_2/(x-w) $ would be a strictly decreasing function of $x$ and thus couldn’t have a real critical point. On the other hand, it is possible to have real, and even rational, solutions of the Diophantine equation “$-16 a_0 a_1 a_2 n = {\rm square}$” (with $n = a_0 + a_1 + a_2$) if we allow some $a_i$ to be negative, and this yields Belyi functions of a different kind, where $g_1$ is still a $3$-cycle but each of $g_0,g_\infty$ is a product of two cycles, say of lengths $a_1,a_2$ and $b_1,b_2$ where $a_1 + a_2 = b_1 + b_2.$ The Diophantine equation $$ a_1 + a_2 = b_1 + b_2, \quad a_1 a_2 b_1 b_2 = d^2 $$ gives a double cover of ${\bf P}^2$ that is a rational Del Pezzo surface; for instance, we may choose any rational numbers for $r := a_1/a_2, \ s := b_1/a_2$ subject to $rs = {\rm square},$ and then solve $a_1 + a_2 = b_1 + b_2.$ The first few nontrivial solutions give three Belyi functions of degree $10$, with $\{ a_1, a_2 \}$ and $\{ b_1, b_2 \}$ any two of $\{1,9\},$ $\{2,8\},$ and $\{5,5\}.$ For instance, if we choose $\{1,9\}$ and $\{2,8\}$ then we can choose the coordinate $x$ so our function has the form $t(x) = x^9(x+w)/(x+1)^2$ for some $w$, and then the condition that $t'/t$ have a double root yields $w=2$ or $w=50/49.$ NB in each case $w$ and $w-1$ are $S$-units for $S = \{2, 3, 5, 7\},$ as expected by Beckmann’s theorem that a Belyi map with Galois group $G$ has good reduction outside the prime factors of $G$.
Exercise: Verify these values of $w$, and check that in the first case we obtain an identity $$ x^9 (x+2) + (3^9/2^8) (x+1)^2 = (2x+3)^3 \cdot {\rm septic}(x) $$ where the septic has $S$-unit discriminant (indeed a $\{2,3\}$-unit). [The resulting degree-7 extension of Q is one of only $10$ degree-7 extensions of Q unramified outside $\{2,3\}$, according to this LMFDB search (and the LMFDB’s page on Completeness of number field data).] Obtain the analogous identity for $w = 50/49,$ and/or for one or more of the functions with $b_1 = b_2 = 5.$

Returning to Belyi polynomials: before proceeding to the case that $g_1$ is a double transposition, consider the generalization where $g_1$ is an $(m+1)$-cycle and $g_0$ is the product of $m+1$ cycles of lengths $a_0,a_1,\ldots,a_m.$ (So far we have seen $m=1$ and $m=2.)$ Then we expect $m!$ distinct Belyi maps over $\bf C$ assuming the cycle lengths are pairwise distinct, but a unique map if all but one of the cycle lengths is the same.
Exercise: Show how to find the corresponding unique map algebraically; what happens if $m|n$ and all $m$ cycles are of the same length $n/m$?

Suppose that the $a_i$ are distinct. Then the roots of the polynomial $t(x)$ are in the field generated by the coefficients of that polynomial. As before, once $m>1$ this field cannot be ${\bf Q},$ or indeed any subfield of ${\bf R}$, because $t'/t$ is monotone decreasing; but we can still ask to compute those roots as algebraic numbers. There are $m!$ possibilities: starting with the $n$-cycle, we must choose $m+1$ of its vertices to divide the circumference into segments of lengths $a_i$ in any order, and there are $m!$ choices up to rotation along the cycle. (If the points moved by $g_1$ were not in cyclic order on $g_\infty$ then $g_0$ would have fewer than $m+1$ cycles, and the covering curve would have positive genus. Cf. the extensive literature on Grothendieck’s “dessins d’enfant”.) So we write $$ t(x) = x^{a_0} \prod_{i=1}^m (x+w_i)^{a_i} $$ for some distinct nonzero $w_i$. (Note that we do not insist on scaling these to put $w_1$ at $1$, to retain the symmetry among the roots of $t,$ which is parametrized by the point $(w_1 : w_2 : \ldots : w_m)$ in ${\bf P}^{m-1};$ permutations of the roots act on this space by projective linear transformations, and the subgroup that fixes the first root acts by coordinate permutations.] Then the numerator of $t'/t$ is a homogeneous polynomial of degree $m$ in $x$ and the $w_i$. It soon follows that the condition that this numerator be an $m$-th power amounts to $m-1$ homogeneous equations in the $w_i$, of degrees $2, 3, \cdots, m.$ Since we already know to expect $m! = 2 \cdot 3 \cdots m$ solutions, these solutions must constitute the complete intersection of the corresponding $m-1$ hypersurfaces in ${\bf P}^{m-1}$.

Monday, Sep. 20: More on Belyi polynomials etc.

POSTSCRIPT on the Belyi quintic arising from the identity $$ x^3 (x^2+15x+60) = (x+6)^3 (x^2-3x+6) - 6^4 $$ that we obtained for the example of $n=5$, $(a_0,a_1,a_2) = (3,1,1)$: here $g_0$ and $g_1$ are conjugate, and indeed $t=0$ and $t=1$ are equivalent: if we replace $x$ by $-6-x$ we get $$ -(x+6)^3 (x^2-3x+6) = -x^3(x^2+15x+60) - 6^4, $$ and then multiplying each side by $-1$ and adding $6^4$ recovers our original identity with sides reversed. This symmetry also means that we can start from the Belyi map $t(x) = -x^3 (x^2+15x+60) / 6^4$ and compose with the quadratic Belyi map $t_1 = 4t(1-t)$ to get a degree-$10$ polynomial in $x$ that is invariant under $x \leftrightarrow -6-x;$ letting $x_1 = x(6+x)$ we find $$ t_1 = -2^{-6} 3^{-8} x_1^3 (x_1^2 - 15 x_1 + 360) $$ which is itself a quintic Belyi polynomial, with cycle structures $5, 3+1+1, 2+2+1$ for $g_\infty, g_0, g_1$ respectively; indeed $t_1 - 1 = 2^{-6} 3^{-8} (x_1+9) (x_1^2 - 12 x_1 + 216)^2.$ [Note that the quadratic factor $x_1^2 - 12 x_1 + 216$ has discriminant $-720 = -12^2 \cdot 5,$ so again all the primes of singular reduction are factors of the order of the Galois group.] In general if two or all three of $g_0,g_1,g_\infty$ are conjugate then the Belyi function may inherit the symmetry, and this can give a further invariant of Belyi functions beyond the cycle structure and Galois group. END POSTSCRIPT

Recall that we postponed till later the case that $g_0$ is the product of only three cycles, so $t = c x^{a_0} (x-1)^{a_1} (x-w)^{a_2}$ for some $w$ (where $a_0,a_1,a_2$ are the cycle lengths), but $g_1$ is a double transposition rather than a $r$-cycle. In this case the count of solutions of $g_0 g_1 g_\infty = {\rm id}$ grows with the degree $n$ even without increasing the number of cycles in $g_0$, so we expect that typically $w$ and thus $P$ will have to generate ever larger number fields, but can still ask how to compute them.

In this setting the critical points $x_1, x_2$ at the roots of $ a_0/x + a_1/(x-1) + a_2/(x-w) $ are distinct but satisfy $P(x_1) = P(x_2)$ [and we can normalize the common value $c$ to $1$ by multiplying $P$ by $1/c$ to put the third branch point at $t=1$]. Let $Q(x)$ be the quadratic polynomial with roots $x_1,x_2.$ We could solve $Q(x)=0$ to find $x_1,x_2$ as algebraic functions of $w$, and then work out what $P(x_1) = P(x_2)$ means as an equation for $w$; equivalently, and less laboriously, we could ask that the remainder of $P \bmod Q$ be a constant polynomial: the linear coefficient is some function of $w$, which we would set to zero. But this would still yield an unnecessarily complicated equation, because the values of $w$, that make $x_1 = x_2$ would arise as spurious solutions. Instead we exploit the fact that $P$ must be congruent to a constant not just $\bmod Q$ but even $\bmod Q^2,$ and this condition would fail for $x_1 = x_2$ (when $P$ is congruent to $c$ only $\bmod Q^{3/2}.)$ So we’ll use for our equation the vanishing of the $x^3$ coefficient of $P \bmod Q^2.$ Exercise: is the vanishing of this coefficient already enough to assure that $P \bmod Q^2$ is a constant polynomial, or might we have to then impose the further conditions that the $x^2$ and $x$ coefficients vanish as well?

Consider for example the case that $n=6$ and $(a_0,a_1,a_2) = (4,1,1).$ As we already saw, the coincidence $a_1 = a_2$ lets us simplify $P$ to the form $Cx^4 (x^2 + ax + b)$ for some nonzero $C$ and parameters $a,b$ determined up to scaling to $\lambda a, \lambda^2 b.$ Even so, we expect two inequivalent solutions (corresponding to the two isomers of cyclohexadiene). This is one of the simplest cases with a double transposition. Here it turns out that for each solution the Galois group is properly contained in $S_6$ (though it cannot be the alternating group $A_6$ or a subgroup of $A_6$, because $g_0$ and $g_\infty$ are odd permutations). When the two “double bonds” are adjacent, we get ${\rm PGL}_2({\bf F}_5)$ (a.k.a. the ($3$-)transitive copy of $S_5$ in $S_6$, a.k.a. the image of the point stabilizer in $S_6$ under an outer automorphism of $S_6$); when the two “double bonds” are opposite, we have the imprimitive $48$-element subgroup of $S_6$ (a.k.a. the image of the pair stabilizer $S_2 \times S_4$ in $S_6$ under an outer automorphism), isomorphic to the symmetries of the octahedron acting on its six vertices as the stabilizer of the partition into three opposite pairs.

[Interlude on the outer automorphism of $S_6$, the Segre cubic, etc.]

Indeed we find that $Q(x) = 6x^2 + 5ax + 4b$, and then that the $x^3$ coefficient of $P \bmod Q^2$ is $144 ab - 100 a^3$. Thus either $a=0$ or $36b = 25a^2.$ The solution $a=0$ makes $P$ a polynomial in $x^2$, which corresponds to the imprimitive solution. Thus the other case must be the ${\rm PGL}_2({\bf F}_5)$ cover. A convenient choice of scaling is $(a,b) = (6,25),$ giving the identity $27 x^4 (x^2+6x+25) = (3x^2-12x+20) \, (3x^2+15x+50)^2 - 50000.$

Exercises:
i) Since we just got a sextic cover with Galois group $S_5,$ there must also be a Belyi map of degree $5$ giving the same Galois closure. The cycle structures are 32 / 41 / 221 (corresponding to the S₆ cycle structures 6 / 411 / 2211). Find the Belyi map.
ii) What happens for the Belyi polynomials of degree 7 for which $g_1$ is a double transposition and $g_0$ has shape 331 or 421?

Wednesday, Sep. 22: A counting formula; Resultants

For the second exercise:

• 7 / 331 / 22111: here $G$ is necessarily the $168$-element subgroup of $S_7;$ the polynomial is defined over ${\bf Q}(\sqrt{-7}).$ The Galois closure is the Klein quartic, and $x$ is a rational coordinate on the quotient of this quartic by one of the two kinds of index $7$ subgroups of $G$ (both isomorphic with $S_4$ and switched by an outer automorphism of $G$). While the Klein quartic can be defined over ${\bf Q},$ its automorphism group cannot. Again I refer to my article on the Klein quartic.
• 7 / 421 / 22111: here $P = c x^4 (x+1)^2 (x+w)$ and there are four possibilities for $w,$ but they are not all conjugate: two are the roots of $2w^2 + w + 1,$ which again generate ${\bf Q}(\sqrt{-7});$ and the others are roots of $27 w^2 - 18 w - 25,$ and generate ${\bf Q}(\sqrt{21}).$ Indeed there are two possible Galois groups, $G_{168}$ and the full alternating group $A_7;$ which one corresponds to which pair of $w$’s?

In each case one can also describe the solutions to $g_0 g_1 g_\infty = {\rm id}$ starting from $g_0$ and $g_1$: there are two variations of “. _ Δ _ Δ” (depending on the orientation of the first 3-cycle), and $1+3$ variations of “. _ (2-cycle) _ (4-cycle)” or “. _ (4-cycle) _ (2-cycle)”.

The theory of representations of finite groups gives us a systematic way to count (though not to exhibit) solutions in a finite group $G$ of $g_0 g_1 g_\infty = {\rm id},$ or more generally $g_1 g_2 \cdots g_k = {\rm id},$ with each $g_i$ in a specified conjugacy class.

Theorem. (See e.g. Thm. 7.2.1 in Serre’s Topics in Galois Theory.) Let $C_1,\ldots, C_k$ be conjugacy classes in a finite group $G$. The number of solutions of $g_1 g_2 \cdots g_k = {\rm id}$ with each $g_i \in C_i$ is $$ \frac1{|G|} \prod_{i=1}^k |C_i| \sum_\chi \frac{\chi(C_1)\,\chi(C_2)\cdots\chi(C_k)}{(\chi(1))^{k-2}} $$ where $\chi$ ranges over the characters of irreducible representations $V_\chi$ of $G$.

Remark: The cases $k=1$, $k=2$ of this formula are familiar consequences of the orthogonality relations in the theory of representations of finite groups. In general, the trivial character contributes $\frac1{|G|} \prod_{i=1}^k |C_i|$ to the sum; this would be the correct answer if every group element appeared equally often as $g_1 g_2 \cdots g_k$ with each $g_i \in C_i,$ so the summands for nontrivial $\chi$ can be regarded as corrections to this main term. (If $G$ has further $1$-dimensional representations then they contribute further “main terms” that, together with the term $\frac1{|G|} \prod_{i=1}^k |C_i|,$ detect whether the image of $C_1 C_2 \cdots C_k$ in $G^{ab}$ is trivial.)

Proof : Let $A$ be the group algebra ${\bf C}[G]$. For each $i=1,2,\ldots,k$ let $c_i \in A$ be the formal sum $\sum_{g\in C_i} g.$ We want to evaluate the coefficient of the identity in $c_1 c_2 \cdots c_k.$ This coefficient is $|G|^{-1}$ times the trace of $c_1 c_2 \cdots c_k$ acting on the regular representation $A$. We use the decomposition $A = \oplus_\chi V_\chi^{\chi(1)}$ of the regular representation $A$ into isotypic components (corresponding to its decomposition $A = \oplus_\chi {\rm End}(V_\chi)$ as a $\bf C$-algebra.) Because each $c_i$ is in the center of $A$, the image of $c_i$ in ${\rm End}(V)$ is a multiple of the identity; comparing traces we see that this multiple is $|C_i| \, \chi(C_i) / \chi(1).$ Thus the trace of $c_1 c_2 \cdots c_k$ on each $V_\chi^{\chi(1)}$ is $\prod_{i=1}^k |C_i| \, \chi(C_i) / \chi(1)^{k-2}.$ Summing this over $\chi$ we obtain the trace of $c_1 c_2 \cdots c_k$ acting on $A$. Multiplying by $|G|^{-1}$ we obtain the claimed formula. QED

Note that we do not obtain a formula for the number of such $k$-tuples $(g_1,g_2,\ldots,g_k)$ that generate $G.$ Still we may be able to deduce this number by applying the same formula to proper subgroups $H \subset G$ to account for solutions that generate a subgroup conjugate to $H.$

In practice the sum over $\chi$ often simplifies further because most of the terms vanish. A particularly nice case, which applies to some of our calculations thus far, appears when $G = S_n$ and one of the $C_i$ is the conjugacy class $(n)$ of $n$-cycles: $n$ of the character values of this conjugacy class are $\pm 1$, and all the others are zero! Indeed let $V_1$ be the trace-zero hyperplane of the $n$-dimensional permutation representation, and for $j=0,1,\ldots,n-1$ let $V_j = \wedge^j V_1.$ It is well known that each $V_j$ is an irreducible representation of $S_n$, of dimension $n-1 \choose j$. Let $\chi_j$ be the corresponding character. [For example: $\chi_0$ is the trivial character; $\chi_1$ is the character taking each $g \in S_n$ to its number of fixed points minus $1;$ $V_{n-1}$ is the sign character $\epsilon,$ and in general $V_{n-1-j} \cong \epsilon \otimes V_j$ so $\chi_{n-1-j} = \epsilon \chi_j.]$ We claim that the character of $(n)$ acting on $V_j$ is $(-1)^j$, and that these are the only nonzero character values of an $n$-cycle. To see the first claim, note that the eigenvalues of an $n$-cycle acting on $V_1$ are the $n-1$ roots of unity $\omega$ with $\omega \neq 1;$ hence the generating function $\sum_{j=0}^{n-1} (-1)^j \chi_j((n)) X^j$ is the product of $1-\omega X$ over all such $\omega,$ which is $(1-X^n)/(1-X) = \sum_{j=0}^{n-1} X^j.$ The first claim follows by comparing coefficients. The prove the second claim we use the identity $\sum_\chi |\chi(C)|^2 = |G|/|C|$ for every conjugacy class $C$ (part of the orthogonality relations). The number of $n$-cycles in $S_n$ is $(n-1)!$, so the sum of $|\chi((n))|^2$ over all characters $\chi$ is $n$ — and we have already accounted for this sum with $\sum_j |\chi_j((n))|^2,$ so all other character values must vanish, and we are done.

We can then use the same generating-function technique to compute the character values of the other $g_i$ on these $V_j$. For example, in the last exercise we considered $n=7$ with cycle structures 7, 331, 22111 and 7, 421, 22111. The eigenvalues of a double transposition acting on $V_1$ are $1, -1$ with multiplicities $4, 2$ so the generating polynomial is $(1-X)^4 (1+X)^2 = 1 - 2X - X^2 + 4X^3 - X^4 - 2X^5 + X^6.$ For 331 the generating polynomial is $(1-X^3)^2 = 1 - 2X^3 + X^6,$ so we obtain a count of $$ \frac{6! \; 105 \cdot 280}{7!} \Bigl(1 + \frac{2 \cdot 4}{6 \choose 3} + 1\Bigr) = 4200 \cdot \frac{12}{5} = 10080 = 2 \cdot 7!, $$ and indeed there are two solutions up to conjugation in $S_7$ (though each generates a 168-element subgroup). Likewise for 421 we compute $(1-X^2)(1-X^4) = 1 - X^2 - X^4 + X^6,$ making the count $$ \frac{6! \; 105 \cdot 630}{7!} \Bigl(1 + \frac{1 \cdot 1}{6 \choose 2} + \frac{1 \cdot 1}{6 \choose 2} + 1\Bigr) = 9450 \cdot \frac{32}{15} = 20160 = 4 \cdot 7!, $$ which agrees with the $4 = 2+2$ solutions up to $S_7$-conjugation that we count by working directly with permutations.

Exercise: Check some of our other enumerations this way — at least the one for an $n$-cycle, an $(n-1)$-cycle, and a simple transposition in $S_n$.

Motivation, definition, and properties of resultants of univariate polynomials, which we’ll use to eliminate one of two variables when we’ve brought one of our calculations down to solving two simultaneous nonlinear equations. The Sylvester matrix of polynomials $P,Q \in k[X]$ has corank equal the degree of $\gcd(P,Q)$, as can be seen by identifying the row kernel with $\{ (A,B) : \deg(A) \lt \deg(Q), \deg B \lt \deg(P), AP+BQ=0 \}.$ If $\xi$ is a common zero of $P$ and $Q$ then the column vector $(\xi^{n-1}, \xi^{n-2}, \ldots, \xi^2, \xi, 1)$ (where $n = \deg(P) + \deg(Q)$ is the matrix size) is in the kernel. This fully accounts for the kernel if $\gcd(P,Q)$ has distinct roots. What happens if there are some roots of multiplicity 2 or greater?

Monday, Sep. 27: Using multivariate (and usually $p$-adic) Newton’s method

POSTSCRIPT
Last time we proved a formula (Serre’s Theorem 7.2.1) for the number of solutions of $g_1 g_2 \cdots g_k = {\rm id}$ with each $g_i$ in some conjugacy class $C_i$ of a finite group $G.$ To generalize from covers of a $k$-punctured Riemann sphere to covers of a $k$-punctured Riemann surface of genus $\gamma,$ we need to count solutions of $g_1 g_2 \cdots g_k = [a_1, b_1] \, [a_2, b_2] \cdots [a_\gamma, b_\gamma],$ where $[\cdot,\cdot]$ is the commutator $[g,h] = g^{-1} h^{-1} g h,$ and the $a_\gamma$ and $b_\gamma$ are arbitrary group elements.
Exercise: Prove that this count is $$ |G|^{2\gamma-1} \prod_{i=1}^k |C_i| \sum_\chi \frac{\chi(C_1)\,\chi(C_2)\cdots\chi(C_k)}{(\,\chi(1))^{k+2\gamma-2}} $$ (which does indeed recover our formula for the Riemann sphere as the special case $\gamma=0$). Check that for $k=0$ this is equivalent to the known result that the number of $(g,h) \in G \times G$ such that $gh=hg$ is $|G|$ times the number of conjugacy classes. [Start by proving that $$ \mathop{\sum\!\sum}_{g,h \in G} \rho(g^{-1} h^{-1} g h) = \frac{|G|^2}{(\dim(V))^2} $$ for each irreducible representation $(V,\rho)$ of $G$.]
END POSTSCRIPT

Example: Consider Belyi polynomials of degree $11$ for which $g_0$ and $g_1$ have cycle structures $3^3 1^2$ and $2^4 1^3$ respectively (a.k.a. 33311 and 2222111). There are $10$ up to equivalence (check that this again agrees with our formula/recipe), of which $8$ have Galois group $A_{11}$ and $2$ have Galois group $M_{11}$ (the smallest of Mathieu’s sporadic simple groups). Start by writing $P(x) = C (x^3 + a x + b)^3 (x^2 + c x + d),$ for some nonzero constant $C$; we have translated $x$ to remove the $x^2$ term from the cubic factor. This leaves four parameters $a,b,c,d$ up to scaling $x,a,b,c,d$ by factors $\lambda, \lambda^2, \lambda^3, \lambda, \lambda^2.$ The double zeros of $P-1$ are then the roots of the quartic $Q(x)$ that appears in the numerator of the logarithmic derivative $$ \frac{P'}{P} = 3 \frac{3x^2+a}{x^3+ax+b} + \frac{2x+c}{x^2+cx+d}. $$ As often happens, the first of the resulting equations in $a,b,c,d$ is linear in the highest-weight parameter, which is $d$ in our case. Thus we can solve for $d$, leaving two more complicated weighted-homogeneous equations in $a,b,c.$ We eliminate $b$ by taking a resultant, leaving a $10$th degree equation in $(a:c^2)$ which splits into factors of degrees $2$ and $8.$ Written in terms of the ratio $r = c/a^2,$ the quadratic factor is $1331 r^2 + 363 r + 207,$ with roots in ${\bf Q}(\sqrt{-11})$, while the octic is

907005800872369*r^8 - 85765544120678433*r^7 + 1366333982783796708*r^6
- 8121037774143566646*r^5 + 3063314005545416139*r^4 + 11342026112037644529*r^3
+ 8616230500846047750*r^2 + 1731334328611593750*r + 106036249911328125,
with a root in the field of discriminant $-5^2 7^3 11^6$ generated by a root of $X^8 + 2 X^6 - 3 X^5 + 10 X^4 - 14 X^3 + 14 X^2 - 8 X + 1.$
[This field is not (yet?) in the LMFDB. Fortunately the polynomial in $r,$ complicated though it is, has discriminant $-3^{102} 5^{16} 7^{21} 11^{92} N^2$ where $N = 1.4059\ldots \cdot 10^{41}$ is easy enough for gp to factor (two primes, one of which is $166775929$) that the functions nfdisc and polredabs take a small fraction of a second to compute the field discriminant and a simple generating polynomial.]

Wednesnday, Sep. 29: A cube minus a square

[…]

Monday, Oct. 4: A cube minus a square, cont’d

[…]

Wednesnday, Oct. 6: Interlude on tables for computing mod $p$; positive- [usually 1-]dimensional families

[…]

An instructive example: polynomials $x(t), y(t)$ of degrees $8, 12$ such that $x^3 - y^2$ is a nonzero polynomial of degree at most $6$ (and at least $5$ by the usual ABC/Wronskian/Riemann-Hurwitz argument). These correspond to rational functions $f = x^3/y^2$ of degree $24$ that give rational maps $f : {\bf P}^1 \to {\bf P^1}$ branched over $4$ points, with monodromy generators of type $3^8$ above $0,$ type $2^{12}$ above $\infty,$ and type $1^6 18$ above $1,$ plus a simple transposition above the image of the extra zero of $f'.$ On the elliptic-surface side, we get elliptic K3 surfaces $Y^2 = X^3 - 3 x X + 2 y$ with a singular fiber of type at least $I_{18}$ at $t=\infty,$ so the moduli space has dimension $20-\rho = 20 - (2+17) = 1$ which (as usual for K3’s) is consistent with the parameter count.

[analysis continues on a separate page]

Wednesnday, Oct. 13: Curves of genus 0 through 5; equations for some modular curves

So far we’ve made sure that all our Belyi covers are rational curves; but that’s not always the case, nor the only interesting case. Before giving some examples of Belyi maps on non-rational curves, we need to give (or recall) enough of a description of such curves to understand the form of explicit equations defining the curves and rational functions on them. We’ll stop at genus 5, which is the last case that a generic curve is a complete intersection in projective space, namely the zero-locus of a three-dimensional space of quadrics (homogeneous polynomials of degree $2$) in ${\bf P}^4$. For genus $4,$ it’s the intersection of a quadric and a cubic in ${\bf P}^3$; for genus $3$, a quartic curve in ${\bf P}^2$. In each of these cases the only non-generic exceptions are hyperelliptic curves; in genus $2$ all curves are hyperelliptic. It’s actually the more familiar cases of genus $0$ and $1$ that can get tricky if we want to work over fields like $\bf Q$ or (for families of curves) ${\bf C}(t)$ that are not algebraically closed. [This and most of what follows is standard (neo)classical algebraic geometry; standard references include several books co-authored by Joe Harris, and at Harvard it may be even more convenient to ask a student of Joe Harris. ]

genus 0: over $\bf C$ it’s just the projective line (a.k.a. Riemann sphere), but over a field that’s not algebraically closed (nor finite) even genus-zero curves needn’t be trivial. Such a curve is always a smooth conic in ${\bf P}^2$ (embedded by the space $\Gamma(-K)$ of anticanonical sections, which has dimension $3$ by Riemann-Roch); thus the curve is rational iff it has a rational point (“if” by the usual slope parametrization, and the converse is trivial). More generally a genus-zero curve is rational iff it has a divisor $D$ of odd degree: “if” because some $D+cK$ has degree $1$, and is effective by Riemann-Roch; and again “only if” is trivial. All our genus-zero covers so far came with such a $D$, indeed a distinguished point, which could be described as the unique multiplicity-$m$ preimage of $t$ for some $m \geq 1$ and $t \in \{0,1,\infty\}$ (being careful that we’re not in a case where we’ve had to make two or all three branch points algebraic conjugates). But that need not be the case in general. For example, there’s a unique action defined over $\bf Q$ of the symmetric group $S_4$ on a curve of genus zero; but the curve cannot be rational over $\bf Q$, or even over $\bf R$, because then (by the usual averaging argument) $S_4$ would be contained in $O_2({\bf R}) / \{\pm1\},$ and thus would contain a cyclic group with index at most $2$. [This could also be checked by calculating the Schur indicator of the $2$-dimensional irreducible representation of the relevant central extension $2 S_4,$ or indeed of its subgroup isomorphic with the $8$-element quaternion group.] So the genus-zero curve must be “pointless”; explicitly it is the conic $x^2 + y^2 + z^2 = 0$, with $S_4$ acting by signed coordinate permutations. [… quaternion algebras like Hurwitz $\{2,\infty\}$, ${\rm Br}_2$, …]

genus 1: again, over an algebraically closed field such as $\bf C$ we have a familiar picture, this time an elliptic curve $C$, since there must be a rational point $P$. More generally, any divisor $D$ of positive degree is still effective by Riemann-Roch, and if $\deg(D) = 1$ then $D \sim P_0$ for some rational point $P_0$. We can then use the sections of $3P_0$ to embed $C$ in ${\bf P}^2$ as a cubic in Weierstrass form $y^2 + a_1 x y + a_3 y = x^3 + a_2 x^2 + a_4 x + a_6,$ calculating the coefficients $a_i$ by comparing Laurent expansions about $P_0$ as usual. Sometimes — especially when $C$ arises as a modular curve such as ${\rm X}_0(11)$ — it is more convenient to start from a degree-$2$ function $x$ on $C$ and a holomorphic differential $\omega$, and then set $z = dx / \omega,$ which is anti-invariant under the involution $\iota$ of $C$ satisfying $x \circ \iota = x$ and is regular away from the poles of $x$, and thus satisfies an equation $z^2 = P(x)$ for some polynomial $P$ of degree $3$ or $4$ according as $x$ has one double pole or two simple poles. On modular curves, the $q$-expansions of modular forms often give a convenient handle on rational functions and holomorphic functions. Here we may tell Sage:

ModularForms(11,prec=14).echelon_basis()

to get the $q$-expansions of a basis of the modular forms on ${\rm X}_0(11)$ to within $O(q^{14}),$ and get the result

[
1 + 12*q^2 + 12*q^3 + 12*q^4 + 12*q^5 + 24*q^6 + 24*q^7 + 36*q^8 + 36*q^9 + 48*q^10 + 72*q^12 + 24*q^13 + O(q^14),
q - 2*q^2 - q^3 + 2*q^4 + q^5 + 2*q^6 - 2*q^7 - 2*q^9 - 2*q^10 + q^11 - 2*q^12 + 4*q^13 + O(q^14)
]

in which the second generator, call it $\phi_1,$ is a cusp form and thus yields a holomorphic differential $\omega = \phi_1 \, dq/q.$ The ratio $\phi_0 / \phi_1$ (where $\phi_0 = 1 + 12q^2 + \cdots$ is the first generator) then gives us a rational function $x = q^{-1} + 2 + 17 q + 46 q^2 + 116 q^3 + 252 q^4 + 533 q^5 + 1034 q^6 + 1961 q^7 + 3540 q^8 + 6253 q^9 + 10654 q^{10} + 17897 q^{11} + O(q^{12}),$ and we compute $z = q \, (dx/dq) / \phi_1 = q^{-2} -2 q^{-1} + 12 + 116 q + 597 q^2 + 2298 q^3 + \cdots.$ Comparing coefficients (or doing something like

Z=subst(z^2,q,serreverse(1/x)); subst(truncate(Z),q,1/X)

in gp) then gives us the equation $z^2 = x^4 - 4x^3 - 88x^2 - 300x - 304 = (x+4) (x^3-8x^2-56x-76)$ for $X_0(11).$ [See the next paragraph for more about this factorization of the quartic.] We can then project the rational zero $x = -4$ to infinity, and normalize the leading coefficient of the resulting cubic (i.e., replace $x$ by $(-11/x) - 4$ and absorb the factor $(22/x^2)^2$ into $z^2$) to get a Weierstrass model $y^2 = x^3 + 14x^2 + 55x + 121/4$; the standard form $y^2 + y = x^3 - x^2 - 10x - 20$ is then recovered by translating $(x,y)$ to $(x-5, y+\frac12).$ (We’ll hopefully come back to questions such as where the $q$-expansions of $\phi_0$ and $\phi_1$ come from, and how the curve ${\rm X}_0(11)$ actually parametrizes $11$-isogenies. For the first question: briefly, $\phi_1$ is an eta product $$ \phi_1 = (\eta_1 \eta_{11})^2 = q \prod_{n=1}^\infty \left( (1-q^n) (1-q^{11n}) \right)^2, $$ and $5 \phi_0 + 12 \phi_1$ is a multiple of an Eisenstein series.)

The quartic factors because the Weierstrass points correspond to self-isogenies of degree $11$ between elliptic curves with CM (complex multiplication) by the imaginary quadratic orders of discriminants $-11$ and $44$, with one of the former and three of the latter. The rational Weierstrass point is represented by $\tau = \frac12 (1 + \frac{i}{\sqrt{11}});$ indeed taking $q = e^{2\pi i \tau} = -e^{-\pi/\sqrt{11}}$ in the expansions through $O(q^{14})$ already gives a ratio of $-3.9998\ldots,$ and we readily get more digits by increasing prec. Likewise $q = e^{-2\pi/\sqrt{11}},$ corresponding to $\tau = i / \sqrt{11},$ gives $12.82750081382\ldots,$ already close to the real root $12.8275008141\ldots$ of $x^3-8x^2-56x-76.$

But what if there is no divisor of degree 1? Any genus-$1$ curve $C$ comes with a divisor $D$ of some positive degree $d$, and $\Gamma(D)$ has dimension $d$, giving a map to ${\bf P}^{d-1}$ that is an embedding as an “elliptic normal curve” for $d \geq 3$ (for $d=2$ it’s a 2:1 map, giving a model of $C$ of the form $y^2 = {\rm quartic}(x).$) As with curves of increasing genus, elliptic normal curves of increasing degree $d$ get increasingly complicated, and are complete intersections only for the smallest few cases; here these are $d=3$ (a plane cubic) and $d=4$ (the intersection of two quadrics). Fortunately the $d \geq 4$ cases are most if not all the ones we’ll encounter. Still even those cases are much more complicated than the plane conics that are as tricky as genus-zero curves get. Two famous examples already for $d=3$ are $x^3 + p y^3 + p^2 z^3 = 0$ (no $p$-adic solution) and $3x^3 + y^3 + 5z^3 = 0$ (Selmer’s example of a cubic with no rational points even though there is no local obstruction). [To be continued…]

genus 2: Once $g \gt 1$, the curve $C$ is of general type (the canonical divisor is positive, of degree $2g-2).$ The space of holomorphic differentials gives a map, the canonical map, to ${\bf P}^{g-1}$, which is either an embedding or a 2:1 map. In the latter case $C$ is hyperelliptic, which is the case for all curves of genus $2$ but only for special curves once $g \geq 3.$ At least if $g$ is even (or if the ground field is algebraically closed, or finite, or more generally is a field with trivial ${\rm Br}_2),$ a hyperelliptic curve of genus $g$ has the form $y^2 = P(x)$ where $P$ is a polynomial of degree $2g+2$ without repeated roots. For $g=2$, the degree-$2$ function $x$ on $C$ is simply the ratio, say $\omega_1/\omega_2,$ of generators of the $2$-dimensional space of holomorphic forms. The hyperelliptic involution $\iota$ then takes any $(x,y)$ to $(x,-y),$ and takes each $\omega_i$ to $-\omega_i$, as can be seen either from the explicit formulas $(\omega_1, \omega_2) = (x \, dx/y, dx / y)$ or by observing that an $\iota$-invariant holomorphic differential descends to a holomorphic differential on ${\bf P}^1$, and is thus zero). [Either argument generalizes to show that, on a hyperelliptic curve $y^2 = P(x)$ of any genus, the hyperelliptic involution taking $(x,y)$ to $(x,-y)$ induces multiplication by $-1$ on every holomorphic differential.] Thus we can construct $y$ as $dx / \omega_2,$ and then find the hyperelliptic defining equation for $C$ that equates $y^2$ with a sextic polynomial in $x$ (or a quintic if there’s a rational Weierstrass point and $\omega_2$ was chosen to vanish at that point).

Again we give an example of a modular curve, this time ${\rm X}_1(13),$ which is the first ${\rm X}_1(N)$ of genus $\gt 1.$ [For $N \leq 12$ the curve is rational, except for $N=11$ when it is a curve of genus 1 that you should by now know how to compute; we shall see later this term how to describe the elliptic curves with $N$-torsion points that are parametrized by these curves ${\rm X}_1(N)$.] This time we need the two-dimensional space of cuspforms for $\Gamma_1(13),$ which in Sage we can get from

CuspForms(Gamma1(13),prec=14).echelon_basis()

to get

[
q - 4*q^3 - q^4 + 3*q^5 + 6*q^6 - 3*q^8 + q^9 - 6*q^10 - 2*q^12 + 2*q^13 + O(q^14),
q^2 - 2*q^3 - q^4 + 2*q^5 + 2*q^6 - 2*q^8 + q^9 - 3*q^10 + 3*q^13 + O(q^14)
]

We call these $\phi_1$ and $\phi_2$, and set $\omega_2 = \phi_2 \, dq/q$ so that $x = \omega_1 / \omega_2$ has a pole at the cusp $q=0.$ It’s more convenient to subtract $1,$ taking $x = q^{-1} + 1 + q + q^2 + q^4 - q^6 - q^8 + O(q^{11})$ rather than $x = q^{-1} + 2 + q + q^2 + q^4 - q^6 - q^8 + O(q^{11}).$ This doesn’t change $y$, but simplifies the equation from $y^2 = x^6 - 8x^5 + 26x^4 - 46x^3 + 53x^2 - 42x + 17$ to $y^2 = x^6 - 2x^5 + x^4 - 2x^3 + 6x^2 - 4x + 1,$ which is the well(?)-known formula for ${\rm X}_1(13)$ up to changing $x$ to $-x$ (which makes all the signs positive).

genus 3 and higher: Here if $C$ is not hyperelliptic then the holomorphic differentials (= sections of the canonical divisor) embed $C$ as a curve of degree $2g-2$ in ${\bf P}^{g-1}$. For $g = 3, 4, 5$ this curve is a complete intersection, as described at the beginning of today’s notes; given generators for the holomorphic differentials, and thus homogeneous coordinates on ${\bf P}^{g-1}$, the defining equations are linear relations in the monomials of appropriate degree, and can be found by linear algebra.

We illustrate with the modular curve ${\rm X}_0(64)$, which has $g=3$. Generically we would expect that expansions through $O(q^{16})$ would suffice to detect the quartic relation (there are ${3+4-1 \choose 4} = 15$ monomials of degree $4$ in $3$ variables), so CuspForms(Gamma0(64),prec=24).echelon_basis() would give more than enough information. This command returns expansions

[
q - 3*q^9 + 2*q^17 + O(q^24),
q^2 - 2*q^10 - 3*q^18 + O(q^24),
q^5 - 3*q^13 + O(q^24)
]

of modular forms $\phi_1,\phi_2,\phi_3$ which are unusually simple. We explain this as the case $d=8$ of the following observations: in general if $N$ has a factor $d^2$ with $d|24$ then the map $\tau \mapsto \tau + 1/d$ normalizes $\Gamma_0(N)$ (ultimately because these are precisely the integers $d$ for which $({\bf Z} / d {\bf Z})^*$ has exponent $1$ or $2$), and thus descends to an automorphism of ${\rm X}_0(N)$; this automorphism multiplies $q$ by $e^{2\pi i/d},$ so diagonalizing it yields a basis of modular forms of the form $q^k f(q^d).$ This simplifies the calculation of the quartic relation, and we can still use Riemann-Roch to prove that this relation is correct if we trust the expansions of the $\phi_i$. But we still prefer to have a few more coefficients as a check against computational error. We thus double prec to 48 and find $$ \begin{array}{l} \phi_1 = q - 3 q^9 + 2 q^{17} - q^{25} + 10 q^{41} + O(q^{48}),\\ \phi_2 = q^2 - 2 q^{10} - 3 q^{18} + 6 q^{26} + 2 q^{34} + O(q^{48}),\\ \phi_3 = q^5 - 3 q^{13} + 5 q^{29} + q^{37} - 3 q^{45} + O(q^{48}), \end{array} $$ which is still consistent with the quartic relation that defines ${\rm X}_0(64)$: $$ \phi_1 \phi_3 (\phi_1^2 + 4 \phi_3^2) = \phi_2^4. $$ In fact this is the Fermat quartic in disguise: the linear change of variable $$ (X,Y,Z) = (\phi_1 - 2 \phi_3, \, 2 \phi_2, \, \phi_1 + 2\phi_3) $$ transforms the equation into $X^4 + Y^4 = Z^4$. […]

Monday, Oct. 18: Low-genus curves and modular equations, cont'd; a Weil-Belyi function on an elliptic curve (and parametrizing 5-torsion etc.)

You might notice — especially if you ask for longer $q$-expansions of $\phi_1,\phi_2,\phi_3$ — that the nonzero coefficients are even sparser than the formula $q^k f(q^d)$ ($k=1,2,5$) requires, and that further missing coefficients are exactly those for which the exponents (starting with $21, \, 33, \, 42)$ cannot be written as a sum of two squares. This reflects the fact that the $\phi_i$ are all “CM forms” (CM = complex multiplication). Let $\phi_0$ be the sum of $a q^{a\bar a}$ over all $a \in {\bf Z}[i]$ congruent to $1 \bmod 2+2i;$ this $q$-expansion begins $$ \phi_0 = q - 2q^5 - 3q^9 + 6q^{13} + 2q^{17} - q^{25} - 10q^{29} - 2q^{37} + 10q^{41} + 6q^{45} + O(q^{49}) $$ (the coefficients are all integers because terms $a q^{a\bar a}$ and $\bar{a} q^{a\bar a}$ appear together). Then $\phi_0$ generates the space of modular cusp forms for $\Gamma_0(32),$ and equals $\phi_1 - 2 \phi_3,$ while $\phi_2$ is obtained from $\phi_0$ by substituting $2\tau$ for $\tau$ (equivalently, $q^2$ for $q$). Thus each of these modular forms is a linear combination of monomials $q^{m^2+n^2}.$ […]

A curve $C$ of genus $g \gt 1$ is hyperelliptic if and only if the canonical map $C \to {\bf P}^{g-1}$ is not an embedding; in this case the map is 2:1 to its image, which is a curve of genus $0$ and degree $g-1,$ call $C_0$. Thus if $g$ is even, the quotient curve $C_0$ is rational (the hyperplane section is a divisor of odd degree), and then $C$ has the familiar form $y^2 = P(x)$ for some polynomial $P$ of degree $2g+2$ without repeated roots. The holomorphic differentials are then $A(x) \, dx/y$ where $A$ is an arbitrary polynomial of degree at most $g-1$. If $g$ is odd, $C_0$ may not be a rational curve, but it is always $Q(x_0,x_1,x_2) = 0,$ and $C$ can be written as the double cover $y^2 = P(x_0,x_1,x_2)$ for some homogeneous polynomial $P$ of degree $g+1$ such that the curve $P=0$ meets the conic in $2g+2$ distinct points.

Given just $C$ and the holomorphic differentials, we can recognize the hyperelliptic curves as those for which the differentials satisfy too many quadratic relations, $(g-1)(g-2)/2$ as opposed to the generic $(g-2)(g-3)/2.$ If we also have a rational point $p$ on $C$ then we can easily generalize our approach to genus-$2$ curves to find a hyperelliptic equation for $C$. Choose a basis for the holomorphic differentials whose $i$-th element $\omega_i$ $(1 \leq i \leq g)$ vanishes to order exactly $i-1$ at $p$. We’ll use only the last two basis elements, and write $x = \omega_{g-1} / \omega_g,$ a degree-$1$ function on $C_0$. Then the function field of $C$ is generated by $x$ and $y = dx / \omega_g.$ We again give a modular example, this time the curve ${\rm X}_0(41)$ of genus $3.$ (Modular curves often come with involutions, and are thus hyperelliptic much more commonly than one might expect “at random”.) Here the Sage output of CuspForms(41,prec=20).echelon_basis() is


[
q + q^4 - q^5 - 2*q^6 + 2*q^7 - 2*q^8 - 3*q^10 - 2*q^12 + 2*q^14 + 2*q^15 + 3*q^16 - 2*q^17 + 3*q^18 + 2*q^19 + O(q^20),
q^2 - 2*q^4 - q^5 + 3*q^8 + q^9 + q^10 - 2*q^11 - 2*q^12 + 2*q^13 + 2*q^14 - 4*q^16 - 2*q^18 + 2*q^19 + O(q^20),
q^3 - 2*q^4 + q^6 - q^7 + 2*q^8 + 2*q^10 - 3*q^11 - q^12 + 2*q^13 - q^14 - 2*q^15 - 2*q^18 + 3*q^19 + O(q^20)
]

(you may have surmised by now that “CuspForms(41,prec=20)” is actually an abbreviation for “CuspForms(Gamma0(41),prec=20)”, which works as well). Call these forms $\phi_1,\phi_2,\phi_3$ respectively. There’s a rational point at the cusp $q=0,$ and we take that for our base point $p,$ so each $\omega_i$ is the corresponding $\phi_i \, dq/q.$ Here we needn’t explicitly set up a linear system to check for a quadratic relation, because such a relation must write $\phi_1 \phi_3 - \phi_2^2$ as a linear combination of $\phi_2 \phi_3$ and $\phi_3^2$ and we can peel off the coefficients one at a time; here we find that $\phi_1 \phi_3 - \phi_2^2 = -2 \phi_2 \phi_3 + O(q^{21}),$ which is more than enough $q$-adic precision to prove that the identity holds exactly: a nonzero section of $2K$ has only $8$ zeros with multiplicity, so can be written as $f(q) \, dq^2$ with $f(q)$ of valuation at most $10$ at $q=0.$ [What would happen if we didn’t check for quadratic relations and just routinely set out to find a quartic equation satisfied by the $\phi_i$?] So we take $x = (\omega_2 / \omega_3) - 1 = (\phi_2/\phi_3) - 1 = q^{-1} + 1 + 2q + 2q^2 + 3q^3 + 4q^4 + 7q^5 + 8q^6 + 11q^7 + O(q^8)$ and $y = (q \, dx/dq) / \omega_3 = -q^{-4} - 2q^{-3} - 2q^{-2} + q^{-1} + 12 + 42 q + 120 q^2 + \cdots,$ and find the hyperelliptic equation $y^2 = x^8 - 4x^7 - 8x^6 + 10x^5 + 20x^4 + 8x^3 - 15x^2 - 20x - 8$ for ${\rm X}_0(41)$. As before, this octic polynomial has distinct roots modulo all primes other than $2$ and factors of the level (here $41$): the discriminant is $-2^{16} 41^6$, reflecting the curve’s good reduction at all primes not dividing the level — the bad reduction at $2$ is an artifact that can be removed by “uncompleting the square”.

For a general hyperelliptic curve $C$ of genus $3$, the genus-zero quotient curve $C_0$ might not be rational but is always given by the unique quadratic equation satisfied by the holomorphic differentials $\omega_i$. We can then choose the ratio of any two, say $x = \omega_2/ \omega_3,$ and construct an $\iota$-anti-invariant function $y = dx / \omega_3$ (NB same denominator), which has double poles above each pole of $x$ (= each zero of $\omega_3$) and nowhere else. Thus we can write $(\omega_3^2 y)^2$ as a homogeneous polynomial of degree $4$ in the three $\omega_i$, giving a hyperelliptic equation for $C$.

Here’s an example of some of the new considerations that arise when we deal with Belyi functions on curves of positive genus. We’ll find the unique such function $f : E \to {\bf P}^1$ with cycle structures $5, \, 5, \, 221.$ By Riemann-Hurwitz $E$ has genus $1$, and since $E$ has at least one obvious divisor of degree $1$ (the simple preimage of the 221 point), it is an elliptic curve. It might not be clear a priori that the two quintuple points are distinguishable, but for now we put them at $f=0$ and $f=\infty,$ and choose the quintuple pole as the origin O for the group law on $E$. Call the quintuple zero $T$, so $f$ has divisor $(f) = (f)_0 - (f)_\infty = 5(T) - 5(O),$ and write the divisor $(f)_1$ as $P+2D$ where $P$ has degree $1$ and $D$ has degree $2.$ Now in genus zero any two divisors of the same degree are linearly equivalent, but here $5(T) - 5(0) \sim 0$ is a nontrivial condition, telling us that $T$ is a $5$-torsion point on $E,$ and $f$ is the associated Weil function. [$T$ cannot be a trivial torsion point, because $f(T) \neq f(0)$ implies $T \neq 0.$ In general if $n(T) \sim n(O)$ then $T$ is $m$-torsion for some factor $m$ of $n$, and if $1 \lt m \lt n$ then the function with divisor $n(T) - n(O)$ is an imprimitive cover of ${\bf P}^1$, being an ($n/m$)-th power of a Weil function of degree $m$. Here $n=5$ is prime so there is no imprimitive case to consider.]

Curiously we can also predict the simple preimage $P$ of the third branch point $1$: it must be $-2T.$ This exploits a trick that must have been rediscovered many times, though to my surprise I’ve found no explicit mention of it earlier than my ABC⇒Mordell paper of 1991. The idea (which applies to branched covers of any positive genus) is that once we know all the ramification of $f$, we know the divisor of its differential $df$, and the fact that this divisor is canonical gives us additional information (an extra equation in the Jacobian) on the preimages of the branch points. It is more convenient to work with the logarithmic differential $df/f,$ which has a simple pole at each zero or pole of $f$, and a zero of multiplicity $m-1$ wherever $f=t$ has a zero of multiplicity $m$ for some $t$ other than 0 and $\infty$. Here this means the logarithmic differential has divisor $D-(O)-(T),$ so $D \sim (O) + (T);$ since also $(P)+2D \sim 5(O)$, we can eliminate $D$ to find $(P) + 2(T) \sim 3(O),$ whence $P = -2T$ in the group law, as claimed. Thus we can start from any Weil function $w$ with divisor $5(T)-5(O)$ (i.e. any multiple of $f$) and recover $f$ as $w/w(-2T).$

The next step is to parametrize pairs $(E,T)$ where $E$ is an elliptic curve and $T$ is a $5$-torsion point on $E$ (NB: this is much better than starting from a generic $E$ and then choosing one of its 24 nontrivial $5$-torsion points). The following procedure for parametrizing elliptic curves with a torsion point of low order goes at least back to Tate (see the formula for a general curve with a $7$-torsion point the end of §7 of his paper The Arithmetic of Elliptic Curves (Inventiones Math. 1974)). Suppose $E$ has extended Weierstrass form with coefficients $(a_1,a_2,a_3,a_4,a_6),$ that is $$ y^2 + a_1 x y + a_3 y = x^3 + a_2 x^2 + a_4 x + a_6. $$ Let $T$ be any point other than the group-law origin $O$, and translate $x$ and $y$ to put $T$ at $(0,0)$; this makes $a_6=0.$ The tangent to $E$ at $T$ has slope $-a_4/a_3,$ so $T$ is $2$-torsion iff $a_3=0.$ Otherwise, we may translate $y$ by $(a_4/a_3) x,$ keeping $T$ at $(0,0)$ but making $a_4 = 0$ (equivalently: making the tangent to $E$ at $P$ horizontal). At this point we’ve used up all the available changes of variable except multiplying $(x, y)$ by $(\lambda_2, \lambda_3)$ for some nonzero $\lambda$, which multiplies each $a_i$ by $\lambda_i$; thus we have parametrized the space of elliptic curves $E$ together with a nonzero, non-$2$-torsion rational point $T$ by an open set in $(1, 2, 3)$-weighted projective space — not quite the entire projective space, because we must exclude $(a_1 : a_2 : a_3)$ that make $E$ singular. In particular, $a_3$ must not vanish lest $E$ be singular at $T$. Moreover, $T$ is $3$-torsion iff the (horizontal) tangent at $T$ meets $E$ with multiplicity $3$ at $T$, which is the case iff $a_2 = 0.$ Hence if $T$ is not $3$-torsion then neither $a_2$ nor $a_3$ is zero, so we may choose the unique $\lambda$ that makes $a_2 = a_3 = a$ for some nonzero $a$.

Now it’s easy to describe, for small $N \gt 3,$ the pairs $(a_1, a)$ that make $T$ an $N$-torsion point. We illustrate with the case $N=5$ that motivated this excursion. We write the condition $5T = 0$ as $3T = -2T,$ which (since $T \neq 0)$ is equivalent to the condition that $2T$ and $3T$ have the same $x$ coordinate. [These coordinates can be computed in gp with ellpow(ellinit([a1,a,a,0,0]), [0,0], 2) and ellpow(ellinit([a1,a,a,0,0]), [0,0], 3), though here the group-law computations are simple enough to be done unaided.] We find that $2T = (-a, a_1 a - a)$ and $3T = (1-a_1, a_1-a-1),$ so $5T = 0$ iff $a_1 = a+1.$ Therefore the general $5$-torsion point on an elliptic curve is equivalent to the point $(0, 0)$ on the curve with coefficients $(a+1, a, a, 0, 0).$ Exercise: Find the corresponding formulas for $4T = 0,$ $6T = 0,$ and (recovering the formula in Tate’s paper) $7T = 0.$

Next step is to find a Weil function $w$. Since $w$ is a section of $5(O)$, it is a linear combination of $xy, x^2, y, x, 1$. There’s a one-dimensional space of combinations that vanish to order at least 4 at $T$, and then the fifth zero is automatically at $T$ as well because $T$ is $5$-torsion. One way to find these combinations is to expand $y$ in a Taylor series about $x=0$ near $T$; we find $y = x^2 - x^3 + x^4 + (a^{-1}-1) x^5 + O(x^6),$ so $w = x^2 - y - xy$ works. An alternative approach, which can be used even for Weil functions of really high degree, is to write $w$ as a product of powers of linear forms. Here the functions $x$ and $y$ on $E$ have divisors $(T) + (-T) - 2(O)$ and $2(T) + (-2T) - 3(O)$ respectively, so $xy^2$ has divisor $5(T) + (-T) + 2(-2T) - 8(O),$ and we need only divide by the equation of the line through $-T$ and $-2T$, which is tangent to $E$ at $-2T$. This gives $w = xy^2 / (x+y+a).$ Rationalizing the denominator and removing a common factor $y$ simplifies this to $(x+1)y - x^2,$ same up to sign as our previous answer.

We are finally ready to find the value of $a,$ and thus the curve $E$, for which $f = w\,/\,w(-2T) = (x^2 - y - xy) / a_2$ is a $(5, 5, 221)$ Belyi function. There are several ways to go about this. A simple one is to locate the $x$-coordinate of the zeros of $f-1$ by computing the resultant w.r.t. $y$ of $f-1$ with the defining equation of the curve. This yields a quintic in $x$, one of whose roots is $x(-2T) = -a,$ and the other four must come in two equal pairs; that is, the quintic must be $c(x+a)$ times the square of a quadratic polynomial, for some constant $c$. We find that the resultant is $-(x+a) (x^4 - ax^3 + a^2 x^2 + 3 a^2 x + a^2-a^3),$ so the last factor must be a square. As usual we solve for $a$ by comparing with the Laurent expansion of the square root about $x = \infty$ (which here is $x^2 - ax/2 + 3a^2/8 + O(1/x)).$ We find that $a = -8,$ and check that this indeed makes $f$ a Belyi function with the desired cycle structures.

The standard model of $E$ has coordinates $(a_1, a_2, a_3, a_4, a_6) = (1, 1, 1, 22, -9).$ It can be obtained for instance by telling gp


  E = [-7,-8,-8,0,0];
  R = ellglobalred(ellinit(E));
  ellchangecurve(E, R[2])

which also shows that the curve has conductor $50$, small enough that it already appears in Tingley’s 50-year-old “Antwerp Tables” (which include all curves of conductor at most $200$). I usually advocate against forcing equations into such forms, which can make the equations more unwieldy and hide features such as the point $(0, 0);$ but the ellglobalred form does have the advantage of being a canonical reduced form, which one can use to tell whether two curves are isomorphic, or to compile tables for future reference. [Once the genus exceeds $1$ it can be much harder to detect and find isomorphisms between two given curves.] Here we learn from the table that $E$ has not just a rational $5$-torsion point, but also a rational $3$-isogeny; indeed it was already known in 1972 that this curve and the $3$-isogenous one with coefficients $(1, 1, 1, -3, 1)$ are the only elliptic curves over $\bf Q$ with both a rational $5$-torsion point and a rational $3$-isogeny. These curves’ appearance here is related with the fact that the Galois closure of our Belyi function is the Bring curve of genus $4$ with automorphism group $S_5$, which has maximal size for a genus-$4$ curve in characteristic zero; I hope I’ll have the time to say more about this in a few weeks.

Wednesday, Oct. 20: Overview of complex reflection groups and their invariant rings (which give rise to highly symmetric curves)

Monday, Oct. 25:

Wednesday, Oct. 27: Introduction to finite subgroups of ${\rm GL}_2({\bf C})$ and their invariants; details of the tetrahedral case

We next work out in some detail the identities related with exceptional finite subgroups of ${\rm GL}_2({\bf C}),$ which give rise to some of beautiful mathematics (mostly classical but with various modern links) that should be [i.e. that I wish were] better known.

For starters, suppose $G$ is a finite subgroup (not necessarily a complex reflection group) of ${\rm GL}_2({\bf C}),$ and let $D$ be its normal subgroup of diagonal matrices. Recall that any complex representation of a finite group $G$ fixes a positive-definite Hermitian pairing (obtained by averaging, a simple case of the “unitarian trick”), and thus maps $G$ to the unitary group of that pairing. [This is why Shephard and Todd can title their paper “finite unitary reflection groups” and still get a description of all finite complex reflection groups.] So here $G$ is a subgroup of ${\rm U}_2({\bf C})$. It follows that the induced action of $G_0 := G/D$ on ${\bf P}^1({\bf C})$ is an injection into ${\rm PU}_2({\bf C}),$ which is the group ${\rm SO}_3({\bf R})$ of Euclidean rotations of the Riemann sphere ${\bf P}^1({\bf C})$. Now it is known that a discrete subgroup of ${\rm SO}_3({\bf R})$ is cyclic, dihedral, or one of the three exceptional groups $A_4, S_4, A_5.$ The cyclic subgroups yield reducible representations, and dihedral cases yield reflection groups $G(m,p,2).$ We next consider the exceptional cases, in which $G_0$ is the group of orientation-preserving symmetries of the regular tetrahedron, octahedron (dually: cube), or icosahedron (dually: dodecahedron) inscribed in the Riemann sphere.

For any finite subgroup $G_0$ of ${\rm PU}_2({\bf C})$ (or even ${\rm PGL}_2({\bf C})$) its preimage $G_1$ in ${\rm SL}_2({\bf C})$ is in the middle of a short exact sequence $1 \to \{\pm1\} \to G_1 \to G_0 \to 1.$ When G₀ is one of the three exceptional groups, or more generally any group containing an involution, the short exact sequence cannot split, because ${\rm SL}_2({\bf C})$ contains no involution other than the central element $-1$. We next describe in each case polynomials that are invariant or at least “covariant” under the action of these groups $2A_4$, $2S_4$, $2A_5$. (We say $P$ is “covariant” under an action of $G$ when there’s a homomorphism $\chi: G \to {\bf C}^*$ such that $gP = \chi(g)P$ for all group elements $g$.) Note that $G_1$ is never a reflection group, because it contains no reflections at all (a complex reflection cannot have determinant 1); but for most of our purposes we need only the projective action, and also once we know the covariant polynomials we can easily describe the invariant rings of each of the reflection groups with the same image in ${\rm PGL}_2({\bf C})$.

A nonzero polynomial $P$ is covariant for $G_1$ iff its zero divisor (which is just a finite multiset in the Riemann sphere) is invariant under $G_0$. Now each of our $G_0$ is the group of rotations of a regular polyhedron with $N$ triangles meeting at each vertex, and acts freely on the Riemann sphere except for the vertices, face centers, and edge centers of the polyhedron, whose stabilizers are cyclic of order $N$, 3, 2 respectively. Here are the familiar counts:

$N$	$G_0$	polyhedron	$\left\|G_0\right\|$	$V$	$F$	$E$
3	A₄	tetrahedron	12	4	4	6
4	S₄	octahedron	24	6	8	12
5	A₅	icosahedron	60	12	20	30

Now the Euler relation $E = V + F - 2 = (V-1) + (F-1)$ means that once we know the polynomials of degrees $V$ and $F$ we can obtain the third polynomial as the Jacobian determinant of the first two. (The Jacobian cannot vanish because the polynomials are algebraically independent.) Also, since our polyhedron has triangular faces we have $F = 3E/2$, which together with Euler’s formula implies $F = 2(V-2);$ thus we can get the polynomial of degree $F$ as the Hessian (determinant of the matrix of second partial derivatives) of the degree-$V$ polynomial. So it remains to find the $G_0$-orbit of smallest size $V$ in the Riemann sphere. In each case there is a linear relation in degree $\left|G_0\right|$ between the $N$-th power, cube, and square of the covariants of degree $V$, $F$, $E$ respectively. The ratio of these powers gives the quotient map ${\bf P}^1({\bf C}) \to {\bf P}^1({\bf C}) / G_0 = {\bf P}^1({\bf C});$ the target ${\bf P}^1({\bf C})$ arises naturally as a line in ${\bf P}^2({\bf C})$ that intersects the three coordinate lines of that ${\bf P}^2({\bf C})$ at the three branch points of the target ${\bf P}^1({\bf C}).$ In each case this quotient map can also be identified as the covering of modular curves ${\rm X}(N) \to {\rm X}(1),$ with the branch points of order 2, 3, and $N$ at $j = 1728 = 12^3,$ $j=0,$ and $j = \infty$ respectively.

We next give explicit formulas and commentary in each case, using $x$ and $y$ as homogeneous coordinates and $x = z/y.$

$N=3$: We put the four vertices of the tetrahedron at $z = \infty$ and and the cube roots of unity (note that this choice is not consistent with the usual picture of the Riemann sphere with the equator on the unit circle; we shall see that the equator ends up being $|z| = \sqrt{2}.)$ Thus we may take for the first polynomial $A = x^3 y - y^4.$ The Hessian of $A$, divided by $-9$, is $B = x^4 + 8xy^3,$ with roots at $z = 0$, $-2$, and $1 \pm \sqrt{-3}$. Dividing the Jacobian $\partial(A,B) / \partial(x,y)$ by $-4$ yields $C = x^6 - 20 x^3 y^3 - 8y^6,$ with relation $64A^3 - B^3 + C^2 = 0.$

Now $G_0$ clearly contains the $3$-cycle $z \mapsto \zeta z$ where $\zeta$ is a cube root of unity. This $3$-cycle lifts to the pair of linear substitutions $(x,y) \mapsto \pm (\zeta^{-1} x, \zeta y)$ in $G_1$, which multiply $A$ by $\zeta$ and $B$ by $\zeta^{-1},$ leaving $C$ fixed. The group $G_0$ also contains a Klein $4$-group, because any four-point subset of ${\bf P}^1$ determines a Klein $4$-group that permutes the set freely and transitively (i.e. sharply $1$-transitively, a consequence of the fact that ${\rm PGL}_2$ acts sharply $3$-transitively on ${\bf P}^1$). For example, $G_0$ contains the involution that takes $1 \leftrightarrow \infty$ and $\zeta \leftrightarrow \zeta^{-1},$ which is $z \leftrightarrow (z+2)/(z-1).$ [In general, a fractional linear transformation $z \mapsto (az+b) \, / \, (cz+d)$ is an involution iff $a+d=0,$ i.e. iff the trace of the corresponding $2 \times 2$ matrix vanishes.] This involution, together with $z \mapsto \zeta z,$ generates $G_0$. The involution lifts to the $4$-cycles $(x,y) \mapsto \pm(-3)^{-1/2} (x+2y,x-y)$ in $G_1$, which leave $A,B,C$ all invariant. Thus $A,B,C$ are covariants of $G_1$ with characters that take our $3$-cycle $(x,y) \mapsto \pm (\zeta^{-1} x, \zeta y)$ to $\zeta, \zeta^{-1}, 1$ respectively. Call the first of these characters $\chi.$ We get the smallest exceptional reflection group (#4 in the Shephard-Todd table) by replacing each element $g$ of $G_1$ by $\chi(g)g$. The resulting subgroup of ${\rm GL}_2({\bf C})$ is a double cover of the same $G_0,$ and is abstractly isomorphic with $G_1$, but contains complex reflections such as $(x,y) \mapsto (x,\zeta^{-1}y).$ Its ring of invariants is generated by $B$ and $C$, with degrees $4$ and $6$. The other three reflection groups mapping to $G_0$ are obtained from this one by extending the center from $\{ \pm1 \}$ to $\boldsymbol{\mu}_4$, $\boldsymbol{\mu}_6$, and $\boldsymbol{\mu}_{12}$; their invariant degrees are respectively $(4, 12)$, $(6, 12)$, and $(12, 12)$: change $C$ to $C^2$, change $B$ to $B^3$, or both. These are Shephard and Todd’s groups 6, 5[sic], and 7.

Besides the $A_4$ cover ${\bf P}^1 \to {\bf P}^1$ (a.k.a. ${\rm X}(3) \to {\rm X}(1)$), these polynomials with tetrahedral symmetry, especially quartics such as $A$ and $B$, arise in the construction of symmetric higher-genus curves and other objects. (See below for sextics such as $C$ which have octahedral symmetry.) Consider first the elliptic curve $w^2 = A(x,y).$ For any homogeneous quartic $f(x,y)$ with distinct roots, the Klein $4$-group of symmetries of the roots lifts to the elliptic curve $w^2 = f(x),$ giving translation by the $2$-torsion points of the curve. For $f=A$ the curve also has a $3$-cycle that is not translation by a torsion point (because it has fixed points), so we get a curve with $j$-invariant $0$. (This is also clear from our formula for $A$; e.g. dehomogenizing by setting $y=1$ yields $w^2 = x^3-1$.) Likewise a quartic such as $x^4 - y^4$ with dihedral symmetry yields an elliptic curve $w^2 = f(x,y)$ with $j$-invariant $1728$. Going beyond genus 1, the quartic plane curve $w^4 = f(x,y)$ has $48$ symmetries, forming a reducible complex reflection group in ${\rm PGL}_3({\bf C})$ (and a dihedral $f$ yields the Fermat quartic, with $96$ symmetries not all of which preserve the map to the $(x:y)$ line). Finally (for now), consider the smooth quartic surface $A(x,y) = A(v,w)$ in ${\bf P}^3({\bf C})$. Schur observed in 1882 that the $12$ symmetries of the tetrahedron yield $64$ lines on this quartic, four for each symmetry plus $4^2 = 16$ joining a root of $A(x,y)$ to a root of $A(v,w)$; this is more than the $48$ of the Fermat quartic surface, though the Fermat quartic surface has more symmetries. Indeed $64$ is the maximal number of lines on a smooth quartic surface over $\bf C$. Segre (1943) published a proof of this; 70 years later Rams and Schütt (Adv. Geom. 14 (2014), 735–756) pointed out a mistake in his argument, but showed that nevertheless the result is correct. It also holds in every characteristic other than $2$ (where the maximum is only $60$) and $3$ (where the the maximum is the $126$ lines of the Fermat quartic surface $x^4 + y^4 + z^4 + t^4 = 0).$

The $\boldsymbol{\mu}_4$ case (#6) also arises in coding theory; this is the reason I chose $\chi(g) g$ rather than $\chi^{-1}(g) g,$ which is equivalent and has quartic invariant $A$ rather than the “larger” $B$. Let $K$ be a linear code of length $n$ over a finite field of $q$ elements. (Normally one uses not $K$ but $C$ for Code, but this could get confusing here…) Recall that the (Hamming) weight enumerator $W_K(x,y)$ is the homogeneous polynomial of degree $n$ whose $x^{n-w} y^w$ coefficient is the number of codewords of weight $w$ (each $w$ in $[0,n]).$ The weight enumerator of any linear code $K$ is related with the weight enumerator of its dual code $K^\perp$ by the MacWilliams identity $W_{K^\perp}(x,y) = \left|K\right|^{-1} W_K(x+(q-1)y, x-y).$ [The dual code is the annihilator of $K$ with respect to the pairing $(c,c') = \sum_{i=1}^n c_i^{\phantom0} c'_i.]$ Suppose now that $K$ is a “Type III code”, i.e. that $q=3$ and $K$ is self-dual. The first example with $n > 0$ is the “tetracode”, generated by $(1,1,1,0)$ and $(1,-1,0,1),$ with weight enumerator $x^4 + 8 xy^3$ (all eight nonzero words have weight 3). This looks familiar for good reason! The condition $K = K^\perp$ implies that $\left| K \right| = 3^{n/2}$ (in general a self-dual code of length $n$ has dimension $n/2$), and that every word has weight divisible by $3$ (compute the pairing of any word with itself). The former property, together with MacWilliams, implies that the weight enumerator $W_K$ is invariant under $(x,y) \leftrightarrow 3^{-1/2} (x+2y, x-y);$ the latter, that $W_K$ is invariant under $(x,y) \mapsto (x,\zeta y).$ Therefore $W_K$ is invariant under the subgroup of ${\rm GL}_2({\bf C})$ generated by these two linear transformations, which is reflection group #6 with center $\boldsymbol{\mu}_4 = \{ \pm 1, \pm i \}$ (note that the scaling coefficient in the MacWilliams identity is $3^{-1/2},$ not $(-3)^{-1/2}).$ This yields Gleason’s theorem for Type III codes: the weight enumerator is a polynomial in $B = x^4 + 8 xy^3$ and $C^2,$ or equivalently in $B$ and $A^3 = y^3 (x^3-y^3)^3.$ In particular, $4 | n,$ which was not obvious (though it can be proved by more direct means). Also, any Type III code of length $8$ has weight enumerator $B^2,$ and a Type III code of length $12$ with no words of weight $3$ must have weight enumerator $B^3 - 24 A^3 = x^{12} + 264 x^6 y^6 + 440 x^3 y^9 + 24 y^{12}.$ It is known that such a code exists, and is unique up to isomorphism; namely it is the extended ternary Golay code, which is also a natural route to the sporadic Mathieu group $M_{12},$ and especially its double cover $2.M_{12}$ — e.g. the 132 pairs of words of weight $6$ are supported on the blocks of the (5,6,12) Steiner system, and the $12$ pairs of words of maximal weight form the unique Hadamard matrix of order $12.$

Monday, Nov. 1: Finite subgroups of ${\rm GL}_2({\bf C})$ and their invariants, cont’d: octahedral and icosahedral details

$N=4$: We have two natural choices here. One is to note that the edge-centers of a regular tetrahedron are the vertices of a regular octahedron, while the vertices of the tetrahedron and its dual constitute the eight vertices of the octahedron’s dual cube. We may thus use the above $C$ and $AB = x^7 y + 7 x^4 y^4 - 8 x y^7$ as our sextic and octic polynomials with octahedral symmetry. (As we know, we could also construct $AB$ as a multiple of the Hessian of $C$; as it happens it’s the Hessian divided by $-3600$.) Then $D = \partial(C,AB) / \partial(x,y) = x^{12} + 88 x^9 y^3 + 704 x^3 y^9 - 64 y^{12}$ is the covariant dodecic(?), with $C^4 + 256(AB)^3 - D^2 = 0.$ [What would happen if we instead took the Hessian of $AB$ to get a covariant polynomial of degree $2(8-2)=12$?] In this picture the symmetry group $S_4$ consists of the known $A_4$ and its composition with the involution $z \leftrightarrow -2/z$ that switches the roots of $A$ and $B.$ The polynomials $AB$, $C$, and $D$ are invariant under $2 A_4$, but the action of $2 S_4$ multiplies $C$ by the nontrivial character $2S_4 \to \{ \pm1 \}$ coming from the sign character of $S_4$, while fixing $AB$ and $D$. This can be seen directly from the action of the determinant-$1$ lifts of $z \leftrightarrow -2/z,$ which are $(x,y) \leftrightarrow (\pm 2^{1/2} y, \mp 2^{-1/2} x).$ It follows that if we lift even permutations from $S_4$ to ${\rm SL}_2({\bf C})$, but odd permutations to matrices of determinant $-1$ in ${\rm GL}_2({\bf C})$, we get another double cover of $S_4$ that does have a polynomial invariant ring, generated by $AB$ and $C$; this is Shepard and Todd’s complex reflection group #12 (where #8 through #11 are larger groups that contain the ${\rm SL}_2({\bf C})$ lift of $S_4$.)

It is often more convenient to start from $P = xy (x^4 - y^4),$ whose roots $x = 0, \infty, \pm 1, \pm i$ are vertices of an octahedron inscribed in the sphere with obvious fourfold symmetry $z \mapsto iz$ (and with the equator restored to $\left| z \right| = 1$). Dividing the Hessian of $P$ by $-25$ yields $Q = x^8 + 14 x^4 y^4 + y^8,$ with roots at the vertices of the cube dual to that octahedron. Dividing the Jacobian $D = \partial(P,Q) / \partial(x,y)$ by $-8$ then yields $R = x^{12} - 33 x^8 y^4 - 33 x^4 y^8 + y^{12},$ which has its twelve roots at the eight points $\boldsymbol{\mu}_4 (1 \pm 2^{1/2})$ and the four primitive 8th roots of unity $2^{-1/2} (\pm 1 \pm i).$ The identity relating these covariants is $108 P^4 - Q^3 + R^2 = 0.$

Here the symmetry group is generated by $z \mapsto iz$ together with the involution $z \leftrightarrow (z+1) \, / \, (z-1),$ which switches $1 \leftrightarrow \infty,$ $0 \leftrightarrow -1,$ and $i \leftrightarrow -i.$ The determinant of the associated linear map $(x,y) \mapsto (x+y,x-y)$ is $-2$, so its lifts to ${\rm SL}_2$ are obtained by dividing by the square roots of $-2$. [...]

$N=5$: We place the $12$ vertices of an icosahedron with one pair at $0$ and $\infty$, so the other $10$ vertices form two orbits under multiplication by $\boldsymbol{\mu}_5$. This makes $A$ a linear combination of $x^{11} y$, $x^6 y^6$, and $x y^{11}$, so we still have one undetermined coefficient even up to scaling $x$ and $y$. One way to find a correct choice is to apply the Hessian-and-Jacobian construction to an arbitrary linear combination and check whether the resulting $A,B,C$ have linearly dependent powers $A^5,B^3,C^2.$ We find that this happens if and only if $A = \alpha x^{11} y + \beta x^6 y^6 + \gamma x y^{11}$ for some coefficients $\alpha,\beta,\gamma$ that satisfy $\beta^2 + 121 \alpha \gamma = 0.$ All such $(\alpha,\beta,\gamma)$ are equivalent up to scaling, so we choose $(1,-11,-1)$ which makes our icosahedron symmetric under $z \leftrightarrow -1/z.$ Then the roots of $A = x^{11} y - 11 x^6 y^6 - x y^{11}$ other than $z=0$ and $z=\infty$ are $\zeta \varphi$ and $\zeta \bar\varphi$ for $\zeta \in \boldsymbol{\mu}_5$ and $\varphi,\bar\varphi = (1 \pm \sqrt{5})/2$ (the golden ratio and its algebraic conjugate). We calculate $$ \begin{aligned} B = -\frac1{11^2} H(A) & = x^{20} + 228 x^{15} y^5 + 494 x^{10} y^{10} - 228 x^5 y^{15} + y^{20}, \\ C = -\frac1{20} \, \frac{\partial(A,B)}{\partial(x,y)} & = x^{30} - 522 x^{25} y^5 - 10005 x^{20} y^{10} - 10005 x^{10} y^{20} + 522 x^5 y^{25} + y^{30}, \end{aligned} $$ and $1728 A^5 - B^3 + C^2 = 0$. [...]

In each of the three cases ($G_0 = A_4$, $S_4$, $S_5$), these polynomials were already known to Klein; there are other choices, but all are equivalent over $\bf C$. Over a ground field that is not algebraically closed, there can be more choices, and a complete description may be hard. We already saw that for $G_0 = S_4$ there are at least two natural choices over $\bf Q$, one (starting from the sextic $x^6 - 20 x^3 y^3 - 8y^6$ that arose in the $A_4$ case), exhibiting symmetry under a $3$-cycle in $G_0$, the other (starting from $x^5 y - x y^5)$ exhibiting a $4$-cycle symmetry. For some purposes, other $G_0$-polynomials may be of interest, such as the $27$ $A_5$-polynomials that together yield the complete solution of the Diophantine equation $X^2 + Y^3 = Z^5$ in coprime integers. (See Table B.2 of J. Edwards’ paper

A Complete Solution to $X^2+Y^3+Z^5=0$, Journal f. d. reine und angew. Math. (Crelle’s Journal) (2004), 213–236,

or Appendix D of doctoral thesis.) Fortunately it is at least feasible in each case to recognize whether a given homogeneous binary form of degree $V$ has $G_0$-symmetry. For $A_4$ this is easy: homogeneous quartic $a x^4 + b x^3 y + c x^2 y^2 + d x y^3 + e y^4$ has tetrahedral symmetry if and only if its quadratic invariant $12 a e - 3 b d + c^2$ vanishes. For $S_4$ sextics one could use the Igusa-Clebsch invariants $I_2, I_4, I_6, I_10,$ which must be proportional to $10 c, -5 c^2, -5 c^3, c^5/4$ for some nonzero $c$; but that is more complicated, and a similar condition for degree-$12$ forms with $A_5$ symmetry would have to be terribly complicated. Happily a uniform description was already found by Paul Gordan in 1887 (Vorlesungen über Invariantentheorie, Teubner, Leipzig 1887; cited by Edwards (opp cit)). In each case one must check $2V-7$ quadratic conditions on the coefficients:

Let $P(x,y)$ be a homogeneous polynomial of degree $d \geq 4$ without a linear factor of multiplicity $d-1$ or $d.$ Then the fourth transvectant of $P$ vanishes if and only if $d \in \{4,6,12\}$ and $P$ has $A_4, S_4, A_5$ symmetry respectively.

The “fourth transvectant” is an ${\rm SL}_2$-covariant quadratic map from binary forms of degree $d$ to binary forms of degree $2(d-4)$; for example, for $d=4$ the fourth transvecant is a multiple of the quadratic invariant $12 a e - 3 b d + c^2.$ The theory of transvectants is not as familiar today as it was to geometers near the turn of the 20th century; but in our case of a binary form it is conveniently described in terms of the representation theory of ${\rm SL}_2$, which can be assumed familiar enough to the audience of Math 263.

Denote by $V_1$ the defining $2$-dimensional representation of ${\rm SL}_2$; and for each $n=0,1,2,\ldots$ denote by $V_n$ the representation $\mathop{\rm Sym}^n V_1$ of dimension $n+1$ (note that the case $n=1$ does give back $V_1$). On the torus $\mathop{\rm diag}(\lambda,\lambda^{-1})$ each $V_n$ has character $$ \sum_{i=0}^n \lambda^{2i-n} = \lambda^n + \lambda^{n-2} + \lambda^{n-4} + \cdots + \lambda^{-n}. $$ It follows that $V_n \otimes V_n \cong \oplus_{j=0}^n V_{2(n-j)}$, so for each $j = 0,1,2,\ldots,n$ there is a nonzero ${\rm SL}_2$-covariant map $V_n \otimes V_n \to V_{2(n-j)}$ which is unique up to scalar multiple. The “$j$-th transvectant” is one of these choices of scaling. [There is no connection with “transvections” in matrix groups.] The first few examples are familiar: the zeroth transvectant is $f \otimes g \mapsto fg,$ and the first is the Jacobian determinant $f \otimes g \mapsto \partial(f,g) / \partial(x,y).$ Note that we get a symmetric map for $j=0$ and an antisymmetric one for $j=1$. Also, the $n$-th transvectant is a pairing $V_n \otimes V_n \to {\bf C}$ proportional to $$ \Bigl( \sum_{i=0}^n f_i x^i y^{n-i} \Bigr) \otimes \Bigl( \sum_{i=0}^n g_i x^i y^{n-i} \Bigr) \mapsto \sum_{i=0}^n \, (-1)^i \, i! \, (n-i)! \, f_i \, g_{n-i}. $$ In general, the $j$-th transvectant is symmetric for $j$ even and antisymmetric for $j$ odd; this can be seen by computing the characters of the symmetric and alternating squares of $V_n$: $$ \mathop{\rm Sym}\nolimits^2 V_n \cong V_{2n} \oplus V_{2n-4} \oplus V_{2n-8} \oplus \cdots; \quad \mathop{\wedge}\nolimits^2 V_n \cong V_{2n-2} \oplus V_{2n-6} \oplus V_{2n-10} \oplus \cdots. $$ For $j$ even we may thus identify the $j$-th transvectant with the quadratic map sending $f$ to the transvectant of $f \otimes f$. For example, for $j=0$ we obtain $f^2$; for $j=2$, the Hessian $f_{xx} \, f_{yy} - f_{xy}^2.$ If $n$ is even then for $j=n$ we get an $\mathop{\rm SL}_2$-invariant quadratic form on $V_n$, proportional to $\bigl( \sum_{i=0}^n f_i x^i y^{n-i} \bigr) \otimes \bigl( \sum_{i=0}^n f_i x^i y^{n-i} \bigr) \mapsto \sum_{i=0}^n \, (-1)^i \, i! \, (n-i)! \, f_i \, f_{n-i}.$ Check that for $n=2$ and $n=4$ this is proportional to the discriminant and the quadratic invariant $12 a e - 3 b d + c^2$; for $d=6$ we of necessity get a multiple of the first Igusa-Clebsch invariant $I_2$. [...]

Monday, Nov. 8: Finite subgroups of ${\rm GL}_2({\bf C})$ and their invariants, cont’d: octahedral and icosahedral details

Monday, Nov. 10: Generators of the invariants of $W(F_4)$ and $W(E_6)$

The invariant ring of $W(F_4)$. Let $I_2, I_4, I_6, J_4$ be the usual $D_4$ invariants: $I_{2k}\,(k=1,2,3)$ is the $k$-th elementary symmetric function of $x_1^2, x_2^2, x_3^2, x_4^2,$ and $J_4 = x_1 x_2 x_3 x_4.$ Thus the subscript of each invariant is its degree, and the polynomial $\prod_{i=1}^4 (X^2 - x_i^2)$ with roots $\pm x_1, \pm x_2, \pm x_3, \pm x_4$ is $P_0(X) := X^8 - I_2 X^6 + I_4 X^4 - I_6 X^2 + J_4^2.$ We next construct the degree-$8$ polynomial $P_1(x)$ whose roots are the sums $s = \sum_{i=1}^4 \epsilon_i x_i$ with each $\epsilon_i \in \{1,-1\}$ and $\prod_{i=1}^4 \epsilon_i = +1.$ Replacing $J_4$ by $-J_4$ will then give the degree-$8$ polynomial with roots $s = \sum_{i=1}^4 \epsilon_i x_i$ with each $\epsilon_i \in \{1,-1\}$ and $\prod_{i=1}^4 \epsilon_i = -1.$ Then $P(X) := 2^8 P_0(X/2) P_1(X) P_2(X),$ is a monic polynomial with roots at the linear combinations $\langle\vec{v}, \vec{x}\rangle$ whose coefficient vectors $v$ range over the $24$ minimal vectors of the $D_4$ lattice; since the automorphism group of these vectors is $W(F_4)$, the coefficients of $P$ will be in the ring of $W(F_4)$ invariants, and barring bad luck we can use the $X^{24-d}$ coefficients for $d=2,6,8,12$ as generators of the invariant ring.

Now the eight roots $s$ of $P_1$ correspond to quartics $Q(X) = \prod_{i=1}^4 (X - \epsilon_i x_i)$ with constant coefficient $J_4$ such that $P_0(X) = Q(X) Q(-X).$ We write $Q(X) = X^4 - s X^3 + B X^2 - C X + J_4$ and calculate $$ Q(X) Q(-X) = X^8 + (2B-s^2) X^6 + (B^2 - 2 C s + 2 J_4) X^4 + (2 B J_4 - C^2) X^2 + J_4^2. $$ We set this equal to $P_0$ and eliminate $B,C$ to obtain $P_1$. Equating $X^6$ coefficients gives $B = (s^2 - I_2)/2;$ then equating coefficients of $X^4$ and $X^2$ gives equations in $C$ of degrees $1$ and $2$. We eliminate $C$ by taking the resultant of these equations (or simply solving the linear equation, substituting into the quadratic, and clearing denominators). The resultant is an even polynomial in $s$ with leading term $-s^8/16$. Multiplying by $-16$ and substituting $X$ for $s$ gives $$ P_1 = X^8 - 4 I_2 X^6 + (6 I_2^2 - 8(I_4 + 6 J_4)) X^4 + (-4 I_2^3 + 16(I_4 + 2 J_4) I_2 - 64 I_6) X^2 + (I_2^2 - 4 I_4 + 8 J_4)^2. $$ [...] $$ i_2 = I_2, \quad i_6 = I_2 I_4 - 6 I_6, \quad i_8 = I_4^2 + 12 J_4^2 - 3 I_2 I_6, \quad i_{12} = 27 (I_2^2 J_4^2 + I_6^2) + 2 I_4^3 - 9 I_4 (I_2 I_6 + 8 J_4^2). $$ We also give the polynomial whose roots are the other $24$-point orbit, consisting of $\pm x_i \pm x_j$ with $i \neq j.$ We might expect to form the resultant w.r.t. $Y$ of $P_0(Y)$ with $P_0(X-Y).$ But this yields a polynomial of degree $64$ that has each of our desired roots twice and also unwanted roots at $\pm 2 x_i$ and a spurious multiplicity-$8$ root at zero, so we would have to divide by $(2X)^8 P_0(X/2)$ and extract a square root. Better to proceed as before: compute the remainder of $P_0(X)$ modulo $X^2 - AX + B$ and compute the resultant w.r.t. $B$ of the $X^1$ and $X^0$ coefficients to obtain a polynomial of degree $28$ in $A$ with a root at each $\pm x_i \pm x_j$ $(i\neq j)$ and also a quadruple root at $A=0$. Substituting $A=X$ and dividing by $X^4$ yields the desired polynomial. [...]

The invariant ring of $W(E_6)$. There are (at least) two natural routes, starting from the stabilizer of a root (minimal vector) of the $E_6$ lattice or the stabilizer of a “dual root” (minimal dual vector). We take the former, for which the stabilizer is $W(A_1 \oplus A_5),$ with the $A_1$ generated by the chosen root and the $A_5$ formed by its orthogonal complement. This subgroup has order $2 \cdot 6!,$ and its index in $W(E_6)$ is $36,$ the number of pairs $\pm r$ of roots. Comparing discriminants, we see that the $E_6$ lattice contains $A_1 \oplus A_5$ with index $\sqrt{(2\cdot6) \, / \, 3} = 2;$ a representative of the nontrivial coset is $r/2 + (1,1,1,-1,-1,-1)/2$ (note that $r/2$ and $(1,1,1,-1,-1,-1)/2$ represent the order-$2$ cosets in $A_1^*/A_1^{\phantom*}$ and $A_5^*/A_5^{\phantom*}$ respectively).

There are two polynomials to compute: the products of $X - \langle\vec{v}, \vec{x}\rangle$ with $\vec v$ ranging over either the $27$ dual roots in one of the nontrivial cosets of $E_6$ in $E_6^*,$ or over the $72$ roots. We call these polynomials $P_{27}$ and $P_{72}$ respectively. As we did for $W(F_4)$, we start with the coordinates of these vectors $\vec v$. The roots are easy: they are the $30$ (permutations of $(1,-1,0,0,0,0)$), and the $2 \cdot {6 \choose 3} = 40$ vectors $(v \pm r)/2$ where $v \in A_5$ is a permutation of $(1,1,1,-1,-1,-1).$ The dual roots, though fewer, are trickier, in part because we must make sure that all are in the same coset. One of the two choices consists of the $2 \cdot 6 = 12$ vectors $v \pm \frac12 r$ where $v \in A_5^*$ is a permutation of $(5,-1,-1,-1,-1,-1)/6,$ together with the ${6 \choose 2} = 15$ permutations of $(1,1,1,1,-2,-2)/3$ in $A_5^*$ [check: in both cases the norm is $\frac12 + \frac{1 \cdot 5}{6} = \frac{2 \cdot 4}{6} = \frac43$.] [...]

Exercise: Recall that we introduced this calculation with the choice between building the $W(E_6)$ invariants from the stabilizer of a root or of a “dual root”, and chose the former. Carry out such an analysis starting with one of the $27$ dual roots $r^*.$ Here $\langle r^*, r^* \rangle = 4/3$ and the orthogonal complement of $r^*$ intersects the $E_6$ lattice in a root lattice of type $D_5$, so we start from invariants of an index-$27$ subgroup $W(D_5)$ of $W(E_6)$, which have generators of degrees $2,4,6,8,5,$ together with the linear form $\langle r^*, x \rangle.$ Construct $P_{27}$ and $P_{72}$ in terms of these invariants of $W(D_5).$ The relevant short vectors are as follows. $27$ dual roots: $r^*$ itself; $10$ vectors such as $\pm e_i - \frac12 r^*$ where $e_i\,(i=1,2,3,4,5)$ are a basis for the index-$2$ superlattice ${\bf Z}^5$ of $D_5$; and $16$ vectors $\frac12 \sum_{i=1}^5 \epsilon_i e_i + \frac14 r^*$ where each $\epsilon_i \in \{1,-1\}$ and $\prod_{i=1}^5 \epsilon_i = +1$ — we have seen above how to deal with such combinations. [Check: the norms $\frac43, \, 1 + \frac14 \frac43, \, \frac54 + \frac1{16}\frac43$ are all equal.] $72$ roots: the $40$ roots $\pm e_i \pm e_j \, (1 \leq i < j \leq 5)$ of $D_5$, and $32$ vectors $\frac12 \sum_{i=1}^5 \epsilon_i e_i + \frac34 \epsilon_0 r^*$ where each $\epsilon_i \in \{1,-1\}$ and $\prod_{i=0}^5 \epsilon_i = -1.$ [Again the norms check out: $\frac54 + \frac9{16} \frac43 = 2.$ To get the sign of $\prod_{i=0}^5 \epsilon_i,$ multiply our dual root $\frac12 \sum_{i=1}^5 \epsilon_i e_i + \frac14 r^*$ by $3$ and subtract $2 \sum_{i=1}^5 \epsilon_i e_i$ to obtain an $E_6$ root.]

Monday, Nov. 15: Introduction to Shioda’s “excellent families” of rational elliptic surfaces with an additive fiber at $t=\infty$

See Chapters 8, 9, and especially 10 of the recent book

Matthias Schütt and Tetsuji Shioda: Mordell-Weil Lattices, Springer 2019, #70 in “Ergebnisse der Mathematik und ihre Grenzgebiete (A Series of Modern Surveys in Mathematics)”.

and Shioda’s series of papers in 1989–1991 (cited in the book’s bibliography) on which these chapters are largely based. We gave an overview leading up to the cases of elliptic surfaces with an additive fiber of type II, III, or IV at $t=\infty$, giving rise to a surface with Mordell-Weil lattice $E_8^{\phantom.}$, $E_7^*$, $E_6^*$ respectively. In each case the invariant degrees of $W(E_n)$ appear as the weights of $n$ homogeneous parameters $a_i, b_j$ of the family of surfaces $y^2 = x^3 + a(t) x + b(t)$. Namely:

$n=8$: here $a(t) = \sum_{i=0}^3 a_i t^i$ and $b(t) = t^5 + \sum_{j=0}^3 b_j t^j$, with and $t,x,y$ of weights $6,10,15$ respectively, so $a_i$ and $b_j$ have weights $20-6i = 2,8,14,20$ and $30-6j = 12,18,24,30;$
$n=7$: here $a(t) = t^3 + \sum_{i=0}^1 a_i t^i$ and $b(t) = \sum_{j=0}^4 b_j t^j$, with and $t,x,y$ of weights $4,6,9$ respectively, so $a_i$ and $b_j$ have weights $12-4i = 8,12$ and $18-4j = 2,6,10,14,18;$ and finally
$n=6$: here $a(t) = \sum_{i=0}^2 a_i t^i$ and $b(t) = t^4 + \sum_{j=0}^2 b_j t^j$, with and $t,x,y$ of weights $3,4,6$ respectively, so $a_i$ and $b_j$ have weights $8-3i = 2,5,8$ and $12-3j = 6,9,12.$

In each case the weight of the Weierstrass equation is the Coxeter number $h$ ($=1/n$ times the number of roots) of $E_n$, namely $30, 18, 12$ for $n=8,7,6$ respectively.

[...]

Each of these three families contains more curves of rank at least $n$ than one might expect to exist, even though it has positive codimension in the moduli space of elliptic curves with $n$ points (the moduli space has dimension $n+1$, while the $E_n$ family has dimension $n$, counting $1$ for the choice of $t$ and $n-1$ for the complement of hyperplanes in ${\bf P}^{n-1}$). Indeed there are $\sim H^{10}$ elliptic curves $y^2 = x^3 + a x + b$ with $a,b \in {\bf Z}$ such that $a \ll H^4$ and $b \ll H^6$; of these, we expect $\sim H^9$ have a small point (e.g. for an integral point we choose $x,y,a$ in one of $~H^{2+3+4}$ ways and then solve for $b$), and more generally about $H^{10-r+\epsilon}$ to have $r$ independent small points. This heuristic must fail for some $r$, because it predicts $H^\epsilon$ curves with $10$ independent small points, and none (more honestly: finitely many) with $11$ — but already in 1954 Néron had constructed nonconstant elliptic surfaces of rank at least $10$ over ${\bf Q}(t)$, and infinite families of curves of rank at least $11$ (see Shioda’s nice account of this construction in Invent. Math. 109 (1991), 109–120, which cites Néron’s paper as [N1]). Still one might not expect to see a counterexample with $r$ as small as $6$, which the $E_6$ family provides: we get $a,b \ll H^4,H^6$ by choosing a point of height at most $H^{1/2}$ in ${\bf P}^5$ and an integer $t \ll H^{3/2}$, for a total of $\sim H^{6/2} H^{3/2} = H^{9/2}$ choices, more than $H^{10-r} = H^4$. A similar accounting gives $H^{7/3} H^{4/3} = H^{11/3} > H^3$ for the $E_7$ family, and $H^{8/5} H^{6/5} = H^{14/5} > H^2$ for $E_8$.

Wednesday, November 17: Shioda’s “excellent families” cont’d: the case of $E_6$; variations: complex reflection groups from $E_8$ and $E_6$, and a Shioda-Usui family for $W(A_5)$

The case of $E_6$. We give some more details of the $E_6$ family and the $27$ minimal sections in one of the nontrivial cosets of $E_6$ in $E_6^*$; this example is simple enough to require only a single resultant but rich enough to give a flavor of the method. [...]

Complex reflection groups from $E_8$ and $E_6$. By specializing the $E_8$ and $E_6$ families we obtain “excellent families” of rational elliptic curves related with exceptionals unitary reflection groups, two over the third cyclotomic field ${\bf Q}(\,\boldsymbol{\mu}_3) = {\bf Q}(\sqrt{-3})$ — namely Shepard-Todd groups 25 and 32, of dimensions $3$ and $4$ — and one over the fourth cyclotomic field ${\bf Q}(\,\boldsymbol{\mu}_4) = {\bf Q}(i)$, which is Shepard-Todd group 31, of dimension $4$.

Consider first the specializations $a=0$, which give surfaces $y^2 = x^3 + t^4 + \sum_{j=0}^2 b_j t^j$ (for $E_6$) and $y^2 = x^3 + t^5 + \sum_{j=0}^3 b_j t^j$ (for $E_8$). These are “potentially constant” elliptic surfaces: the $j$-invariant $j(E_t)$ is a constant function, here $j=0$ which is the $j$-invariant of a curve with endomorphisms by ${\bf Z}[\boldsymbol{\mu}_3]$. This gives the Mordell-Weil lattice the structure of module over ${\bf Z}[\boldsymbol{\mu}_3]$. It is known that in each there is a unique such structure up to isomorphism; that is, each of the Euclidean reflection groups $W(E_6), W(E_8)$ has a unique conjugacy class of $3$-cycles $g$ whose fixed sublattice is $\{0\}$. The commutator of $g$ is then the group of automorphisms of $E_6$ or $E_8$ as a ${\bf Z}[\boldsymbol{\mu}_3]$-lattice. In each case this is itself a unitary reflection group, of rank $3$ or $4$ respectively, and its invariant degrees are precisely the weights of the associated $b_j$. Thus we easily recover two “excellent families” of elliptic surfaces with $j=0$ by specializing the $E_6$ and $E_8$ families to linear ${\bf Q}(\,\boldsymbol{\mu}_3)$-subspaces.

[more about these two families]

The reflection group $W(E_8)$ also contains a unique conjugacy class of $4$-cycles $g$ such that $g^2 = -1$, giving the $E_8$ lattice the structure of a $4$-dimensional lattice over ${\bf Z}[i]$. The automorphism group is then the commutator of $g$, which again is a unitary reflection group, with invariant degrees $8,12,20,24$, which is the subset of the invariant degrees of $W(E_8)$ that are divisible by $4$. One might guess that we could construct an “excellent family” for this group by specializing $b(t)$ to zero, obtaining potentially constant elliptic surfaces with $j=1728$; but this cannot work because we would lose the main term $t^5$. (We shall see that there is an “excellent family” of $j=1728$ surfaces corresponding to an expectional unitary reflection group, but these surfaces have ${\bf Z}[i]$-rank $2$, not $4$.) Instead we have $g$ act by $(t,x,y) \mapsto (-t,-x,iy)$, which removes the coefficients $a_3,a_1$ and $b_2,b_0$, leaving $$ y^2 = x^3 + (a_2 t^2 + a_0) x + (t^5 + b_3 t^3 + b_1 t) $$ with $a_2,a_0$ of weights $8,20$ and $b_3,b_1$ of weights $12,24$. [...]

A Shioda-Usui family for $W(A_5)$.

T. Shioda and H. Usui: Fundamental invariants of Weyl groups and excellent families of elliptic curves, Comment. Math. Univ. St. Pauli 41 (1992) #2, 169–217.

Monday, November 22: Another variation on a theme of Shioda: an “excellent family” of rational elliptic surfaces of rank $4$ with a $2$-torsion section and an action of $W(F_4).2$