Math 229: Introduction to Analytic Number Theory (Fall 2022)

Warning: MathJax requires JavaScript to process the mathematics on this page.
If your browser supports JavaScript, be sure it is enabled.

Lecture notes for Math 229: Introduction to Analytic Number Theory (Fall 2022)

If you find a mistake, omission, etc., please let me know by e-mail.

The orange ball marks our current location in the course.

For an explanation of the background pattern, skip ahead to the end of the page.

September 1 and 6: plan.pdf and intro.pdf: administrivia and “philosophy”/examples;
elem.pdf: Elementary methods I: Variations on Euclid;
Exercise 7(ii) corrected 12 Sep 2022: “of course” the exercise is to bound $\sum_{q=1}^x \phi(q)$, not $\sum_{q=1}^x \phi(x)$ !
euler.pdf: Elementary methods II: The Euler product for $s \geq 1$ and consequences

[also: which if any of 8675309, 6060842, 6654321, and 7184981043 is prime, and how surprised might you be if one of them is prime, or half of a prime pair? Thanks to Jordan Ellenberg and David Farmer for noting the prime and prime pair respectively. (Turns out that I had already seen the twin-prime observation in the mouse-over text for xkcd comic #1047: Approximations.]

The CA for Math 229 is Hari Iyer. If you’re on my mailing list then you already have his e-mail address.
His office hours will take place Mondays, 8:00–10:00 PM at Math Night in the Leverett Dining Hall.

“faculty legislation requires all instructors to include a statement outlining their policies regarding collaboration on their syllabi” [but apparently does not require us to be all that careful about singular/plural grammar …] — as stated in plan.pdf: for homework, “as usual in our department, you are allowed — indeed encouraged — to collaborate on solving homework problems, but must write up your own solutions.” For the final project or presentation, work on your own even if another student has chosen the same topic. (As with theses etc. it is still OK to ask peers to read drafts of your paper, or see dry runs of your presentation, and make comments.) In all cases, acknowledge sources as usual, including peers in your homework group.

For many more examples of and references for elementary approaches to the distribution of primes and related topics, see Paul Pollack’s book Not Always Buried Deep: A Second Course in Elementary Number Theory.

About $4, 25, 168, \ldots$: OEIS Sequence A006880 gives $\pi(10^n)$ for $n \leq 29$. Computing the last few terms (the last two of which are only a few years old) required both nontrivial algorithms and significant computational resources.

Here is XKCD’s haiku rendition of Euclid’s proof. (CW: b-word.) It’s not quite right — can you remove the last word and use those two syllables to fix the proof? — “But, hey, it’s a hallucination.”

To make $\ll$ and $\gg$ in TeX, write \ll and \gg respectively. Please do not write << and >> (which will produce $<<$ and $>>$ )!

Homework = elem Exercises 2,6,7 and euler Exercises 3, 4, 6. (In the past I’ve assigned some subset of those and elem:2,5, euler:2,7.)

September 13 and 15: dirichlet.pdf: Dirichlet characters and L-series; Dirichlet’s theorem under the hypothesis that L-series do not vanish at $s=1$

Homework = dirichlet Exercises 1, 3, 4, 7, 10, and 11. (For the second part of 11 you’ll probably want to use QR [= Quadratic Reciprocity] for the Kronecker or Jacobi symbol — or classical QR plus Dirichlet’s theorem [just infinitude, not logarithmic density]…)

The introduction of powers of $i$ for $r=5$ (starting at the bottom of page 3) is similar to the trick of summing every fourth entry in the $n$th row of Pascal’s triangle by averaging $(1+1)^n$, $(1-1)^n$, $(1+i)^n$, and $(1-i)^n$. Generalizing this to summing every $k$th entry again leads naturally to roots of unity and Pontrjagin duality for finite groups.

What we call the “inverse [discrete] Fourier transform” in the middle of p.5 is sometimes called the plain “[discrete] Fourier transform”, possibly with the factor of $|G|^{-1}$ omitted or replaced by $|G|^{-1/2}$. (There is a similar diversity of terminology for Fourier series and Fourier transforms.) I don’t think that any choice is “correct”, or even so dominant in the literature as to make all others look “wrong”; I regard any such choice as acceptable as long as it is introduced explicitly and used consistently.

One (overly) fancy way to identify the dual of ${\bf Z}/m{\bf Z}$ with μ$_m$ is to extend the quotient map ${\bf Z} \to {\bf Z}/m{\bf Z}$ to a short exact sequence $0 \to {\bf Z} \stackrel{m}{\to} {\bf Z} \to {\bf Z}/m{\bf Z} \to 0$, with the second arrow being multiplication by $m$, and then dualize to identify the dual of ${\bf Z}/m{\bf Z}$ with the kernel of multiplication by $m$ on the dual of $\bf Z$; but the dual of $\bf Z$ is just the circle group, so the kernel in question is its $m$-torsion subgroup, which is indeed μ$_m$. But to justify that we would have to extend Pontrjagin duality beyond finite groups to infinite groups like $\bf Z$ (discrete but not compact) and the circle (compact but not discrete).

September 20: chebi.pdf: Cebysev’s method; introduction of Stirling’s approximation, and of the von Mangoldt function $\Lambda(n)$ and its sum $\psi(x)$

Edis Memis notes that the inequality ${2x \choose x} \geq \prod_{x \lt p\leq 2x} p$ that I mentioned in class does not quite give $\psi(2x) - \psi(x) \lt x \log 2$ for $x \in \bf Z$, because it does account for the contributions to $\psi(2x) - \psi(x)$ of terms $\Lambda(n)$ with $n = p^k$ and $k \geq 2$. But our formula for $\log \lfloor x \rfloor !$ does give $\log {2x \choose x} \geq \psi(2x) - \psi(x)$, because each term $\Lambda(n) \, (\lfloor 2x/n \rfloor - 2 \lfloor x/n \rfloor)$ is nonnegative, and those with $n \lt x \leq 2n$ have $(\lfloor 2x/n \rfloor - 2 \lfloor x/n \rfloor) = 1$ and thus sum to $\psi(2x) - \psi(x)$.

Homework = chebi Exercises 1, 2${}^*\!$, 4, 6
* I could not reconstruct my argument for $\psi(x) < C x + O(\log^2 x)$ with $C = \frac14 \log 108 = 1.1705+$. I can find a simple proof for $C = \frac15 \log 432 = 1.2137-$, and a reasonably simple further improvement to $C = 1.0824-$. For this homework problem I’ll be satisfied with a proof (still starting from (1)) of $\psi(x) < C x + O(\log^2 x)$ for any explicit constant $C$ that is strictly smaller than $\log 4 = 1.3863-$ (and thus sufficient to prove Bertrand’s postulate when combined with a short chain of primes $p_k$ with each $p_k < 2 p_{k-1}$).
By comparison: already in 1892 Sylvester announced bounds within 4.4% of the PNT using nothing more than log tables, extensive hand computation with factors of $30030 = 2 \cdot 3 \cdot 5 \cdot 7 \cdot 11 \cdot 13,$ and (the crucial ingredient) more detailed analysis of the errors left over when comparing a sum of $c_d \log \lfloor x/d \rfloor!$ with $\psi(x)$ or $\psi(x) - \psi(x/m).$ His paper is “On arithmetical series, II”, Messenger of Math. (2) 21 (1892), 87–120, and fortunately this volume can be found in the Harvard libraries. The bounds $[0.95694, 1.04423]$ are claimed towards the end (page 119), after deriving the bounds $[0.922, 1.0765]$ (page 99) and $[0.946,1.0552]$ (page 106) from similar analysis mod 210 and 2310.
Almost a century later N. Costa Pereira combined such ideas with digital computation (again without linear programming) to report a series of bounds culminating with $|(\psi(x)/x)-1| < 1/2976$ (!) for large enough $x,$ giving the explicit bound $10^{11}$ for “large enough” (page 327 of his paper “Elementary estimates for the Chebyshev function $\psi(x)$ and the Möbius function $M(x)$”, Acta Arithmetica 52 (1989), 307–337). One could come arbitrarily close to 1 via the analytic formula for $\psi(x)$ that we’ll develop in the next few weeks, but the closest explicit bounds I’ve seen using that approach are still a bit above 1/1000 (L.Schoenfeld, “Sharper bounds for the Chebyshev functions $\theta(x)$ and $\psi(x)$. II”, Math. Comp. 30 #134 (1976), 337–360; MR0457375 (56 #15581c)).

Keith Conrad (who unlike me can read Russian, not just sound it out) writes that the trick of factoring $2n \choose n$ “was Erdos’s simplification to the proof of bounds on $\psi(x)$ […] Chebyshev never used binomial coefficients.” It still seems plausible that Chebyshev started with $2n \choose n$ but didn’t mention it in his 1852 paper (in French) because it not only gives worse bounds than he obtained using another combination of values of $\log (c_i n)!$ but also does not quite prove Bertrand’s postulate directly (though 80 years later Erdös did give a proof starting from the factorization of $2n \choose n$; this Wikipedia page gives a proof along these lines).

Here is a scan of Ralph B. Boas’ “spelling lesson”, from the College Math J. 15 #3 (June 1984), page 217.

Starting September 22:
psi.pdf: Complex analysis enters the picture via the contour integral formula for $\psi(x)$ and similar sums (a.k.a. Perron’s formula);
Homework from psi = Exercise 3
zeta1.pdf: The functional equation for the Riemann zeta function using Poisson summation on theta series; basic facts about Γ(s) as a function of a complex variable s
Homework from zeta1 = Exercises 1, 8, 9

The contour-integral formula can be regarded as an instance of the formula for inverting the Laplace transform, in the Laplace transform’s guise as the Mellin transform. Indeed partial summation of the Dirichlet series for $-\zeta' / \zeta$ writes $-\frac{\zeta'}{\zeta}(s) / s$ as the Mellin transform of $\psi(x)$ (evaluated at $-s$ if we go by Wikipedia’s normalization), and the inversion formula for recovering $\psi(x)$ would be precisely our contour integral if we were allowed to integrate all the way from $c - i\infty$ to $c + i\infty$ without worrying about convergence and error estimates. The Laplace inversion formula is in turn closely related with the more familiar formula for inverting the Fourier transform.

Euler already guessed the functional equation in some sense: you’ve probably run across the “identity” 1 + 2 + 3 + 4 + … = −1/12 (that he may have derived from 1 − 2 + 3 − 4 + … = 1/4, though according to the 1 + 2 + 3 + 4 + … page it is not clear whether Euler ever stated the −1/12 version), and −1/12 agrees with the value of $\zeta(-1)$; he likewise “computed” alternating sums that correspond to $(1 - 2^{1+n}) \zeta(-n)$ for other small integers $n \geq 0,$ finding numbers equivalent to $\zeta(-n) = -1/2, -1/12, 0, 1/120, 0, -1/252, 0, 1/240, 0, -1/132, 0, 691/32760, 0$ for $n = 0, 1, 2, 3, \ldots$ — and the appearance of 691 surely suggested (if he didn’t notice this earlier) a connection with the values of $\zeta(n+1)$ that Euler had already obtained. Euler then proved this connection; he even chose $\cos(\pi n/2)$ for the 4-periodic fudge factor, which turns out to be right for all complex $n$ ! But Riemann was still the first to give a formula for $\zeta(s)$ for complex $s$ and to prove the functional equation in this setting.
To spell out the argument suggested on p.2 that $\Gamma$ has no zeros: since $s \Gamma(s) = \Gamma(s+1)$ for all $s$, it is enough to show $\Gamma(s) \neq 0$ for ${\rm Re}(s) \gt 0;$ if there were such a zero, we could conclude from the formula for $B(s_1,s_2)$ that if $s_1 + s_2 = s$ (with both $s_1, s_2$ in the right half-plane) then either $\Gamma(s_1) = 0$ or $\Gamma(s_2) = 0,$ and that would make $\Gamma$ identically zero by analytic continuation — contradiction. (Come to think of it, we already get $\Gamma(s) \neq 0$ from Euler’s reflection formula $\left[B(s,1-s) = \right] \Gamma(s) \Gamma(1-s) = \pi / \sin (\pi s)$, whose right-hand side surely has no zeros. Alternatively, use Bohr-Mollerup to establish the product formula for $\Gamma(s)$ on the positive real axis, and then invoke analytic continuation.)
You can find David Wilkins’ transcription and English translation of Riemann’s fundamental 1859 paper here. A conjecture equivalent to what we now call the Riemann Hypothesis appears in the middle of page 4 of the translation (numbered 5 in the PDF file because the title appears on page 0); note that in the previous page Riemann set $s=(1/2)+ti.$
Riemann’s proof of $\xi(s) = \xi(1-s)$ (and generalizations such as those we shall develop for Dirichlet L-functions) illustrate the versatile technique of studying an infinite sum by converting it to a definite integral. A few more exotic examples from my own contributions to Mathoverflow and Math Stack Exchange include the evaluation of $\sum_{n=0}^\infty \frac{n! \, (2n)!}{(3n+2)!}$ (MSE 516263) and the asymptotic analysis of $\sum_{k=0}^n {n \choose k} (-1)^k \sqrt{k}$ as $n \to \infty$ (MSE 368212) and $\sum_{n=0}^\infty (-x)^n / n^n$ as $x \to \infty$ (MO 109160).
If you like “distributions”, you can express Poisson summation as “a row of deltas $\sum_{m \in {\bf Z}} \delta_m$ is its own Fourier transform”: the inner product of any function $\,f$ with the “row of deltas” is the sum of the values of $\,f$ over the integers; now use Parseval for Fourier transforms.
Our special case of the Gaussian works also for complex $u$, giving rise to the value at $iu$ of a modular form of weight $1/2$. The “sanity check” for the resulting transformation formulas turns out to be basically equivalent with Quadratic Reciprocity!
Even for real u there’s more to be said. The functional equation for $\theta$ says nothing about $\theta(1)$ [which remarkably turns out to be $\pi^{1/4} / \Gamma(3/4)$] but does yield the ratio $\theta'(1) / \theta(1) = -1/4,$ which in turn gives the approximation $e^\pi \approx 8 \pi - 2$ by ignoring all terms of size $\exp(-3\pi)$ and smaller. This, together with the Archimedean approximation $\pi \approx 22/7$, explains the first run of 9s in $e^\pi - \pi = 19.999099979\ldots$ (see Exercise 10 and XKCD 217).
Another neat example of Poisson summation: if $f(x) = 1 / (x^2+c^2)$ (some constant $c \gt 0$), then the Fourier transform of $\,f$ is $(\pi/c) \exp(-2\pi c |y|)$ (a standard exercise in contour integration, which can also be done by Fourier inversion since the Fourier transform of $\exp(-2\pi c|y|)$ is an elementary integral); hence $\sum_{n\in\bf Z} 1/(n^2+c^2)$ is readily evaluated as $(\pi/c) (e^{2\pi c} + 1) \, / \, (e^{2\pi c} - 1)$ using the formula for summing a convergent geometric series. Subtracting the $n=0$ term $f(0) = 1/c^2$ and letting $c \to 0$, we recover Euler’s formula $\zeta(2) = \pi^2/6$. [This is Exercise 11 in zeta1.] The $c^2, c^4, c^6$, etc. terms in the Laurent expansion of $(\pi/c) (e^{2\pi c} + 1) \, / \, (e^{2\pi c} - 1)$ about $c=0$ then yield the values of $\zeta(s)$ for $s=4, 6, 8,$ etc. as rational multiples of $\pi^s$.
Poisson summation works in the general setting of locally-compact abelian groups: if $\,f$ is “any” function on such a group $G$, and $H \subseteq G$ is a closed subgroup, then $\int_H f$ is (appropriately scaled) the integral of the Fourier transform $\hat f$ of $\,f$ over the annihilator of $H$. In standard Poisson, $G = \bf R$, $H = \bf Z$ (which is its own annihilator), and the “integrals” over $\bf Z$ are just sums. Tate’s thesis gives a proof of the functional equation of the zeta function of any number field $K$ by using $K$ as the subgroup of the adèles of $K$. We won’t go there in Math 229x, but will note the generalization from $(G,H) = ({\bf R}, {\bf Z})$ to $(G,H) = ({\bf R}^n, L)$ where $L$ is some lattice in ${\bf R}^n$; the annihilator of L is then its dual lattice (a.k.a. the “reciprocal lattice” in crystallography).

September 29: gamma.pdf and prod.pdf: review of the complex Gamma function, and of functions of finite order (Hadamard’s product formula and its logarithmic derivative)
Homework: gamma Exercise 4, prod Exercises 1 and 2.

October 4 and 6: zeta2.pdf: the Hadamard products for $\xi(s)$ and $\zeta(s)$, vertical distribution of the zeros of $\zeta(s)$ (and an interlude on the Lindelöf hypothesis, on which see also the “further remarks” on page 4); free.pdf: the nonvanishing of $\zeta(s)$ on the edge $\sigma=1$ of the critical strip, and the classical zero-free region.
Homework = zeta2 Exercises 2 and 3; free Exercises 1 and 2
zeta2 Corrected October 7: in Exercise 2, $\xi(s)$ (with the definition we’re using) has poles, and so is not an entire function! Corrected to $(s^2-s) \xi(s)$.

The notes use (but possibly could state more explicitly) the following facts about the real part, call it $r = x / (x^2+y^2),$ of the (multiplicative) inverse of a complex number $x + iy$ with $x \gt 0:$
• $r$ is always positive;
• $r$ is bounded away from zero if $y$ is bounded and $x$ is in a fixed interval such as $[1,2]$ with both endpoints positive; and
• for large $y$ (say $|y| \geq 1$), $r$ decays as $1/y^2$ as $|y| \to\infty$ (again assuming that $x$ is in say $[1,2]$).
This picture used to appear without explanation on the web page for John Derbyshire’s Prime Obsession. It is a plot of the Riemann zeta function on the boundary of the rectangle $[0.4,0.6]+[0,14.5]i$ in the complex plane. Since the contour winds around the origin once (and does not contain the point $s=1,$ which is the unique pole of $\zeta(s)$), the zeta function has a unique zero inside this rectangle. Since the complex zeros are known to be symmetric about the line ${\rm Re}(s) = 1/2,$ this zero must have real part exactly equal $1/2$, in accordance with the Riemann hypothesis.
It is known that this first “nontrivial zero” of $\zeta(s)$ occurs at $s = 1/2 + it$ for $t = 14.13472514\ldots.$ The pole at $s=1$ accounts for the wide swath in the third quadrant, which corresponds to s of imaginary part less than 1.
Here’s a similar picture for $L(s,\chi_4)$ on $[0.4,0.6]+[0,11]i.$ With no pole in the vicinity, this picture is not as visually interesting. We see the first two nontrivial zeros, with imaginary parts 6.0209489... and 10.2437703...
For more pictures along these lines, see Juan Arias de Reyna’s manuscript “X-Ray of Riemann’s Zeta function”.
The coefficients $3, 4, 1$ of the inequality $3 + 4 \cos(\theta) + \cos(2\theta) \geq 0$ may seem “random”/mysterious, but they’re basically just the fourth row $1, 4, 6, 4, 1$ of Pascal’s triangle, as you can see by writing $3 + 4 \cos(\theta) + \cos(2\theta)$ as a linear combination of $\exp(in\theta)$ with $n = -2, -1, 0, 1, 2.$
The coefficient $\sqrt{2\pi}$ in Stirling’s approximation can also be obtained from Euler’s reflection formula $\Gamma(s) \Gamma(1-s) = B(s,1-s) = \pi / \sin(\pi s)$. [I was worried about $s$ and $1-s$ both being in $R_\epsilon$, but that’s easily arranged by fixing $\sigma \in \bf R$ and considering $s = \sigma + it$ with $|t| \to \infty$.]

October 11: pnt.pdf: Conclusion of the proof of the Prime Number Theorem with error bound; the Riemann Hypothesis, and some of its consequences and equivalent statements.

Here’s an expository paper by B. Conrey on the Riemann Hypothesis, which includes a number of further suggestive pictures involving the Riemann zeta function, its zeros, and the distribution of primes.
Here’s the Rubinstein-Sarnak paper “Chebyshev’s Bias” (Experimental Mathematics, 1994).
Here’s a bibliography of fast computations of $\pi(x)$.

October 11 and 13: lsx.pdf: $L(s,\chi)$ as an entire function (where $\chi$ is a nontrivial primitive character mod $q$); Gauss sums, and the functional equation relating $L(s,\chi)$ with $L(1-s,\bar\chi)$

It is not actually wrong to state the Proposition on page 1 with “positive” rather than “nonnegative” coefficients $a_n$, because if any $a_n=0$ then we can simply remove the term $a_n e^{-\lambda_n s}$ from the sum (and if doing that for all such $n$ results in a finite or empty sum then it is automatically convergent on all of $\bf C$).
“Recall” that Exercises 8 and 9 in zeta1 introduced the even and odd cases of the functional equation for $L(s,\chi)$ by the examples of $\chi = \chi_8^{\phantom0}$ and $\chi = \chi_4^{\phantom0}$ respectively. Since in each case $\chi$ is real (i.e. $\chi = \overline\chi)$ we didn’t see at that point that in general the functional equation must relate $L(s,\chi)$ with $L(1-s,\overline\chi)$.

October 18: pnt_q.pdf: Product formula for $L(s,\chi),$ and ensuing partial-fraction decomposition of its logarithmic derivative; a (bad!) zero-free region for $L(s,\chi),$ and resulting estimates on $\psi(x,\chi)$ and thus on $\psi(x, a \bmod q)$ and $\pi(x, a \bmod q).$ The Extended Riemann Hypothesis and consequences.
Corrected 18 Oct 2022 to fix a grammar typo on page 2 (“and thus each factor”, not “and thus each factors”)
Corrected 22 Oct 2022 to fix a typo in a formula near the bottom of page 3: “…because the contribution of $\sum_{n^2 < x} \chi(n^2) \Lambda(n^2)$ to $\psi(x,\chi)$” [with $\sum_{n^2 < x}$, not $\sum_{n < x}$].
Homework = Exercises 1 and 5

Thanks to William Chang ‘19 for alerting me to a more elementary proof of the step that shows that for any $t \geq 2$ there are $O(\log|qt|)$ zeros with imaginary part in $[t-1,t+1]$ (see page 3 of zeta2 for the proof we gave in the case $q=1$): apply Jensen’s inequality (which we already used in prod, see (3)) to $L(s,\chi)$ on a disc such as $|s-(2+it)| \leq 3$. The central absolute value $|L(2+it,\chi)|$ is bounded away from zero (recall dirichlet Exercise 11), while the maximum is bounded by some power of $q|t|$ (this doesn’t even require the functional equation, only analytic continuation via Euler-Maclaurin). But if ${\rm Im}\,\rho \in [t-1,t+1]$ then $|\rho-(2+it)| \leq \sqrt 5 \lt 3$, so if there are $N$ such $\rho$ then $(3/\sqrt 5)^N \ll (q|t|)^{O(1)}$, whence $N \ll \log (q|t|)$ as claimed.

Apropos Exercise 5 for the Γ(s) handout: here’s a graph of $$ S(x) = x - x^2 + x^4 - x^8 + x^{16} - x^{32} + - \cdots $$ for $x \in [0,0.9995]$. (Look near $(1,0.5)$ in the maximum available magnification to see the first few oscillations.) The fact that $S(0.995) = 0.50088\ldots \gt 1/2$, together with the functional equation $S(x) = x - s(x^2)$ (from which $S(x) = x - x^2 + S(x^4)$ follows), suffices to refute the guess that $S(x) \to 1/2$ as $x \to 1-$.

October 20 and 25: free_q.pdf; The classical region $1 - \sigma \ll 1 / \log(q|t|+2)$ free of zeros of $L(s,\chi)$ with at most one exception $\beta$; the resulting asymptotics for $\psi(x, a \bmod q)$ etc.; lower bounds on $1 - \beta$ and $L(1,\chi),$ culminating with Siegel’s theorem. [The fancy script L “ℒ ” is {\mathscr L}, with \mathscr defined in the mathrsfs package.]
Homework = Exercises 1 and 3

If you were expecting the complex conjugate of $\chi(a)$ in formulas 7, 8 (top of page 5 of free_q), rather than $\chi(a)$ itself, remember that we just showed that if $\beta$ exists at all then $\chi$ is real.
It turns out that Exercise 4 is far from the strongest result of the form “small $L(1,\chi)$ implies close Siegel zero”. In fact if $L(1,\chi)$ is small enough (for a primitive real character $\chi \bmod q$) then one can show that there exists a Siegel zero $\beta$ with $1 - \beta \ll L(1,\chi)$; here we can show that $L(1,\chi) < c \, / \log^3 \! q$ is “small enough” using just elementary estimates (no need for positivity of the convolution $1 * \chi$, nor the Hadamard product formula), and Terry Tao improves this to $L(1,\chi) < c \, / \log q$ in Lemma 59 of these online lecture notes on “Complex-analytic multiplicative number theory”. Even $c \, / \log^3 \! q$ decreases more slowly than $q^{-\epsilon}$ for any $\epsilon \gt 0$, so suffices to prove that Siegel’s theorem is equivalent to the statement $$ \forall \epsilon \gt 0 \; \exists c_\epsilon \gt 0: \forall \chi \bmod q, \, L(1,\chi) \gt c_\epsilon q^{-\epsilon}, $$ and if either of this statement and Siegel’s theorem can be made effective then the other is automatically effective too.
To get the $c \, / \log^3 \! q$ result we can argue as follows. Recall that for $\sigma \gt 1$ we proved $$ L(1,\chi) \gt \prod_p (1 + p^{-\sigma})^{-1} = \zeta(2\sigma) \, / \, \zeta(\sigma) \gg \sigma - 1. $$ Thus there exists $C \gt 0$ such that $L(1,\chi) = \delta \lt 1$ implies $L(1 + C\delta,\chi) \gt 2 \delta$; by the mean value theorem (MVT) we deduce $L'(\sigma) \gt 1/C$ for some $\sigma \in (1, 1+C\delta)$. But we can show $L''(\sigma,\chi) \ll \log^3 \! q$ for $\sigma \gt 1 - (1 \, / \log q)$. (This is similar to the proof of $L'(\sigma,\chi) \ll \log^2 \! q$ on the same interval; see Exercise 3 in free_q.) By MVT again, $L'(\sigma,\chi)$ changes by at most $O(\delta \log^3 \! q)$ on the interval $[1 - 2C\delta, 1 + C\delta]$. Therefore, if $\delta$ is less than a sufficiently small multiple of $1 \, / \log^3 \! q$ then $L'(\sigma,\chi) \gt 1/(2C)$ for $\sigma \gt 1 - 2C\delta$; hence (by MVT yet again) $L'(1 - 2C\delta,\chi) \lt 0$, and we conclude that $L(\beta,\chi) = 0$ for some $\beta \gt 1 - 2C\delta$. Note that $C$, as well as the constant implicit in “a sufficiently small multiple of $1 \, / \log^3 \! q$”, are effective.

October 27: l1x.pdf: Closed formulas for $L(1, \chi)$ and their relationship with cyclotomic units, class numbers, and the distribution of quadratic residues.
Homework = Exercises 1, 2, 3 (extra credit: 4)

A purely algebraic proof of Dirichlet’'s formula for $L(1,\chi)$ for an odd character $\chi$, and thus of the positivity of $S_\chi\bigl(\frac12(q-1)\bigr)$, was finally obtained in
Orde, H.: On Dirichlet's class number formula, J. London Math. Society (2) #18 (1978), 409–420.
In Exercise 2, the character $\chi$ must be assumed real, else both the Gauss-sum factor and the irrational character values usually (always?) put $q^{1/2} \pi^{-n} L(n,\chi)$ in some proper cyclotomic extension of $\bf Q$.
To prove the formula for $\sum_{n=1}^\infty \frac1n e^{2\pi i an/q}$: The Taylor series $\sum_{n=1}^\infty z^n / n = -\log(1-z)$ converges absolutely for $|z|<1$; if $|\zeta|=1$ and $\zeta \neq 1$ then $-\log(1-z)$ extends to a continuous function on the closed segment $\{t\zeta : 0 \leq t \leq 1\}$, and $\sum_{n=1}^\infty \zeta^n / n$ still converges conditionally. It follows from Abel’s theorem that $\sum_{n=1}^\infty \zeta^n / n = -\log(1-\zeta).$ Now take $\zeta = e^{2\pi i a / q}.$ (The Wikipedia page currently states the theorem for real power series $\sum_{k=0}^\infty a_k x^k,$ but the coefficients $a_k$ can just as well be complex.)
The formula for the real and imaginary parts of $\log (1 - e^{2\pi i a/q})$ can be seen geometrically in the isosceles triangle with vertices $0,1,e^{2\pi i a/q}$. The real part is $\log |1 - e^{2\pi i a/q}|$, and $|1 - e^{2\pi i a/q}|$ is the length of the side opposite $0$, which is $2 \sin a \pi/q$ (note that no absolute value sign is needed as long as $0 < a < q$, because then $0 < a \pi/q < \pi$). The imaginary part $(\pi/2) (\frac{2a}{q}-1)$ is the angle between $1 - e^{2\pi i a/q}$ and the horizontal, which can be computed starting from the vertex angle $2\pi a/q$ of the triangle (or $2\pi(1-\frac{a}{q})$ if $a > q/2$).
The formulas for $L(1,\chi)$ include the familiar Leibniz-Madhava formula $1 - \frac13 + \frac15 - \frac17 + - \cdots = \frac\pi 4$ (with $\chi = \chi_4$), and nice but less familiar examples such as $L(1,\chi_5) = \frac2{\sqrt{5}} \log \varphi$ where $\varphi = \frac12(1+\sqrt5)$ (the “golden ratio”) is the fundamental unit of ${\bf Q}(\sqrt 5)$ (with cyclotomic representations $(\sin 2\pi/5) \, / \, (\sin \pi/5) = 2 \cos \pi/5 = (2\cos 2\pi/5)^{-1}$ — you might have seen $\sin 18^\circ = \frac14(\sqrt{5}-1)$ in connection with the regular pentagon or high-school contest math, and it also follows from the evaluation of the Gauss sum $\tau(\,\chi_5) = \sqrt{5}$).
Once you know the Fourier series for the “sawtooth wave” (imaginary part of $\log |1 - e^{2\pi i x}|$), you can recover the series for the “square wave” (top of page 3) by subtracting a sawtooth of double the frequency and half the amplitude. [In effect this also accounts for the factor $2 - \chi(2)$ in the formula for $S_\chi((q-1)/2)$.] Multiplicatively, this leads us to the identity $(1-z)^2 / (1-z^2) = (1-z)/(1+z)$ (where $z = e^{2\pi i x}$), which is pure imaginary for $|z|=1$, with sign opposite to the sign of ${\rm Im}(z)$; the fact that $(1-z)/(1+z) \in {\bf R}i$ for $|z| = 1$ corresponds to one of the earliest theorems of Greek geometry, “generally attributed to Thales of Miletus”.

November 8 (and 10): sieve.pdf: The Selberg (a.k.a. quadratic) sieve and some applications
Concerning the note to Exercise 4: the $k=1$ case of Schinzel’s conjecture was stated explicitly by Bunyakovsky as early as 1857.

For more on sieve techniques and inequalities, especially since Selberg’s Lectures on Sieves (1969), see the more recent and very extensive treatment in Opera de Cribro by Friedlander and Iwaniec (Providence: American Math. Society, 2010).

Homework = Exercises 1, 2, 3

November 10: weyl.pdf: Introduction to exponential sums; Weyl’s equidistribution theorem and the application to equidistribution $\bmod 1$ of arithmetic progressions with irrational step. [interlude on “little $o$” notation]
Homework = Exercises 1, 3, 4

November 15 and 17: more analytic estimates on exponential sums (kmv.pdf and vdc.pdf): Mean-square estimates via Beurling’s function (we won’t cover the Montgomery-Vaughan generalization of Hilbert’s inequality); the Kuzmin and van der Corput bounds
Homework = kmv Exercises 1, 2, 3 and vdc Exercises 1, 2, 3

Functions $\phi : {\bf R} \to {\bf C}$ whose Fourier transforms $\,\widehat{\!\phi}$ are supported on some interval $[-\delta,\delta]$ (i.e. satisfy the condition $|r| > \delta \Rightarrow \,\widehat{\!\phi}(r) = 0$) are often said to be “band-limited”, as in the title of [Logan 1977], because they are (limits of) linear combinations of exponentials whose frequencies are supported on the “band” $|r| \leq \delta$.
In dimension $\gt1$, good analogs of $\beta_\pm$ that minorize or majorize the characteristic function of a box and have Fourier supports in another box are still quite mysterious. Another application, to upper bounds on sphere-packing densities, requires only inequalities on both $\beta$ and $\hat\beta$; in that setting a new construction of an optimal test function (but still reminiscent of Beurling’s use of the factor $(\sin \pi z)^2$) was found very recently by Viazovska to solve the sphere-packing problem in dimension 8, and was adapted shortly thereafter to solve it in dimension 24.
Here’s Terry Tao’s take on the van der Corput inequalities, and on the Vinogradov estimates which improve on van der Corput for large $k$ (with an exponent of $c/k^2$ instead of $1/(2^k-2)$).

November 22: short.pdf: The Davenport-Erdös bound and the distribution of short character sums, with some applications

Homework = short Exercises 2 and 3.

This term we will not cover the Burgess bounds on short character sums, which yield nontrivial estimates on $L(1/2 + it, \chi)$ and the least “quadratic nonresidue” $\bmod p$.

There are several Davenport-Erdös results, so one must specify which one is intended. According to this list at the Erdös Number Project, Davenport published 7 papers with Erdös, which only ties him for 41st among the 202 co-authors of more than one joint paper with Erdös. (The maximum is 62 !)
Re Proposition 1: the formula $\sum_{X \lt l \leq N} 1/l = \log (\log(N) / \log(X)) + o(1)$ in the proof follows from $\sum_{l \leq y} 1/l = \log(\log(y)) + M + o(1)$ where $M$ is the Meissel-Mertens_constant. This constant drops out from the formula for $\sum_{X \lt l \leq N} 1/l.$
The size of the smallest quadratic residue is also of interest in theoretical computer science: finding a “quadratic non-residue” of a given large prime $p$ is a standard example of an easily “RP” (randomized polynomial-time) problem not known to have a deterministic polynomial-time algorithm. [Since it takes only $\log p$ bits to specify $p$, “polynomial” here means polynomial in $\log p$. It is easy to check whether $(a/p) = -1$ given $a,p$: Euler’s criterion in terms of $a^{(p-1)/2} \bmod p$ is polynomial-time thanks to the repeated-squaring trick. So choosing $a$ at random works with probability $1/2$. Under ERH, trying the first $O(\log^2 p)$ residues is a polynomial-time algorithm, but it is not known unconditionally to succeed for all $p$.] This problem is equivalent to extracting square roots $\bmod p$ when they do exist (and thus solving arbitrary quadratic equations $\bmod p$) in polynomial time: if we could to that, we would start from $-1$ and repeatedly take square roots until we reach some $a \bmod p$ that does not have a square root; conversely, given $a \bmod p$ with $(a/p) = -1$ it is easy to extract a square root of any $c \bmod p$ with $(c/p) = +1$, by using Euler’s criterion $O(\log p)$ times to write a square root as $c^m a^{(n+1)/2}$ for some $m$, where $(p-1) = 2^e n$ (and $e=v_2(p-1)$ is the number of iterations). NB finding a square root of $-1$ (when $p \equiv +1 \bmod 4$) is polynomial time, thanks to Schoof’s algorithm; and indeed for each $k$ it is now known that a $2^k$-th root can be found in polynomial time if it exists (i.e., when $p \equiv 1 \bmod 2^{k+1}$). But the degree of this polynomial grows too rapidly with $k$ for this to give an algorithm that is polynomial time for all $p$.
For the characterization of the Gaussian distribution by its moments: suppose more generally that $\mu$ is a probability measure on ${\bf R}^n$. We say that $\mu$ has “exponential decay” if there exists $c \gt 0$ such that for all $R \gt 0$ the complement of the $R$-ball about the origin has measure $O(\exp(-cR))$. Then the Fourier transform of $\mu$ [which takes any vector $y$ to $\int_{{\bf R}^n} \exp(2\pi i \langle x,y \rangle) \, d\mu(x) = \int_{{\bf R}^n} e(\langle x,y \rangle) \, d\mu(x)$] extends to an analytic function on a neighborhood of ${\bf R}^n$ in ${\bf C}^n$. Hence if $\mu$ and $\mu'$ are two probability measures with exponential decay, all of whose moments agree, then the Fourier transform of $\mu - \mu'$ is an analytic function that vanishes together with all its derivatives at the origin, and is thus identically zero — whence $\mu - \mu' = 0$ so $\mu = \mu'.$
In fact it is sufficient for just $\mu$ to have exponential decay, because exponential decay can be detected from the moments. Assume for simplicity $n = 1.$ Then for even $r$ the $r$th moment is $\ll \int_{-\infty}^\infty x^r \exp(-c|x|) \, dx,$ and thus $\ll r! / c^r.$ Conversely, if this bound holds for all even $r$ then the complement of the $R$-ball about the origin has measure $\ll r! / (cR)^r$ for all even $r$; taking $r = cR + O(1)$ we find that this measure is $\ll R^{1/2} \exp(-cR),$ which yields the desired exponential decay with any coefficient less than $c.$ The same argument works in dimension $n$ (with $R^{n/2} \exp(-cR)$ in place of $R^{1/2} \exp(-cR)$).
In particular, Gaussian distributions on $\bf R$ and $\bf C$ are characterized by their moments, as claimed.
WARNING Without the condition of exponential decay, it is not true that every probability distribution with finite moments is characterized by those moments. Indeed there are real-valued smooth functions $f,$ not identically zero, such that $\int_{-\infty}^\infty x^r f(x) \, dx = 0$ for each $r = 0, 1, 2, \ldots$; we can thus write some multiple of $f$ as $f_+ - f_-$ with $f_\pm \geq 0$ and $\int_{-\infty}^\infty f_\pm (x) \, dx = 1,$ and then $f_+ \, dx$ and $f_- \, dx$ are two different distributions on $\bf R$ with the same moments.

November 29 and December 1: kloos.pdf: An application of Weil’s bound on Kloosterman sums;
many_pts.pdf: How many points can a curve of genus $g$ have over the finite field of $q$ elements? The zeta function of a curve over a finite field; the Weil and Drinfeld-Vladut bounds, and related matters.

Here are the plots of $x y \equiv 256 \bmod 691$ and $x y \equiv 691 \bmod 5077$, illustrating the asymptotically uniform distributions studied in this chapter (and also illustrating a bit of mathematical PostScript trickery, as you can see from the source files for the plots for p = 691 and p = 5077).
many_pts_0.pdf: Some explanation for the possibly mysterious argument on the bottom of p.3
Here are some tables of curves of given genus over finite fields with many rational points.
Let $M_k(g)$ be the maximal number of rational points of a genus-$g$ curve over the finite field $k$. The many_pts handout cites [Serre 1982-1984] for the theorem that for each $k$ the limsup over $g$ of $M_k(g)/g$ is positive. The result that for each $k$ the liminf of $M_k(g)/g$ is also positive is contained in the paper:
N.D.Elkies, E.W.Howe, A.Kresch, B.Poonen, J.L.Wetherell, and M.E.Zieve: Curves of every genus with many points, II: Asymptotically good families, Duke Math. J. 122 #2 (2004), 399-422 (math.NT/0208060 on the arXiv).
(This is actually easier than the limsup proof, though it assumes the limsup result as almost a black box — “almost” because we need a somewhat stronger result that the limsup can be attained by a sequence of curves $C_n$ whose genera are spaced closely enough that there exists some finite $R$ with $g(C_{n+1}) / g(C_n) < R$ for each $n$.)

*****************************************************************************************

So what’s with the whorls in the background pattern? They’re a visual illustration of an exponential sum, that is, $\sum_{n=1}^N \exp (i f(n))$. Even simple functions $f$ can give rise to interesting behavior and/or important open problems as we vary N. What function $f$ produced the background for this page? See here for more information.