If you find a mistake, omission, etc., please let me know by e-mail.

Andrei's Math 55 page

Q & A: Questions that arose concerning lectures, problem sets, etc., and my replies

The orange balls mark our current location in the course, and the current problem set.

We start with

You may have already seen ``little oh'' and ``big Oh'' notations. For functionsWe next start onf,gon the same space, ``f=O(g)'' means thatgis a nonnegative real-valued function,ftakes values in a normed vector space, and there exists a real constantMsuch that |f(x)|<=Mg(x) for allx. The notation ``f=o(g)'' is used in connection with a limit; for instance, ``f(x)=o(g(x)) asxapproachesx_{0}'' indicates thatf,gare vector- and real-valued functions as above on some neighborhood ofx_{0}, and that for eachc>0 there is a neighborhood ofx_{0}such that |f(x)|<=cg(x) for allxin the neighborhood. Thusf'(x_{0})=ameans the same as ``f(x)=f(x_{0})+a(x-x_{0})+o(|x-x_{0}|) asxapproachesx_{0}'', with no need to exclude the casex=x_{0}. Rudin in effect uses this approach when proving the Chain Rule (5.5).In Rudin's proof of L'Hôpital's Rule (5.13), why can he assume that

g(x) does not vanish for anyxin (a,b), and that the denominatorg(x)-g(y) in equation (18) is never zero?The derivative of

f/gcan be obtained from the product rule, together with the derivative of 1/g-- which in turn can be obtained from the Chain Rule together with the the derivative of the single function 1/x. Once we do multivariate differential calculus, we'll see that the derivatives off+g,f-g,fg,f/gcould also be obtained in much the same way that we showed the continuity of those functions, by combining the multivariate Chain Rule with the derivatives of the specific functionsx+y,x-y,xy,x/yof two variablesx,y.As Rudin notes at the end of this chapter, differentiation can also be defined for vector-valued functions of one real variable. As Rudin does not note, the vector space can even be infinite-dimensional, provided that it is normed; and the basic algebraic properties of the derivative listed in Thm. 5.3 (p.104) can be adapted to this generality, e.g., the formula

(fg)'=f'g+fg'still holds iff,gtake values in normed vector spacesU,Vand multiplication is interpreted as a continuous bilinear map fromU x Vto some other normed vector spaceW.NB The norm does not have to come from an inner product structure. Often this does not matter because we work in finite dimensional vector spaces, where all norms are equivalent, and changing to an equivalent norm does not affect the definition of the derivative. The one exception to this is Thm. 5.19 (p.113) where one needs the norm exactly rather than up to a constant factor. This theorem still holds for a general norm but requires an additional argument. The key ingredient of the proof is this: given a nonzero vector

zin a vector spaceV, we want a continuous functionalwonVsuch that ||w||=1 andw(z)=|z|. IfVis an inner product space (finite-dimensional or not), the inner product withz/|z| provides such a functionalw. But this approach does not work in general. The existence of suchwis usually proved as a corollary of the Hahn-Banach theorem. WhenVis finite dimensional,wcan be constructed by induction on the dimension ofV. To deal with the general case one must also invoke the Axiom of Choice in its usual guise of Zorn's Lemma.

The Riemann-sum approach to integration goes back to the ``method of exhaustion'' of classical Greek geometry, in which the area of a plane figure (or the volume of a region in space) is bounded below and above by finding subsets and supersets that are finite unions of disjoint rectangles (or boxes). The lower and upper Riemann sums adapt this idea to the integrals of functions which may be negative as well as positive (one of the weaknesses of geometric Greek mathematics is that the ancient Greeks had no concept of negative quantities -- nor, for that matter, of zero). You may have encountered the quaint technical term ``quadrature'', used in some contexts as a synonym for ``integration''. This too is an echo of the geometrical origins of integration. ``Quadrature'' literally means ``squaring'', meaning not ``multiplying by itself'' but ``constructing a square of the same size as''; this in turn is equivalent to ``finding the area of'', as in the phrase ``squaring the circle''. For instance, Greek geometry contains a theorem equivalent to the integration ofMost ofx^{2}dx, a result called the ``quadrature of the parabola''. The proof is tantamount to the evaluation of lower and upper Riemann sums for the integral ofx^{2}dx.An alternative explanation of the upper and lower Riemann sums is that they arise by repeated application of the following two axioms describing the integral:

The latter axiom is a consequence of the following two: the integral of a constant function from

- For any
a,b,c(witha < b < c), the integral of a function fromatocis the sum of its integrals fromatoband frombtoc;- If a function
fon [a,b] takes values in [m,M] then its integral fromatobis in [m(b-a),M(b-a)] (again assuminga <b).atobis that constant times the lengthb-aof the interval [a,b]; and iff<=gon some interval then the integral offover that interval does not exceed the integral ofg. Note that again all these axioms arise naturally from an interpretation of the integral as a ``signed area''.Recall the ``integration by parts'' identity:

fgis an integral off dg + g df. The Stieltjes integral is a way of making sense of this identity even whenfand/orgis not continuously differentiable. To be sure, some hypotheses onfandgmust still be made for the Stieltjes integral off dgto make sense. Rudin specifies one suitable system of such hypotheses.Here's a version of Riemann-Stieltjes integrals that works cleanly for integrating bounded functions from [

a,b] to any complete normed vector space: PS, PDF'.

Riemann-Stieltjes integration by parts: Suppose bothfandgare increasing functions on [a,b]. For any partitiona = x_{0}< ... <xof the interval, write_{n}= bf(b)g(b) -f(a)g(a) as the telescoping sum off(x)_{i}g(x) -_{i}f(x_{i-1})g(x_{i-1}) fromi=1 ton. Now rewrite thei-th summand as[Naturally it is no accident that this identity resembles the one used in the familiar proof of the formula for the derivative of f(x) (_{i}g(x)-_{i}g(x_{i-1})) +g(x_{i-1}) (f(x)-_{i}f(x_{i-1})).fg!] Summing this overiyields the upper Riemann-Stieltjes sum for the integral off dgplus the lower R.-S. sum for the integral ofg df. Therefore: if one of these integrals exists, so does the other, and their sum isf(b)g(b) -f(a)g(a). [Cf. Rudin, page 141, Exercise 17.]

After covering a few odds and ends from Chapter 7, we'll proceed to
**power series** and the exponential and logarithmic
functions in **Chapter 8**; we'll use these to derive the
binomial theorem for arbitrary exponents. This will give us a way to
simplify a key ingredient in the
**Stone-Weierstrass theorem**,
which is the one major result of Chapter 7 we haven't seen yet.
We postpone discussion of Euler's Beta and Gamma integrals
(also in Chapter 8) so that we can use multivariate integration to
give a more direct proof of the formula relating them.

The result concerning the convergence of alternating series is stated and proved on pages 70-71 of Rudin (Theorem 3.42).

The original Weierstrass approximation theorem (7.26 in Rudin)
can be reduced to the uniform approximation of the single function
|*x*| on [-1,1]. From this function we can construct an
arbirtrary piecewise linear continuous function, and such P.L.
functions uniformly approximate any continuous function on a
closed interval. To get at |*x*|, we'll rewrite it as
[1-(1-*x*^{2})]^{1/2}, and use the
power series for (1-*X*)^{1/2}. See the first part of
exercise 22 for chapter 8, on p.201, for this power series (and
more generally for the power series for
(1-*X*)^{a}).
We need (1-*X*)^{1/2} to be approximated by its
power series uniformly on the *closed* interval [-1,1]
(or at least [0,1]); but fortunately this too follows from the proof
of Abel's theorem (8.2, pages 174-5). Actually this is a subtler
result than we need, since the *X ^{n}* coefficient
of the power series for (1-

An explicit upper bound of 4 on Pi can be obtained by calculating
cos(2) < 1 - 2^2/2! + 2^4/4! = -1/3 < 0.
For much better estimates, integrate
(*x-x*^{2})^{4}
*dx*/(*x*^{2}+1) from 0 to 1 and note that
0 < 1/(*x*^{2}+1) <= 1. :-)

Rudin uses a fact about convex functions that is only presented
as an exercise earlier in the book (p.100, #23). Namely:
let *f* be a convex function on some interval *I*,
and consider *s*(*x*, *y*) :=
(*f*(*x*)-*f*(*y*)) / (*x-y*)
as a function on the set of (*x*, *y*) in
*I* x *I* with *x* > *y*;
then *s* is an increasing function of both variables.
The proof is fortunately not hard. For instance, to prove that if
*x* > *y'* > *y* then
*s*(*x*, *y'*) > *s*(*x*,
*y*),
write *y'* as *px*+*qy* and calculate that
*s*(*x*, *y'*) > *s*(*x*,
*y*)
is equivalent to the usual convexity condition. The case
*x* > *x'* > *y* works in exactly the same way.

The rather obscure integration by parts in Rudin, p.194 is not necessary. A straightfoward choice of ``parts'' yields

This may seem to go in a useless direction, but the elementary observation that

recovers the recursion (97).

In addition to the trigonometric definite integrals noted by Rudin
(formula 98), Beta functions also turn up in the evaluation of
the definite integral of
*u ^{ a} du* /
(1+

The limit formula for Gamma(*x*) readily yields
the product formula:

Gamma(wherex) =x^{-1}e^{-Cx}Prod(exp(x/k) / (1+x/k),k=1,2,3,...)

We next begin **multivariate differential calculus**,
starting at the middle of Rudin Chapter 9 (since the first part
of that chapter is for us a review of linear algebra). Again,
Rudin works with functions from open subsets of
**R**^{n}
to **R**^{m},
but must of the discussion works equally well with the target space
**R**^{m} replaced by an arbitrary
normed vector space *V*. If we want to allow arbitrary
normed vector spaces for the domain of *f*, we'll usually
have to require that the derivative *f*' be
a *continuous* linear map, or equivalently that its norm
||*f*'||=sup_{|v|<=1}|*f*'(*v*)|
be finite.

As in the
univariate case,
proving the Mean Value Theorem in the multivariate context
(Theorem 9.19) requires either that *V* have an inner-product
norm, or the use of the Hahn-Banach theorem
to construct suitable functionals on *V*. Once this is done,
the key Theorem 9.21 can also be proved for functions to *V*,
and without first doing the case *m*=1. To do this,
first prove the result in the special case when each
*D _{j} f*(

The *Inverse function theorem* (9.24) is a special case
of the *Implicit function theorem* (9.28), and its
proof amounts to specializing the proof of the implicit function
theorem. But Rudin proves the Implicit theorem as a special
case of the Inverse theorem, so we have to do Inverse first.
(NB for these two theorems we will assume
that our target space is finite-dimensional;
how far can you generalize to infinite-dimensional spaces?)
Note that Rudin's statement of the contraction principle
(Theorem 9.23 on p.220) is missing the crucial hypothesis
that *X* be nonempty!

The proof of the second part of the implicit function theorem,
which asserts that the implicit function **g** not only
exists but is also continuously differentiable with derivative
at **b** given by formula (58) (p.225), can be done
more easily using the chain rule, since **g** has been
constructed as the composition of the following three functions:
first, send **y** to (**0,y**); then,
apply the inverse function **F**^{-1};
finally, project the resulting vector (**x,y**)
to **x**. The first and last of these three functions
are linear, so certainly *C*^{1}; and the continuous
differentiability of **F**^{-1} comes from
the inverse function theorem.

Here's an approach to *D _{ij}=D_{ji}*
that works for a

Next topic, and last one from Rudin, is
**multivariate integral calculus** (Chapter 10).
Having defined multivariate integrals and obtained
the formula for change of variables, we're setting up the machinery
of differential forms that is the framework for the higher-dimensional
generalization of the Fundamental Theorem of Calculus that comprises
the divergence, Stokes, and Green theorems and much else besides.

The basic formula (92) that yields generalized Stokes (formula 91)
is even more easily proved for oriented *k*-cells instead of
*k*-simplices, since each term of *d*(omega) on the
left-hand side of the identity yields the sum of 2 of the 2*k*
terms of the boundary occurring on the right-hand side. Naturally the
signs that enter when one does it for a *k*-simplex are the
reason we put the (-1)^{j} into the definition of
the boundary (formula 85). One interpretation of generalized Stokes
is that the operators *d* on differential forms, and boundary
on chains, are adjoints relative to the pairing given by integration.
In fact just as *d*^{ 2}=0 we have
(boundary)^{2}=0; this is easy to show, and further
bolsters our confidence in our choice of signs in formula 85.
Just as we defined the *k*-th cohomology of *E*
as the quotient of the closed *k*-forms on *E*
by the exact *k*-forms, we can define the *k*-th
*homology H _{k}*(

The following remarks may help explain how come a function from
an open set in **R**^{3}
to **R**^{3} yields both a 1-form and a 2-form
(formulas 124, 125 on page 281).
In general, once we choose a generator for the *n*-th
exterior power of an *n*-dimensional vector space *V*
over some field (remember that this top exterior power is
1-dimensional), the exterior product yields for each *k*
a pairing between the *k*-th and (*n-k*)-th exterior
powers of *V*, which identifies each of the two spaces with
the other's dual. In our case *n*=3 and *k*=1, so
we identify the dual of **R**^{3} with its
exterior square. But, once we have chosen an inner product structure
on **R**^{3}, we have identified it with its
own dual; and once we also give **R**^{3} an
orientation, we get a generator for its top exterior power, namely
*e*_{1}^*e*_{2}^*e*_{3}
for any positively oriented orthonormal basis
(*e*_{1}, *e*_{2},
*e*_{3}).
So, we have an identification of **R**^{3} with
its exterior square, and thus can naturally identify the 1- and 2-forms
on any open set in **R**^{3} with functions from
that set to **R**^{3}.

At a more down-to-earth level, we can also use these ideas to explain the ``cross product'' on an oriented 3-dimensional real inner product spaceA generator for theV: the cross productvxwis the image ofv^wunder the map that identifies the exterior square ofVwithV.

We're about to start on **Fourier analysis**, using
Körner's book (a bit of this material can also be found in
Chapter 8 of Rudin). The natural context for Fourier analysis
is the notion of a **Hilbert space**. The one strange
choice Körner makes is to suppress this notion entirely
(Hilbert's name does not even appear in the index!), as if to
prove that all of Fourier analysis can be done without Hilbert
spaces, as Axler set out to do all of linear algebra without
determinants. But doing Fourier without Hilbert seems perverse.
We shall thus begin with an introduction to the structure of
Hilbert spaces, in particular separable ones. The first handout
(hilbert1.pdf,
hilbert1.ps)
gives the definition, basic examples,
and the fundamental notion of an orthonormal topological basis (ontb).
The second handout
(hilbert2.pdf,
hilbert2.ps)
introduces separability, and the behavior of orthogonal complements,
projections, and duality for a Hilbert space.

With what we know already, we can give another kind of proof that
the complex exponentials are an ontb for L_{2}([0,1]):
their linear span is a **C**-subalgebra of
*C*([0,1]) which separates points and is ``self-adjoint''
(closed under complex conjugation); therefore, by the complex
Stone-Weierstrass theorem (Thm. 7.33 on page 165 of Rudin), it
is dense in *C*([0,1]) in the sup metric. [In fact
Rudin's introduction to the theory of Fourier series in Chapter 8
uses this argument; see Thm. 8.15 on page 190.]
Thus *a fortiori* that linear span is dense in *C*([0,1])
also relative to the L_{2} metric, and therefore
is dense in L_{2}([0,1]), as desired. Rudin only deals
with Fourier series in one dimension, but the same method also
works for Fourier series on **T**^{n}
for *n*>1.

We'll proceed by developing Fourier analysis starting with the following chapters of Körner: 1 (Introduction), 2 (Statement and proof of Fejér's theorem), 3 (Weyl equidistribution), 9,15,16,18 (some results on convergence and divergence of Fourier series). Chapters 4 through 6 contain material most of which we have seen already, but the second part of Chapter 6 may still surprise you. You might also be interested in Chapter 19, which states without proof several results much more precise (though necessarily much harder) than is done in Chapter 18.

Re Chapter 3: Weyl showed more generally the following theorem:
Let
*t*_{1}, *t*_{2}, *t*_{3},...
be real numbers mod 2*pi, i.e., elements of **T**. Then
the following are equivalent:

- For all
*a,b*with 0 <*a*<*b*< 1, the number of*t*with_{r}*r*<= n lying in 2*pi[*a,b*] is asymptotic to (*b-a)**n*; - For any continuous function
*f*from**T**to**C**, as*n*grows to infinity the average of*f*(*t*) over_{r}*r*<= n approaches the average of*f*over**T**; - For each nonzero integer
*s*, the average of*e*approaches 0 as^{istr}*n*grows to infinity.

Re Chapter 18: As noted in class, what's really going on here is the
following result: Let *V,W* be normed vector spaces, with
*V* complete. Then any weakly convergent sequence
{*T _{n}*} in

To prove this result: Assume on the contrary that
{*T _{n}*} is unbounded. Without loss of generality
(i.e. by passing to a subsequence), assume that
||

Next on the menu: Chapters 40 and 41, on orthogonal polynomials and
Gaussian quadrature. Unaccountably missing
from the end of Chapter 40 is the following result, which
I'll call ``Theorem 40.9'': let *a,b,r,u _{n}*
be as in Theorem 40.8; then there is a ``three-term recurrence''

All the results of Chapter 41 generalize directly to integration
with respect to an arbitrary weight on a finite interval, using
zeros of polynomials orthogonal with respect to that weight.
Note too that Lemma 41.1 can also be obtained as a consequence
of the fact that, over any field *F*, evaluation at
*n* distinct field elements is a linear bijection from
the polynomials of degree <*n* to *F ^{n}*.
This can be proved either computationally, as in Körner,
or from general linear-algebra facts together with the observation
that the kernel of the map is zero. The left-hand side (even with
an arbitrary weight function) is a linear functional on that space,
etc.

We're starting on discrete Fourier analysis, from chapters 97ff.
in Körner. Fourier series, discrete Fourier analysis,
and the Fourier transform (which we'll proceed to next) are
special cases of *Pontrjagin duality*. We shall not be
able to develop Pontrjagin duality in anything like its full scope
in 55b; even the basic theorems require detailed study of things like
Haar measure that are normally not established until well into a
graduate analysis course like Math 212. But can still state some of
the definitions and key theorems to suggest the framework into which
the various Fourier variants fit. Here's an
outline.
For another glimpse into the Pontrjagin framework,
in the special case of finite abelian groups where
everything can be stated and proved using only linear algebra,
see Chapters 103-4 of the textbook.

In each of our Fourier settings, to pointwise multiply two
functions on a group *G* we *convolve* their
Fourier transforms, and vice versa (up to scalar multiplication).
See Chapters 51, 52 for *convolution* on **R**
and **T**. Convolution on
**Z**/*n***Z**
is defined in 97.7 and described in 97.8 on page 495.
In general the convolution of two functions *f,g*
is the function *f*g* whose value at any *t*
is the sum/average/integral over *x* of
*f*(*t-x*)*g*(*x*). We encountered
such things already in connection with the Dirichlet and
Fejér kernels; they will also be the key to our use
of the fast Fourier transform (FFT, Chapter 98) to fast multiplication
(Chapter 99). Note that the behavior of Pontrjagin duality on
subgroups is also an important ingerdient in the FFT, which
separates even and odd integers mod 2*N*.

Our final topic is *Fourier transforms*, Section IV of the
text. We'll concentrate on the material in Chapters 46 (introduction,
and Fejér part I), 49 (conclusion of Fejér, and some
applications), 51 (convolution on **R**), and 60
(the inversion formula). Chapters 47 and 48 contain warnings
and justifications of changing the order of integration that
should be familiar and routine by now; you may want to read them
for review purposes, but be aware that at least one typo is lurking
(In Lemma 48.6 on page 234, *f* is a function not on
**R** but on **R**^{2}).
For that matter there are at least two typos in Ch.46:
in Lemma 46.3 we need *f* to satisfy the hypotheses of Lemma 46.2
(integrability and absolute convergence), not 41.2;
and on the top line of p.224, it's the continuity of
*f^* that we need, not of *f*.
[Also, there's only one `m' in ``imitate''!]
The end of Chapter 49 recommends attention to Lemma 50.2, which
contains two important integral formulas. Unfortunately
Körner proves them with complex analysis, which most of you
probably don't know yet. Fortunately 50.2(i) is also proved in Rudin
(Example 9.43, pages 237-8), and 50.2(ii) will be easy
once we have the inversion theorem of Chapter 60.
As a natural extension of the results of Chapter 51 we have
Parseval's identity and an application to the behavior at infinity
of Fourier transforms of integrable functions:
parseval.pdf,
parseval.ps.

First problem set: Univariate differential calculus (PS, PDF')

Second problem set: Univariate integral calculus (PS, PDF')

Third problem set: Univariate calculus cont'd, and Stone-Weierstrass (PS, PDF')

If Problems 6, 7, 8 are too easy or too familiar, see if you can evaluate the integral of cos^{n}(x) cos(cx) from 0 to Pi/2, and deduce further product formulas.

Fourth problem set: Multivariate differential calculus (PS, PDF')

Fifth problem set: Multivariate integral calculus (PS, PDF')

Sixth problem set: Interlude on convexity; introduction to differential forms (PS, PDF')

Seventh problem set: Differential forms, chains, integration, and more exterior algebra (PS, PDF')

Eighth problem set: Life in Hilbert space (PS, PDF')

Ninth problem set: Fourier series I [summability methods, Weyl equidistribution, and differentiation] (PS, PDF')

Tenth problem set: Fourier series, cont'd; orthogonal polynomials (PS, PDF')

Eleventh and last problem set: Fourier series, the grand finale: Gauss, Jacobi, Poisson, and Müntz (PS, PDF')