Lecture notes for
Math 55a: Honors Abstract Algebra
(Fall 2016)
If you find a mistake, omission, etc., please
let me know
by e-mail.
The orange balls
mark our current location in the course,
and the current problem set.
Ceci n’est pas un Math 55a syllabus
[No, you don’t have to know French to take Math 55a.
Googling ceci+n'est suffices to turn up
the explanation, such as it is.]
Update 9/9: We now have two CA’s for the class:
Wyatt Mackey (wmackey@college) and Andrew (Andy) Gordon
(andrewgordon@college)
[if writing from outside the Harvard network, append
.college.edu to ...@harvard].
!
If you are coming to class but not
officially registered for Math 55 (e.g. you are auditing,
or still undecided between 25a and 55a but officially signed up for 25a),
send me your e-mail address
so that I and the CA's can include you in class announcements.
The Sep.13 office hours will be an hour later than usual,
that is, 8:30 to 10:00 PM in the Lowell House Dining Hall,
due to Lowell’s Sophomore Dinner happening earlier that evening.
CA office hours are Wednesdays 8 to 10 PM in the Leverett House
Dining Hall (same location as Math Nights), starting September 14.
Wyatt will hold section Wednesdays 2-3 PM in Science Center 304.
Andy will hold section Fridays 2-3 PM, room TBA
in Science Center 310.
!
First Quiz Friday, October 7 in class.
Andy will hold a review section
Thursday, October 6, at 8PM in Science Center Room 304
in place of his usual Friday section.
Also, no class Monday, Oct. 10 (University holiday);
I will lecture Wednesday the 12th, but on a scenic detour from the
main sequence of 55a topics so as not to disadvantage students observing
the Yom Kippur fast. (The talk was on linear error-correcting codes,
which were the hidden(?) agenda of some of the recent homework problems.
While I’m at it, here is
the interlude on 5777, Jewish leap years, etc., with probably too much
additional information.) Thus the due date for Problem Set 5
is postponed from Friday the 7th (the quiz date) to Wednesday the 12.
!
Office Hours moved to Thursday starting the week of October 31
(paralleling the shift in the due dates of problem sets from
Friday to Monday). So November 3 instead of November 1, etc.
!
Second Quiz Friday, December 2 in class.
Andy will hold a review section
Thursday, December 1, at 7PM in Science Center Room 304
in place of his usual Friday section.
I'll hold office hours 8:00-9:30 at Lowell House (not the usual 7:30-9:00,
to avoid conflict with Andy's section), in addition to Tuesday's
office hours.
At least in the beginning of the linear algebra
unit, we’ll be following the Axler textbook closely enough that
supplementary lecture notes should not be needed. Some important
extensions/modifications to the treatment in Axler:
- [see Axler, page 5]
Pace the boxed note on that page,
virtually all mathematicians say and write
“n-tuple”
(more fully, “ordered n-tuple”),
while I cannot recall another instance of “list” used
for this as Axler does.
(One sometimes sees “tuple” for an
n-tuple of unspecified length n,
and “ordered pair” and perhaps “ordered triple”,
“ordered quadruple”,
etc. for n = 2, 3, 4, …)
- [cf. Axler, Notation 1.6 on page 4, and the
“Digression on Fields” on page 10]
Unless noted otherwise, F may be an arbitrary
field, not only R or C. The
most important fields other than those of real and complex numbers
are the field Q of rational numbers, and the
finite fields Z/pZ
(p prime). Other examples are: the field
Q(i) of complex numbers with rational
real and imaginary parts; more generally,
Q(d1/2) for any nonsquare
rational number d; the “p-adic numbers”
Qp (p prime),
of which we’ll say more when we study topology next term;
and more exotic finite fields such as the 9-element field
(Z/3Z)(i).
Here’s a review of the axioms for
fields, vector spaces, and related mathematical structures.
- [cf. Axler, p.28 ff.] We define the span of an arbitrary subset
S of (or tuple in) a vector space V as follows:
it is the set of all (finite) linear combinations
a1v1 + … +
anvn
with each vi in S and each
ai in F. This is still the smallest
vector subspace of V containing S. In particular,
if S is empty, its span is by definition {0}. We do not require that S be finite.
- Warning: in general the space F[X]
(a.k.a. P(F))
of polynomials in X, and its subspaces
Pn(F)
of polynomials of degree at most n,
might not be naturally identified
with a subspace of the space FF of
functions from F to itself. The problem is that
two different polynomials may yield the same function.
For example if F is the field of 2 elements
the polynomial X2−X
gives rise to the zero function. In general different polynomials
can represent the same function if and only if
F is finite — do you see why?
- (See also Exercise 11 in Axler 1.C, assigned as part of the first
problem set)
If Ui are any subspaces of a vector space V
then so is their intersection
∩i
Ui. Note that this doesn’t specify
a finite intersection: i could range over an
“index set” I of any cardinality
(so we would write the intersection as
∩i∈I
Ui ).
We don’t usually want to intersect an empty family of sets
(do you see why?),
but for subsets of a given set V we can declare that
∩i∈∅
Ui = V.
- For any field (or even any ring) F there is a canonical
ring homomorphism, call it h, from Z to F.
“Ring homomorphism” means:
h(0) = 0,
h(1) = 1,
and for any integers m, n we have
h(m+n)
= h(m) + h(n) and
h(mn)
= h(m) h(n)
(and h(−n)
= −h(n),
though this follows from the other properties).
But this doesn’t quite mean that we get an isomorphic copy of
Z in F, because h might not be injective.
Equivalently, the kernel (that is, the preimage
h−1({0})
= {n : h(n) = 0})
might be larger than just {0}. In general, I must be an
ideal, i.e. an additive subgroup of Z that is
closed under multiplication by arbitrary integers
(whether in I or not — this mimics the definition of
a subspace, though as it happens for ideals in Z
it’s automatic). Now ideals in Z are either {0} or
(n) := {cn | c in Z}
for some n (namely the least positive element of the ideal).
For rings, any n may arise, most easily for the ring
Z/nZ of integers mod n.
But if F is a field and I=(n) then
n must be either zero or prime, lest F have
zero divisors (elements a and b,
neither zero, for which ab=0).
This n is then called the characteristic of
the field F. The familiar fields
Q, R, C all have characteristic zero.
For any prime p, there are fields of characteristic p,
notably the “prime field”
Z/pZ
(mentioned above; this is the key fact from elementary (but nontrivial)
number theory that any nonzero element of
Z/pZ has a multiplicative inverse!).
This field Z/pZ
and other prime fields have important uses in number theory, combinatorics,
computer science, and elsewhere, often using the linear algebra that
we develop in Math 55a.
- [cf. the boxed note on page 42 of Axler] It is natural to wonder whether
every vector space, finite-dimensional or not, has a basis.
The polynomial ring F[z], considered as a vector space
over F (and denoted by a fancy script P in Axler),
does have a basis (powers of z), as does a polynomial ring in
several variables, or even infinitely many (see the next item);
but does F∞? The answer is yes —
but only under the Axiom of Choice (equivalently,
Zorn’s Lemma)!
[“but only under” because it is known that Choice/Zorn is
equivalent to the claim that every vector space has a basis.
Don’t spend too much time trying to find an explicit basis for
F∞, or for R as a vector space
over Q (a “Hamel basis”)…]
Using the same tool one can prove analogues of some other results in
Chapter 2, such as 2.33 (p.41: every linearly independent set
extends to a basis), and thus 2.34 (p.42: every subspace is a
direct summand; again, don’t spend too much time trying to do this
explicitly for Q as a subspace of the Q-vector space R,
or for ⊕n≥1F
as a subspace of F∞!).
NB some other results clearly fail in infinite dimensions,
even when we have an explicit basis; e.g. the even powers of z
form a linearly independent subset of F[z]
that has the same cardinality as a basis but is not a basis.
- However, 2.31 (p.40: every spanning list contains a basis)
still holds with no further axioms for spanning sets S
of arbitrary size, as long as V is finite dimensional.
The reason is that V has a finite spanning set,
say S0, and every element of
S0 is a linear combination of elements of S,
and sincelinear combinations are of necessity finite it takes only
a finite subset of S to span S0
and thus V. Now apply the proof of 2.31 to this
finite subset. We may call this generalization “2.31+”.
- [from the Sep.12 lecture] Here’s an extreme example of how
basic theorems about finite-dimensional vector spaces can become
utterly false for finitely-generated modules:
a module generated by just one element
can have a submodule that is not finitely generated. Indeed,
for any field F let A be the ring of polynomials
in infinitely many variables Xj.
[The letter “A” is a common name for a ring,
from French anneau, cognate with English “annulus”.]
As usual
we can regard A as a module over itself, with a single
generator 1. Then a submodule is just an ideal of the ring.
Choose the ideal I generated by all the Xj,
which consists of all polynomials with constant coefficient equal 0.
Then if there are infinitely many indices j then I
is infinitely generated; indeed any generating set must be
at least as large as the index set of j’s,
so for every cardinal ℵ we can make a ring A with a
singly-generated module (namely A itself) and with a submodule
that cannot be generated by fewer than ℵ elements.
For a subtler example, consider the ring we might call
“F[X1/2∞]”,
consisting of F-linear combinations of monomials
Xn/2k
for arbitrary nonnegative integers n and k.
Again let I be the ideal generated by the nonconstant monomials,
which is not finitely generated, though there are generating sets
that are “only” countably infinite.
The new behavior involves the countable generating set
{X1/2k | k≥0}:
there is no minimal generating subset, because each
X1/2k
is a multiple of X1/2k'
for any k'>k. Likewise for the ring generated
by all monomials Xr
with r any nonnegative rational number (or even all
Xr
with r any nonnegative real number).
(When A is Noetherian, submodules of finitely-generated modules
are finitely-generated, but might still require more generators;
for example, there are Noetherian rings A with
“non-principal ideals” I, which give examples of
a 1-generator module with a submodule that requires at least 2 generators.)
- As with the notions of span and linear combination, the definition of
a linear transformation makes sense for modules over any ring A
(whether commutative or not), and in that generality is called an
A-module homomorphism (so you now know the
“morphisms” in the “category” of
A-modules);
when A is a skew field, we still call this a
linear transformation, and the “rank-nullity theorem”
(3.22, page 63) still holds for finite-dimensional vector spaces
in that context.
- Suppose T : V → W
is a linear transformation.
Axler’s notation for the image of T
was already becoming rather old-fashioned when he wrote the first edition of
his book; these days simply T(V) is common
(and likewise for any function at all).
The terminology “null space” (whether one or two words)
for T −1({0}) is also somewhat quaint;
we usually say “kernel” and write
“ker(T)”
[and (LA)TeX already provides the command
\ker to typeset this properly].
While I’m at it, best to avoid the use of
“one-to-one” to mean
“injective” (see boxed note on page 60),
because it is also sometimes used for
“bijective”.
- Please avoid Axler’s notation “product” and
“V × W ” (p.91, 3.71 ff.).
I understand the motivation for this notation: it is formally correct,
and avoids the need to distinguish between “external direct sum”
(the usual name for that vector space) and “internal direct sum”
(a vector space sum [within some larger vector space] that happens to be direct).
The problem with this is that in Math 55 (and ubiquitously in the literature)
we shall introduce before long a “tensor product”
V ⊗ W of vector spaces,
whose dimension is the product of the dimensions of
V and W when those two dimensions are finite; and
it would be a much bigger source of confusion to have that notation
coexist with “V × W ”
where the dimensions add. So please stick with
“V ⊕ W ”
and the name “external direct sum” — or if you must,
“Cartesian product” to avoid confusion with tensor products.
For a possibly infinite Cartesian product, which is not
the same as a direct sum
(because an element of the direct sum must have only finitely many
nonzero components), we still have the notation
Πi∈IVi
to distinguish the Cartesian product from the direct sum
⊕i∈IVi.
- More notes on notation: I understand why Axler wants to distinguish
V' and T' (dual space and transformation)
from V* and T*, and
U0 (annihilator) from
U⊥.
(See the boxed note on page 104.)
I’ll try to stick with U0
in this class. But for the duals, using “ ' ” this way
incurs a steep price of the very useful construction exemplified by
“let V and V' be vector spaces”:
we already have few enough good letters to name mathematical structures
that even π is pressed into double duty (not just 3.14159… but
also the quotient πrojection from V to V/U).
I’ll stick with the common V* and T* here.
- An equivalent statement of the identity (ST)* = T*S*
(third part of 3.101, page 104 of Axler),
together with (IV)* = IV*
(which Axler might not even bother stating explicitly),
is that duality of vector spaces and linear transformations constitutes a
“contravariant
functor” from the category of
F-vector spaces and linear transformations to itself.
- The results about quotient spaces and duality in sections E and F of
Chapter 3 are often described in terms of
exact sequences.
A sequence … → L → M
→ N → …
of linear transformations (or A-module homomorphisms,
“etc.”) is said to be “exact at M ”
if the kernel of the map M → N
is the image of the map L → M
(that is, if the elements of M that go to zero in N
are precisely those that come from L). The sequence is
“exact” if it is exact at each step with both an incoming
and an outgoing map. In particular a map
M → N is injective iff
it extends to a sequence 0 → M → N
that is exact at M, and surjective iff
it extends to a sequence M → N → 0
that is exact at N. (Note that in each case there is no
choice about the function from or to the trivial vector space 0,
and likewise at least for modules.) Thus the map is an isomorphism
iff 0 → M → N → 0
is exact (at both M and N). Even more easily,
0 → M → 0 is exact iff M=0.
A short exact sequence is the next case, with three modules
other than the initial and final 0. The standard example is
0 → L → M
→ M/L → 0
where the map 0 → L → M
is an inclusion map (thus an injection) and the map
M → M/L → 0
is the quotient map (thus a surjection). In general if
0 → L → M → N → 0
is a short exact sequence then the injection
L → M identifies L with
a submodule of M, and then the surjection
M → N is identified with the
quotient map. More generally, any homomorphism
L → M extends (uniquely up to equivalence)
to an exact sequence with four modules between the outer zeros:
0 → K → L
→ M → N → 0,
where K is the kernel of the map
L → M, and N is its
“cokernel ”, that is, the quotient of
M by the image of L.
Now consider the case of vector spaces. Then to each linear transformation
V → W we associate the dual transformation
V* ← W*, with the dual of a composition
V → W → X
being the composition of the dual transformations
V* ← W* ← X*
in reverse order; this makes duality a “contravariant functor”
on the category of F-vector spaces. The key fact is
that for finite-dimensional vector spaces, duality preserves exactness
of sequences of linear transformations. Thus starting from any linear
V → W, we can extend to an exact sequence
0 → U → V
→ W → X → 0
with U the kernel and X the cokernel,
and dualize to deduce the exactness of
0 ← U* ← V* ←
W* ← X* ← 0
with V* ← W* the dual map.
This immediately encodes Axler 3.108 (page 107): the map
V → W is surjective
iff X is zero iff X* is zero
iff the dual map is injective. Likewise for 3.110 (p.108)
via the vanishing of U and U*.
With a bit more work we can get the general relations
ker(T*) =
(im(T))0
(3.107, p.106) and
im(T*) =
(ker(T))0
(3.109, p.107) between the kernels and images of T and its dual,
again assuming that T is a linear map between finite-dimensional
vector spaces. Conversely, the fact that duality preserves exactness
(for sequences of linear maps between finite-dimensional vector spaces)
can be deduced as a special case of 3.107 and 3.109.
You can now understand this joke
(such as it is).
- Another way to think about the eigen-basics: “Lemma 5.0”:
If T is an operator on any vector space V,
and λ any scalar, then U is an invariant subspace
for T iff it is an invariant subspace for
T − λIV.
So, for instance, since ker T is an invariant subspace,
so is
ker(T − λIV),
a.k.a. the λ-eigenspace.
- Yet another note on notation: Axler’s name
“T/U ” (for the operator on V/U
induced from the action of T on a vector space V
with an invariant subspace U, see 5.14 on p.137)
is a nice notation, but (unlike
T |U
for the restriction of T to U)
is seen rarely if at all in the research literature.
Normally it will be called plain T, or possibly
\bar{T} (since it is constructed by descending to V/U
the composition of T with the quotient map
V → V/U).
- Let T be a linear operator on V. The algebraic
properties of polynomial evaluation at T can be summarized
by saying that the map from F[X]
to End(V) that takes any polynomial P
to P(T) is not just linear but a
ring homomorphism. [Since F[X]
is a commutative ring, so is the image of this homomorphism, even though
End(V) is not commutative once
dim(V) > 1.] In particular the kernel
is an ideal in F[X]; when V
is finite dimensional, this ideal must be nonzero, and its generator is
what we shall call the “minimal polynomial” of T.
Special case: if V is F itself, then we
naturally identify End(V) with F,
and we get for any field element x
the evaluation homomorphism from F[X]
to F that takes any polynomial to its value at x.
- Axler proves the Fundamental Theorem of Algebra using
complex analysis, which cannot be assumed in Math 55a
(we’ll get to it at the end of 55b).
Here’s a proof using the topological tools
we’ll develop at the start of 55b.
(Axler gives one standard complex-analytic proof in 4.13 on page 124.)
- Triangular matrices are intimately related with “flags”.
A (complete) flag in a finite dimensional vector space V
is a sequence of subspaces {0}=V0,
V1, V2, …,
Vn=V, with each
Vi of dimension i and containing
Vi−1. A basis
v1, v2, …,
vn
determines a flag: Vi is the span of the first
i basis vectors. Another basis
w1, w2, …,
wn
determines the same flag if and only if each
wi is a linear combination of
v1, v2, …,
vi
(necessarily with nonzero vi coefficient).
The standard flag in Fn is the flag
obtained in this way from the standard basis of unit vectors
e1, e2, …,
en.
The punchline is that, just as a diagonal matrix is one that respects
the standard basis (equivalently, the associated decomposition of
V as a direct sum of 1-dimensional subspaces),
an upper-triangular matrix is one that respects the standard
flag.
Note that the i-th diagonal entry of a triangular matrix
gives the action on the one-dimensional quotient space
Vi/Vi−1
(each i=1,…,n).
While the third edition of Axler includes quotients and duality,
it still lacks tensor algebra.
This is no surprise, but it will not stop us in Math 55!
Here’s an introduction
[As you might guess from \oplus,
the TeXism for the tensor-product symbol is \otimes.]
- One of many applications is the trace of
an operator on a finite dimensional F-vector space V.
This is a linear map from Hom(V,V) to F.
We can define it simply as the composition of two maps:
our identification of Hom(V,V) with
the tensor product of V* and V,
and the natural map from this tensor product to F
coming from the bilinear map taking
(v*,v)
to v*(v).
We shall see that this is the same as the classical definition:
the trace of T is the sum of the diagonal entries of
the matrix of T with respect to any basis.
The coordinate-independent construction via tensor algebra
explains why the trace does not change under change of basis.
(The invariance can also be proved by checking explicitly that
AB and BA have the same trace for any square matrices
A, B of the same size.)
- Here are some basic definitions and facts about general
norms
on real and complex vector spaces.
- Just as we can study bilinear symmetric forms
on a vector space over any field, not just R,
we can study sesquilinear conjugate-symmetric forms
on a vector space over any field with a conjugation,
not just C.
Here a “conjugation” on a field F is a field automorphism
σ:F→F
such that σ is not the identity
but σ2 is (that is, σ is an involution).
Given a basis {vi} for F,
a sesquilinear form 〈.,.〉 on F
is determined by the field elements
ai,j
= 〈vi,vj〉,
and is conjugate-symmetric if and only if
aj,i = σ(ai,j)
for all i,j.
Note that the “diagonal entries” ai,i
— and more generally 〈v,v〉
for any v in V — must be elements
of the subfield of F fixed by σ.
- “Sylvester’s Law of Inertia” states that
for a nondegenerate pairing on a finite-dimensional vector space
V/F, where either
F=R and the pairing is
bilinear and symmetric, or
F=C and the pairing is
sesquilinear and conjugate-symmetric,
the counts of positive and negative inner products
for an orthogonal basis constitute an invariant of the pairing
and do not depend on the choice of orthogonal basis.
(This invariant is known as the “signature” of the pairing.)
The key trick in proving this result is as follows.
Suppose V is the orthogonal direct sum of subspaces
U1, U2
for which the pairing is positive definite on U1
and negative definite on U2.
Then any subspace W of V
on which the pairing is positive definite
has dimension no greater than dim(U1).
Proof: On the intersection of W with U2,
the pairing is both positive and negative definite;
hence that subspace is {0}. The claim follows by a dimension count,
and we quickly deduce Sylvester’s Law.
- Over any field not of characteristic 2,
we know that for any non-degenerate
symmetric pairing on a finite-dimensional vector space
there is an orthogonal basis, or equivalently
a choice of basis such that the pairing is
(x,y)=Σi(ai xi yi)
for some nonzero scalars ai.
But in general it can be quite hard to decide whether
two different collections of ai
yield isomorphic pairings. Even over Q
the answer is already tricky in dimensions 2 and 3,
and I don’t think it’s known in a vector space of arbitrary dimension.
Over a finite field of odd size there are always exactly two
possibilities, as I think we’ll see in a few weeks.
- If U is a subspace of inner-product space V,
but not necessarily finite dimensional, there is not generally a complement:
one can still define U⊥, but the direct orthogonal
sum U ⊕ U⊥ might be
strictly smaller than V. What then happens to
6.56 (p.198 in 6.C), which describes the orthogonal projection
PU(v) as the vector
in U closest to v
(i.e., minimizing the norm ||v−u||)?
Well, if there exists such u then indeed
v−u is orthogonal to U,
but in general the minimum need not be attained: at best we can construct
a sequence of vectors un in U such that
||v−un|| approaches
infu∈U||v−u||.
It then follows from
Apollonius’ theorem
(see the front cover of Axler! and also Exercise 31 of 6.A, page 179)
that the un constitute a
Cauchy sequence
in U (else (um+un)/2
is too close to v).
So if U is complete with respect to the norm distance
then there is a nearest vector and we can proceed as before.
But in general infinite-dimensional inner product spaces are not complete
(the complete ones are
Hilbert spaces, and that is a very special case).
We shall say a lot more about completeness and related notions
at the start of Math 55b.
- A regular graph of degree d is a
Moore graph of girth 5
if any two different vertices are linked by a unique path of length
at most 2. Such a graph necessarily has
n = 1 + d + d(d−1)
= d2 + 1 vertices.
Let A be the adjacency matrix, and
1 the all-ones vector. Then (because each vertex
has degree d) 1 is a
d-eigenvector of A.
We have
(1 + A + A2) v =
d v + 〈v, 1〉 1
for all v
(proof: check on unit vectors and use linearity).
Thus A takes the orthogonal complement
of R·1
to itself and satisfies
(1 + A + A2) = d
on that space. Since this quadratic equation has distinct roots
m and −1−m
for some m > 0 (namely the positive root of
(1 + m + m2) = d),
it follows that the orthogonal complement
of R·1
is the direct sum of the corresponding eigenspaces.
Let d1 and d2
be their dimensions. These sum to
n−1 = d2, and satisfy
md1 + (−1−m)d2 + d = 0
because the matrix A has trace zero. This lets us solve
for d1 and d2.
in particular we find that their difference is
(2d − d2) / (2m+1).
Since that’s an integer, either d=2
(giving the pentagon graph) or m is an integer.
Substituting m2 + m + 1
for d, we find that
16(d1 − d2)
is an integer plus 15/(2m+1), whence
m is one of 0, 1, 2, or 7. The first of these is impossible,
and the others give d = 3, 7, or 57
as claimed.
It is clear that the pentagon is the unique Moore graph of girth 5,
and degree 2, and it is not hard to check that the Petersen graph
is the unique example with d = 3. For
d = 7 there is again a unique graph up to
an interesting group of automorphisms; see this
proof outline, which also lets one count
the automorphisms of this graph.
- Why the name “spectral theorem”? The set (or sometimes the
“multiset”) of eigenvalues
of a linear operator on a vector space V
is often called its
“spectrum”, especially when
V is a real or complex vector space, either
finite or infinite dimensional. This is related with the
visual (and by extension the electromagnetic) spectrum, for reasons that
would take us much too far into wave and quantum mechanics,
so we shall say little more of that here (but you may encounter it again
in your physics class(es)).
We’ll define the determinant of an operator T
on a finite dimensional space V as follows:
T induces a linear operator
∧n(T)
on the top exterior power
∧n(V) of V
(where n = dim V);
this exterior power is one-dimensional,
so an operator on it is multiplication by some scalar;
det(T) is by definition the scalar corresponding to
∧n(T).
The “top exterior power”
is a subspace of the “exterior algebra”
∧•(V)
of V,
which is the quotient of the tensor algebra by
the ideal generated by
{v ⊗ v: v in V}.
We’ll still have to construct the sign homomorphism from the
symmetry group of order dim(V) to {1,−1}
to make sure that this exterior algebra is as large
as we expect it to be, and that in particular that
the (dim(V))-th exterior power has dimension 1
rather than zero.
Interlude: normal subgroups;
short exact sequences in the context of groups:
A subgroup H of G is normal
(satisfies H = g−1Hg
for all g in G) iff
H is the kernel of some group homomorphism from G
iff the injection H → G
fits into a short exact sequence
{1}
→ H
→ G
→ Q
→ {1},
in which case Q is the quotient group
G/H.
[The notation {1} for the one-element (“trivial”) group
is usually abbreviated to plain 1, as in
1
→ H
→ G
→ Q
→ 1.]
This is not in Axler but can be found in any introductory text in
abstract algebra; see for instance Artin, Chapter 2, section 10.
Examples:
1
→ An
→ Sn
→ {±1}
→ 1;
also, the determinant homomorphism
GLn(F) → F*
gives the short exact sequence
1
→ SLn(F)
→ GLn(F)
→ F*
→ 1,
and this works even if F is just a commutative ring with unit
as long as F* is understood as
the group of invertible elements of F — for example,
Z* = {±1}.
Some more tidbits about exterior algebra:
- If w, w' are elements of the
m-th and m'-th exterior powers of V,
then ww'=(−1)mm'w'w;
that is, w and w' commute
unless m and m' are both odd
in which case they anticommute.
- If m+m'=n=dim(V)
then the natural pairing from the
m-th and m'-th exterior powers
to the n-th is nondegenerate,
and so identifies the m'-th exterior power
canonically with the dual of the m-th
tensored with the top (n-th) exterior power.
- In particular, if m=1,
and T is any invertible operator on V,
then we find that the induced action of T
on the (n−1)st exterior power
is the same as its action on V*
multiplied by det(T).
This yields the formula connecting the inverse and cofactor matrix
of an invertible matrix (a formula which you may also know
in the guise of “Cramer’s rule”).
- For each m there is a natural non-degenerate pairing
between the m-th exterior powers of
V and V*,
which identifies these exterior powers with each other’s dual.
More will be said about exterior algebra when differential forms
appear in Math 55b.
We’ll also show that a symmetric (or Hermitian) matrix
is positive definite iff all its eigenvalues are positive
iff it has positive principal minors
(the “principal minors” are the determinants
of the square submatrices of all orders containing the (1,1) entry).
More generally we’ll show that the eigenvalue signs determine
the signature, as does the sequence of signs of principal minors
if they are all nonzero. More precisely: an invertible
symmetric/Hermitian matrix has signature
(r,s) where
r is the number of positive eigenvalues and
s is the number of negative eigenvalues;
if its principal minors are all nonzero then
r is the number of j in
{1,2,…,n} such that the
j-th and (j−1)-st minors
have the same sign, and
s is the number of j in that range such that the
j-th and (j−1)-st minors
have opposite sign [for j=1 we always count
the “zeroth minor” as being the positive number 1].
This follows inductively from the fact that the determinant has sign
(−1)s and the signature
(r',s') of the restriction of a pairing
to a subspace has
r' ≤ r and
s' ≤ s.
For positive definiteness, we have the two further
equivalent conditions: the symmetric (or Hermitian) matrix
A=(aij) is positive definite
iff there is a basis (vi)
of Fn such that
aij = 〈vi,vj〉
for all i,j, and
iff there is an invertible matrix B such that
A=B*B.
For example, the matrix with entries 1/(i+j−1)
(“Hilbert matrix”) is positive-definite, because it is
the matrix of inner products (integrals on [0,1]) of the basis
1,x,x2,…,xn−1
for the polynomials of degree <n.
See the 10th problem set for a calculus-free proof of the
positivity of the Hilbert matrix, and an evaluation of its determinant.
For any matrix B (not necessarily invertible or even square)
with columns vi,
the matrix B*B with entries
aij = 〈vi,vj〉
is known as the Gram matrix of the columns of B.
It is invertible iff those columns are linearly independent.
If we add an (n+1)-st vector, the determinant of
the Gram matrix increases by a factor equal to the squared distance
between this vector and the span of the columns of B.
All of Chapter 8 works over an arbitrary algebraically closed field,
not only over C
(except for the minor point about extracting square roots,
which breaks down in characteristic 2); and
the first section (“Generalized Eigenvalues”) works over any field.
More about nilpotent operators: let T be any operator
on a vector space V over a field F, not assumed
algebraically closed. If V is finite-dimensional, then
The Following Are Equivalent:
(1) There exists a nonnegative integer k
such that Tk=0;
(2) For any vector v, there exists a nonnegative integer
k such that Tkv=0;
(3) Tn=0, where n=dim(V).
Note that (1) and (2) make no mention of the dimension,
but are still not equivalent for operators on infinite-dimensional spaces.
(For example, consider differentiation on the
R-vector space R[x].)
We readily deduce the further equivalent conditions:
(4) There exists a basis for V for which
T has an upper-triangular matrix
with every diagonal entry equal zero;
(5) Every upper-triangular matrix for T
has zeros on the diagonal,
and there exists at least one upper-triangular matrix for T.
Recall that the second part of (5) is automatic
if F is algebraically closed.
The space of generalized 0-eigenvectors
(the maximal subspace on which T is nilpotent)
is sometimes called the nilspace of T.
It is an invariant subspace. When V is finite dimensional,
V is the direct sum of the nilspace and another invariant
subspace V', consisting of the intersection of the subspaces
Tk(V) as k ranges over all
positive integers (8.5). This can be used to prove Cayley-Hamilton
using the standard definition of the characteristic polynomial as
det(xI−T).
An example in infinite dimension when (8.5) fails:
V is the real vector space of continuous functions from
R to R, and T is multiplication
by x. [That is a useful counterexample for many other
aspects of “eigenstuff” when we try to go beyond
finite dimension; for example, there are no eigenvectors, but
for every real number λ the operator
λI−T is not invertible!]
The dimension of the space of generalized c-eigenvalues
(i.e., of the nilspace of T−cI) is usually called the
algebraic multiplicity of c
(since it’s the multiplicity of c
as a root of the characteristic polynomial of T),
to distinguish it from the “geometric multiplicity”
which is the dimension of ker(T−cI).
Our source for representation theory of finite groups
(on finite-dimensional vector spaces over C)
will be Artin’s Algebra, Chapter 9. We’ll omit
sections 3 and 10 (which require not just topology and
calculus but also, at least for §3, some material beyond 55b
to do properly, namely the construction of Haar measures);
also we won’t spend much time on §7, which works out in detail
the representation theory of a specific group that Artin
calls I (the icosahedral group,
a.k.a. A5).
There are many other sources for this material,
some of which take a somewhat different point of view via the
“group algebra” C[G]
of a finite group G (a.k.a. the algebra of
functions on G under convolution).
See for instance Chapter 1 of Representation Theory
by Fulton and Harris (mentioned in class). A canonical treatment
of representations of finite groups is
Serre’s Linear Representations of Finite Groups,
which is the only entry for this chapter in the list of
“Suggestions for Further Reading” at the end of Artin’s book
(see p.604).
While we’ll work almost exclusively over C,
most of the results work equally well (though with somewhat different
proofs) over any field F that contains the roots of unity
of order #(G), as long as the characteristic
of F is not a factor of #(G).
Without roots of unity, many more results are different,
but there is still a reasonably satisfactory theory available.
Dropping the characteristic condition leads to much trickier territory,
e.g. even Maschke’s theorem (every finite-dimensional
representation is a direct sum of irreducibles) fails;
some natural problems are still unsolved!
Here’s an alternative viewpoint on representations of a finite group
G (not in Artin, though you can find it elsewhere, e.g.
Fulton-Harris pages 36ff.): a representation of G over
a field F is equivalent to a module for the
group ring
F[G].
The group ring is an associative F-algebra
(commutative iff G is commutative) that consists of the
formal F-linear combinations of group elements.
This means that F[G]
is FG as an
F-vector space,
and the algebra structure is defined by setting
eg1eg2
= eg1g2
for all g1, g2
in G, together with the
F-bilinearity of the product.
This means that if we identify elements of the group ring with
functions G→F then the
multiplication rule is
(f1 * f2)(g)
= Σg1g2=g
f1(g1) f2(g2)
— yes, it’s convolution again. To identify an
F[G]-module with a representation,
use the action of F to define the vector space structure,
and let ρ(g) act by multipliction by
the unit vector eg. In particular,
the regular representation is F[G]
regarded in the usual way as a module over itself. If we identify
the image of this representation with certain permutation matrices
of order #(G), we get an explicit model of
F[G] as a subalgebra of the
algebra of square matrices the same order. For example, if
G = Z/nZ
we recover the algebra of circulant matrices of order n.
First problem set / Linear Algebra I:
vector space basics; an introduction to convolution rings
Clarifications (2.ix.16):
•
“Which if any of these basic results would fail if
F were replaced by Z?” —
but don’t worry about this for problems 7 and 24,
which specify R.
•
Problem 12: If you see how to compute this efficiently but not
what this has to do with Problem 8, please keep looking for
the connection.
Here’s
the “Proof of Concept” mini-crossword with links concerning
the ∎ symbol.
Here’s
an excessively annotated solution.
Second problem set / Linear Algebra II:
dimension of vector spaces; torsion groups/modules and divisible groups
About Problem 5: You may wonder: if not determinants,
what can you use? See Axler, Chapter 4, namely 4.8 through 4.12
(pages 121–123), and note that the proof of 4.8
(using techniques we won’t cover till next week) can be replaced by
the ordinary algorithm for polynomial long division, which you probably
learned with real coefficients but works over any field.
While I’m at it, 4.7 (page 120) works over any infinite
field; Axler’s proof is special to the real and complex numbers,
but 4.12 yields the result in general. (We already remarked that
this result does not hold for finite fields.)
Third problem set / Linear Algebra III:
Countable vs. uncountable dimension of vector spaces;
linear transformations and duality
Fourth problem set / Linear Algebra IV:
Duality; projective spaces; more connections with polynomials
Fifth problem set / Linear Algebra V:
“Eigenstuff”
(with a prelude on exact sequences and more duality)
corrected 4.x.10:
Problem 2: FN / V*,
not VN / V* (noted by many students);
Problem 5ii: all α in F,
not all α in V
(noted by Hikari Sorensen).
Due date extended from Oct.7 (Fri.) to Oct.12 (Wed.) in class
Sixth problem set / Linear Algebra VI:
Tensors, eigenstuff cont’d, and a bit on inner products
Seventh problem set / Linear Algebra VII:
Pairings and inner products, cont’d
Eighth problem set / Linear Algebra VIII:
The spectral theorem; spectral graph theory; symplectic structures
Ninth problem set / Linear Algebra IX:
Trace, determinant, and more exterior algebra
Tenth problem set:
Linear Algebra X (determinants and distances);
representations of finite abelian groups (Discrete Fourier transform)
Eleventh and final problem set:
Representations of finite abelian groups