Math 55a: Honors Abstract Algebra (Fall 2016)

Lecture notes for Math 55a: Honors Abstract Algebra (Fall 2016)

If you find a mistake, omission, etc., please let me know by e-mail.

The orange balls mark our current location in the course, and the current problem set.

Ceci n’est pas un Math 55a syllabus
[No, you don’t have to know French to take Math 55a. Googling ceci+n'est suffices to turn up the explanation, such as it is.]

Update 9/9: We now have two CA’s for the class: Wyatt Mackey (wmackey@college) and Andrew (Andy) Gordon (andrewgordon@college)
[if writing from outside the Harvard network, append .college.edu to ...@harvard].
! If you are coming to class but not officially registered for Math 55 (e.g. you are auditing, or still undecided between 25a and 55a but officially signed up for 25a), send me your e-mail address so that I and the CA's can include you in class announcements.

The Sep.13 office hours will be an hour later than usual, that is, 8:30 to 10:00 PM in the Lowell House Dining Hall, due to Lowell’s Sophomore Dinner happening earlier that evening.

CA office hours are Wednesdays 8 to 10 PM in the Leverett House Dining Hall (same location as Math Nights), starting September 14.

Wyatt will hold section Wednesdays 2-3 PM in Science Center 304. Andy will hold section Fridays 2-3 PM~~, room TBA~~ in Science Center 310.
! First Quiz Friday, October 7 in class. Andy will hold a review section Thursday, October 6, at 8PM in Science Center Room 304 in place of his usual Friday section.
Also, no class Monday, Oct. 10 (University holiday); I will lecture Wednesday the 12th, but on a scenic detour from the main sequence of 55a topics so as not to disadvantage students observing the Yom Kippur fast. (The talk was on linear error-correcting codes, which were the hidden(?) agenda of some of the recent homework problems. While I’m at it, here is the interlude on 5777, Jewish leap years, etc., with probably too much additional information.) Thus the due date for Problem Set 5 is postponed from Friday the 7th (the quiz date) to Wednesday the 12.
! Office Hours moved to Thursday starting the week of October 31 (paralleling the shift in the due dates of problem sets from Friday to Monday). So November 3 instead of November 1, etc.
! Second Quiz Friday, December 2 in class. Andy will hold a review section Thursday, December 1, at 7PM in Science Center Room 304 in place of his usual Friday section. I'll hold office hours 8:00-9:30 at Lowell House (not the usual 7:30-9:00, to avoid conflict with Andy's section), in addition to Tuesday's office hours.

At least in the beginning of the linear algebra unit, we’ll be following the Axler textbook closely enough that supplementary lecture notes should not be needed. Some important extensions/modifications to the treatment in Axler:

[see Axler, page 5] Pace the boxed note on that page, virtually all mathematicians say and write “n-tuple” (more fully, “ordered n-tuple”), while I cannot recall another instance of “list” used for this as Axler does. (One sometimes sees “tuple” for an n-tuple of unspecified length n, and “ordered pair” and perhaps “ordered triple”, “ordered quadruple”, etc. for n = 2, 3, 4, …)
[cf. Axler, Notation 1.6 on page 4, and the “Digression on Fields” on page 10]
Unless noted otherwise, F may be an arbitrary field, not only R or C. The most important fields other than those of real and complex numbers are the field Q of rational numbers, and the finite fields Z/pZ (p prime). Other examples are: the field Q(i) of complex numbers with rational real and imaginary parts; more generally, Q(d^1/2) for any nonsquare rational number d; the “p-adic numbers” Q_p (p prime), of which we’ll say more when we study topology next term; and more exotic finite fields such as the 9-element field (Z/3Z)(i). Here’s a review of the axioms for fields, vector spaces, and related mathematical structures.
[cf. Axler, p.28 ff.] We define the span of an arbitrary subset S of (or tuple in) a vector space V as follows: it is the set of all (finite) linear combinations a₁v₁ + … + a_nv_n with each v_i in S and each a_i in F. This is still the smallest vector subspace of V containing S. In particular, if S is empty, its span is by definition {0}. We do not require that S be finite.
Warning: in general the space F[X] (a.k.a. P(F)) of polynomials in X, and its subspaces P_n(F) of polynomials of degree at most n, might not be naturally identified with a subspace of the space F^F of functions from F to itself. The problem is that two different polynomials may yield the same function. For example if F is the field of 2 elements the polynomial X²−X gives rise to the zero function. In general different polynomials can represent the same function if and only if F is finite — do you see why?
(See also Exercise 11 in Axler 1.C, assigned as part of the first problem set)
If U_i are any subspaces of a vector space V then so is their intersection ∩_i U_i. Note that this doesn’t specify a finite intersection: i could range over an “index set” I of any cardinality (so we would write the intersection as ∩_i∈I U_i ). We don’t usually want to intersect an empty family of sets (do you see why?), but for subsets of a given set V we can declare that ∩_i∈∅ U_i = V.
For any field (or even any ring) F there is a canonical ring homomorphism, call it h, from Z to F. “Ring homomorphism” means: h(0) = 0, h(1) = 1, and for any integers m, n we have h(m+n) = h(m) + h(n) and h(mn) = h(m) h(n) (and h(−n) = −h(n), though this follows from the other properties). But this doesn’t quite mean that we get an isomorphic copy of Z in F, because h might not be injective. Equivalently, the kernel (that is, the preimage h⁻¹({0}) = {n : h(n) = 0}) might be larger than just {0}. In general, I must be an ideal, i.e. an additive subgroup of Z that is closed under multiplication by arbitrary integers (whether in I or not — this mimics the definition of a subspace, though as it happens for ideals in Z it’s automatic). Now ideals in Z are either {0} or (n) := {cn | c in Z} for some n (namely the least positive element of the ideal). For rings, any n may arise, most easily for the ring Z/nZ of integers mod n. But if F is a field and I=(n) then n must be either zero or prime, lest F have zero divisors (elements a and b, neither zero, for which ab=0). This n is then called the characteristic of the field F. The familiar fields Q, R, C all have characteristic zero. For any prime p, there are fields of characteristic p, notably the “prime field” Z/pZ (mentioned above; this is the key fact from elementary (but nontrivial) number theory that any nonzero element of Z/pZ has a multiplicative inverse!). This field Z/pZ and other prime fields have important uses in number theory, combinatorics, computer science, and elsewhere, often using the linear algebra that we develop in Math 55a.
[cf. the boxed note on page 42 of Axler] It is natural to wonder whether every vector space, finite-dimensional or not, has a basis. The polynomial ring F[z], considered as a vector space over F (and denoted by a fancy script P in Axler), does have a basis (powers of z), as does a polynomial ring in several variables, or even infinitely many (see the next item); but does F^∞? The answer is yes — but only under the Axiom of Choice (equivalently, Zorn’s Lemma)! [“but only under” because it is known that Choice/Zorn is equivalent to the claim that every vector space has a basis. Don’t spend too much time trying to find an explicit basis for F^∞, or for R as a vector space over Q (a “Hamel basis”)…] Using the same tool one can prove analogues of some other results in Chapter 2, such as 2.33 (p.41: every linearly independent set extends to a basis), and thus 2.34 (p.42: every subspace is a direct summand; again, don’t spend too much time trying to do this explicitly for Q as a subspace of the Q-vector space R, or for ⊕_n≥1F as a subspace of F^∞!). NB some other results clearly fail in infinite dimensions, even when we have an explicit basis; e.g. the even powers of z form a linearly independent subset of F[z] that has the same cardinality as a basis but is not a basis.
However, 2.31 (p.40: every spanning list contains a basis) still holds with no further axioms for spanning sets S of arbitrary size, as long as V is finite dimensional. The reason is that V has a finite spanning set, say S₀, and every element of S₀ is a linear combination of elements of S, and sincelinear combinations are of necessity finite it takes only a finite subset of S to span S₀ and thus V. Now apply the proof of 2.31 to this finite subset. We may call this generalization “2.31+”.
[from the Sep.12 lecture] Here’s an extreme example of how basic theorems about finite-dimensional vector spaces can become utterly false for finitely-generated modules: a module generated by just one element can have a submodule that is not finitely generated. Indeed, for any field F let A be the ring of polynomials in infinitely many variables X_j. [The letter “A” is a common name for a ring, from French anneau, cognate with English “annulus”.] As usual we can regard A as a module over itself, with a single generator 1. Then a submodule is just an ideal of the ring. Choose the ideal I generated by all the X_j, which consists of all polynomials with constant coefficient equal 0. Then if there are infinitely many indices j then I is infinitely generated; indeed any generating set must be at least as large as the index set of j’s, so for every cardinal ℵ we can make a ring A with a singly-generated module (namely A itself) and with a submodule that cannot be generated by fewer than ℵ elements.
For a subtler example, consider the ring we might call “F[X^{1/2^∞}]”, consisting of F-linear combinations of monomials X^{n/2^k} for arbitrary nonnegative integers n and k. Again let I be the ideal generated by the nonconstant monomials, which is not finitely generated, though there are generating sets that are “only” countably infinite. The new behavior involves the countable generating set {X^{1/2^k} | k≥0}: there is no minimal generating subset, because each X^{1/2^k} is a multiple of X^{1/2^k'} for any k'>k. Likewise for the ring generated by all monomials X^r with r any nonnegative rational number (or even all X^r with r any nonnegative real number).
(When A is Noetherian, submodules of finitely-generated modules are finitely-generated, but might still require more generators; for example, there are Noetherian rings A with “non-principal ideals” I, which give examples of a 1-generator module with a submodule that requires at least 2 generators.)
As with the notions of span and linear combination, the definition of a linear transformation makes sense for modules over any ring A (whether commutative or not), and in that generality is called an A-module homomorphism (so you now know the “morphisms” in the “category” of A-modules); when A is a skew field, we still call this a linear transformation, and the “rank-nullity theorem” (3.22, page 63) still holds for finite-dimensional vector spaces in that context.
Suppose T : V → W is a linear transformation. Axler’s notation for the image of T was already becoming rather old-fashioned when he wrote the first edition of his book; these days simply T(V) is common (and likewise for any function at all). The terminology “null space” (whether one or two words) for T⁻¹({0}) is also somewhat quaint; we usually say “kernel” and write “ker(T)” [and (LA)TeX already provides the command \ker to typeset this properly]. While I’m at it, best to avoid the use of “one-to-one” to mean “injective” (see boxed note on page 60), because it is also sometimes used for “bijective”.
Please avoid Axler’s notation “product” and “V × W ” (p.91, 3.71 ff.). I understand the motivation for this notation: it is formally correct, and avoids the need to distinguish between “external direct sum” (the usual name for that vector space) and “internal direct sum” (a vector space sum [within some larger vector space] that happens to be direct). The problem with this is that in Math 55 (and ubiquitously in the literature) we shall introduce before long a “tensor product” V ⊗ W of vector spaces, whose dimension is the product of the dimensions of V and W when those two dimensions are finite; and it would be a much bigger source of confusion to have that notation coexist with “V × W ” where the dimensions add. So please stick with “V ⊕ W ” and the name “external direct sum” — or if you must, “Cartesian product” to avoid confusion with tensor products. For a possibly infinite Cartesian product, which is not the same as a direct sum (because an element of the direct sum must have only finitely many nonzero components), we still have the notation Π_i∈IV_i to distinguish the Cartesian product from the direct sum ⊕_i∈IV_i.
More notes on notation: I understand why Axler wants to distinguish V' and T' (dual space and transformation) from V* and T*, and U⁰ (annihilator) from U^⊥. (See the boxed note on page 104.) I’ll try to stick with U⁰ in this class. But for the duals, using “ ' ” this way incurs a steep price of the very useful construction exemplified by “let V and V' be vector spaces”: we already have few enough good letters to name mathematical structures that even π is pressed into double duty (not just 3.14159… but also the quotient πrojection from V to V/U). I’ll stick with the common V* and T* here.
An equivalent statement of the identity (ST)* = T*S* (third part of 3.101, page 104 of Axler), together with (I_V)* = I_V* (which Axler might not even bother stating explicitly), is that duality of vector spaces and linear transformations constitutes a “contravariant functor” from the category of F-vector spaces and linear transformations to itself.
The results about quotient spaces and duality in sections E and F of Chapter 3 are often described in terms of exact sequences. A sequence … → L → M → N → … of linear transformations (or A-module homomorphisms, “etc.”) is said to be “exact at M ” if the kernel of the map M → N is the image of the map L → M (that is, if the elements of M that go to zero in N are precisely those that come from L). The sequence is “exact” if it is exact at each step with both an incoming and an outgoing map. In particular a map M → N is injective iff it extends to a sequence 0 → M → N that is exact at M, and surjective iff it extends to a sequence M → N → 0 that is exact at N. (Note that in each case there is no choice about the function from or to the trivial vector space 0, and likewise at least for modules.) Thus the map is an isomorphism iff 0 → M → N → 0 is exact (at both M and N). Even more easily, 0 → M → 0 is exact iff M=0. A short exact sequence is the next case, with three modules other than the initial and final 0. The standard example is 0 → L → M → M/L → 0 where the map 0 → L → M is an inclusion map (thus an injection) and the map M → M/L → 0 is the quotient map (thus a surjection). In general if 0 → L → M → N → 0 is a short exact sequence then the injection L → M identifies L with a submodule of M, and then the surjection M → N is identified with the quotient map. More generally, any homomorphism L → M extends (uniquely up to equivalence) to an exact sequence with four modules between the outer zeros: 0 → K → L → M → N → 0, where K is the kernel of the map L → M, and N is its “cokernel ”, that is, the quotient of M by the image of L.
Now consider the case of vector spaces. Then to each linear transformation V → W we associate the dual transformation V* ← W*, with the dual of a composition V → W → X being the composition of the dual transformations V* ← W* ← X* in reverse order; this makes duality a “contravariant functor” on the category of F-vector spaces. The key fact is that for finite-dimensional vector spaces, duality preserves exactness of sequences of linear transformations. Thus starting from any linear V → W, we can extend to an exact sequence 0 → U → V → W → X → 0 with U the kernel and X the cokernel, and dualize to deduce the exactness of 0 ← U* ← V* ← W* ← X* ← 0 with V* ← W* the dual map. This immediately encodes Axler 3.108 (page 107): the map V → W is surjective iff X is zero iff X* is zero iff the dual map is injective. Likewise for 3.110 (p.108) via the vanishing of U and U*. With a bit more work we can get the general relations ker(T*) = (im(T))⁰ (3.107, p.106) and im(T*) = (ker(T))⁰ (3.109, p.107) between the kernels and images of T and its dual, again assuming that T is a linear map between finite-dimensional vector spaces. Conversely, the fact that duality preserves exactness (for sequences of linear maps between finite-dimensional vector spaces) can be deduced as a special case of 3.107 and 3.109.
You can now understand this joke (such as it is).
Another way to think about the eigen-basics: “Lemma 5.0”: If T is an operator on any vector space V, and λ any scalar, then U is an invariant subspace for T iff it is an invariant subspace for T − λI_V. So, for instance, since ker T is an invariant subspace, so is ker(T − λI_V), a.k.a. the λ-eigenspace.
Yet another note on notation: Axler’s name “T/U ” (for the operator on V/U induced from the action of T on a vector space V with an invariant subspace U, see 5.14 on p.137) is a nice notation, but (unlike T |_U for the restriction of T to U) is seen rarely if at all in the research literature. Normally it will be called plain T, or possibly \bar{T} (since it is constructed by descending to V/U the composition of T with the quotient map V → V/U).
Let T be a linear operator on V. The algebraic properties of polynomial evaluation at T can be summarized by saying that the map from F[X] to End(V) that takes any polynomial P to P(T) is not just linear but a ring homomorphism. [Since F[X] is a commutative ring, so is the image of this homomorphism, even though End(V) is not commutative once dim(V) > 1.] In particular the kernel is an ideal in F[X]; when V is finite dimensional, this ideal must be nonzero, and its generator is what we shall call the “minimal polynomial” of T. Special case: if V is F itself, then we naturally identify End(V) with F, and we get for any field element x the evaluation homomorphism from F[X] to F that takes any polynomial to its value at x.
Axler proves the Fundamental Theorem of Algebra using complex analysis, which cannot be assumed in Math 55a (we’ll get to it at the end of 55b). Here’s a proof using the topological tools we’ll develop at the start of 55b. (Axler gives one standard complex-analytic proof in 4.13 on page 124.)
Triangular matrices are intimately related with “flags”. A (complete) flag in a finite dimensional vector space V is a sequence of subspaces {0}=V₀, V₁, V₂, …, V_n=V, with each V_i of dimension i and containing V_i−1. A basis v₁, v₂, …, v_n determines a flag: V_i is the span of the first i basis vectors. Another basis w₁, w₂, …, w_n determines the same flag if and only if each w_i is a linear combination of v₁, v₂, …, v_i (necessarily with nonzero v_i coefficient). The standard flag in F_n is the flag obtained in this way from the standard basis of unit vectors e₁, e₂, …, e_n. The punchline is that, just as a diagonal matrix is one that respects the standard basis (equivalently, the associated decomposition of V as a direct sum of 1-dimensional subspaces), an upper-triangular matrix is one that respects the standard flag. Note that the i-th diagonal entry of a triangular matrix gives the action on the one-dimensional quotient space V_i/V_i−1 (each i=1,…,n).

While the third edition of Axler includes quotients and duality, it still lacks tensor algebra. This is no surprise, but it will not stop us in Math 55! Here’s an introduction [As you might guess from \oplus, the TeXism for the tensor-product symbol is \otimes.]

One of many applications is the trace of an operator on a finite dimensional F-vector space V. This is a linear map from Hom(V,V) to F. We can define it simply as the composition of two maps: our identification of Hom(V,V) with the tensor product of V^* and V, and the natural map from this tensor product to F coming from the bilinear map taking (v^*,v) to v^*(v). We shall see that this is the same as the classical definition: the trace of T is the sum of the diagonal entries of the matrix of T with respect to any basis. The coordinate-independent construction via tensor algebra explains why the trace does not change under change of basis. (The invariance can also be proved by checking explicitly that AB and BA have the same trace for any square matrices A, B of the same size.)
Here are some basic definitions and facts about general norms on real and complex vector spaces.
Just as we can study bilinear symmetric forms on a vector space over any field, not just R, we can study sesquilinear conjugate-symmetric forms on a vector space over any field with a conjugation, not just C. Here a “conjugation” on a field F is a field automorphism σ:F→F such that σ is not the identity but σ² is (that is, σ is an involution). Given a basis {v_i} for F, a sesquilinear form ⟨.,.⟩ on F is determined by the field elements a_i,j = ⟨v_i,v_j⟩, and is conjugate-symmetric if and only if a_j,i = σ(a_i,j) for all i,j. Note that the “diagonal entries” a_i,i — and more generally ⟨v,v⟩ for any v in V — must be elements of the subfield of F fixed by σ.
“Sylvester’s Law of Inertia” states that for a nondegenerate pairing on a finite-dimensional vector space V/F, where either F=R and the pairing is bilinear and symmetric, or F=C and the pairing is sesquilinear and conjugate-symmetric, the counts of positive and negative inner products for an orthogonal basis constitute an invariant of the pairing and do not depend on the choice of orthogonal basis. (This invariant is known as the “signature” of the pairing.) The key trick in proving this result is as follows. Suppose V is the orthogonal direct sum of subspaces U₁, U₂ for which the pairing is positive definite on U₁ and negative definite on U₂. Then any subspace W of V on which the pairing is positive definite has dimension no greater than dim(U₁). Proof: On the intersection of W with U₂, the pairing is both positive and negative definite; hence that subspace is {0}. The claim follows by a dimension count, and we quickly deduce Sylvester’s Law.
Over any field not of characteristic 2, we know that for any non-degenerate symmetric pairing on a finite-dimensional vector space there is an orthogonal basis, or equivalently a choice of basis such that the pairing is (x,y)=Σ_i(a_i x_i y_i) for some nonzero scalars a_i. But in general it can be quite hard to decide whether two different collections of a_i yield isomorphic pairings. Even over Q the answer is already tricky in dimensions 2 and 3, and I don’t think it’s known in a vector space of arbitrary dimension. Over a finite field of odd size there are always exactly two possibilities, as I think we’ll see in a few weeks.
If U is a subspace of inner-product space V, but not necessarily finite dimensional, there is not generally a complement: one can still define U^⊥, but the direct orthogonal sum U ⊕ U^⊥ might be strictly smaller than V. What then happens to 6.56 (p.198 in 6.C), which describes the orthogonal projection P_U(v) as the vector in U closest to v (i.e., minimizing the norm ||v−u||)? Well, if there exists such u then indeed v−u is orthogonal to U, but in general the minimum need not be attained: at best we can construct a sequence of vectors u_n in U such that ||v−u_n|| approaches inf_u∈U||v−u||. It then follows from Apollonius’ theorem (see the front cover of Axler! and also Exercise 31 of 6.A, page 179) that the u_n constitute a Cauchy sequence in U (else (u_m+u_n)/2 is too close to v). So if U is complete with respect to the norm distance then there is a nearest vector and we can proceed as before. But in general infinite-dimensional inner product spaces are not complete (the complete ones are Hilbert spaces, and that is a very special case). We shall say a lot more about completeness and related notions at the start of Math 55b.
A regular graph of degree d is a Moore graph of girth 5 if any two different vertices are linked by a unique path of length at most 2. Such a graph necessarily has n = 1 + d + d(d−1) = d² + 1 vertices. Let A be the adjacency matrix, and 1 the all-ones vector. Then (because each vertex has degree d) 1 is a d-eigenvector of A. We have (1 + A + A²) v = d v + ⟨v, 1⟩ 1 for all v (proof: check on unit vectors and use linearity). Thus A takes the orthogonal complement of R·1 to itself and satisfies (1 + A + A²) = d on that space. Since this quadratic equation has distinct roots m and −1−m for some m > 0 (namely the positive root of (1 + m + m²) = d), it follows that the orthogonal complement of R·1 is the direct sum of the corresponding eigenspaces. Let d₁ and d₂ be their dimensions. These sum to n−1 = d², and satisfy md₁ + (−1−m)d₂ + d = 0 because the matrix A has trace zero. This lets us solve for d₁ and d₂. in particular we find that their difference is (2d − d²) / (2m+1). Since that’s an integer, either d=2 (giving the pentagon graph) or m is an integer. Substituting m² + m + 1 for d, we find that 16(d₁ − d₂) is an integer plus 15/(2m+1), whence m is one of 0, 1, 2, or 7. The first of these is impossible, and the others give d = 3, 7, or 57 as claimed.
It is clear that the pentagon is the unique Moore graph of girth 5, and degree 2, and it is not hard to check that the Petersen graph is the unique example with d = 3. For d = 7 there is again a unique graph up to an interesting group of automorphisms; see this proof outline, which also lets one count the automorphisms of this graph.
Why the name “spectral theorem”? The set (or sometimes the “multiset”) of eigenvalues of a linear operator on a vector space V is often called its “spectrum”, especially when V is a real or complex vector space, either finite or infinite dimensional. This is related with the visual (and by extension the electromagnetic) spectrum, for reasons that would take us much too far into wave and quantum mechanics, so we shall say little more of that here (but you may encounter it again in your physics class(es)).

We’ll define the determinant of an operator T on a finite dimensional space V as follows: T induces a linear operator ∧ⁿ(T) on the top exterior power ∧ⁿ(V) of V (where n = dim V); this exterior power is one-dimensional, so an operator on it is multiplication by some scalar; det(T) is by definition the scalar corresponding to ∧ⁿ(T). The “top exterior power” is a subspace of the “exterior algebra” ∧^•(V) of V, which is the quotient of the tensor algebra by the ideal generated by {v ⊗ v: v in V}. We’ll still have to construct the sign homomorphism from the symmetry group of order dim(V) to {1,−1} to make sure that this exterior algebra is as large as we expect it to be, and that in particular that the (dim(V))-th exterior power has dimension 1 rather than zero.

Interlude: normal subgroups; short exact sequences in the context of groups: A subgroup H of G is normal (satisfies H = g⁻¹Hg for all g in G) iff H is the kernel of some group homomorphism from G iff the injection H → G fits into a short exact sequence {1} → H → G → Q → {1}, in which case Q is the quotient group G/H. [The notation {1} for the one-element (“trivial”) group is usually abbreviated to plain 1, as in 1 → H → G → Q → 1.] This is not in Axler but can be found in any introductory text in abstract algebra; see for instance Artin, Chapter 2, section 10. Examples: 1 → A_n → S_n → {±1} → 1; also, the determinant homomorphism GL_n(F) → F^* gives the short exact sequence 1 → SL_n(F) → GL_n(F) → F^* → 1, and this works even if F is just a commutative ring with unit as long as F^* is understood as the group of invertible elements of F — for example, Z^* = {±1}.

Some more tidbits about exterior algebra:

If w, w' are elements of the m-th and m'-th exterior powers of V, then ww'=(−1)^mm'w'w; that is, w and w' commute unless m and m' are both odd in which case they anticommute.
If m+m'=n=dim(V) then the natural pairing from the m-th and m'-th exterior powers to the n-th is nondegenerate, and so identifies the m'-th exterior power canonically with the dual of the m-th tensored with the top (n-th) exterior power.
In particular, if m=1, and T is any invertible operator on V, then we find that the induced action of T on the (n−1)st exterior power is the same as its action on V^* multiplied by det(T). This yields the formula connecting the inverse and cofactor matrix of an invertible matrix (a formula which you may also know in the guise of “Cramer’s rule”).
For each m there is a natural non-degenerate pairing between the m-th exterior powers of V and V^*, which identifies these exterior powers with each other’s dual.

More will be said about exterior algebra when differential forms appear in Math 55b.

We’ll also show that a symmetric (or Hermitian) matrix is positive definite iff all its eigenvalues are positive iff it has positive principal minors (the “principal minors” are the determinants of the square submatrices of all orders containing the (1,1) entry). More generally we’ll show that the eigenvalue signs determine the signature, as does the sequence of signs of principal minors if they are all nonzero. More precisely: an invertible symmetric/Hermitian matrix has signature (r,s) where r is the number of positive eigenvalues and s is the number of negative eigenvalues; if its principal minors are all nonzero then r is the number of j in {1,2,…,n} such that the j-th and (j−1)-st minors have the same sign, and s is the number of j in that range such that the j-th and (j−1)-st minors have opposite sign [for j=1 we always count the “zeroth minor” as being the positive number 1]. This follows inductively from the fact that the determinant has sign (−1)^s and the signature (r',s') of the restriction of a pairing to a subspace has r' ≤ r and s' ≤ s.

For positive definiteness, we have the two further equivalent conditions: the symmetric (or Hermitian) matrix A=(a_ij) is positive definite iff there is a basis (v_i) of Fⁿ such that a_ij = ⟨v_i,v_j⟩ for all i,j, and iff there is an invertible matrix B such that A=B^*B. For example, the matrix with entries 1/(i+j−1) (“Hilbert matrix”) is positive-definite, because it is the matrix of inner products (integrals on [0,1]) of the basis 1,x,x²,…,xⁿ⁻¹ for the polynomials of degree <n. See the 10th problem set for a calculus-free proof of the positivity of the Hilbert matrix, and an evaluation of its determinant.
For any matrix B (not necessarily invertible or even square) with columns v_i, the matrix B^*B with entries a_ij = ⟨v_i,v_j⟩ is known as the Gram matrix of the columns of B. It is invertible iff those columns are linearly independent. If we add an (n+1)-st vector, the determinant of the Gram matrix increases by a factor equal to the squared distance between this vector and the span of the columns of B.

All of Chapter 8 works over an arbitrary algebraically closed field, not only over C (except for the minor point about extracting square roots, which breaks down in characteristic 2); and the first section (“Generalized Eigenvalues”) works over any field.

More about nilpotent operators: let T be any operator on a vector space V over a field F, not assumed algebraically closed. If V is finite-dimensional, then The Following Are Equivalent:
(1) There exists a nonnegative integer k such that T^k=0;
(2) For any vector v, there exists a nonnegative integer k such that T^kv=0;
(3) Tⁿ=0, where n=dim(V).
Note that (1) and (2) make no mention of the dimension, but are still not equivalent for operators on infinite-dimensional spaces. (For example, consider differentiation on the R-vector space R[x].) We readily deduce the further equivalent conditions:
(4) There exists a basis for V for which T has an upper-triangular matrix with every diagonal entry equal zero;
(5) Every upper-triangular matrix for T has zeros on the diagonal, and there exists at least one upper-triangular matrix for T.
Recall that the second part of (5) is automatic if F is algebraically closed.

The space of generalized 0-eigenvectors (the maximal subspace on which T is nilpotent) is sometimes called the nilspace of T. It is an invariant subspace. When V is finite dimensional, V is the direct sum of the nilspace and another invariant subspace V', consisting of the intersection of the subspaces T^k(V) as k ranges over all positive integers (8.5). This can be used to prove Cayley-Hamilton using the standard definition of the characteristic polynomial as det(xI−T).

An example in infinite dimension when (8.5) fails: V is the real vector space of continuous functions from R to R, and T is multiplication by x. [That is a useful counterexample for many other aspects of “eigenstuff” when we try to go beyond finite dimension; for example, there are no eigenvectors, but for every real number λ the operator λI−T is not invertible!]

The dimension of the space of generalized c-eigenvalues (i.e., of the nilspace of T−cI) is usually called the algebraic multiplicity of c (since it’s the multiplicity of c as a root of the characteristic polynomial of T), to distinguish it from the “geometric multiplicity” which is the dimension of ker(T−cI).

Our source for representation theory of finite groups (on finite-dimensional vector spaces over C) will be Artin’s Algebra, Chapter 9. We’ll omit sections 3 and 10 (which require not just topology and calculus but also, at least for §3, some material beyond 55b to do properly, namely the construction of Haar measures); also we won’t spend much time on §7, which works out in detail the representation theory of a specific group that Artin calls I (the icosahedral group, a.k.a. A₅). There are many other sources for this material, some of which take a somewhat different point of view via the “group algebra” C[G] of a finite group G (a.k.a. the algebra of functions on G under convolution). See for instance Chapter 1 of Representation Theory by Fulton and Harris (mentioned in class). A canonical treatment of representations of finite groups is Serre’s Linear Representations of Finite Groups, which is the only entry for this chapter in the list of “Suggestions for Further Reading” at the end of Artin’s book (see p.604).

While we’ll work almost exclusively over C, most of the results work equally well (though with somewhat different proofs) over any field F that contains the roots of unity of order #(G), as long as the characteristic of F is not a factor of #(G). Without roots of unity, many more results are different, but there is still a reasonably satisfactory theory available. Dropping the characteristic condition leads to much trickier territory, e.g. even Maschke’s theorem (every finite-dimensional representation is a direct sum of irreducibles) fails; some natural problems are still unsolved!

Here’s an alternative viewpoint on representations of a finite group G (not in Artin, though you can find it elsewhere, e.g. Fulton-Harris pages 36ff.): a representation of G over a field F is equivalent to a module for the group ring F[G]. The group ring is an associative F-algebra (commutative iff G is commutative) that consists of the formal F-linear combinations of group elements. This means that F[G] is F^G as an F-vector space, and the algebra structure is defined by setting e_g₁e_g₂ = e_g₁g₂ for all g₁, g₂ in G, together with the F-bilinearity of the product. This means that if we identify elements of the group ring with functions G→F then the multiplication rule is (f₁ * f₂)(g) = Σ_g₁g₂=g f₁(g₁) f₂(g₂) — yes, it’s convolution again. To identify an F[G]-module with a representation, use the action of F to define the vector space structure, and let ρ(g) act by multipliction by the unit vector e_g. In particular, the regular representation is F[G] regarded in the usual way as a module over itself. If we identify the image of this representation with certain permutation matrices of order #(G), we get an explicit model of F[G] as a subalgebra of the algebra of square matrices the same order. For example, if G = Z/nZ we recover the algebra of circulant matrices of order n.

First problem set / Linear Algebra I: vector space basics; an introduction to convolution rings
Clarifications (2.ix.16):
• “Which if any of these basic results would fail if F were replaced by Z?” — but don’t worry about this for problems 7 and 24, which specify R.
• Problem 12: If you see how to compute this efficiently but not what this has to do with Problem 8, please keep looking for the connection.

Here’s the “Proof of Concept” mini-crossword with links concerning the ∎ symbol. Here’s an excessively annotated solution.

Second problem set / Linear Algebra II: dimension of vector spaces; torsion groups/modules and divisible groups
About Problem 5: You may wonder: if not determinants, what can you use? See Axler, Chapter 4, namely 4.8 through 4.12 (pages 121–123), and note that the proof of 4.8 (using techniques we won’t cover till next week) can be replaced by the ordinary algorithm for polynomial long division, which you probably learned with real coefficients but works over any field. While I’m at it, 4.7 (page 120) works over any infinite field; Axler’s proof is special to the real and complex numbers, but 4.12 yields the result in general. (We already remarked that this result does not hold for finite fields.)

Third problem set / Linear Algebra III: Countable vs. uncountable dimension of vector spaces; linear transformations and duality

Fourth problem set / Linear Algebra IV: Duality; projective spaces; more connections with polynomials

Fifth problem set / Linear Algebra V: “Eigenstuff” (with a prelude on exact sequences and more duality)
corrected 4.x.10:
Problem 2: F^N / V^*, not V^N / V^* (noted by many students);
Problem 5ii: all α in F, not all α in V (noted by Hikari Sorensen).
Due date extended from Oct.7 (Fri.) to Oct.12 (Wed.) in class

Sixth problem set / Linear Algebra VI: Tensors, eigenstuff cont’d, and a bit on inner products

Seventh problem set / Linear Algebra VII: Pairings and inner products, cont’d

Eighth problem set / Linear Algebra VIII: The spectral theorem; spectral graph theory; symplectic structures

Ninth problem set / Linear Algebra IX: Trace, determinant, and more exterior algebra

Tenth problem set: Linear Algebra X (determinants and distances); representations of finite abelian groups (Discrete Fourier transform)

Eleventh and final problem set: Representations of finite abelian groups