Lecture notes for Math 55a: Honors Abstract Algebra (Fall 2010)

If you find a mistake, omission, etc., please let me know by e-mail.

The orange balls mark our current location in the course, and the current problem set.


Ceci n'est pas un Math 55a syllabus

Tony Feng's section and office hours:
Section: Wednesdays (starting September 8), 4-5 PM, Science Center 103b
Office hours: Thursdays (starting September 9), 7-8 PM, Science Center 310
revised: now Thursdays 8-9 and 9-10, both in Science Center 411


at least in the beginning of the linear algebra unit, we'll be following the Axler textbook closely enough that supplementary lecture notes should not be needed. Some important extensions/modifications to the treatment in Axler: Less surprising than the absence of quotients and duality in Axler is the lack of tensor algebra. That won't stop us in Math 55, though. Here's an introduction [As you might guess from \oplus, the TeXism for the tensor-product symbol is \otimes.]
corrected 30.ix.10 and 1.x.10 to fix a couple of minor typos and improve the presentation of VF F' We'll not cover Chapter 9, which seems to exist mainly to prove that it is possible to construct the characteristic polynomial over R without introducing determinants. We'll proceed to Chapter 10, which covers determinants, but…

We'll define the determinant of an operator T on a finite dimensional space V as follows: T induces a linear operator T' on the top exterior power of V; this exterior power is one-dimensional, so an operator on it is multiplication by some scalar; det(T) is by definition the scalar corresponding to T'. The “top exterior power” is a subspace of the “exterior algebra” of V, which is the quotient of the tensor algebra by the ideal generated by {v*v: v in V}. We'll still have to construct the sign homomorphism from the symmetry group of order dim(V) to {1,−1} to make sure that this exterior algebra is as large as we expect it to be, and that in particular that the (dim(V))-th exterior power has dimension 1 rather than zero.

Interlude: normal subgroups; short exact sequences in the context of groups: A subgroup H of G is normal (satisfies H = g−1Hg for all g in G) iff H is the kernel of some group homomorphism from G iff the injection HG fits into a short exact sequence {1} → HGQ → {1}, in which case Q is the quotient group G/H. [The notation {1} for the one-element (“trivial”) group is usually abbreviated to plain 1, as in 1 → HGQ → 1.] This is not in Axler but can be found in any introductory text in abstract algebra; see for instance Artin, Chapter 2, section 10. Examples: 1 → AnSn → {±1} → 1; also, the determinant homomorphism GLn(F) → F* gives the short exact sesquence 1 → SLn(F) → GLn(F) → F* → 1, and this works even if F is just a commutative ring with unit as long as F* is understood as the group of invertible elements of F — for example, Z* = {±1}.

Some more tidbits about exterior algebra:

More will be said about exterior algebra when differential forms appear in Math 55b.

We'll also show that a symmetric (or Hermitian) matrix is positive definite iff all its eigenvalues are positive iff it has positive principal minors (the “principal minors” are the determinants of the square submatrices of all orders containing the (1,1) entry). More generally we'll show that the eigenvalue signs determine the signature, as does the sequence of signs of principal minors if they are all nonzero. More precisely: an invertible symmetric/Hermitian matrix has signature (r,s) where r is the number of positive eigenvalues and s is the number of negative eigenvalues; if its principal minors are all nonzero then r is the number of j in {1,2,…,n} such that the j-th and (j−1)-st minors have the same sign, and s is the number of j in that range such that the j-th and (j−1)-st minors have opposite sign [for j=1 we always count the “zeroth minor” as being the positive number 1]. This follows inductively from the fact that the determinant has sign (−1)s and the signature (r',s') of the restriction of a pairing to a subspace has r' ≤ r and s' ≤ s.

For positive definiteness, we have the two further equivalent conditions: the symmetric (or Hermitian) matrix A=(aij) is positive definite iff there is a basis (vi) of Fn such that aij = ⟨vi,vj for all i,j, and iff there is an invertible matrix B such that A=B*B. For example, the matrix with entries 1/(i+j−1) (“Hilbert matrix”) is positive-definite, because it is the matrix of inner products (integrals on [0,1]) of the basis 1,x,x2,…,xn−1 for the polynomials of degree <n. See the 10th problem set for a calculus-free proof of the positivity of the Hilbert matrix, and an evaluation of its determinant.
For any matrix B (not necessarily invertible or even square) with columns vi, the matrix B*B with entries aij = ⟨vi,vj is known as the Gram matrix of the columns of B. It is invertible iff those columns are linearly independent. If we add an (n+1)-st vector, the determinant of the Gram matrix increases by a factor equal to the squared distance between this vector and the span of the columns of B. You'll use this too in PS10.

Here's a brief introduction to field algebra and Galois theory.

Our source for representation theory of finite groups (on finite-dimensional vector spaces over C) will be Artin's Algebra, Chapter 9. We'll omit sections 3 and 10 (which require not just topology and calculus but also, at least for §3, some material beyond 55b to do properly, namely the construction of Haar measures); also we won't spend much time on §7, which works out in detail the representation theory of a specific group that Artin calls I (the icosahedral group, a.k.a. A5). There are many other sources for this material, some of which take a somewhat different point of view via the “group algebra” C[G] of a finite group G (a.k.a.\ the algebra of functions on G under convolution). See for instance Chapter 1 of Representation Theory by Fulton and Harris (mentioned in class). A canonical treatment of representations of finite groups is Serre's Linear Representations of Finite Groups, which is the only entry for this chapter in the list of “Suggestions for Further Reading” at the end of Artin's book (see p.604).

While we'll work almost exclusively over C, most of the results work equally well (though with somewhat different proofs) over any field F that contains the roots of unity of order #(G), as long as the characteristic of F is not a factor of #(G). Without roots of unity, many more results are different, but there is still a reasonably satisfactory theory available. Dropping the characteristic condition leads to much trickier territory, e.g. even Maschke's theorem (every finite-dimensional representation is a direct sum of irreducibles) fails; some natural problems are still unsolved!

Here's an alternative viewpoint on representations of a finite group G (not in Artin, though you can find it elsewhere, e.g. Fulton-Harris pages 36ff.): a representation of G over a field F is equivalent to a module for the group ring F[G]. The group ring is an associative F-algebra (commutative iff G is commutative) that consists of the formal F-linear combinations of group elements. This means that F[G] is FG as an F-vector space, and the algebra structure is defined by setting eg1eg2 = eg1g2 for all g1, g2 in G, together with the F-bilinearity of the product. This means that if we identify elements of the group ring with functions GF then the multiplication rule is (f1 * f2)(g) = Σg1g2=g f1(g1) f2(g2) — yes, it's convolution again. To identify an F[G]-module with a representation, use the action of F to define the vector space structure, and let ρ(g) act by multipliction by the unit vector eg. In particular, the regular representation is F[G] regarded in the usual way as a module over itself. If we identify the image of this representation with certain permutation matrices of order #(G), we get an explicit model of F[G] as a subalgebra of the algebra of square matrices the same order. For example, if G = Z/nZ we recover the algebra of circulant matrices of order n.

Just as the regular representation is F[G] as a module over itself, a subrepresentation of the regular representation is an ideal of F[G]. The fundamental theorem of representation theory can be stated as an isomorphism between F[G] and the direct sum of algebras End(V) with V ranging over all the irreducible representations of G. Comparing dimensions gives the formula #(G) = ΣV dim(V)2. The F-linear isomorphism from F[G] and the direct sum of the End(V) is a generalization of the discrete Fourier transform (DFT) to noncommutative groups [check that if G is commutative we recover the DFT of Problem Set 10].

In case we haven't officially said it yet: the unitary group Un that figures prominently in Artin's treatment (see the start of 9.2, p.310) is the subgroup of GLn(C) that acts by isometries on Cn with its usual inner-product norm; that is, matrices whose inverse equals their own Hermitian transpose. The unitary group U(V) = U(V, ⟨.,.⟩) of a complex inner-product space V = (V, ⟨.,.⟩) is defined in much the same way; if V is finite-dimensional then any choice of orthonormal basis identifies U(V) with Un.

Artin derives the diagonalizability of ρ(g) as Corollary 2.3 (p.310) of unitarity plus the Spectral Theorem for normal operators. This is not necessary because we already know that a linear transformation that satisfies a polynomial equation with distinct roots is diagonalizable, and ρ(g) satisfies such an equation, namely Xn−1=0 where n is the order of g. Note that this argument applies more generally to a representation over any field in which Xn−1=0 has n distinct roots. Again this leads naturally to the condition that the characteristic not be a factor of #(G), i.e. that #(G) is nonzero in F: this condition guarantees that also n≠0 in F, because n is always a factor of #(G) [a corollary of Lagrange's theorem that the order of any subgroup of G is a factor of #(G), see e.g. Artin, (6.11) and (6.12) in Chapter 2, p.58].

Let T be the character table of a group of N elements, and let T* be its Hermitian transpose. The first part two parts of Theorem 5.9 in Chapter 9 state in effect the identity TDT* = NI where D is the diagonal matrix whose diagonal entries are the sizes of the conjugacy classes of G. But then NI is also DT*T. This means that the columns of T are also orthogonal, and the column indexed by conjugacy class C has inner product N/#(C) with itself. [This is an adaptation of the familiar but still remarkable fact that the rows of a square matrix are orthonormal iff the columns are orthonormal.] That is, for any group elements g, h we have Σχ χ(g)χ(h) = 0 (the sum extending over the irrducible characters χ), unless g and h are conjugate, in which case χ(g) = χ(h) for all χ and we get Σχ|χ(g)|2 = N/#(C) where C is the conjugacy class of g. In particular, taking g = id we get C = {id} and thus recover the fact that N is the sum of the squares of the dimensions of the irreducible representations.

Let G be a group of N elements, and (V,ρ) be any finite-dimensional representation over a field in which N is nonzero. Let P be the operator N−1 ΣgG ρ(g). This is a G-endomorphism of V, i.e. it commutes with each ρ(g), whence it follows that P is an idempotent: it satisfies P2 = P. We already know (PS9 1(i)) that this makes V the direct sum of the eigenspaces V0 and V1 of P, which are respectively the kernel and image of P. It follows that P has trace dim(V1). This is what lets us begin to link characters with irreducibles: on the one hand this trace is N−1 ΣgG χ(g) = ⟨χ, 1&rang where χ is the character of V. On the other hand V1 is the dimension of the G-invariant subspace VG of V (which we can also canonically identify with HomG(F,V) where F is regarded as the trivial representation).

To get from this to the dimension of HomG(V,V’) for any representations V,V, recall the identification of Hom(V,V’) (ignoring the G-action) with the tensor product V*V. Artin doesn't describe the action of G on V*. We know how to get from each ρ(g) a linear operator (ρ(g))* on V*, but these do not compose in the right order (unless G is abelian): we have (ρ(g))*(ρ(h))* = (ρ(hg))*. So to get a representation ρ* of G on V* we must define ρ*(g) = (ρ(g−1))* for all g. This yields the dual representation (sometimes called the “contragredient representation”) (V**) of (V,ρ); its character χ* is the complex conjugate of χ. This gives Hom(V,V’) = V*V the structure of a representation of G, and we can check that the invariant subspace is precisely HomG(V,V’) [which also justifies Artin's term “G-invariant homomorphism” for an element of HomG(V,V’)]. We then deduce that the dimension of this space is ⟨χ*χ', 1⟩ = ⟨χ', χ⟩, and since that's a nonnegative integer — and thus a fortiori real — it is also equal to ⟨χ, χ'⟩ as claimed.

NB Artin's definition of the inner product on class functions, or indeed on any functions on G (Chapter 9, (5.8), page 318), makes this inner product semilinear in the first variable. This is consistent with his treatment elsewhere in the book (see Chapter 4, (4.4), page 250), and is occasionally more convenient — e.g. here we wouldn't have had to finish by observing that the inner product is real and thus its own conjugate — but is rarer than the convention that we (following Axler) have been using. We have to tweak our development accordingly.

The operators introduced by Artin towards the end of his proof of (most of) Theorem 5.9 have further uses when φ is the character of an irreducible representation V. Indeed consider Pφ := N−1 ΣgG φ*(g) ρ(g) acting on any representation W. Since φ is a class function we again find that Pφ is a G-endomorphism of W. Thus (by Schur) if W is irreducible then Pφ is a multiple of the identity. The correct multiple can then be identified by computing the trace of Pφ, and that's ⟨χ, φ⟩ where χ is the character of W (whether or not W is irreducible). Using the orthogonality relations we deduce that if W is irreducible then Pφ = 0, unless W is isomorphic with V, in which case Pφ = 1 / dim(V). It follows that if W is the direct sum of irreducibles Vi then dim(V) · Pφ is the projection to the direct sub-sum of those Vi that are isomorphic with V. In particular, this sub-sum is independent of the decomposition; it is often known as the V-isotypic subspace of W. Examples: if V is the trivial representation then the V-isotypic subspace is the fixed subspace we called VG, and indeed in this case Pφ is the average of ρ(g) over group elements g. If G is the group of two elements and V its nontrivial representation then the isotypic subspace is the (−1)-eigenspace for the non-identity element g of G, and indeed the associated projection Pφ is the familiar (1−ρ(g))/2, just as projection to the (+1)-eigenspace is the averaging operator (1+ρ(g))/2.

If you know about integral closures and related ideas (see for instance the beginning of the summary of field algebra) then you can also use Pφ to demonstrate the last part of Theorem 5.9, which Artin punts on proving: the dimension of every irreducible representation V is a factor of N = #(G). Proof: N/dim(V) is an eigenvalue of the action on the regular representation of N·Pφ = ΣgG φ*(g) ρ(g); since each of the numbers φ*(g) is an algebraic integer (it's the sum of roots of unity), it follows that so is N/dim(V), and since that quotient is rational we conclude that N / dim(V) ∈ Z as claimed.

Our final topic for Math 55a is the Sylow theorems, covered by Artin in Chapter 6, Section 4 (pages 205–208). Corollary 4.3 (a finite group G contains an element of order p for every prime factor p of |G|) is a theorem of Cauchy that predates Sylow; we'll start by giving its combinatorial proof, which will introduce the technique that we (following Artin) will use to prove the Sylow theorems. Note that Artin's statement of the “second Sylow theorem” ((4.6) on page 206) is not the usual terminology, which regards Corollary 4.7 as the second Sylow theorem and (4.6) as a step in one proof of this theorem.

The proof Artin gives for Sylow I can be extended to give the “1 mod p part of Sylow III. Artin shows that the combinatorial coefficient bin(pem, pe) he calls N is not a multiple of p; extending the same argument shows that N is congruent to m mod p (the first factor is m mod p, and each of the other factors is 1 mod p). [Simpler yet: work in charcteristic p, and apply e times the identity (1+X)p = 1 + Xp to get (1+X)pem = (1 + Xpe)m; now look at the Xpe coefficient.] Since m is also the number of orbits of each Sylow subgroup on the pem-element subsets of G, it follows that the number of Sylow subgroups is congruent mod p to m/m = 1.

Artin's treatment of Sylow II may be easier to read if you just replace each instance of s by H (recall that s was introduced as the coset 1H of  H).


First problem set / Linear Algebra I: vector space basics; an introduction to convolution rings
corrected 1.ix.10 (a for an in 10(i))

Second problem set / Linear Algebra II: dimension of vector spaces; torsion groups/modules and divisible groups

Third problem set / Linear Algebra III: Countable vs. uncountable dimension of vector spaces; linear transformations and duality
NB: The “trickiness” of 2ii has nothing to do with subtleties of infinite cardinals, the Axiom of Choice, etc.; all you should need about countability (beyond the definition) is that a countable union of countable sets is countable (as in Cantor's proof of the countability of the rationals).

Fourth problem set / Linear Algebra IV: Eigenstuff, and a projective overture (in which “U” denotes the annihilator of U in PV*)

Fifth problem set / Linear Algebra V: Tensors, eigenstuff cont'd, and a bit on inner products
Problem 5i can be less confusing if you start with compositions of maps UVW involving three different vector spaces, and only then set all of them equal V. Thus if A is in Hom(V,W) then the map taking X to AX goes from Hom(U,V) to Hom(U,W), i.e. from U*V to U*W, so it had better be IA. Likewise for B is in Hom(U,V) the map taking X to XB goes from V*W to U*W, so it ought to be B*I — not BI because we need a map from V* to U*. (Fortunately B and its adjoint operator B* have the same spectrum, so even if you made this mistake you could still get the rest of the problem right.) Of course, once you've surmised the answer IA + B*I, you still have to prove it; by linearity, it's enough to check that the formulas agree on pure tensors, or in convenient coordinates if you prefer — which is indeed a special case because the usual coordinates on U*V come from a basis of pure tensors, namely tensors of vectors in the dual basis for U* with the basis for V.

Sixth problem set / Linear Algebra VI: Pairings and inner products, cont'd

Seventh problem set / Linear Algebra VII: More about the spectrum, including spectral graph theory; symplectic structures
For 2i, changing each coordinate of a λ-eigenvector v to its absolute value preserves v, v and does not decrease Av, v. Thus by problem 1 the new vector is also a λ-eigenvector, and we get such an eigenvector with nonnegative entries. If μ is a negative eigenvalue, the same construction shows that the largest eigenvalue is at least |μ|.
• For 2ii, show that if vi > 0 but vj = 0 and aij > 0 then we can increase the “Rayleigh quotient” Av, v/v, v by making vj positive but sufficiently small. So if not all vi are positive we can let I be the set of indices i of positive coordinates, and J its complement, giving a disconnection of A (note that I cannot be empty because then v is the zero vector). In the connected case, dimension 1 follows because else some eigenvector would have both positive and negative entries.
• Finally, for 2iii if there is a (−λ)-eigenvector v then changing each coordinate to its absolute value produces a (+λ)-eigenvector; then if we let I be the set of indices of positive coordinates of v then for every positive aij exactly one of i,j is in I, and conversely if there is such a subset I of {1,2,…,n} then changing each coordinate of a (+λ)-eigenvector to its negative produces a (−λ)-eigenvector.
• NB The term “connected”, and the connection with Problem 5 (see the next paragraph), should suggest forming a graph with n vertices and an edge for each i,j such that aij > 0. The graph is connected iff the matrix is “connected”, and then the condition of part iii is that this graph be bipartite. This generalizes the corresponding results for the adjacency matrix of the same graph. Note that indeed the hypercube graph of Problem 5 is bipartite and its adjacency matrix has maximal eigenvalue n and minimal eigenvalue −n.
• There is an analogous result for matrices with nonnegative entries that are not required to be symmetric: there is a real eigenvalue λ such that |μ| ≤ λ for all eigenvalues μ, even complex ones (which may exist in this generality). There are also equality conditions in terms of connectedness of an associated graph, which in this setting must be taken to be a directed graph. I don't know as slick a proof of this result, at least not without using more of the topological tools we'll develop in Math 55b.
For 5ii, let x be the vector whose v coordinate is +1 if v is in S and −1 if not. Then x is orthogonal to the all-1 vector, which generates the eigenspace for eigenvalue n. Since the second-largest eigenvalue is n−2, it follows that Ax, x⟩ ≤ (n−2) ⟨x, x with equality iff Ax = (n−2) x. This yields the desired bound, with equality iff each vertex in S has exactly one neighbor outside S, which in turn is soon seen to be equivalent to the condition that S is one of the 2n subsets on which one of the n coordinates is constant.

Eighth problem set / Linear Algebra VIII: Nilpotent operators and related topics; a natural inner product strucure on operators on a finite-dimensional space; exterior algebra; recreational applications of the sign homomorphism on Sn
corrected 29.x.10 (unipotent operator in 3, not “unipotent vector”)

Ninth problem set / Linear Algebra IX: More on exterior algebra and determinants
corrected to replace duplicated Problem 4 (ouch) and fix the Axler quote
corrected again (ouch-prime) to fix Problem 2: ⟨ω,ω⟩ = 0, not ⟨ω,ω'⟩ = 0
For problem 2, certainly if ω = v1 ^ v2 then ω ^ ω = v1 ^ v2 ^ v1 ^ v2 = − v1 ^ v1 ^ v2 ^ v2 = 0. The reverse implication can be proved by direct calculation, but it's easier to use the fact that if ω is not a pure tensor then it is of the form v1 ^ v2 + v3 ^ v4 for some basis {v1, v2, v3, v4}, and then ω ^ ω = 2 v1 ^ v2 ^ v3 ^ v4 is nonzero.
For problem 3, choose any basis B for V; let ψ be the wedge product of all 4n basis vectors, and use for W a basis of pure exterior products of 2n-element subsets of B. Then ⟨ω,ω'⟩ = 0 for all basis elements ω,ω' unless ω and ω' come from complementary subsets of B, in which case ⟨ω,&omega'⟩ = ±1. So W is the orthogonal direct sum of r := dim(W)/2 two-dimensional subspaces, each with signature (1,1). Hence W has signature (r, r). In each subsapce, ω + ω' and ω − ω' constitute an orthogonal basis, and the basis vectors have inner products 2 and −2 with themselves. So we can take for W+ the span of the r positive basis vectors, and for W the span of the r negative ones.

Tenth problem set / Linear Algebra X: Signatures and real polynomials; determinants and distances; Pontrjagin duality
corrected: in 3(ii), each xi and yj must be positive, not just greater than −1/2 [I was thinking of the application to Müntz for which the matrix entries are of the form tx,ty⟩ = 1/(x+y+1), not 1/(x+y)]
also, corrected 11.xi: in 1(i) the
Pj must be pairwise coprime, not merely distinct (and then there's no need to require P and each Pj to be monic)
and, corrected 23.xi (sorry!): in problem 3, the xi must already be assumed distinct in part (ii), else the matrix is only positive semidefinite.
For 1(iii), the point is to find a polynomial you can compute easily from the coefficients of P; it's no fair to just write something like “Πir(xi) Πjs(x+j), where P has r positive and s negative roots”: that's begging the question of determining r and s. Hint: the polynomial you construct need not have the same degree as P. For 8(ii), you're looking for a way to efficiently combine black boxes that compute DFT's on H and G/H into a technique for computing DFT's on G (so that one can recursively construct such a box if G is (say) a cyclic group of order 2r from DFT's for tiny groups like the 2-element group). This should be straightforward if G is simply the direct product of H and G/H. But what if (as suggested by the introductory paragraph) G is a cyclic group of order 2r and H is its subgroup of order 2, or the subgroup of order 2r−1 and index 2?

Eleventh and final problem set: Representations of finite groups
corrected overnight: in problem 3, Sk is not the same as kS
Problem 3 shows one approach and a few typical applications for a result often known as “Burnside's Lemma” [though it was already well-known in Burnside's day, going back at least to Cauchy (1845) according to this Wikipage]. I thank Lewis Stiller for pointing out the connection with representation theory some years back.
The construction illustrated by Problem 4(iii) is the ”Molien series” or ”Hilbert-Molien series” of the ring of invariants of a representation of a finite group. (See for example this Wikipage; as of this writing, the example there happens to be the same as the one in this problem.) In general the denominator can always be taken to be a product of factors 1-td, but it's rare for the numerator to be simple, let alone as simple as just 1 when we're dealing with the very special case of a complex reflection group.
For problem 7(iii), there are some alternative solutions for this particular case, but the intended solution was to show more generally: if V is a nonzero finite-dimensional representation, over some field subfield F of C, of a finite group G, then V is irreducible if and only if EndG(V) is a division algebra. Note that this is easy for F=C by Schur's Lemma: if V is the direct sum over j of isotypics Vjmj, then EndG(V) is the direct sum of matrix algebras GLmj(C), which is a division algebra if and only if mj=1 for one j and mj=0 for all other j. To do it in general, note that if some operator in EndG(V) is invertible then its inverse is also a G-endomorphism of V. Hence if EndG(V) is not a division algebra, there is some nonzero G-endomorphism T of V, and then the kernel and image of T are subrepresentations of V other than {0} and V itself. Conversely, if V is not reducible then it is a direct sum V1V2 of two nonzero representations, and the projection to either factor is a nonzero G-endomorphism of V that is not invertible.


The FINAL EXAM is now on my office door.
Correction (argh...): The n-by-n matrix A of the introductory paragraph of the first problem should have been called An. Then in part (i) take n=2m to get A2m [NB not A2n as I wrote earlier].
Another typo correction: in 3(iii), “G has a representation (V,ρ) whose associated character ρ takes the values…&rdquo should of course be “whose associated character χ takes&rdquo etc.