Math 55a: Honors Abstract Algebra (Fall 2010)

Lecture notes for Math 55a: Honors Abstract Algebra (Fall 2010)

If you find a mistake, omission, etc., please let me know by e-mail.

The orange balls mark our current location in the course, and the current problem set.

Tony Feng's section and office hours:
Section: Wednesdays (starting September 8), 4-5 PM, Science Center 103b
Office hours: Thursdays (starting September 9), 7-8 PM, Science Center 310
revised: now Thursdays 8-9 and 9-10, both in Science Center 411

at least in the beginning of the linear algebra unit, we'll be following the Axler textbook closely enough that supplementary lecture notes should not be needed. Some important extensions/modifications to the treatment in Axler:

[cf. Axler, p.3] Unless noted otherwise, F may be an arbitrary field, not only R or C. The most important fields other than those of real and complex numbers are the field Q of rational numbers, and the finite fields Z/pZ (p prime). Other examples are the field Q(i) of complex numbers with rational real and imaginary parts; more generally, Q(d^1/2) for any nonsquare rational number d; the “p-adic numbers” Q_p (p prime), of which we'll say more when we study topology next term; and more exotic finite fields such as the 9-element field (Z/3Z)(i). Here's a review of the axioms for fields, vector spaces, and related mathematical structures [corrected 7.ix.10: I somehow forgot to include the associative property among the vector space axioms )-:]
[cf. Axler, p.22] We define the span of an arbitrary subset S of (or tuple in) a vector space V as follows: it is the set of all (finite) linear combinations a₁v₁ + … + a_nv_n with each v_i in S and each a_i in F. This is still the smallest vector subspace of V containing S. In particular, if S is empty, its span is by definition {0}. We do not require that S be finite.
Unlike Axler, we discuss “quotient vector spaces” (and even quotient modules) explicitly and introduce them early. If N is a submodule of the module M over some ring A, the quotient module M/N consists of all equivalence classes in M, where we define the equivalence relation “∼” to mean that m∼m' if and only if m−m' is contained in N. One must then check that the module operations inherited from M do give M/N the structure of a well-defined A-module. [“Well-defined” means that the operations do not depend on the choice of representative from an equivalence class.] A familiar example is the cyclic finite group Z/nZ (recall that abelian group=Z-module). In the case of vector spaces, the dimension of a quotient vector space V/U is called the codimension in V of its subspace U; if V is finite-dimensional, this codimension equals dim(V) − dim(U).
Warning: in general the space F[X] (a.k.a. P(F)) of polynomials in X, and its subspaces P_n(F) of polynomials of degree at most n, might not be naturally identified with a subspace of the space F^F of functions from F to itself. The problem is that two different polynomials may yield the same function. For example if F is the field of 2 elements the polynomial X²-X gives rise to the zero function. In general different polynomials can represent the same function if and only if F is finite — do you see why?
In Axler's Theorem 2.10, The hypothesis that the spanning set be finite (implicit in Axler's use of “lists”) is not necessary as long as the vector space V is finite dimensional. Indeed let S be an arbitrary spanning set. Since V is finite dimensional, it has a finite spanning set S'. Write each element of S' as a linear combination of elements of S. These linear combinations involve only finitely many elements of S, so S has a finite subset S₀ that spans S' and thus spans V. Now apply Axler's argument to find a subset of S₀, and thus a fortiori of S, that is a basis of V.
Here's an extreme example of how basic theorems about finite-dimensional vector spaces can become utterly false for finitely-generated modules: a module generated by just one element can have a submodule that is not finitely generated. Indeed, for any field F let A be the ring of polynomials in infinitely many variables X_j. As usual we can regard A as a module over itself, with a single generator 1. A submodule is then just an ideal of the ring. Choose the ideal I generated by all the X_j, which consists of all polynomials with constant coefficient equal 0. Then if there are infinitely many indices j then I is infinitely generated; indeed any generating set must be at least as large as the index set of j’s, so for every cardinal ℵ we can make a ring A with a singly-generated module (namely A itself) and with a submodule that cannot be generated by fewer than ℵ elements.
For a subtler example, consider the ring we might call “F[X^{1/2^∞}]”, consisting of F-linear combinations of monomials X^{n/2^k} for arbitrary nonnegative integers n and k. Again let I be the ideal generated by the nonconstant monomials, which is not finitely generated, though there are generating sets that are “only” countably infinite. The new behavior involves the countable generating set {X^{1/2^k} | k≥0}: there is no minimal generating subset, because each X^{1/2^k} is a multiple of X^{1/2^k'} for any k'>k. Likewise for the ring generated by all monomials X^r with r a nonnegative rational number (or even all X^r with r a nonnegative real number).
Suppose T : V → W is a linear transformation. Axler's notation for the image of T was already becoming rather old-fashioned when he wrote his book; these days simply T(V) is common (and likewise for any function at all). The terminology “null space” (whether one or two words) for T^-1(0) is also somewhat quaint; we usually say “kernel” and write “ker(T)” [and (LA)TeX already provides the command \ker to typeset this properly].
Axler also unaccountably soft-pedals the important notion of duality. [corrected 18.ix.10: If W is the direct sum of U and V then the copy of U^* in W^* is the “annihilator” in W^* of V, not of U.]
About “Lemma 3.?”: some notes and warnings about the behavior of Hom(V,W) under (finite or infinite) direct sums.
Here's a brief preview of abstract nonsense (a.k.a. diagram-chasing), and a diagram-chasing interpretation of quotients and duality.
Axler proves the Fundamental Theorem of Algebra using complex analysis, which cannot be assumed in Math 55a (we'll get to it at the end of 55b). Here's a proof using the topological tools we'll develop at the start of 55b. (Axler gives the complex-analytic proof on page 67.) [corrected 24.ix.10 to fix a minor typo: when f-tilde is introduced, it is f-tilde that vanishes at zero, not f]
We shall need some “eigenstuff” also in an infinite-dimensional setting, so will not assume that any vector space is (nonzero) finite dimensional unless we really must.
If T is a linear operator on a vector space V, and U is an invariant subspace, then the quotient space V/U inherits an action of T. Moreover, the annihilator of U in V^* is an invariant subspace for the action of the adjoint operator T^* on V^*. (Make sure you understand why both these claims hold.)
Triangular matrices are intimately related with “flags”. A (complete) flag in a finite dimensional vector space V is a sequence of subspaces {0}=V₀, V₁, V₂, …, V_n=V, with each V_i of dimension i and containing V_i-1. A basis v₁, v₂, …, v_n determines a flag: V_i is the span of the first i basis vectors. Another basis w₁, w₂, …, w_n determines the same flag if and only if each w_i is a linear combination of v₁, v₂, …, v_i (necessarily with nonzero v_i coefficient). The standard flag in F_n is the flag obtained in this way from the standard basis of unit vectors e₁, e₂, …, e_n. The punchline is that, just as a diagonal matrix is one that respects the standard basis (equivalently, the associated decomposition of V as a direct sum of 1-dimensional subspaces), an upper-triangular matrix is one that respects the standard flag. Note that the i-th diagonal entry of a triangular matrix gives the action on the one-dimensional quotient space V_i/V_i−1 (each i=1,…,n).

Less surprising than the absence of quotients and duality in Axler is the lack of tensor algebra. That won't stop us in Math 55, though. Here's an introduction [As you might guess from \oplus, the TeXism for the tensor-product symbol is \otimes.]
corrected 30.ix.10 and 1.x.10 to fix a couple of minor typos and improve the presentation of V ⊗_F F'

One of many applications is the trace of an operator on a finite dimensional F-vector space V. This is a linear map from Hom(V,V) to F. We can define it simply as the composition of two maps: our identification of Hom(V,V) with the tensor product of V^* and V, and the natural map from this tensor product to F coming from the bilinear map taking (v^*,v) to v^*(v). We'll see that this is the same as the classical definition: the trace of T is the sum of the diagonal entries of the matrix of T with respect to any basis. The coordinate-independent construction via tensor algebra explains why the trace does not change under change of basis. (The invariance can also be proved by checking explicitly that AB and BA have the same trace for any square matrices A, B of the same size.)
Here are some basic definitions and facts about general norms on real and complex vector spaces.
Just as we can study bilinear symmetric forms on a vector space over any field, not just R, we can study sesquilinear conjugate-symmetric forms on a vector space over any field with a conjugation, not just C. Here a “conjugation” on a field F is a field automorphism σ:F→F such that σ is not the identity but σ² is (that is, σ is an involution). Given a basis {v_i} for F, a sesquilinear form ⟨.,.⟩ on F is determined by the field elements a_i,j = ⟨v_i,v_j⟩, and is conjugate-symmetric if and only if a_j,i = σ(a_i,j) for all i,j. Note that the “diagonal entries” a_i,i — and more generally ⟨v,v⟩ for any v in V — must be elements of the subfield of F fixed by σ.
“Sylvester's Law of Inertia” states that for a nondegenerate pairing on a finite-dimensional vector space V/F, where either F=R and the pairing is bilinear and symmetric, or F=C and the pairing is sesquilinear and conjugate-symmetric, the counts of positive and negative inner products for an orthogonal basis constitute an invariant of the pairing and do not depend on the choice of orthogonal basis. (This invariant is known as the “signature” of the pairing.) The key trick in proving this result is as follows. Suppose V is the orthogonal direct sum of subspaces U₁, U₂ for which the pairing is positive definite on U₁ and negative definite on U₂. Then any subspace W of V on which the pairing is positive definite has dimension no greater than dim(U₁). Proof: On the intersection of W with U₂, the pairing is both positive and negative definite; hence that subspace is {0}. The claim follows by a dimension count, and we quickly deduce Sylvester's Law.
Over any field not of characteristic 2, we know that for any non-degenerate symmetric pairing on a finite-dimensional vector space there is an orthogonal basis, or equivalently a choice of basis such that the pairing is (x,y)=sum_i(a_i x_i y_i) for some nonzero scalars a_i. But in general it can be quite hard to decide whether two different collections of a_i yield isomorphic pairings. Even over Q the answer is already tricky in dimensions 2 and 3, and I don't think it's known in a vector space of arbitrary dimension. Over a finite field of odd size there are always exactly two possibilities, as we'll see in a few weeks.
A regular graph of degree d is a Moore graph of girth 5 if any two different vertices are linked by a unique path of length at most 2. Such a graph necessarily has n = 1 + d + d(d−1) = d² + 1 vertices. Let A be the adjacency matrix, and 1 the all-ones vector. Then (because each vertex has degree d) 1 is a d-eigenvector of A. We have (1 + A + A²) v = d v + ⟨v, 1⟩ 1 for all v (proof: check on unit vectors and use linearity). Thus A takes the orthogonal complement of R·1 to itself and satisfies (1 + A + A²) = d on that space. Since this quadratic equation has distinct roots m and −1−m for some m > 0 (namely the positive root of (1 + m + m²) = d), it follows that the orthogonal complement of R·1 is the direct sum of the corresponding eigenspaces. Let d₁ and d₂ be their dimensions. These sum to n−1 = d², and satisfy md₁ + (−1−m)d₂ + d = 0 because the matrix A has trace zero. This lets us solve for d₁ and d₂. in particular we find that their difference is (2d − d²) / (2m+1). Since that's an integer, either d=2 (giving the pentagon graph) or m is an integer. Substituting m² + m + 1 for d, we find that 16(d₁ − d₂) is an integer plus 15/(2m+1), whence m is one of 0, 1, 2, or 7. The first of these is impossible, and the others give d = 3, 7, or 57 as claimed.
It is clear that the pentagon is the unique Moore graph of girth 5, and degree 2, and it is not hard to check that the Petersen graph is the unique example with d = 3. For d = 7 there is again a unique graph up to an interesting group of automorphisms; see this proof outline, which also lets one count the automorphisms of this graph. There is no spectral graph theory there, but this approach is suggested by the inequality you'll prove in the fifth problem of problem set 7: if the 50 vertices are divided into two equal subsets, what's the smallest possible number of edges between them?
For more of this kind of application of linear algebra to combinatorics, see for example Cameron and van Lint's text Designs, Graphs, Codes and their Links (London Math. Society, 1991, reprinted 1996).
All of Chapter 8 works over an arbitrary algebraically closed field, not only over C (except for the minor point about extracting square roots, which breaks down in characteristic 2); and the first section (“Generalized Eigenvalues”) works over any field.
We don't stop at Corollary 8.8: let T be any operator on a vector space V over a field F, not assumed algebraically closed. If V is finite-dimensional, then The Following Are Equivalent:
(1) There exists a nonnegative integer k such that T^k=0;
(2) For any vector v, there exists a nonnegative integer k such that T^kv=0;
(3) Tⁿ=0, where n=dim(V).
Note that (1) and (2) make no mention of the dimension, but are still not equivalent for operators on infinite-dimensional spaces. We readily deduce the further equivalent conditions:
(4) There exists a basis for V for which T has an upper-triangular matrix with every diagonal entry equal zero;
(5) Every upper-triangular matrix for T has zeros on the diagonal, and there exists at least one upper-triangular matrix for T.
Recall that the second part of (5) is automatic if F is algebraically closed.
The space of generalized 0-eigenvectors (the maximal subspace on which T is nilpotent) is sometimes called the nilspace of T. It is an invariant subspace. When V is finite dimensional, V is the direct sum of the nilspace and another invariant subspace V', consisting of the intersection of the subspaces T^k(V) as k ranges over all positive integers. See Exercise 8.11; this can be used to quickly prove Theorem 8.23 and consequences such as Cayley-Hamilton (Theorem 8.20).
The dimension of the space of generalized c-eigenvalues (i.e., of the nilspace of T−cI) is usually called the algebraic multiplicity of c (since it's the multiplicity of c as a root of the characteristic polynomial of T), to distinguish it from the “geometric multiplicity” which is the dimension of ker(T−cI).

We'll not cover Chapter 9, which seems to exist mainly to prove that it is possible to construct the characteristic polynomial over R without introducing determinants. We'll proceed to Chapter 10, which covers determinants, but…

We'll define the determinant of an operator T on a finite dimensional space V as follows: T induces a linear operator T' on the top exterior power of V; this exterior power is one-dimensional, so an operator on it is multiplication by some scalar; det(T) is by definition the scalar corresponding to T'. The “top exterior power” is a subspace of the “exterior algebra” of V, which is the quotient of the tensor algebra by the ideal generated by {v*v: v in V}. We'll still have to construct the sign homomorphism from the symmetry group of order dim(V) to {1,−1} to make sure that this exterior algebra is as large as we expect it to be, and that in particular that the (dim(V))-th exterior power has dimension 1 rather than zero.

Interlude: normal subgroups; short exact sequences in the context of groups: A subgroup H of G is normal (satisfies H = g⁻¹Hg for all g in G) iff H is the kernel of some group homomorphism from G iff the injection H → G fits into a short exact sequence {1} → H → G → Q → {1}, in which case Q is the quotient group G/H. [The notation {1} for the one-element (“trivial”) group is usually abbreviated to plain 1, as in 1 → H → G → Q → 1.] This is not in Axler but can be found in any introductory text in abstract algebra; see for instance Artin, Chapter 2, section 10. Examples: 1 → A_n → S_n → {±1} → 1; also, the determinant homomorphism GL_n(F) → F^* gives the short exact sesquence 1 → SL_n(F) → GL_n(F) → F^* → 1, and this works even if F is just a commutative ring with unit as long as F^* is understood as the group of invertible elements of F — for example, Z^* = {±1}.

Some more tidbits about exterior algebra:

If w, w' are elements of the m-th and m'-th exterior powers of V, then ww'=(−1)^mm'w'w; that is, w and w' commute unless m and m' are both odd in which case they anticommute.
If m+m'=n=dim(V) then the natural pairing from the m-th and m'-th exterior powers to the n-th is nondegenerate, and so identifies the m'-th exterior power canonically with the dual of the m-th tensored with the top (n-th) exterior power.
In particular, if m=1, and T is any invertible operator on V, then we find that the induced action of T on the (n−1)st exterior power is the same as its action on V^* multiplied by det(T). This yields the formula connecting the inverse and cofactor matrix of an invertible matrix (a formula which you may also know in the guise of “Cramer's rule”).
For each m there is a natural non-degenerate pairing between the m-th exterior powers of V and V^*, which identifies these exterior powers with each other's dual.

More will be said about exterior algebra when differential forms appear in Math 55b.

We'll also show that a symmetric (or Hermitian) matrix is positive definite iff all its eigenvalues are positive iff it has positive principal minors (the “principal minors” are the determinants of the square submatrices of all orders containing the (1,1) entry). More generally we'll show that the eigenvalue signs determine the signature, as does the sequence of signs of principal minors if they are all nonzero. More precisely: an invertible symmetric/Hermitian matrix has signature (r,s) where r is the number of positive eigenvalues and s is the number of negative eigenvalues; if its principal minors are all nonzero then r is the number of j in {1,2,…,n} such that the j-th and (j−1)-st minors have the same sign, and s is the number of j in that range such that the j-th and (j−1)-st minors have opposite sign [for j=1 we always count the “zeroth minor” as being the positive number 1]. This follows inductively from the fact that the determinant has sign (−1)^s and the signature (r',s') of the restriction of a pairing to a subspace has r' ≤ r and s' ≤ s.

For positive definiteness, we have the two further equivalent conditions: the symmetric (or Hermitian) matrix A=(a_ij) is positive definite iff there is a basis (v_i) of Fⁿ such that a_ij = ⟨v_i,v_j⟩ for all i,j, and iff there is an invertible matrix B such that A=B^*B. For example, the matrix with entries 1/(i+j−1) (“Hilbert matrix”) is positive-definite, because it is the matrix of inner products (integrals on [0,1]) of the basis 1,x,x²,…,xⁿ⁻¹ for the polynomials of degree <n. See the 10th problem set for a calculus-free proof of the positivity of the Hilbert matrix, and an evaluation of its determinant.
For any matrix B (not necessarily invertible or even square) with columns v_i, the matrix B^*B with entries a_ij = ⟨v_i,v_j⟩ is known as the Gram matrix of the columns of B. It is invertible iff those columns are linearly independent. If we add an (n+1)-st vector, the determinant of the Gram matrix increases by a factor equal to the squared distance between this vector and the span of the columns of B. You'll use this too in PS10.

Here's a brief introduction to field algebra and Galois theory.

Our source for representation theory of finite groups (on finite-dimensional vector spaces over C) will be Artin's Algebra, Chapter 9. We'll omit sections 3 and 10 (which require not just topology and calculus but also, at least for §3, some material beyond 55b to do properly, namely the construction of Haar measures); also we won't spend much time on §7, which works out in detail the representation theory of a specific group that Artin calls I (the icosahedral group, a.k.a. A₅). There are many other sources for this material, some of which take a somewhat different point of view via the “group algebra” C[G] of a finite group G (a.k.a.\ the algebra of functions on G under convolution). See for instance Chapter 1 of Representation Theory by Fulton and Harris (mentioned in class). A canonical treatment of representations of finite groups is Serre's Linear Representations of Finite Groups, which is the only entry for this chapter in the list of “Suggestions for Further Reading” at the end of Artin's book (see p.604).

While we'll work almost exclusively over C, most of the results work equally well (though with somewhat different proofs) over any field F that contains the roots of unity of order #(G), as long as the characteristic of F is not a factor of #(G). Without roots of unity, many more results are different, but there is still a reasonably satisfactory theory available. Dropping the characteristic condition leads to much trickier territory, e.g. even Maschke's theorem (every finite-dimensional representation is a direct sum of irreducibles) fails; some natural problems are still unsolved!

Here's an alternative viewpoint on representations of a finite group G (not in Artin, though you can find it elsewhere, e.g. Fulton-Harris pages 36ff.): a representation of G over a field F is equivalent to a module for the group ring F[G]. The group ring is an associative F-algebra (commutative iff G is commutative) that consists of the formal F-linear combinations of group elements. This means that F[G] is F^G as an F-vector space, and the algebra structure is defined by setting e_g₁e_g₂ = e_g₁g₂ for all g₁, g₂ in G, together with the F-bilinearity of the product. This means that if we identify elements of the group ring with functions G→F then the multiplication rule is (f₁ * f₂)(g) = Σ_g₁g₂=g f₁(g₁) f₂(g₂) — yes, it's convolution again. To identify an F[G]-module with a representation, use the action of F to define the vector space structure, and let ρ(g) act by multipliction by the unit vector e_g. In particular, the regular representation is F[G] regarded in the usual way as a module over itself. If we identify the image of this representation with certain permutation matrices of order #(G), we get an explicit model of F[G] as a subalgebra of the algebra of square matrices the same order. For example, if G = Z/nZ we recover the algebra of circulant matrices of order n.

Just as the regular representation is F[G] as a module over itself, a subrepresentation of the regular representation is an ideal of F[G]. The fundamental theorem of representation theory can be stated as an isomorphism between F[G] and the direct sum of algebras End(V) with V ranging over all the irreducible representations of G. Comparing dimensions gives the formula #(G) = Σ_V dim(V)². The F-linear isomorphism from F[G] and the direct sum of the End(V) is a generalization of the discrete Fourier transform (DFT) to noncommutative groups [check that if G is commutative we recover the DFT of Problem Set 10].

In case we haven't officially said it yet: the unitary group U_n that figures prominently in Artin's treatment (see the start of 9.2, p.310) is the subgroup of GL_n(C) that acts by isometries on Cⁿ with its usual inner-product norm; that is, matrices whose inverse equals their own Hermitian transpose. The unitary group U(V) = U(V, ⟨.,.⟩) of a complex inner-product space V = (V, ⟨.,.⟩) is defined in much the same way; if V is finite-dimensional then any choice of orthonormal basis identifies U(V) with U_n.

Artin derives the diagonalizability of ρ(g) as Corollary 2.3 (p.310) of unitarity plus the Spectral Theorem for normal operators. This is not necessary because we already know that a linear transformation that satisfies a polynomial equation with distinct roots is diagonalizable, and ρ(g) satisfies such an equation, namely Xⁿ−1=0 where n is the order of g. Note that this argument applies more generally to a representation over any field in which Xⁿ−1=0 has n distinct roots. Again this leads naturally to the condition that the characteristic not be a factor of #(G), i.e. that #(G) is nonzero in F: this condition guarantees that also n≠0 in F, because n is always a factor of #(G) [a corollary of Lagrange's theorem that the order of any subgroup of G is a factor of #(G), see e.g. Artin, (6.11) and (6.12) in Chapter 2, p.58].

Let T be the character table of a group of N elements, and let T^* be its Hermitian transpose. The first part two parts of Theorem 5.9 in Chapter 9 state in effect the identity TDT^* = NI where D is the diagonal matrix whose diagonal entries are the sizes of the conjugacy classes of G. But then NI is also DT^*T. This means that the columns of T are also orthogonal, and the column indexed by conjugacy class C has inner product N/#(C) with itself. [This is an adaptation of the familiar but still remarkable fact that the rows of a square matrix are orthonormal iff the columns are orthonormal.] That is, for any group elements g, h we have Σ_χ χ(g)χ(h) = 0 (the sum extending over the irrducible characters χ), unless g and h are conjugate, in which case χ(g) = χ(h) for all χ and we get Σ_χ|χ(g)|² = N/#(C) where C is the conjugacy class of g. In particular, taking g = id we get C = {id} and thus recover the fact that N is the sum of the squares of the dimensions of the irreducible representations.

Let G be a group of N elements, and (V,ρ) be any finite-dimensional representation over a field in which N is nonzero. Let P be the operator N⁻¹ Σ_g∈G ρ(g). This is a G-endomorphism of V, i.e. it commutes with each ρ(g), whence it follows that P is an idempotent: it satisfies P² = P. We already know (PS9 1(i)) that this makes V the direct sum of the eigenspaces V₀ and V₁ of P, which are respectively the kernel and image of P. It follows that P has trace dim(V₁). This is what lets us begin to link characters with irreducibles: on the one hand this trace is N⁻¹ Σ_g∈G χ(g) = ⟨χ, 1&rang where χ is the character of V. On the other hand V₁ is the dimension of the G-invariant subspace V^G of V (which we can also canonically identify with Hom_G(F,V) where F is regarded as the trivial representation).

To get from this to the dimension of Hom_G(V,V’) for any representations V,V’, recall the identification of Hom(V,V’) (ignoring the G-action) with the tensor product V^*⊗V’. Artin doesn't describe the action of G on V^*. We know how to get from each ρ(g) a linear operator (ρ(g))^* on V^*, but these do not compose in the right order (unless G is abelian): we have (ρ(g))^*(ρ(h))^* = (ρ(hg))^*. So to get a representation ρ^* of G on V^* we must define ρ^*(g) = (ρ(g⁻¹))^* for all g. This yields the dual representation (sometimes called the “contragredient representation”) (V^*,ρ^*) of (V,ρ); its character χ^* is the complex conjugate of χ. This gives Hom(V,V’) = V^*⊗V’ the structure of a representation of G, and we can check that the invariant subspace is precisely Hom_G(V,V’) [which also justifies Artin's term “G-invariant homomorphism” for an element of Hom_G(V,V’)]. We then deduce that the dimension of this space is ⟨χ^*χ', 1⟩ = ⟨χ', χ⟩, and since that's a nonnegative integer — and thus a fortiori real — it is also equal to ⟨χ, χ'⟩ as claimed.

NB Artin's definition of the inner product on class functions, or indeed on any functions on G (Chapter 9, (5.8), page 318), makes this inner product semilinear in the first variable. This is consistent with his treatment elsewhere in the book (see Chapter 4, (4.4), page 250), and is occasionally more convenient — e.g. here we wouldn't have had to finish by observing that the inner product is real and thus its own conjugate — but is rarer than the convention that we (following Axler) have been using. We have to tweak our development accordingly.

The operators introduced by Artin towards the end of his proof of (most of) Theorem 5.9 have further uses when φ is the character of an irreducible representation V. Indeed consider P_φ := N⁻¹ Σ_g∈G φ^*(g) ρ(g) acting on any representation W. Since φ is a class function we again find that P_φ is a G-endomorphism of W. Thus (by Schur) if W is irreducible then P_φ is a multiple of the identity. The correct multiple can then be identified by computing the trace of P_φ, and that's ⟨χ, φ⟩ where χ is the character of W (whether or not W is irreducible). Using the orthogonality relations we deduce that if W is irreducible then P_φ = 0, unless W is isomorphic with V, in which case P_φ = 1 / dim(V). It follows that if W is the direct sum of irreducibles V_i then dim(V) · P_φ is the projection to the direct sub-sum of those V_i that are isomorphic with V. In particular, this sub-sum is independent of the decomposition; it is often known as the V-isotypic subspace of W. Examples: if V is the trivial representation then the V-isotypic subspace is the fixed subspace we called V^G, and indeed in this case P_φ is the average of ρ(g) over group elements g. If G is the group of two elements and V its nontrivial representation then the isotypic subspace is the (−1)-eigenspace for the non-identity element g of G, and indeed the associated projection P_φ is the familiar (1−ρ(g))/2, just as projection to the (+1)-eigenspace is the averaging operator (1+ρ(g))/2.

If you know about integral closures and related ideas (see for instance the beginning of the summary of field algebra) then you can also use P_φ to demonstrate the last part of Theorem 5.9, which Artin punts on proving: the dimension of every irreducible representation V is a factor of N = #(G). Proof: N/dim(V) is an eigenvalue of the action on the regular representation of N·P_φ = Σ_g∈G φ^*(g) ρ(g); since each of the numbers φ^*(g) is an algebraic integer (it's the sum of roots of unity), it follows that so is N/dim(V), and since that quotient is rational we conclude that N / dim(V) ∈ Z as claimed.

Our final topic for Math 55a is the Sylow theorems, covered by Artin in Chapter 6, Section 4 (pages 205–208). Corollary 4.3 (a finite group G contains an element of order p for every prime factor p of |G|) is a theorem of Cauchy that predates Sylow; we'll start by giving its combinatorial proof, which will introduce the technique that we (following Artin) will use to prove the Sylow theorems. Note that Artin's statement of the “second Sylow theorem” ((4.6) on page 206) is not the usual terminology, which regards Corollary 4.7 as the second Sylow theorem and (4.6) as a step in one proof of this theorem.

The proof Artin gives for Sylow I can be extended to give the “1 mod p” part of Sylow III. Artin shows that the combinatorial coefficient bin(p^em, p^e) he calls N is not a multiple of p; extending the same argument shows that N is congruent to m mod p (the first factor is m mod p, and each of the other factors is 1 mod p). [Simpler yet: work in charcteristic p, and apply e times the identity (1+X)^p = 1 + X^p to get (1+X)^{p^em} = (1 + X^{p^e})^m; now look at the X^{p^e} coefficient.] Since m is also the number of orbits of each Sylow subgroup on the p^em-element subsets of G, it follows that the number of Sylow subgroups is congruent mod p to m/m = 1.

Artin's treatment of Sylow II may be easier to read if you just replace each instance of s by H (recall that s was introduced as the coset 1H of H).

First problem set / Linear Algebra I: vector space basics; an introduction to convolution rings
corrected 1.ix.10 (a for a_n in 10(i))

Second problem set / Linear Algebra II: dimension of vector spaces; torsion groups/modules and divisible groups

Third problem set / Linear Algebra III: Countable vs. uncountable dimension of vector spaces; linear transformations and duality
NB: The “trickiness” of 2ii has nothing to do with subtleties of infinite cardinals, the Axiom of Choice, etc.; all you should need about countability (beyond the definition) is that a countable union of countable sets is countable (as in Cantor's proof of the countability of the rationals).

Fourth problem set / Linear Algebra IV: Eigenstuff, and a projective overture (in which “U^⊥” denotes the annihilator of U in PV^*)

Fifth problem set / Linear Algebra V: Tensors, eigenstuff cont'd, and a bit on inner products
Problem 5i can be less confusing if you start with compositions of maps U → V → W involving three different vector spaces, and only then set all of them equal V. Thus if A is in Hom(V,W) then the map taking X to AX goes from Hom(U,V) to Hom(U,W), i.e. from U^*⊗V to U^*⊗W, so it had better be I⊗A. Likewise for B is in Hom(U,V) the map taking X to XB goes from V^*⊗W to U^*⊗W, so it ought to be B^*⊗I — not B⊗I because we need a map from V^* to U^*. (Fortunately B and its adjoint operator B^* have the same spectrum, so even if you made this mistake you could still get the rest of the problem right.) Of course, once you've surmised the answer I⊗A + B^*⊗I, you still have to prove it; by linearity, it's enough to check that the formulas agree on pure tensors, or in convenient coordinates if you prefer — which is indeed a special case because the usual coordinates on U^*⊗V come from a basis of pure tensors, namely tensors of vectors in the dual basis for U^* with the basis for V.

Sixth problem set / Linear Algebra VI: Pairings and inner products, cont'd

Seventh problem set / Linear Algebra VII: More about the spectrum, including spectral graph theory; symplectic structures
For 2i, changing each coordinate of a λ-eigenvector v to its absolute value preserves ⟨v, v⟩ and does not decrease ⟨Av, v⟩. Thus by problem 1 the new vector is also a λ-eigenvector, and we get such an eigenvector with nonnegative entries. If μ is a negative eigenvalue, the same construction shows that the largest eigenvalue is at least |μ|.
• For 2ii, show that if v_i > 0 but v_j = 0 and a_ij > 0 then we can increase the “Rayleigh quotient” ⟨Av, v⟩ / ⟨v, v⟩ by making v_j positive but sufficiently small. So if not all v_i are positive we can let I be the set of indices i of positive coordinates, and J its complement, giving a disconnection of A (note that I cannot be empty because then v is the zero vector). In the connected case, dimension 1 follows because else some eigenvector would have both positive and negative entries.
• Finally, for 2iii if there is a (−λ)-eigenvector v then changing each coordinate to its absolute value produces a (+λ)-eigenvector; then if we let I be the set of indices of positive coordinates of v then for every positive a_ij exactly one of i,j is in I, and conversely if there is such a subset I of {1,2,…,n} then changing each coordinate of a (+λ)-eigenvector to its negative produces a (−λ)-eigenvector.
• NB The term “connected”, and the connection with Problem 5 (see the next paragraph), should suggest forming a graph with n vertices and an edge for each i,j such that a_ij > 0. The graph is connected iff the matrix is “connected”, and then the condition of part iii is that this graph be bipartite. This generalizes the corresponding results for the adjacency matrix of the same graph. Note that indeed the hypercube graph of Problem 5 is bipartite and its adjacency matrix has maximal eigenvalue n and minimal eigenvalue −n.
• There is an analogous result for matrices with nonnegative entries that are not required to be symmetric: there is a real eigenvalue λ such that |μ| ≤ λ for all eigenvalues μ, even complex ones (which may exist in this generality). There are also equality conditions in terms of connectedness of an associated graph, which in this setting must be taken to be a directed graph. I don't know as slick a proof of this result, at least not without using more of the topological tools we'll develop in Math 55b.
For 5ii, let x be the vector whose v coordinate is +1 if v is in S and −1 if not. Then x is orthogonal to the all-1 vector, which generates the eigenspace for eigenvalue n. Since the second-largest eigenvalue is n−2, it follows that ⟨Ax, x⟩ ≤ (n−2) ⟨x, x⟩ with equality iff Ax = (n−2) x. This yields the desired bound, with equality iff each vertex in S has exactly one neighbor outside S, which in turn is soon seen to be equivalent to the condition that S is one of the 2n subsets on which one of the n coordinates is constant.

Eighth problem set / Linear Algebra VIII: Nilpotent operators and related topics; a natural inner product strucure on operators on a finite-dimensional space; exterior algebra; recreational applications of the sign homomorphism on S_n
corrected 29.x.10 (unipotent operator in 3, not “unipotent vector”)

Ninth problem set / Linear Algebra IX: More on exterior algebra and determinants
corrected to replace duplicated Problem 4 (ouch) and fix the Axler quote
corrected again (ouch-prime) to fix Problem 2: ⟨ω,ω⟩ = 0, not ⟨ω,ω'⟩ = 0
For problem 2, certainly if ω = v₁ ^ v₂ then ω ^ ω = v₁ ^ v₂ ^ v₁ ^ v₂ = − v₁ ^ v₁ ^ v₂ ^ v₂ = 0. The reverse implication can be proved by direct calculation, but it's easier to use the fact that if ω is not a pure tensor then it is of the form v₁ ^ v₂ + v₃ ^ v₄ for some basis {v₁, v₂, v₃, v₄}, and then ω ^ ω = 2 v₁ ^ v₂ ^ v₃ ^ v₄ is nonzero.
For problem 3, choose any basis B for V; let ψ be the wedge product of all 4n basis vectors, and use for W a basis of pure exterior products of 2n-element subsets of B. Then ⟨ω,ω'⟩ = 0 for all basis elements ω,ω' unless ω and ω' come from complementary subsets of B, in which case ⟨ω,&omega'⟩ = ±1. So W is the orthogonal direct sum of r := dim(W)/2 two-dimensional subspaces, each with signature (1,1). Hence W has signature (r, r). In each subsapce, ω + ω' and ω − ω' constitute an orthogonal basis, and the basis vectors have inner products 2 and −2 with themselves. So we can take for W₊ the span of the r positive basis vectors, and for W₋ the span of the r negative ones.

Tenth problem set / Linear Algebra X: Signatures and real polynomials; determinants and distances; Pontrjagin duality
corrected: in 3(ii), each x_i and y_j must be positive, not just greater than −1/2 [I was thinking of the application to Müntz for which the matrix entries are of the form ⟨t^x,t^y⟩ = 1/(x+y+1), not 1/(x+y)]
also, corrected 11.xi: in 1(i) the P_j must be pairwise coprime, not merely distinct (and then there's no need to require P and each P_j to be monic)
and, corrected 23.xi (sorry!): in problem 3, the x_i must already be assumed distinct in part (ii), else the matrix is only positive semidefinite.
⊕ For 1(iii), the point is to find a polynomial you can compute easily from the coefficients of P; it's no fair to just write something like “Π_i≤r(x−i) Π_j≤s(x+j), where P has r positive and s negative roots”: that's begging the question of determining r and s. Hint: the polynomial you construct need not have the same degree as P. For 8(ii), you're looking for a way to efficiently combine black boxes that compute DFT's on H and G/H into a technique for computing DFT's on G (so that one can recursively construct such a box if G is (say) a cyclic group of order 2^r from DFT's for tiny groups like the 2-element group). This should be straightforward if G is simply the direct product of H and G/H. But what if (as suggested by the introductory paragraph) G is a cyclic group of order 2^r and H is its subgroup of order 2, or the subgroup of order 2^r−1 and index 2?

Eleventh and final problem set: Representations of finite groups
corrected overnight: in problem 3, S^k is not the same as k^S…
Problem 3 shows one approach and a few typical applications for a result often known as “Burnside's Lemma” [though it was already well-known in Burnside's day, going back at least to Cauchy (1845) according to this Wikipage]. I thank Lewis Stiller for pointing out the connection with representation theory some years back.
The construction illustrated by Problem 4(iii) is the ”Molien series” or ”Hilbert-Molien series” of the ring of invariants of a representation of a finite group. (See for example this Wikipage; as of this writing, the example there happens to be the same as the one in this problem.) In general the denominator can always be taken to be a product of factors 1-t^d, but it's rare for the numerator to be simple, let alone as simple as just 1 when we're dealing with the very special case of a complex reflection group.
For problem 7(iii), there are some alternative solutions for this particular case, but the intended solution was to show more generally: if V is a nonzero finite-dimensional representation, over some field subfield F of C, of a finite group G, then V is irreducible if and only if End_G(V) is a division algebra. Note that this is easy for F=C by Schur's Lemma: if V is the direct sum over j of isotypics V_j^m_j, then End_G(V) is the direct sum of matrix algebras GL_{m_j}(C), which is a division algebra if and only if m_j=1 for one j and m_j=0 for all other j. To do it in general, note that if some operator in End_G(V) is invertible then its inverse is also a G-endomorphism of V. Hence if End_G(V) is not a division algebra, there is some nonzero G-endomorphism T of V, and then the kernel and image of T are subrepresentations of V other than {0} and V itself. Conversely, if V is not reducible then it is a direct sum V₁⊕V₂ of two nonzero representations, and the projection to either factor is a nonzero G-endomorphism of V that is not invertible.

The FINAL EXAM is now on my office door.
Correction (argh...): The n-by-n matrix A of the introductory paragraph of the first problem should have been called A_n. Then in part (i) take n=2m to get A_2m [NB not A_2n as I wrote earlier].
Another typo correction: in 3(iii), “G has a representation (V,ρ) whose associated character ρ takes the values…&rdquo should of course be “whose associated character χ takes&rdquo etc.