Math 55a: Honors Abstract Algebra (Fall 2017)

Warning: MathJax requires JavaScript to process the mathematics on this page.
If your browser supports JavaScript, be sure it is enabled.

Lecture notes for Math 55a: Honors Abstract Algebra (Fall 2017)

If you find a mistake, omission, etc., please let me know by e-mail.

The orange balls mark our current location in the course, and the current problem set.

Ceci n’est pas un Math 55a syllabus
[No, you don’t have to know French to take Math 55a. Googling ceci+n'est suffices to turn up the explanation, such as it is.]

The CAs for Math 55a are Vikram Sundar (vikramsundar@college) and Rohil Prasad (prasad01@college)
[if writing from outside the Harvard network, append .college.edu to ...@harvard].

CA office hours are Monday 8-10 PM in the Leverett Dining Hall, starting September 4 (same place and time that Math Night will start the week following).

Thanks to Vikram for setting up this Dropbox link for the CAs’ notes from class.

Section times:
Vikram Sundar: Monday 1-2 PM; Science Center room 112 on Sep.11, and room 222 from Sep.18 on.
Rohil Prasad: Thursday 4-5 PM, Science Center room 411
! If you are coming to class but not officially registered for Math 55 (e.g. you are auditing, or still undecided between 25a and 55a but officially signed up for 25a), send me your e-mail address so that I and the CA's can include you in class announcements.
My office hours for the week of 18-22 September will be Wednesday (Sep.20), not the usual Tuesday. (Still 7:30 to 9:00 PM in the Lowell House Dining Hall.)
Here is some more information from last year on the number 5777 etc. (converted to MathJax and with the added remark on $5779 = L_{27}/L_9$); as noted in the Sep.20 lecture, the fact that the palindrome 5775 factors so smoothly ($3 \cdot 5^2 \cdot 7 \cdot 11$) is also due in part to the fact that $5776 = 76^2$. Shanah Tovah!
! The diagnostic quiz will be given Wednesday, September 27 in class (11:07 AM to 12:00 noon). It will cover only material from the first three problem sets.

August 30: “Math blackboard” ($\rm\TeX$’s \mathbb font), such as $\mathbb R$, is a printed representation of a handwritten representation of ordinary boldface such as $\bf R$. When using $\rm\TeX$ (or $\rm\LaTeX$ etc.), you might as well use normal boldface. Either $\mathbf R$ or $\mathbb R$ means the set of real numbers, whether considered as a field, abelian group, metric space (more on this in Math 55b), or whatever other structure is relevant. Likewise $\mathbf C$ = $\mathbb C$ = the set of complex numbers; $\mathbf Q$ = $\mathbb Q$ = the set of rational numbers (quotients of integers — since the initial letter of “rational(s)” is preempted by the use of $\bf R$ for the reals); $\mathbf Z$ = $\mathbb Z$ = the set of integers (from German Zahlen); and in Axler, $\mathbf F$ = $\mathbb F$ = the field $\bf R$ or $\bf C$.

At least in the beginning of the linear algebra unit, we’ll be following the Axler textbook closely enough that supplementary lecture notes should not be needed. Some important extensions/modifications to the treatment in Axler:

[see Axler, page 5] Pace the boxed note on that page, virtually all mathematicians say and write “$n$-tuple” (more fully, “ordered $n$-tuple”), while I cannot recall another instance of “list” used for this as Axler does. (One sometimes sees “tuple” for an $n$-tuple of unspecified length $n$, and “ordered pair” and perhaps “ordered triple”, “ordered quadruple”, etc. for $n = 2, 3, 4, \ldots$ .)
[cf. Axler, Notation 1.6 on page 4, and the “Digression on Fields” on page 10]
Unless noted otherwise, $\bf F$ may be an arbitrary field, not only $\bf R$ or $\bf C$. The most important fields other than those of real and complex numbers are the field $\bf Q$ of rational numbers, and the finite fields ${\bf Z} / p {\bf Z}$ ($p$ prime). Other examples are: the field ${\bf Q}(i)$ of complex numbers with rational real and imaginary parts; more generally, ${\bf Q}(d^{1/2})$ for any non-square rational number $d$; the “$p$-adic numbers” ${\bf Q}_p$ ($p$ prime), of which we’ll say more when we study topology next term; and more exotic finite fields such as the 9-element field $({\bf Z}/3{\bf Z})(i)$. Here’s a review of the axioms for fields, vector spaces, and related mathematical structures.
[cf. Axler, p.28 ff.] We define the span of an arbitrary subset $S$ of (or tuple in) a vector space $V$ as follows: it is the set of all (finite) linear combinations $a_1 v_1 + \cdots + a_n v_n$ with each $v_i$ in $S$ and each $a_i$ in $F\!$. This is still the smallest vector subspace of $V$ containing $S$. In particular, if $S$ is empty, its span is by definition $\{0\}$. We do not require that $S$ be finite.
Warning: in general the space $F[X]$ (a.k.a. ${\cal P}(F)$) of polynomials in $X$, and its subspaces ${\cal P}_n(F)$ of polynomials of degree at most $n$, might not be naturally identified with a subspace of the space $F^F$ of functions from $F$ to itself. The problem is that two different polynomials may yield the same function. For example, if $F$ is the field of $2$ elements then the polynomial $X^2-X$ gives rise to the zero function. In general, different polynomials can represent the same function from the field $F$ to itself if and only if $F$ is finite — do you see why?
(See also Exercise 11 in Axler 1.C, assigned as part of the first problem set)
If $U_i$ are any subspaces of a vector space $V\!$, then so is their intersection $\cap_i U_i$. Note that this is not limited to finite intersections: $i$ could range over an “index set” $I$ of any cardinality (so we would write the intersection as $\cap_{i \in I} U_i$). We don’t usually want to intersect an empty family of sets (do you see why not?), but for subsets of a given set $V$ we can declare that $\cap_{i\in\emptyset} U_i = V$.
For any field (or even any ring) $F$ there is a canonical ring homomorphism, call it $h$, from $\bf Z$ to $F\!$. “Ring homomorphism” means: $h(0) = 0$, $h(1) = 1$, and for any integers $m,n$ we have $h(m+n) = h(m) + h(n)$ and $h(mn) = h(m) \, h(n)$ (and $h(m-n) = h(m) - h(n)$, but this already follows from the other properties, as indeed does $h(0)=0$). But this doesn’t quite mean that we get an isomorphic copy of $\bf Z$ in $F\!$, because $h$ might not be injective. Equivalently, the kernel (that is, the preimage $h^{-1}(\{0\}) = \{n : h(n) = 0\}$) might be larger than just {0}. In general, $I$ must be an ideal, i.e. an additive subgroup of $\bf Z$ that is closed under multiplication by arbitrary integers (whether in $I$ or not — this mimics the definition of a subspace, though as it happens for ideals in $\bf Z$ it’s automatic). Now every ideal in $\bf Z$ is either the zero ideal {0} or $(n) := \{ cn \mid c \in {\bf Z}\}$ for some integer $n > 0$ (namely the least positive element of the ideal), called the (positive) generator of the ideal. When $F$ is a ring, any $n$ may arise as the generator of $\ker(h)$, most easily for the ring ${\bf Z} / n {\bf Z}$ of integers $\bmod n$. But if $F$ is a field and $\ker h = (n)$ then $n$ must be either zero or prime, lest $F$ have zero divisors (elements $a$ and $b$, neither zero, for which $ab=0$). This $n$ is then called the characteristic of the field $F\!$. The familiar fields $\bf Q$, $\bf R$, $\bf C$ all have characteristic zero. For any prime p, there are fields of characteristic $p$, notably the “prime field” ${\bf Z} / p {\bf Z}$ (mentioned above; this is the key fact from elementary (but nontrivial) number theory that any nonzero element of ${\bf Z} / p {\bf Z}$ has a multiplicative inverse!). This field ${\bf Z} / p {\bf Z}$ and other finite fields have important uses in number theory, combinatorics, computer science, and elsewhere, often using the linear algebra that we develop in Math 55a.
[cf. the boxed note on page 42 of Axler] It is natural to wonder whether every vector space, finite-dimensional or not, has a basis. The polynomial ring $F[x]$, considered as a vector space over $F$ (and denoted by a fancy script $\mathcal P$ in Axler), does have a basis (powers of $z$), as does a polynomial ring in several variables, or even infinitely many (see the next item); but does $F^\infty$? The answer is yes — but only under the Axiom of Choice (equivalently, Zorn’s Lemma)! [I can write “But only under” because it is known that Choice/Zorn is equivalent to the claim that every vector space has a basis. Don’t spend too much time trying to find an explicit basis for $F^\infty$, or for $\bf R$ as a vector space over $\bf Q$ (a “Hamel basis”)…] Using the same tool one can prove analogues of some other results in Chapter 2, such as 2.33 (p.41: every linearly independent set extends to a basis), and thus 2.34 (p.42: every subspace is a direct summand; again, don’t spend too much time trying to do this explicitly for $\bf Q$ as a subspace of the $\bf Q$-vector space $\bf R$, or for $\oplus_{n\geq1} F$ as a subspace of $F^\infty$!). NB some other results clearly fail in infinite dimensions, even when we have an explicit basis; e.g. the even powers of $z$ form a linearly independent subset of $F[z]$ that has the same cardinality as a basis but is not a basis.
However, 2.31 (p.40: every spanning list contains a basis) still holds with no further axioms for spanning sets $S$ of arbitrary size, as long as $V$ is finite dimensional. The reason is that $V$ has a finite spanning set, say $S_0$, and every element of $S_0$ is a linear combination of elements of $S$, and since linear combinations are of necessity finite it takes only a finite subset of $S$ to span $S_0$ and thus $V$. Now apply the proof of 2.31 to this finite subset. We may call this generalization “2.31+”.
Here’s an extreme example of how basic theorems about finite-dimensional vector spaces can become utterly false for finitely-generated modules: a module generated by just one element can have a submodule that is not finitely generated. Indeed, for any field $F$, let $A$ be the ring of polynomials in infinitely many variables $X_j$. [The letter $A$ is a common name for a ring, from French anneau, cognate with English “annulus”.] As usual we can regard $A$ as a module over itself, with a single generator 1. Then a submodule is just an ideal of the ring. Choose the ideal $I$ generated by all the $X_j$ which consists of all polynomials with constant coefficient equal 0. Then if there are infinitely many indices $j$ then $I$ is infinitely generated; indeed any generating set must be at least as large as the index set of $j$’s, so for every cardinal $\aleph$ we can make a ring $A$ with a singly-generated module (namely $A$ itself) and with a submodule that cannot be generated by fewer than $\aleph$ elements.
For a subtler example, consider the ring we might call “$F[X^{1/2^\infty}]$”, consisting of F-linear combinations of monomials $X^{n/2^k}$ for arbitrary nonnegative integers $n$ and $k$. Again let $I$ be the ideal generated by the nonconstant monomials, which is not finitely generated, though there are generating sets that are “only” countably infinite. The new behavior involves the countable generating set $\{ X^{1/2^k} \mid k \geq 0 \}$: there is no minimal generating subset, because each $X^{1/2^k}$ is a multiple of $X^{1/2^{k'}}$ for any $k' \gt k$. Likewise for the ring generated by all monomials $X^r$ with $r$ any nonnegative rational number (or even all $X^r$ with $r$ any nonnegative real number).
(When A is Noetherian, submodules of finitely-generated modules are finitely-generated, but might still require more generators; for example, there are Noetherian rings $A$ with “non-principal ideals” $I$, which give examples of a 1-generator module with a submodule that requires at least 2 generators.)
Please avoid Axler’s notation “product” and “$V \times W$” (p.91, 3.71 ff.). I understand the motivation for this notation: it is formally correct, and avoids the need to distinguish between “external direct sum” (the usual name for that vector space) and “internal direct sum” (a vector space sum [within some larger vector space] that happens to be direct). The problem with this is that in Math 55 (and ubiquitously in the literature) we shall introduce before long a “tensor product” $V \otimes W$ of vector spaces, whose dimension is the product of the dimensions of $V$ and $W$ when those two dimensions are finite; and it would be a much bigger source of confusion to have that notation coexist with “$V \times W$” where the dimensions add. So please stick with “$V \oplus W$” and the name “external direct sum” — or if you must, “Cartesian product” to avoid confusion with tensor products. For a possibly infinite Cartesian product, which is not the same as a direct sum (because an element of the direct sum must have only finitely many nonzero components), we still have the notation $\Pi_{i \in I} V_i$ to distinguish the Cartesian product from the direct sum $\oplus_{i \in I} V_i$.
Apropos Axler 2.43 (page 47), a warning: the formula $\dim(U+W) = \dim(U)+\dim(W) - \dim(U\cap W)$, and its analogy with the inclusion-exclusion principle, may lead you to expect a similar formula for $\dim(U_1+U_2+U_3)$ for any three subspaces $U_1,U_2,U_3$ of a vector space; but that expected generalization is (in)famously false in general (and likewise for four or more subspaces)!
As with the notions of span and linear combination, the definition of a linear transformation makes sense for modules over any ring $A$ (whether commutative or not), and in that generality is called an $A$-module homomorphism (so you now know the “morphisms” in the “category” of $A$-modules); when $A$ is a skew field, we still call this a linear transformation, and the “rank-nullity theorem” (3.22, page 63) still holds for finite-dimensional vector spaces in that context.
Suppose $T: V \to W$ is a linear transformation. Axler’s notation for the image of T was already becoming rather old-fashioned when he wrote the first edition of his book; these days simply $T(V)$ is common (and likewise for any function at all). The terminology “null space” (whether one or two words) for $T^{-1}(\{0\})$ is also somewhat quaint, and we usually say “kernel” and write “$\ker(T)$” [and $\rm\LaTeX$ already provides the command \ker to typeset this properly]. While I’m at it, best to avoid the use of “one-to-one” to mean “injective” (see boxed note on page 60), because it is also sometimes used for “bijective”. Also, the $\rm\LaTeX$ for ${\cal L}(V,W)$ is {\cal L}(V,W); note the brackets aroud \cal L, without which you would get $\cal L(V,W)$.
Here’s a page of ntoes on “Lemma 3.?” and some related observations on how $\rm Hom$ connectes with finite and infinite direct sums.
More notes on notation: I understand why Axler wants to distinguish $V'$ and $T'$ (dual space and transformation) from $V^*$ and $T^*$, and $U^0$ (annihilator) from $U^\perp$ (see the boxed note on page 104). I’ll try to stick with $U^0$ in this class. But for the duals, using “ $\!\phantom|'\!$ ” this way incurs a steep price of the very useful construction exemplified by “let $V$ and $V'$ be vector spaces”: we already have few enough good letters to name mathematical structures that even $\pi$ is pressed into double duty (not just $3.14159\ldots$ but also the quotient πrojection from $V$ to $V/U$). I’ll stick with the common $V^*$ and $T^*$ here.
An equivalent statement of the identity $(ST)^* = T^* S^*$ (third part of 3.101, page 104 of Axler), together with $(I_V)^* = I_{V^*}$ (which Axler might not even bother stating explicitly), is that duality of vector spaces and linear transformations constitutes a “contravariant functor” from the category of F-vector spaces and linear transformations to itself.
The results about quotient spaces and duality in sections E and F of Chapter 3 are often described in terms of exact sequences. A sequence $\ \cdots \to L \to M \to N \to \cdots\ $ of linear transformations (or $A$-module homomorphisms, “etc.”) is said to be “exact at $M$” if the kernel of the map $M \to N$ is the image of the map $L \to M$ (that is, if the elements of $M$ that go to zero in $N$ are precisely those that come from $L$). The sequence is “exact” if it is exact at each step with both an incoming and an outgoing map. In particular, a map $M \to N$ is injective iff it extends to a sequence $0 \to M \to N$ that is exact at $M$, and surjective iff it extends to a sequence $M \to N \to 0$ that is exact at $N$. [In this context “0” is commonly used for the trivial vector space (or module, etc.) $\{0\}$. Note that in each case there is no choice about the function from or to that trivial vector space 0, and likewise at least for modules. Another notation that signals injectivity is $M \hookrightarrow N$ (${\rm\LaTeX}$: \hookrightarrow, with the extra hook suggesting $\subset$); likewise $M \to\!\!\!\!\to N$ for a surjective map.] Thus the map is an isomorphism iff $0 \to M \to N \to 0$ is exact (at both $M$ and $N$). Even more easily, $0 \to M \to 0$ is exact iff $M=0$. A short exact sequence is the next case, with three modules other than the initial and final 0. The standard example is $0 \to L \to M \to N \to 0$ where the map $L \to M$ is an inclusion map (thus an injection) and the map $M \to N$ is the quotient map $M \to M/L$ (thus a surjection). In general if $0 \to L \to M \to N \to 0$ is a short exact sequence then the injection $L \to M$ identifies $L$ with a submodule of $M$, and then the surjection $M \to N$ is identified with the quotient map. More generally, any homomorphism $L \to M$ extends (uniquely up to equivalence) to an exact sequence with four modules between the outer zeros: $0 \to K \to L \to M \to N \to 0$, where $K$ is the kernel of the map $L \to M$, and $N$ is its “cokernel ”, that is, the quotient of $M$ by the image of $L$.
Now consider the case of vector spaces. Then to each linear transformation $V \to W$ we associate the dual transformation $V^* \leftarrow W^*$, with the dual of a composition $V \to W \to X$ being the composition of the dual transformations $V^* \leftarrow W^* \leftarrow X^*$ in reverse order; this makes duality a “contravariant functor” on the category of $F$-vector spaces. The key fact is that for finite-dimensional vector spaces, duality preserves exactness of sequences of linear transformations. Thus starting from any linear $V \to W$, we can extend to an exact sequence $0 \to U \to V \to W \to X \to 0$ with $U$ the kernel and $X$ the cokernel, and dualize to deduce the exactness of $0 \leftarrow U^* \leftarrow V^* \leftarrow W^* \leftarrow X^* \leftarrow 0$ with $V^* \leftarrow W^*$ the dual map. This immediately encodes Axler 3.108 (page 107): the map $V \to W$ is surjective iff $X$ is zero iff $X^*$ is zero iff the dual map is injective. Likewise for 3.110 (p.108) via the vanishing of $U$ and $U^*$. With a bit more work we can get the general relations $\ker(T^*) = ({\rm im}(T))^0$ (3.107, p.106) and ${\rm im}(T^*) = (\ker(T))^0$ (3.109, p.107) between the kernels and images of $T$ and its dual, again assuming that $T$ is a linear map between finite-dimensional vector spaces. Conversely, the fact that duality preserves exactness (for sequences of linear maps between finite-dimensional vector spaces) can be deduced as a special case of 3.107 and 3.109.

You can now understand this joke (such as it is).
Being a special case of ${\rm Hom}$, duality makes sense in the more general setting of modules over a ring $A$: the dual of an $A$-module $M$ is $M^* := {\rm Hom}(M,A)$, the $A$-module of o $A$-linear homomorphism from $M$ to $A$. This still gives a “contravariant functor” from the category of $A$-modules to itself: an $A$-module homomorphism $M \to N$ gives rise in the same way an $A$-module homomorphism $M^* \leftarrow N^*$ with the direction reversed, consistent with identity and composition. But, as you might suspect by now, our theorems about the kernel and image of the dual of a linear transformation can fail in this more general setting, even when applied to finitely-generated modules. We already see this for injections and surjections: For a linear transformation $T : V \to W$, we saw that if $T$ is injective then the dual transformation $T^*$ is surjective, and vice versa. Only one of these two results holds for injections and surjections of $A$-modules; can you see which one it is, and give a counterexample for the other (already for $A = \bf Z$)?
Another way to think about the eigen-basics: “Lemma 5.0”: If $T$ is an operator on any vector space $V$, and $\lambda$ any scalar, then $U$ is an invariant subspace for $T$ iff it is an invariant subspace for $T - \lambda I.$ So, for instance, since $\ker T$ is an invariant subspace, so is $\ker(T-\lambda I),$ a.k.a. the $\lambda$-eigenspace.
Yet another note on notation: Axler’s name “$T/U$” (for the operator on $V/U$ induced from the action of $T$ on a vector space $V$ with an invariant subspace $U,$ see 5.14 on p.137) is a nice notation, but (unlike $T|_U$ for the restriction of $T$ to $U$) is seen rarely if at all in the research literature. Normally it will be called plain $T$, or possibly $\overline T$ (since it is constructed by descending to $V/U$ the composition of $T$ with the quotient map $V \to V/U$).
Let $T$ be a linear operator on $V$. The algebraic properties of polynomial evaluation at $T$ can be summarized by saying that the map from $F[X]$ to End($V$) that takes any polynomial $P$ to $P(T)$ is not just linear but a ring homomorphism. [Since $F[X]$ is a commutative ring, so is the image of this homomorphism, even though End($V$) is not commutative once $\dim(V) > 1$.] In particular the kernel is an ideal in $F[X]$; when $V$ is finite dimensional, this ideal must be nonzero, and its generator is what we shall call the “minimal polynomial” of $T$. Special case: if $V$ is $F$ itself, then we naturally identify End($V$) with $F$, and we get for any field element $x$ the evaluation homomorphism from $F[X]$ to F that takes any polynomial to its value at $x$.
Axler proves the Fundamental Theorem of Algebra using complex analysis, which cannot be assumed in Math 55a (we’ll get to it at the end of 55b). Here’s a proof using the topological tools we’ll develop at the start of 55b. (Axler gives one standard complex-analytic proof in 4.13 on page 124.) Here are two other equivalent conditions for algebraic closure, in terms of irreducible polynomials and finite(-dimensional) field extensions.
Triangular matrices are intimately related with “flags”. A (complete) flag in a finite dimensional vector space $V$ is a sequence of subspaces $\{0\} = V_0, V_1, V_2, \ldots, V_n = V$, with each $V_i$ of dimension $i$ and (for $1\leq i\leq n$) containing $V_{i-1}$. A basis $v_1,v_2,\ldots,v_n$ determines a flag: $V_i$ is the span of the first $i$ basis vectors. Another basis $w_1,w_2,\ldots,w_n$ determines the same flag if and only if each $w_i$ is a linear combination of $v_1,v_2,\ldots,v_i$ (necessarily with nonzero $v_i$ coefficient). The standard flag in $F^n$ is the flag obtained in this way from the standard basis of unit vectors $e_1,e_2,\ldots,e_n$. The punchline is that, just as a diagonal matrix is one that respects the standard basis (equivalently, the associated decomposition of V as a direct sum of 1-dimensional subspaces), an upper-triangular matrix is one that respects the standard flag. Note that the $i$-th diagonal entry of a triangular matrix gives the action on the one-dimensional quotient space $V_i / V_{i-1}$ (again for each $i=1,\ldots,n$).

While the third edition of Axler includes quotients and duality, it still lacks tensor algebra. This is no surprise, but it will not stop us in Math 55! Here’s an introduction [As you might guess from \oplus, the TeXism for the tensor-product symbol is \otimes.]
Corrected 14.x.2017 [Alec Sun]: at the end of the first display on page 2, it’s $w_{ij}$, not $u_i \otimes v_j$.

One of many applications is the trace of an operator on a finite dimensional $F$-vector space $V$. This is a linear map from ${\rm Hom}(V,V)$ to $F$. We can define it simply as the composition of two maps: our identification of ${\rm Hom}(V,V)$ with the tensor product of $V^*$ and $V$, and the natural map from this tensor product to $F$ coming from the bilinear map taking $(v^*,v)$ to $v^*(v)$. We shall see that this is the same as the classical definition: the trace of $T$ is the sum of the diagonal entries of the matrix of $T$ with respect to any basis. The coordinate-independent construction via tensor algebra explains why the trace does not change under change of basis. (The invariance can also be proved by checking explicitly that $AB$ and $BA$ have the same trace for any square matrices $A,B$ of the same size.) Once we’ve constructed the trace, we have a series of invariants ${\rm tr}(T^k)$ ($k=1,2,3,\ldots$; the $k=0$ trace is ${\rm tr}(I_V) = \dim V$). If $T$ has an upper-triangular matrix $(a_{ij})$ then the diagonal entries of $T^k$ are $a_{ii}^k$, so ${\rm tr}(T^k) = \sum_i a_{ii}^k$. In characteristic zero (or characteristic $>\dim V$), that’s enough to construct the characteristic polynomial of $T$ [with apologies for using two mathematical senses of “characteristic” in the same sentence...]; but to do it in general we’ll have to work harder.
Here are some basic definitions and facts about general norms on real and complex vector spaces.
Just as we can study bilinear symmetric forms on a vector space over any field, not just $\bf R$, we can study sesquilinear conjugate-symmetric forms on a vector space over any field with a conjugation, not just $\bf C$. Here a “conjugation” on a field $F$ is a field automorphism $\sigma: F \to F$ such that $\sigma$ is not the identity but $\sigma^2$ is the identity (that is, $\sigma$ is an involution). Given a basis $\{v_i\}$ for $F$, a sesquilinear form $\langle \cdot, \cdot \rangle$ on $F$ is determined by the field elements $a_{i,\,j} = \langle v_i, v_j \rangle,$ and is conjugate-symmetric if and only if $a_{j,i} = \sigma(a_{i,\,j})$ for all $i,j$. Note that the “diagonal entries” $a_{i,i} = \langle v_i, v_i \rangle$ — and more generally $\langle v,v \rangle$ for any $v \in V$ — must be elements of the subfield of $F$ fixed by $\sigma$.
Over any field not of characteristic 2, we know that for any non-degenerate symmetric pairing on a finite-dimensional vector space there is an orthogonal basis, or equivalently a choice of basis such that the pairing is $(x,y) = \sum_i a_i x_i y_i$ for some nonzero scalars $a_i$. But in general it can be quite hard to decide whether two different collections of $a_i$ yield isomorphic pairings. Even over Q the answer is already tricky in dimensions 2 and 3, and I don’t think it’s known in a vector space of arbitrary dimension. Over a finite field of odd size there are always exactly two possibilities, as we may see in a few weeks.
“Sylvester’s Law of Inertia” states that for a nondegenerate pairing on a finite-dimensional vector space $V/F$, where either $F = \bf R$ and the pairing is bilinear and symmetric, or $F = \bf C$ and the pairing is sesquilinear and conjugate-symmetric, the counts of positive and negative inner products for an orthogonal basis constitute an invariant of the pairing and do not depend on the choice of orthogonal basis. (This invariant is known as the “signature” of the pairing.) The key trick in proving this result is as follows. Suppose $V$ is the orthogonal direct sum of subspaces $U_1, U_2$ for which the pairing is positive definite on $U_1$ and negative definite on $U_2$. [A pairing $(\cdot,\cdot)$ is called “negative definite” if $-(\cdot,\cdot)$ is positive definite.] Then any subspace $W$ of $V$ on which the pairing is positive definite has dimension no greater than $\dim(U_1)$. Proof: On $W \cap U_2,$ the pairing is both positive and negative definite; hence that subspace is $\{0\}$. The claim follows by a dimension count, and we quickly deduce Sylvester’s Law.
If $U$ is a subspace of inner-product space $V$, but not necessarily finite dimensional, there is not generally a complement: one can still define $U^\perp$, but the direct orthogonal sum $U \oplus U^\perp$ might be strictly smaller than $V$. What then happens to 6.56 (p.198 in 6.C), which describes the orthogonal projection $P_U(v)$ as the vector in $U$ closest to $v$ (i.e., minimizing the norm $\|v - u\|$)? Well, if there exists such $u$ then indeed $v-u$ is orthogonal to $u$, but in general the minimum need not be attained: at best we can construct a sequence of vectors $u_n \in U$ such that $\| v - u \| $ approaches $\inf_{u \in U} \| v - u \|.$ It then follows from Apollonius’ theorem (see the front cover of Axler! and also Exercise 31 of 6.A, page 179) that the $u_n$ constitute a Cauchy sequence in U (else $(u_m + u_n)/2$ is too close to $v$). So if $U$ is complete with respect to the norm distance then there is a nearest vector and we can proceed as before. But in general infinite-dimensional inner product spaces are not complete (the complete ones are Hilbert spaces, and that is a very special case). We shall say a lot more about completeness and related notions at the start of Math 55b.
A regular graph of degree $d$ is a Moore graph of girth 5 if any two different vertices are linked by a unique path of length at most 2. Such a graph necessarily has $b = 1 + d (d-1) = d^2 + 1$ vertices. Let $A$ be the adjacency matrix, i.e. the symmetric $n \times n$ matrix with $A_{ij} = 0$ if vertices $i,j$ are (distinct and) adjacent on the graph, and $A_{ij}=0$ otherwise; and let $\bf 1$ be the all-ones vector. Then $\bf 1$ an eigenvector of $A$ with eigenalue $d$ (because each vertex has degree $d$). We have $(1 + A + A^2) v = dv + \langle v, {\bf 1} \rangle \bf 1$ for all $v$ (proof: check on unit vectors and use linearity). Thus $A$ takes the orthogonal complement of ${\bf R} \cdot \bf1$ to itself, and satisfies $1+A+A^2 = d$ on that orthogonal complement. Since this quadratic equation has distinct roots, say $m$ and $-1-m$ for some $m \ge 0$ (namely the positive root of $1 + m + m^2 = d),$ it follows that the orthogonal complement of ${\bf R} \cdot \bf1$ is the direct sum of the corresponding eigenspaces. Let $d_1$ and $d_2$ be their dimensions. These sum to $n - 1 = d^2$, sand satisfy $md_1 + (-1-m)d_2 + d = 0$ because the matrix $A$ has trace zero. This lets us solve for $d_1$ and $d_2$. in particular we find that $d_2 - d_1 = (2d-d^2) \, / \, (2m+1).$ Since that’s an integer [it is a surprisingly powerful constraint that the dimension of any vector space is in $\bf Z$!], either $d = 2$ (giving the pentagon graph) or $m$ is an integer. Substituting $m^2 + m+1$ for $d$, we find that $16(d_1 - d_2)$ is an integer plus $15 /(2m+1)$, whence $m \in \{0,1,2,7\}.$ The first of these is impossible, and the others give $d=3$, 7, or 57 as claimed.
Why the name “spectral theorem”? The set (or sometimes the “multiset”) of eigenvalues of a linear operator on a vector space $V$ is often called its “spectrum”, especially when $V$ is a real or complex vector space, either finite or infinite dimensional. This is related with the visual (and by extension the electromagnetic) spectrum, for reasons that would take us much too far into wave and quantum mechanics, so we shall say little more of that here (but you may encounter it again in your physics class(es)).

We’ll define the determinant of an operator $T$ on a finite dimensional space $V$ as follows: $T$ induces a linear operator $\wedge^n T$ on the top exterior power $\wedge^n V$ of $V$ (where $n = \dim V);$ this exterior power is one-dimensional, so an operator on it is multiplication by some scalar; $\det(T)$ is by definition the scalar corresponding to $\wedge^n T$. The “top exterior power” is a subspace of the “exterior algebra” $\wedge^\bullet (V)$ of $V$, which is the quotient of the tensor algebra by the two-sided ideal generated by $\{ v \otimes v : v \in V\}.$ (Recall that this ideal also contains $v \otimes w + w \otimes v$ for all $v,w \in V.)$ We’ll still have to construct the sign homomorphism from the symmetric group of order $\dim V$ to $\{1, -1\}$ to make sure that this exterior algebra is as large as we expect it to be, and that in particular that the $(\dim(V))$-th exterior power has dimension 1 rather than zero.

Interlude: normal subgroups; short exact sequences in the context of groups
A subgroup $H$ of $G$ is normal (satisfies $H = g^{-1} \! H g$ for all $g \in G)$ iff $H$ is the kernel of some group homomorphism from $G$ iff the injection $H \hookrightarrow G$ fits into a short exact sequence $\{1\} \to H \to G \to Q \to \{1\},$ in which case $Q$ is the quotient group $G/H.$ [The notation {1} for the one-element (“trivial”) group is usually abbreviated to plain 1, as in $1 \to H \to G \to Q \to 1.]$ This is not in Axler but can be found in any introductory text in abstract algebra; see for instance Artin, Chapter 2, section 10.
Examples: $1 \to A_n \to S_n \to \{ \pm 1 \};$ also, the determinant homomorphism ${\rm GL}_n(F) \to F^*$ gives the short exact sequence $1 \to {\rm SL}_n(F) \to {\rm GL}_n(F) \to F^* \to 1,$ and this works even if $F$ is just a commutative ring with unit as long as $F^*$ is understood as the group of invertible elements of $F$ — for example, ${\bf Z}^* = \{\pm 1\}.$

Some more tidbits about exterior algebra:

If $w \in \wedge^m V$ and $w' \in \wedge^{m'} V$ then $ww' = (-1)^{mm'} w'w;$ that is, $w$ and $w'$' commute unless $m,m'$ are both odd in which case $w$ and $w'$ anticommute. (The identity $ww' = (-1)^{mm'} w'w$ is also written in the equivalent form $w \wedge w' = (-1)^{mm'} w' \wedge w.)$
If $m + m' = n = \dim V$ then the natural pairing $\wedge^m V \times \wedge^{m'} V \to \wedge^n V$ is nondegenerate, and so identifies the $m'$-th exterior power canonically with the dual of the $m$-th, tensored with the top ($n$-th) exterior power.
In particular, if $m=1$, and $T$ is any invertible operator on $V$, then we find that the induced action of $T$ on the $(n-1)$st exterior power is the same as its action on $V^*$ multiplied by $\det T$. This yields the formula connecting the inverse and cofactor matrix of an invertible matrix (a formula which you may also know in the guise of “Cramer’s rule”).
For each $m$ there is a natural non-degenerate pairing between the $\wedge^m V$ and $\wedge^m V^*$, which identifies these exterior powers with each other’s dual.

More will be said about exterior algebra when differential forms appear in Math 55b.

We’ll also show that a symmetric (or Hermitian) matrix is positive definite iff all its eigenvalues are positive iff it has positive principal minors (the “principal minors” are the determinants of the square submatrices of all orders containing the (1,1) entry). More generally we’ll show that the eigenvalue signs determine the signature, as does the sequence of signs of principal minors if they are all nonzero. More precisely: an invertible symmetric/Hermitian matrix has signature $(r,s)$ where $r$ is the number of positive eigenvalues and $s$ is the number of negative eigenvalues; if its principal minors are all nonzero then $r$ is the number of $j \in \{ 1, 2, \ldots, n \}$ such that the $j$-th and ($j-1$)-st minors have the same sign, and $s$ is the number of $j$ in that range such that the $j$-th and ($j-1$)-st minors have opposite sign [for $j=1$ we always count the “zeroth minor” as being the positive number 1]. This follows inductively from the fact that the determinant has sign $(-1)^s$ and the signature $(r',s')$ of the restriction of a pairing to a subspace has $r' \leq r$ and $s' \leq s.$

For positive definiteness, we have the two further equivalent conditions: the symmetric (or Hermitian) matrix $A = (a_{jk})$ is positive definite iff there is a basis $(v_j)$ of $F^n$ such that $a_{j,k} = \langle v_j, v_k \rangle$ for all $j,k$, and iff there is an invertible matrix $B$ such that $A = B^* \! B.$ For example, the matrix with entries $1 / (j+k-1)$ (“Hilbert matrix”) is positive-definite, because it is the matrix of inner products (integrals on [0,1]) of the basis $1, x, x^2, \ldots, x^{n-1}$ for the polynomials of degree $\lt n.$ See the 10th problem set for a calculus-free proof of the positivity of the Hilbert matrix, and an evaluation of its determinant.

All of Chapter 8 works over an arbitrary algebraically closed field, not only over $\bf C$ (except for the minor point about extracting square roots, which breaks down in characteristic 2); and the first section (“Generalized Eigenvalues”) works over any field.
More about nilpotent operators: let $T$ be any operator on a vector space $V$ over a field $F,$ not assumed algebraically closed. If $V$ is finite-dimensional, then The Following Are Equivalent:
(1) There exists a nonnegative integer $k$ such that $T^k = 0$;
(2) For any vector $v \in V$, there exists a nonnegative integer $k$ such that $T^k v = 0;$
(3) $T^n = 0$, where $n = \dim V$.
Note that (1) and (2) make no mention of the dimension, but are still not equivalent for operators on infinite-dimensional spaces. (For example, consider differentiation on the $\bf R$-vector space ${\bf R}[x].)$ We readily deduce the further equivalent conditions:
(4) There exists a basis for $V$ for which $T$ has an upper-triangular matrix with every diagonal entry equal zero;
(5) Every upper-triangular matrix for $T$ has zeros on the diagonal, and there exists at least one upper-triangular matrix for $T$.
Recall that the second part of (5) is automatic if $F$ is algebraically closed.
The space of generalized 0-eigenvectors (the maximal subspace on which $T$ is nilpotent) is sometimes called the nilspace of $T\!$. It is an invariant subspace. When $V$ is finite dimensional, $V$ is the direct sum of the nilspace and another invariant subspace $V',$ consisting of the intersection of the subspaces $T^k(V)$ as $k$ ranges over all positive integers (8.5). This can be used to prove Cayley-Hamilton (over an algebraically closed field) using the standard definition of the characteristic polynomial as $\det(xI-T).$
An example in infinite dimension when (8.5) fails: V is the real vector space of continuous functions from $\bf R$ to $\bf R$, and $T$ is multiplication by $x$. [That is a useful counterexample for many other aspects of “eigenstuff” when we try to go beyond finite dimension; for example, there are no eigenvectors, but for every real number $\lambda$ the operator $\lambda I - T$ is not invertible!]
The dimension of the space of generalized $\lambda$-eigenvalues (i.e., of the nilspace of $T-\lambda I)$ is usually called the algebraic multiplicity of $\lambda$ (since it’s the multiplicity of $\lambda$ as a root of the characteristic polynomial of $T$), to distinguish it from the “geometric multiplicity” which is the dimension of $\ker(T-\lambda I)$, a.k.a. the eigenspace $V_\lambda$.

Our source for representation theory of finite groups (on finite-dimensional vector spaces over $\bf C$) will be Artin’s Algebra, Chapter 9. We’ll omit sections 3 and 10 (which require not just topology and calculus but also, at least for §3, some material beyond 55b to do properly, namely the construction of Haar measures); also we won’t spend much time on §7, which works out in detail the representation theory of a specific group that Artin calls $I$ (the icosahedral group, a.k.a. $A_5$).. There are many other sources for this material, some of which take a somewhat different point of view via the “group algebra” ${\bf C}[G]$ of a finite group $G$ (a.k.a. the algebra of functions on $G$ under convolution). See for instance Chapter 1 of Representation Theory by Fulton and Harris (mentioned in class); some further introductory remarks in this direction are a couple of paragraphs below. A canonical treatment of representations of finite groups is Serre’s Linear Representations of Finite Groups, which is the only entry for this chapter in the list of “Suggestions for Further Reading” at the end of Artin’s book (see p.604).

While we’ll work almost exclusively over C, most of the results work equally well (though with somewhat different proofs) over any field $F$ that contains the roots of unity of order $\#(G)$, as long as the characteristic of $F$ is not a factor of $\#(G)$). [We also use the notation $|G|$ for the cardinality $\#(G)$]. Without roots of unity, many more results are different, but there is still a reasonably satisfactory theory. Dropping the characteristic condition leads to much trickier territory, e.g. even Maschke’s theorem (every finite-dimensional representation is a direct sum of irreducibles) fails; some natural problems are still unsolved a century-plus later!

Here’s an alternative viewpoint on representations of a finite group $G$ (not in Artin, though you can find it elsewhere, e.g. Fulton-Harris pages 36ff.): a representation of $G$ over a field $F$ is equivalent to a module for the group ring $F[G].$ The group ring is an associative $F$-algebra (commutative iff $G$ is commutative) that consists of the formal $F$-linear combinations of group elements. This means that $F[G]$ is $F^G$ as an $F$-vector space, and the algebra structure is defined by setting $e_{g_1} e_{g_2} = e_{g_1 g_2}$ for all $g_1, g_2 \in G$, together with the $F$-bilinearity of the product. This means that if we identify elements of the group ring with functions $G \to F$ then the multiplication rule is $(f_1 * f_2)(g) = \sum_{g_1 g_2 = g} f_1(g_1) \, f_2(g_2)$ — yes, it’s convolution again. To identify an $F[G]$-module with a representation, use the action of $F$ to define the vector space structure, and let $\rho(g)$ act by multipliction by the unit vector $e_g$. In particular, the regular representation is $F[G]$ regarded in the usual way as a module over itself. If we identify the image of this representation with certain permutation matrices of order $\#(G)$, we get an explicit model of $F[G]$ as a subalgebra of the algebra of square matrices the same order. For example, if $G = {\bf Z} / n {\bf Z}$ we recover the algebra of circulant matrices of order $n$.

The regular representation of a group $G$ is the special case $G=S$ of the permutation representation associated to an action of $G$ on a set $S$ (which in turn can be defined as a homomorphism, call it $h,$ from $G$ to the group of permutations of $S$; NB as with linear representations there is no requirement that $h$ be injective — if it is injective, the action is said to be “faithful”). The permutation representation associated to $h$, call it $\rho_h,$ has dimension $\#S$, and can be regarded as the vector space ${\bf C}^S$ with basis $\{ e_s : s \in S \}$ indexed by $S$. (Thus we usually assume that $S$ is finite.) Any $g \in G$ takes $e_s$ to $e_{g(s)}$ (more fully, to $e_{(h(g))(s)}$); this and linearity gives the action of $G$ on all of ${\bf C}^S$. Warning: if we write the typical element of ${\bf C}^S$ as $\sum_{s \in S} c_s e_s$ then $\rho_h(g)$ takes it to $\sum_{s \in S} c_s e_{g(s)}$, which in general is not the same thing as $\sum_{s \in S} c_{g(s)} e_s$ as you might expect, but $\sum_{s\in S} c_{g^{-1}(s)} e_s$. Indeed if we tried to define a representation by $(\rho(g)) \left(\sum_{s \in S} c_s e_{g(s)}\right) = \sum_{s \in S} c_{g(s)} e_s$ then we would find that $\rho(g_1) \rho(g_2)$ is not $\rho(g_1 g_2)$ but $\rho(g_2 g_1)$ [check this!], so we wouldn’t get a representation at all unless $G$ is abelian!

Another way to describe this action: identify ${\bf C}^S$ with the space of maps $S \to {\bf C}$ (so $\sum_{s \in S} c_s e_s$ corresponds to the map $s \mapsto c_s$), and then $\rho_h(g)$ takes the map $f$ to $f \circ h(g^{-1}).$ This all works with $\bf C$ replaced by any ground field, as does the formula for the character of this representation, which states that $\chi_{\rho_h} (g)$ is the number of elements of $S$ fixed by $h(g)$ (though as usual this is not as informative in positive characteristic).

It seems Artin does not mention the following: if $\phi$ is the character of an irreducible representation $U$, then for any representation $(V,\rho)$ the map $P_\phi = \frac{\dim U}{\#G} \sum_g \overline{\phi(g)} \rho(g)$ is a \hbox{$G$-endomorphism}, and if $V$ is irreducible then (by Schur and the orthogonality formula) $P_\phi$ is the identity if $V \cong U$ and zero otherwise. That is, $P_\phi$ is projection to the “$U$-isotypic subspace” of $V$: if $V = \oplus_i V_i$ is any decomposition into irreducibles then $P_\phi$ is projection to the subsum of those $V_i$ that are isomorphic with $U$. In particular, this isotypic subspace is an invariant of $V$ (whether or not $V$ is finite dimensional). This is a grand generalization of the fact that if $\iota$ is any involution of a vector space $V$ (over a field of characteristic other than 2) then $\frac12(1 \pm \iota)$ is projection to the $(\pm 1)$-eigenspace of $\iota$ (which is the special case $G = \{1, \iota\}).$

(In particular, $\sum_g \overline{\phi(g)} \rho(g)$ acts on $U$ by multiplication by $\#G \, / \dim(U);$ once one knows enough about algebraic integers and related conceptsit follows that this is an integer, and thus that the dimension of every irreducible representation of a finite group is a factor of the group order. We alas will not be able to prove this fact in Math 55.)

Finally, apropos of orthogonality of characters: if $T$ is the character table then orthogonality of characters is tantamount to $T D T^* = |G| I,$ where $T^*$ is the Hermitian transpose and $D$ is the diagonal matrix whose diagonal entries are the sizes of the conjugacy classes (in the same order as the columns of the table). If $G$ is abelian, this simplifies to $T T^* = |G| I$, which then implies $T^* T = |G| I$ so the columns of $T$ are orthogonal too — which we have seen already. This remains true in the present case: since $T$ is invertible (with inverse $|G|^{-1} D T^*),$ we may conjugate $T D T^* = |G| I,$ by $T$ to deduce $D T^* T = |G| I$ and thus $T^* T = |G| D^{-1}$. This says that the columns of $T$ are orthogonal (with respect to the usual complex inner product), and for any $g \in G$ the column of $T$ corresponding to the conjugacy class $[g]$ has squared norm $|G| / [g],$ which is the size of the commutator of $g$: $$\sum_\chi |\chi(g)|^2 = \# C_g.$$ For example, for the character table $$ \left[ \begin{array}{rrr}1 & 1 & 1 \cr 2 & 0 & -1 \cr 1 & -1 & 1\end{array} \right] $$ of the symmetric group $S_3$, these squared norms are $6, 2, 3$, which are indeed the sizes of the corresponding commutators. As a special case, taking $g=1$ we again recover the sum-of-squares formula $|G| = \sum_\chi \chi(1)^2.$

A few remarks around Artin’s development in Chapter 6 leading up to the Sylow theorems:

In the proof of Cauchy’s theorem (the Wikipedia page’s “Proof 2”), if the group order — call it $n$ — is not a multiple of $p$, we find that $n^{p-1} \equiv 1 \bmod p$ (since in this setting the identity $e$ is the only solution of $g^p=e);$ this gives a combinatorial proof of “Fermat’s little theorem” for $n > 0$ (since there is then always at least one group of $n$ elements, namely the cyclic group ${\bf Z}/n{\bf Z}).$ Replacing $n$ by $-n$ then yields the result for negative integers as well (since $(-1)^{p-1} \equiv 1 \bmod p$ even for $p=2).$
Often Artin’s 4.7 is called Sylow II, with 4.6 an intermediate result; but Artin calls 4.6 “Sylow II” and 4.7 a corollary.
The combinatorial argument for Sylow I also extends to prove the “$1 \bmod p$” part of Sylow III once we show that ${p^e m \choose p^e} \equiv m \bmod p.$ A nice way to see this is to start from the familiar congruence $(1+X)^p \equiv 1+X^p \bmod p$ in ${\bf Z}[X]$ (which follows from ${p \choose k} \equiv 0 \bmod p$ for $0<k<p),$ and deduce inductively that $(1+X)^{p^e} \equiv 1+X^{p^e} \bmod p$ for $e=1,2,3,\ldots.$ Raising to the $m$-th power yields $(1+X)^{p^e m} \equiv (1+X^{p^e})^m \bmod p$, and then comparing $X^{p^e}$ coefficients yields the desired congruence ${p^e m \choose p^e} \equiv m \bmod p.$
One might imagine that since all finite groups can be built up from simple ones, and the Classification Theorem describes all simple finite groups, we can understand all finite groups. Alas(?) this is far from the case. Even $p$-groups, that is groups of prime-power order $p^n,$ are chaotic for large $n$. Indeed for given $p$ the number of groups of order $n$ grows as $p^{\frac{2n^3}{27} - O(n^2)}_{\phantom0},$ with most of these groups fitting into a short exact sequence of the form $1 \to ({\bf Z}/p {\bf Z})^d \to G \to ({\bf Z}/p {\bf Z})^e \to 1$ with $d+e = n.$ To see how so many such groups can exist, write the short exact sequence as $1 \to V \to G \to W \to 1$, and construct a map $W \times W \to V$ as follows. Given $w_1,w_2 \in W$, choose preimages $g_1,g_2 \in G$, and map $(w_1,w_2)$ to the commutator $[g_1,g_2],$ which is in the kernel of the map $G \to W$ and can thus be regarded as a vector in $V$. One can check that this commutator is independent of the choice of preimages $g_1,g_2$, depends bilinearly on $w_1,w_2$, and is alternating (the image vanishes if $w_1=w_2$). Thus we have an element of ${\rm Hom}(\wedge^2 V, W)$, a vector space over ${\bf Z} / p{\bf Z}$ of dimension $d \cdot {e \choose 2}$, which is maximized when $d$ and $e$ are within a constant of $n/3$ and $2n/3$ respectively. Somewhat harder, one can show that any alternating map $W \times W \to V$ is realized by some $G$ (and is realized uniquely unless $p=2.)$ For two such maps to give rise to isomorphic groups, they must be related by elements of ${\rm GL}(V) \times {\rm GL}(W),$ and that group has fewer than $p^{d^2+e^2} < p^{n^2}$ elements. Hence there are at least $p^{\frac{2n^3}{27} - O(n^2)}_{\phantom0}$ isomorphism classes as claimed.
(Similar “chaos” affects the classification of trilinear and higher-order maps on vector spaces, such as alternating trilinear forms on a vector of high dimension.)

Thanks to Vikram for this $\rm\LaTeX$ template for problem-set solutions (here’s what the resulting PDF looks like). They ask that e-mail submissions of problem sets have “Math 55 homework” in the Subject line.

First problem set / Linear Algebra I: vector space basics; an introduction to convolution rings
Clarifications:
• “Which if any of these basic results would fail if $\bf F$ were replaced by $\bf Z$?” — but don’t worry about this for problems 7 and 24, which specify $\bf R$.
• Problem 12: If you see how to compute this efficiently but not what this has to do with Problem 8, please keep looking for the connection.
Here’s the “Proof of Concept” mini-crossword with links concerning the ∎ symbol. Here’s an excessively annotated solution.

Second problem set / Linear Algebra II: dimension of vector spaces; torsion groups/modules and divisible groups
About Problem 5: You may wonder: if not determinants, what can you use? See Axler, Chapter 4, namely 4.8 through 4.12 (pages 121–123), and note that the proof of 4.8 (using techniques we won’t cover till next week) can be replaced by the ordinary algorithm for polynomial long division, which you probably learned with real coefficients but works over any field. While I’m at it, 4.7 (page 120) works over any infinite field; Axler’s proof is special to the real and complex numbers, but 4.12 yields the result in general. (We already remarked that this result does not hold for finite fields.)

Third problem set / Linear Algebra III: Countable vs. uncountable dimension of vector spaces; linear transformations and duality
corrected 18 September (Mark Kong):
• Problem 2: Suppose that for some (finite) $n$ we can extend $B_0$ by $n$ vectors (not “extend $B$ by $n$ vectors” etc.).
• Also, in Problem 1 Mark notes that one already needs a bit of the Axiom of Choice even to prove the fact (which I blithely asserted in class) that a countable union of countable, or even finite, sets is itself countable. (If you can enumerate a countable disjoint union $\bigcup_{i=1}^\infty S_i$ of countable or finite sets, then you can choose an element of $\prod_{i=1}^\infty S_i$ by choosing from each $S_i$ the element that comes earliest in the enumeration.) Go ahead and assume this for Problem 1.
• (And in Problem 10 it’s subsets of fewer than $e$ elements of $F$, not $e$-element subsets — but that’s still not polynomial in $q$.)

Fourth problem set / Linear Algebra IV: Duality, and connections with projective spaces and with vector spaces of polynomials
corrected 27.ix.2017: In problem 2ii, we need nonzero $x \in F$ such that $x^n \neq 1$ (not “$x^n=1$” which always exists);
and the introductory sentence now makes explicit the intention that $F$ is a finite field of $q$ elements also for problem 3.
(Noted by Forrest Flesher)

Fifth problem set / Linear Algebra V: “Eigenstuff” (preceded by prelude: exact sequences and more duality)
corrected 8.x.2017: CJ Dowd is the first to note that in problem 9 (Axler 5A:31) we cannot quite let $\bf F$ be arbitrary:
if it is finite and of size less than $m$ then it cannot contain enough pairwise distinct eigenvalues to accommodate $v_i$
for each $i=1,2,\ldots,m$ ! Fortunately this is the only obstruction, so for this problem assume that $\bf F$ contains
at least $m$ distinct elements.

Sixth problem set / Linear Algebra VI: $\bigotimes$ (and also eigenstuff cont’d, and a bit on inner products)

Seventh problem set / Linear Algebra VII: Inner products etc.

Eighth problem set / Linear Algebra VIII: The spectral theorem; spectral graph theory; symplectic structures
Problems 7 and 8 postponed till Friday, 3 November at noon.

Ninth problem set / Linear Algebra IX: Trace, determinant, and more exterior algebra

Tenth problem set: Linear Algebra X (determinants and distances); representations of finite abelian groups (Discrete Fourier transform)
(Yes, in Problem 9i the equation “$A^4 = N^2$” means $A^4 = N^2 I$ [i.e. $P(A) = 0$ where $P$ is the polynomial $X^4 - N^2 \in {\bf C}[X]$.)
correction to 1i (CJ Dowd): the equality condition is not quite right when $\det A = 0$ (and $n \geq 3)$, when equality holds iff some $v_i = 0$, and then the other $v_j$ are orthogonal to that $v_i$ but need not be orthogonal to each other.

Problem set 10.99557…

Eleventh and final problem set: Representations of finite abelian groups
corrected 28.xi.2017: Fan Zhou notes that in part (i) of Problem 5 $g_1,g_2$ are in $G_1,G_2$ respectively, not $V_1,V_2$.