To-do: compare with representation theory proof.

Link for self: Stanley’s generating function proof (Wikipedia mentions “reverse plane partitions”—the point seems to be that by considering more general labelings than , one can extract enough structure for a nice generating function), Theorem 17 of http://pfeinsil.math.siu.edu/MATH/MatrixTree/gessel-determinants_paths_and_plane_partitions.pdf

Russian notation, with the slopes of the rim edges (up or down) labeled as in the diagram for , clarifies the additive-combinatorial nature of hook lengths: hook lengths are given by the differences between the half-integers indexing the upward and downward “ends” of a hook (see also Section 2 of http://arxiv.org/abs/1507.04290v4 for exposition on application to so-called “core partitions”). From cursory googling it seems considering “negative hook lengths” along these lines can be useful (see http://www.math.ucsd.edu/~fan/mypaps/fanpap/10hook.pdf).

Anyways, given a partition , it is natural in the context of hooks (illuminates factorial-like structure) and HLF (in view of out-corner-removal recursion) to keep track of the “in-corners” and “out-corners” ordered as a sequence (see diagram). For instance, the product of hook lengths is . (Note that this formula holds for any sequence whose contain all out-corners and contain all in-corners.) Removing an out-corner essentially corresponds to replacing some with the sub-sequence . (We have an additional relation from the placement of the vertex of the partition at the origin, but this is unimportant as our problem is invariant under horizontal translation.)

Idle question for self: Is there a nice generating function in these coordinates? Connection to core partition coordinates?

Now, we induct/recurse based on the following observation: any labeling (i.e. standard Young tableau on ) has maximal number on some “out-corner” of . (The out-corners can also be thought of as “rim hooks” of length 1, where “rim hooks” are hooks whose removal leaves a smaller partition remaining.)

Thus, by looking at the set of hook lengths under removal of out-corners, it suffices, after some telescoping cancellation (note that every hook either changes by length 0 or 1, and most change by 0, i.e. stay the same; from core partitions perspective, think of corners as 1-rim-hooks), to prove that .

However, by decomposing into rectangular cells as in the diagram, one has . (This is quadratic in the position coordinates , as expected.)

It is, however, true algebraically that . This follows from the general identity for any polynomial , where is the remainder (a possibly-zero polynomial of degree at most ), which itself follows from either finite differences with arbitrary (in particular, not necessarily equal) steps—induct on —or Lagrange interpolation—look at the “leading” (i.e. ) coefficient of —or partial fractions—decompose , and compare power series of both sides. (This is, incidentally, related to the theory of the different ideal in algebraic number theory: see Problem 1 at http://math.mit.edu/classes/18.785/2015fa/ProblemSet6.pdf for instance.)

For the application, just take , whose quotient mod is ; then is easily computed by Vieta’s formulas, as desired.

]]>

- The maximality-based scenario in Keenan Kidwell’s answer shows the Hilbert (ray) (p-)class field tower is Galois. (You may find Lemmermeyer’s survey, or the “Background” section here helpful.)
- An unramified (at finite places) abelian extension of a quadratic number field is Galois (see also MathOverflow). When the abelian extension in question is cyclic of degree 4, this comes up while constructing the “4-part” of the (narrow) Hilbert 2-class field (see Theorem 2 of Lemmermeyer, “Construction of Hilbert 2-class fields”), going beyond the construction of the “2-part” from classical genus theory. I was reminded of this after seeing Exercise 18.10 of Ireland & Rosen (find an example of a non-abelian CM-field) during a Directed Reading Program meeting today (01/29/16), but it’s not too closely related.

]]>

In this post (specifically in Section 4) we give a proof of the classification of finitely generated (f.g.) modules over a Dedekind domain avoiding the notion of projective modules. (Instead we rely on pure submodules, which give a slightly more transparent/natural/direct approach to the classification problem at hand, at the expense of the greater generality afforded by the projective module approach. Along the way we also explain why these approaches are really not so different after all.)

We gradually build up from the more familiar special cases when is a principal ideal domain (PID) or even just a field. Along the way we discuss some relevant abstractions, such as the splitting lemma (Section 2).

I would like to thank Profs. Yifeng Liu and Bjorn Poonen for teaching me in 18.705 and 18.785, respectively, this past (Fall 2014) semester at MIT. Thanks also to Alison Miller for catching a typo in Exercise 3.

**1. Preliminaries **

First we recall the simpler case when is a PID. In fact, we start with the simplest case of fields:

Exercise 1 (Structure theorem for f.g. modules over a field (i.e. vector spaces), “right splitting” approach via free modules)Let be the field, the module (i.e. -vector space).Let be a -submodule (i.e. subspace) of , and suppose is free over . Lift a basis of (arbitrarily) to prove that .

(The key point is that if some -linear combination , i.e. , then by linear independence in we must have .)

Conclude by induction that the f.g. modules over are precisely the free modules over , i.e. those isomorphic to for some .

Remark 1It’s a separate (but obviously important) issue to show that the -dimension is well-defined, i.e. if and only if . For (PIDs and) Dedekind domains there is a similar notion of rank in the torsion-free case, but we won’t focus on these sorts of issues throughout the post.

Exercise 2 (Structure theorem for f.g. modules over a PID, algorithmic generators and relations approach)Let be the PID, the module.

- Take a finite set of generators of , i.e. a surjection , and look at the kernel (so ), which describes the “relations” among the set of generators. Show that is f.g. over . (In other words, is finitely presented.)
- Take a finite presentation (here ), which corresponds to an (possibly “overdetermined”) “relations matrix” . Mimicking Gaussian elimination (but using only -row/column operations), show that can be “diagonalized”, i.e. put in a Smith normal form. This corresponds to a suitable -linear change of generators of , combined with an alternative -linear change of the “description” of the relations among the generators. (This shouldn’t be confused with diagonalization of linear operators. It’s really more or less just solving a system of -linear equations.)

Exercise 3 (F.g. torsion-free modules over PIDs are free, “right splitting” approach via free modules)Let be the PID, the torsion-free f.g. module.Mimicking Exercise 1, prove that is a free -module. (Hint: Take a “saturated” submodule of , in the sense that if for some then , or equivalently, that is torsion-free. Now show that .)

Exercise 4 (Torsion part (of f.g. module over PID) breaks off, “right splitting” approach via free modules)Let be the PID, the f.g. module, its torsion submodule.Using and then mimicking Exercise 3, prove that .

Remark 2Exercise 2 is the first proof on Wikipedia, as well as the proof in Artin’s “Algebra”.Another approach, with key steps outlined in Exercises 3 and 4, (which doesn’t explicitly reduce or otherwise work with generators/relations) is to (less explicitly) find a special -submodule of and try to prove that is a direct summand of . (The whole point is that this is generally sutble when isn’t a vector space—this is one of the main problems of representation theory.) For instance, see the second proof on Wikipedia, or Emerton’s answer at the discussion on MathOverflow. (These are effectively done using the fact that free modules are projective, which is also the standard approach to the general Dedekind domain problem.)

Nonetheless, these two approaches have the same overall flavor: intuitively, the fully “saturated” submodules are the ones that break off (as direct summands), whether we identify them algorithmically (as in the first approach) or through some characterization (as in the second approach).

**2. Starting from scratch: (canonical) direct sum decomposition **

In general, say we have an -submodule (or slightly more generally, an injection , though importantly, this injection may not be unique), and we wish to prove is a direct summand of , i.e. for some .

Slightly more generally, let’s look at (all the terms of) the whole short exact sequence (sometimes it’s better to think in terms of , so that , and sometimes it’s better to think in terms of , so that ). What does it take to get a “canonical” splitting? (We roughly follow the terminology in the linked Wikipedia article.)

** 2.1. (“Canonical”) left splitting **

Viewing as a submodule of , it suffices to get a “compatible -projection” , i.e. a (surjective) map such that , where is the inclusion map. (Then will be a projection in the usual linear algebra sense.)

For instance, when is a group algebra, one can prove Maschke’s theorem by constructing a -projection (specifically, by “averaging” a -projection into a -projection).

( is called an injective module if and only if we have such a “left splitting” whenever is a module containing .)

** 2.2. (“Canonical”) right splitting **

Let denote the surjection . It suffices to get a “compatible -inclusion” , i.e. an (injective) map such that .

( is called a projective module if and only if we have such a “right splitting” whenever is a module with a surjection onto .

For instance, when is a Dedekind domain, see Exercise 6.)

** 2.3. Briefly, the relation between splittings and the direct sum decomposition **

In the case (for definiteness), the and are related as follows: for any .

Exercise 5 (based on Exercise 3.38 from Pete Clark’s aforementioned commutative algebra notes)Let be a ring. Prove that [every -module is injective] if and only if [every -module is projective].For example, prove that if for a field and finite group , then both statements hold for the group algebra.

**3. Sketch of (classical?) projective module approach **

We now return to the general case of Dedekind domains . Let be the fraction field of .

Exercise 6 (F.g. torsion-free modules over Dedekind domains are projective, “right splitting” approach via projective modules)Let be the Dedekind domain, the torsion-free module.For a detailed treatment of projective modules, we defer to Pete Clark’s notes. (Incidentally, he also started an interesting MathOverflow discussion on the difficulties of teaching free, projective, and flat modules.)

- The
rankof is defined as the (clearly finite) dimension of , or equivalently the localization , as a vector space over . (In other words, we’re embedding the -module inside a vector space in the only reasonable way.)- (Key step.) Prove that rank- modules are just (up to -module isomorphism) the fractional ideals of , and show that these are projective.
- Prove that direct sums of projective modules are projective, and conclude by induction that the f.g. torsion-free modules over are (projective and) precisely (up to isomorphism) the finite direct sums of fractional ideals of .
- Give a direct proof (not using that direct sums of projective modules are projective) that f.g. torsion-free modules over are projective, by mimicking the proof for rank- modules. Of course, by induction, this further induces an alternative proof of the classification of f.g. torsion-free modules.

Remark 3Exercise 6 is the heart of the (classical?—I believe it’s Steinitz’ original proof) projective module approach to breaking off submodules (as direct summands). The use of localization (or similar Bezout’s identity arguments) seems pretty fundamental in general; we use it as well to verify the the pure submodule criterion below (in the alternative approach: see the key Theorem 3). Exercise 7 further clarifies the similarity. Here are some references:

- hilbertthm90‘s sequence of three blog posts, which I believe roughly follow May’s notes on Dedekind domains.
- Chapter III, section 22, page 144 of “Representation theory of finite groups and associative algebras” by C. Curtis and I. Reiner.
- Pete Clark’s commutative algebra notes, Chapter 20.6. Finitely generated modules over a Dedekind domain, page 290.
- The first chapter of Henri Cohen’s “Advanced Topics in Computational Number Theory” gives an interesting proof of the projectivity: , so is a direct summand of a free module, hence projective.
- This was the approach given in 18.785 (MIT Fall 2014, taught by Prof. Bjorn Poonen) Problem Set 4.

Remark 4It seems difficult to apply the generators and relations approach (from Exercise 2) to the Dedekind domain problem. For instance, it doesn’t seem easy to describe an analogous Smith normal form—in the final decomposition into (torsion) and (torsion-free fractional ideals), the fractional ideals are not principal (though they are generated by at most elements). That said, since , all but one component can be taken principal (in general), so maybe there’s some hope. Also maybe “Hermite and Smith normal form algorithms over Dedekind domains” by Henri Cohen is relevant.

**4. Alternative pure submodule approach **

** 4.1. Pure submodule criterion, in the spirit of generators and relations **

Let be a ring, a submodule. The usefulness of the following definition will soon become clear. Overall the idea is to refine the “right splitting approach” from Exercises 3 and 4 in the direction of the algorithmic generators-and-relations ideas of Exercise 2, rather than the projective module direction from Exercise 6.

Definition 1We define a submodule (of a fixed -module ) to bepureif the following condition holds: if we choose finitely many , and we can write the as -linear combinations of some , then the can in fact be written as corresponding -linear combinationswith the taken inside. In other words, we can find such that for all .Of course, we can rephrase this condition concisely in terms of matrices: for and , we have if and only if .

Example 1If is a direct summand of , i.e. for some (internal direct sum), then the projection allows us to simply take for all in any such scenario. So is a pure submodule of .

Observation 5 (Equivalent definition of purity)is pure if and only if following condition holds: for any finite choice of and with , there exist with .

Theorem 2 (Pure submodule criterion (for direct summands), 18.705 (MIT Fall 2014, taught by Prof. Yifeng Liu) Midterm, Problem 3.4)Let be a ring, a submodule. Further suppose is finitely presented (e.g. if is noetherian and is f.g.).If is a pure submodule of , then is a direct summand of , i.e.

internallyfor some submodule (in particular, the direct sum here must be canonical). In other words, when is finitely presented, the converse of Example 1 holds.

*Proof:* Everything will be canonical here, i.e. we’ll prove that the short exact sequence splits (in the sense of the splitting lemma from Section 2).

Observation 6If () generate , then a right splitting (induced by) is just a well-defined map sending each to some lift/pre-image (i.e. the must lie in ).It’s easy to see that a choice of works (i.e. the corresponding construction of is consistent) if and only if we have whenever (because the latter is equivalent to ). (And if is well-defined, then the corresponding projection satisfies .)

This closely resembles the condition for purity given in Observation 5, modulo a final technical point—since is finitely presented, the generators can be chosen such that **we only have to worry about finitely many -linear conditions** (since in a finite presentation, the set of -linear conditions, viewed via corresponding “coefficient vectors” as a submodule of , is f.g.). So we’re done.

Remark 7Example 1 and Theorem 2 together show that when is finitely presented, purity is more or less a “relative (to )” version of injective modules.)

Exercise 7 (Connection between projective module and pure submodule approaches)Prove that afinitely presented-module is projective if and only if the following condition holds: for any -surjection , the kernel is a pure submodule of .Given the “equational” nature of Definition 1, is this connection related to the dual basis lemma, or another “equational” characterization of projective modules? (I haven’t thought through this much myself.)

Remark 8If we only assume to be f.g., then as long as we tweak Definition 1 to includeinfinitesystems of equations, the natural analog of Theorem 2 holds, with essentially the same proof. This is probably still useful; for instance, I believe the proof of the Dedekind module classification below would still go through. Cf. Greg Kuperberg’s related comments in the aforementioned MathOverflow discussion.(I believe this is more or less a “relative (to )” version of algebraically compact (or pure-injective) modules.)

Although we won’t need it for the application below, here’s the rest of Problem 3:

Exercise 8 (Based on 18.705 (MIT Fall 2014, taught by Prof. Yifeng Liu) Midterm, Problem 3)In this exercise we donotassume that is finitely presented.

- Show that is pure if and only if the induced -map is injective for every finitely presented -module . (This is the usual first definition.)

Remark 9It may or may not help to use the equational criterion for vanishing of elements of tensor products; I haven’t thought about this too carefully myself.- Prove Example 1 using the tensor product formulation.
- When is a PID (e.g. take ), show that a submodule of is pure if and only if for all nonzero . (In other words, the condition in Definition 1 is “enough” over PIDs!)

Remark 10Hint: Probably the easiest solution is to combine the tensor product formulation with the structure theorem over PIDs. It would be interesting to find a more conceptual way to prove the result without the structure theorem, which might even lead to an alternative proof of the structure theorem.- Does the previous part still work for Dedekind domains ? If not, what’s the best analog (e.g. how many pairs are necessary in Definition 1)? (I haven’t actually worked this out myself, but I imagine there’s a reasonable answer. Feel free to take the structure theorem over Dedekind domains for granted.)

** 4.2. Applying pure submodule criterion to original problem **

The following is the key result.

Theorem 3 (Saturated submodules (of f.g. modules over Dedekind domains) break off as direct summands, “right splitting approach” via pure submodule criterion)Let be a Dedekind domain, a f.g. module over . Let be an -saturated submodule of , i.e. such that if for some nonzero and , then , or equivalently, is torsion-free. (For intuition, think about avoiding -submodules like or . One could also phrase this in terms of suitable tensor products, e.g. by in the torsion-free case, or a “semi-localization” of at a finite set of primes in the torsion case.)Then is a pure submodule of , hence a direct summand of by the pure submodule criterion (Theorem 2)—indeed, is f.g. over the noetherian ring , hence finitely presented.

*Proof:* Suppose for some (finitely many) in we have for some matrix . The key idea is localization: if we temporarily “add denominators to ”, then the system of equations will be easier to solve. Then in the spirit of Bezout’s identity, we’ll take a suitable linear combination to get a solution with denominator .

Let be a prime of , and the corresponding multiplicative subset. Then is a -saturated -submodule of , i.e. (via exactness of localization) is torsion-free. But is a DVR, hence a PID, so by the structure theorem over PIDs (specifically Exercise 3), is a free -module. Consequently (following the reasoning from Exercises 3 and 4), is a direct summand of .

Thus (by Example 1) is a pure -submodule of . But , so by Definition 1 (of purity) we have for some .

It follows that for each prime of , there exists with . Finally, it suffices to find an -linear combination of equal to . But this is easy: the -ideal generated by the is not contained in for any prime ideal (because it contains the element ), so , i.e. there exists a finite -linear combination of equal to .

Remark 11We can avoid localizing at sets containing zerodivisors by instead carefully semi-localizing, i.e. working with of the form (note that ). This actually makes the final paragraph easier; we just need to check is still a PID, which follows from a “pretty strong approximation theorem” (essentially the Chinese remainder theorem (CRT) for Dedekind domains) argument.

Corollary 4 (Torsion part (of f.g. module over Dedekind domain) breaks off, “right splitting approach” via pure submodule criterion)If is f.g. over a Dedekind domain , then is a direct summand of .

*Proof:* is torsion-free, so we may apply Theorem 3.

Remark 12Note that the proof here doesn’t (at least explicitly) use on the classification/characterization of torsion-free modules over . This contrasts with the standard development where one uses Exercise 6 (specifically, that torsion-free modules are projective) to prove the right splitting of , which is in fact also the approach for PIDs given in Exercises 3 and 4.

By the previous corollary, , where is torsion-free, so it remains to (separately) classify decompositions of torsion and torsion-free modules .

Corollary 5 (Decomposition of f.g. torsion-free modules over Dedekind domains, “right splitting approach” via pure submodule criterion)If is f.g. torsion-free over a Dedekind domain , then for (therankof ) fractional ideals , and fractional ideals of are indeed (of rank and hence) indecomposable.

*Proof:* In view of Theorem 3, our strategy is to find -saturated submodules of . We may think of as living inside a f.d. -vector space (equivalently, localize w.r.t. the set ). Then we’ll show that the -coordinates correspond to indecomposable direct summands of , which will be isomorphic to fractional ideals of (which are generally not principal—this is why the Smith normal form approach seems difficult—see Remark 4).

Let in . Saturate the submodule to get (in general it will no longer be principal), where . Then , so for some fractional ideal of , which (since is saturated) breaks off as a direct summand of . The -dimension of the localization/tensored-up -vector space decreases by each time we break a fractional ideal off, so the eventual decomposition is for some fractional ideals , with each indeed indecomposable (as it has corresponding -dimension ).

The hard work is done, but for completeness we include the proof of the torsion classification:

Theorem 6 (Classification of f.g. torsion modules over Dedekind domains)If is torsion, then is an -module (here we’ve used CRT), which induces a decomposition (there are certainly more concrete ways to think about all of this). (One could probably phrase this in terms of primary decompositions, but it doesn’t seem particularly worthwhile.) Each localized is easy to classify, since is a PID.(Actually, without the CRT prime power decomposition, we could’ve directly noted that is a PID (cf. Remark 11)—though the proof happens to use “pretty strong approximation”/CRT (but in a different way).)

]]>

Take an arbitrary orthogonal basis with respect to the form , and scale (and optionally reorder) so that there are entries of . (This can also be shown using the spectral theorem, but I think it’s overkill and conceptually messier (we’re “mixing up” linear transformation matrices and bilinear form matrices).)

So first we show are unique (i.e. invariant under change of basis). This is not too hard, since our basis is quite nice. The maximum dimension of a positive-definite set is , or else we would get a linearly independent set of at least vectors. (More precisely, this would force a nontrivial intersection between a dimension-() positive-definite space and a dimension- negative-semi-definite space, which is clearly absurd.) Similarly, the maximum dimension of a negative-definite set is .

Now that we have uniqueness, we move on to the eigenvalues and principal minor determinant interpretations. By the spectral theorem (for real symmetric matrices), the fact that symmetric matrices have real eigenvalues, and uniqueness of Sylvester form, has positive eigenvalues, negative eigenvalues, and zero eigenvalues.

This will allow us to induct for the principal minor determinant thing: if , and **all principal determinants are nonzero (important assumption for “mixed”-definite matrices)**, then there are exactly sign changes among the list of determinants of , where we interpret as .

The base case is pretty clear. Now suppose and the result holds for . The key interpretation of is as follows: let be the set of vectors in with last coordinates zero. Then for any , where denotes projection onto .

By inductive hypothesis on , if we have sign changes, then there are negative eigenvalues and positive eigenvalues **of **, with corresponding eigenvectors (forming an orthonormal basis WLOG by the spectral theorem) (first negative). To convert to information about , we use the vector interpretation—negative eigenvalues with eigenvector (with appended) give ; similar for positive. Of course, for distinct by orthogonality.

So now we have a positive definite set of dimension and a negative definite set of dimension , so has and . Thus either or , according to whether has the same sign as or .

]]>

We work in matrix form, so we need to prove that there exists an orthonormal basis of consisting of eigenvectors of a normal matrix .

First we show that has no generalized eigenvectors of order (or higher). Fix , and define . Since commutes with every matrix, is normal. It suffices to show that whenever (here is a vector). Suppose the latter holds. Since commute,

from which it similarly follows that

as desired.

Hence has all Jordan blocks of size ( is diagonalizable). Let be the distinct eigenvalues of , with corresponding eigenspaces , so that .

We now prove that the are pairwise orthogonal. Let be two distinct indices and take , . Let , so is normal as before. Then (by definition), so

Therefore

so forces , establishing the desired orthogonality.

Finally, we can just choose arbitrary orthonormal bases for the (nondegenerate) individual , and merge the resulting bases into a single orthonormal basis of eigenvectors for , so we’re done.

]]>

(so is a splitting field over/Galois over ) equals the degree of the “splitting part” of min poly of .

So naturally, we wonder if we can say anything starting from the perspective of groups, i.e. “can be replaced by an arbitrary (finite) group of automorphisms (of ) ?”

Perhaps not surprisingly, yes!

It’s not unnatural to do this from scratch, but we can run the arguments from before more or less in reverse. We’ll (among other things) show is a finite splitting/Galois extension of .

First observe (and this is the reason is “not too unnatural from scratch”) that acts on (since automorphisms of must be bijective), so for any , we have (product over distinct elements of the orbit of ), since its coefficients are invariant under . **In fact, this is the minimal polynomial of over , since otherwise some subset of the orbit would have to be fixed under .**

We immediately get finiteness of , or else we could get an infinite chain of intermediate fields between , giving elements of larger than by the primitive element theorem, contradiction.

Now let and write by the primitive element theorem, so we can apply the previous explanation’s results. In fact, by the previous paragraphs the minimal polynomial splits completely in , so we already have Galois. But now equals the size of the orbit , which is at most (in fact, although we don’t need this, we can get exactly by the orbit-stabilizer theorem, since the stabilizer is trivial, since generates everything). Yet (since fixes by definition), so **we conclude that and**

**strengthening our previous results.**

**1. Fundamental theorem of Galois theory**

OK now this. For a *Galois* extension (although our proof might only work for characteristic ), we get a correspondence for subgroups and for intermediate fields.

But we just proved that for finite subgroups , so it remains to show that is surjective.

Here it’s important that is Galois over , and hence over any intermediate field . Indeed, we then have , since (alternatively, if fixes something not in , say for some of degree at most , then for all conjugates , so , so is identically constant, which is clearly bad).

(Another way of thinking about the correspondence: for every intermediate field , we associate some set of conjugates (subset of the ) who together form an irreducible polynomial in .)

**1.1. Counterexample when is not Galois extension**

has three intermediate fields . But only has two subgroups.

**Comment.** For many purposes I think we can avoid Galois theory in the intermediate fields problem. For example, we can say a lot by considering the extension field generated by the coefficients of the minimal polynomial (it vaguely resembles the stuff above). See here.

]]>

(Random side note: in the context of abstract algebra, polynomials are naturally motivated by “minimal (free) ring extensions”: if we have a commutative ring , then is the smallest ring containing with the additional element *satisfying no relations*. On the other hand, any constraints/extra relations would be polynomials, so at least for monic polynomials we get the quotient construction .)

Suppose is a singly-generated field extension (by primitive element theorem this is broader than it seems). If is the minimal polynomial of of degree , then let’s look at how it splits/factors in . If has some set of roots of roots lying in (the “splitting part”/linear factors of in ), say .

*Generally, the main object of Galois theory seems to be splitting fields (or if we’re lazy, algebraic closures), but I’m still fighting through the material myself so I don’t fully appreciate/communicate this myself.* Perhaps the point is just that it’s much easier to work concretely with roots (in algebraic closures) than directly with irreducible polynomials. (For example, we can then work with symmetric sums of the roots, and generally draw lots of intuition from number fields.)

We’ll work in characteristic for convenience, so e.g. the are pairwise distinct.

**1. “Symmetry” of the : crux of Galois theory, and introducing Galois groups**

The key is that (recall by definition), since the share minimal polynomials .

**To make this symmetry precise**, we phrase things in terms of **-automorphisms of **; each -automorphism fixes coefficients of , hence is uniquely determined by sending . Thus they form a Galois group of size , since we easily check the automorphisms to be bijections of .

**2. Looking at the splitting/linear part more closely, and introducing fixed fields**

A priori, we know (by definition) that fixes (all elements of) . But if is small, then it’s reasonable that it might fix much more.

With or without this intuition, it’s natural to play around with these automorphisms. Of course, if for some , then for all (which are -automorphisms). So applying this to , we see that “permutes the factors of ”. Focusing on the linear factors, we have (recall the are distinct by characteristic ).

It follows that has coefficients fixed by , hence in the **fixed field of ** (easily check it’s a field). So is a splitting field over (it’s Galois over ). Furthermore, since covers all the ( acts transitively on the ), * fixes no proper subset of the *; in other words, by irreducibility, ** is the -minimal polynomial of .**

**3. Computational/explicit perspective on fixed field**

Certainly we have . Does equality hold?

Well, certainly is irreducible in , hence certainly in as well. So because , we get , and by this forces .

Also, note that if two intermediate fields give the same min poly (of ), then this common set of -automorphisms is both and -automorphism. But you fix something in only if it’s in by a degree argument, so must have and vice versa. Alternatively, note that the poly lies in and is certainly irreducible so by degree argument we have .

**4. Summary of results**

Cool, so because the (splitting parts in of the min polys of ) are the same, the – and -automorphisms of are the same. In other words, . (Note that this is true for any field in between —the splitting parts are the same.)

But furthermore, is just large enough for for to lie in **yet be irreducible**. So finally,

**As a corollary, since , we have and . I’d be interested in a direct derivation/interpretation of the . Also note that is a splitting field (of ) if and only if if and only if .** I’ll probably have more to say later on, perhaps about intermediate fields.

(We can see some “competing goals” between these two paragraphs—in the former we want an intermediate field not too large, so the splitting part of the min poly stays the same; in the latter we want an intermediate field not too small, so the min poly completely splits. It’s quite nice that is just right.)

**5. More on the thing**

**5.1. Good example to play with for the thing**

A good example to play with is , . Then is simply between , so order , so . With the above approach, it’s not clear where , for instance, comes into play, where is just a third root.

**5.2. Another perspective on **

A different approach than the one above, that’s more satisfactory in a way: Take a primitive element with min poly , and again factor for of degree , with for a primitive element.

Extend to some (e.g. splitting field of over ) so that splits completely in , with . Let be the roots. Now partition these based on their . Suppose give , and give . Then consider the -isomorphism taking . It must map bijectively to , because corresponds to (and corresponds to ). So our partition splits guys into sets the same size as , and we get as desired.

**5.3. A similar way to compare **

A similar way to compare . Compare the factorizations of over and . Since the guys in are primitive elements (essentially by definition), maps to . In this proof it is natural to note that are the roots of lying in and , respectively, or equivalently that and for any respectively.

]]>

**1. Revisited/filtered thoughts after first trying to prove it myself**

First instinct is to use Lagrange interpolation. Runge’s phenomenon says equally spaced nodes are bad for this. More generally even smarter things like Chebyshev nodes are bad. See comments here for some intuition: high degree means greater oscillations in between nodes, as we’ve only controlled nodes perfectly and it’s thus hard to bound stuff between nodes. (On the other hand, I don’t see good intuition a priori why something like Chebyshev nodes shouldn’t work, **it’s just that it’s more plausible that it won’t work** than a “smoother/more-averaged-out” approximation. In fact the Wikipedia says all absolutely continuous guys are good with Chebyshev so… .)

Let’s do a continuous/smoother version of Lagrange interpolation instead. Analogous to Lagrange interpolation polynomials ( with and for ), we want interpolation polynomials such that when for some bijection (and quickly decays as leaves ); for simplicity we just use , but perhaps in other contexts we’d need to do something more complicated? We’ll then take ; this is apparently related to blurring; actually, this has some issues with endpoints, but if we first reduce to (or write out stuff more carefully) then we’re good to go. Anyway here works, where is a fixed positive integer and is just a scale factor to make , with an easy bound of . The rest is not hard, looking at using a uniform continuity bound to bound when is small. (We need uniform continuity since the same bound will apply independently of .)

**2. Probabilistic proofs**

The probabilistic proof with Bernstein polynomials is interesting and I don’t have great intuition for why this works while the Lagrange interpolation doesn’t, except that the term has “smooth/well-behaved contribution” that peaks when . (Compared with Lagrange where the contribution is at , at the , and perhaps unpredictable elsewhere. That probably makes stuff around particularly hard to control, given that our main bound will probably come from uniform continuity of around .)

**3. A more careful approach (which extends to other contexts, like trigonometric polynomials; Stone’s generalization)**

Another approach: it suffices to approximate the absolute value function on , by the following two links.

By the polynomial interpolation Wikipedia link above though, we should be able to use Chebyshev nodes to do this. Fix . Let for , so for and for . Then with , so on .

But (by differentiating or using roots of unity) we have if , and otherwise. So letting and noting that , we have

For we have , so for fixed , we have a negative number with increasing magnitude as increases in . So we have an alternating series, which bounds stuff by

for . Of course by symmetry we get the same bound for , so for all , and we’re done.

(Actually note that these are Chebyshev nodes of the *second* kind, but at least intuitively, it shouldn’t make a huge difference…)

**4. More related reading**

- Wikipedia on approximation theory, and a MathOverflow question on approximating polynomials related to Hermite interpolation.
- Chebyshev equioscillation theorem, and theorem/reference about how well Chebyshev can do in general: see here.
- Rational functions are way better at approximating: link.

]]>

In particular, my personal website has a Dropbox link to some of my handouts, notes, etc. as well as other googled ones.* (The link will break if I accidentally move files around, so let me know if that’s the case. But I’ll try to at least keep it updated on my website.)

I’ll probably add some math notes once I figure out how to do to WordPress conversion.

*On this note, I wonder how difficult it would be to set up a centralized global repository of (say) learning resources, perhaps organized via labels/tags (similar to Gmail inbox labels/“folders”—actually I think “multiple label” capability in general would be quite helpful; EDIT: apparently Google Drive allows this—just bad UI). Actually, this is perhaps sort of the idea behind the selected papers network (see homepage and Baez’s and Gowers’ blog posts). (I suppose Google already does this to some extent, but…)

This is only one way that better communication and collaboration could help everyone out a lot. For an example about how to improve learning via contests, see my comment under Evan Chen’s recent blog post. (The point is that contests are (ostensibly) heavily “problem-solving/creativity”-based, but I think it would be easy for authors to offer extremely helpful insights behind their problems (story, background/context, interconnections, etc.), which might then spur an overall healthier contest environment, etc.)

]]>