Sylvester’s law of inertia, from the spectral theorem

Here we prove Sylvester’s law of inertia. Let {A} be a real symmetric matrix, and assume the spectral theorem.

Take an arbitrary orthogonal basis with respect to the form {A(v,w) = X^tAY}, and scale (and optionally reorder) so that there are {(n_+,n_-,n_0)} entries of {+1,-1,0}. (This can also be shown using the spectral theorem, but I think it’s overkill and conceptually messier (we’re “mixing up” linear transformation matrices and bilinear form matrices).)

So first we show {n_+,n_-,n_0} are unique (i.e. invariant under change of basis). This is not too hard, since our basis {v_1,\ldots,v_n} is quite nice. The maximum dimension of a positive-definite set is {n_+}, or else we would get a linearly independent set of at least {(n_+ + 1) + n_-+n_0 = n+1} vectors. (More precisely, this would force a nontrivial intersection between a dimension-({\ge n_+ + 1}) positive-definite space and a dimension-{(n_-+n_0)} negative-semi-definite space, which is clearly absurd.) Similarly, the maximum dimension of a negative-definite set is {n_-}.

Now that we have uniqueness, we move on to the eigenvalues and principal minor determinant interpretations. By the spectral theorem (for real symmetric matrices), the fact that symmetric matrices have real eigenvalues, and uniqueness of Sylvester form, {A} has {n_+} positive eigenvalues, {n_-} negative eigenvalues, and {n_0} zero eigenvalues.

This will allow us to induct for the principal minor determinant thing: if {n_0=0}, and all principal determinants are nonzero (important assumption for “mixed”-definite matrices), then there are exactly {n_-} sign changes among the list of determinants of {A_0,A_1,\ldots,A_n}, where we interpret {\det(A_0)} as {1}.

The base case {n=1} is pretty clear. Now suppose {n\ge2} and the result holds for {n-1}. The key interpretation of {A_{n-1}} is as follows: let {V_k} be the set of vectors in {\mathbb{R}^n} with last {n-k} coordinates zero. Then {X^t A Y = \pi(X)^t A_{n-1} \pi(Y)} for any {X,Y\in V_n}, where {\pi} denotes projection onto {V_{n-1}}.

By inductive hypothesis on {A_{n-1}}, if we have {k} sign changes, then there are {k} negative eigenvalues and {n-1-k} positive eigenvalues of {A_{n-1}}, with corresponding eigenvectors (forming an orthonormal basis WLOG by the spectral theorem) {v_1,\ldots,v_k,v_{k+1},\ldots,v_{n-1}\in V_{n-1}} (first {k} negative). To convert to information about {A_n}, we use the vector interpretation—negative eigenvalues with eigenvector {X_i = (v_i,0)\in V_n} (with {0} appended) give {X^t A X = \lambda |X|^2 < 0}; similar for positive. Of course, {X_i^t A X_j = 0} for distinct {i,j} by orthogonality.

So now we have a positive definite set of dimension {n-1-k} and a negative definite set of dimension {k}, so {A} has {n_+\ge n-1-k} and {n_-\ge k}. Thus either {(n_+,n_-) = (n-k,k)} or {(n_+,n_-) = (n-1-k,k+1)}, according to whether {\det{A}} has the same sign as {(-1)^k} or {(-1)^{k+1}}.

2 thoughts on “Sylvester’s law of inertia, from the spectral theorem”

  1. Hey, apologies that this isn’t related to the above, but I was reading your blog and you recommended Stein’s analysis lectures. Now I just so happen to own a copy of the first lecture on Fourier Analysis and I’ll be taking a course next semester that covers intrductory real analysis (Rudin) and complex analysis (Priestley). When do you recommend I should read the Fourier analysis book? Just alongside the course or is there a point at which it will be more beneficial (after the course maybe?)

    Thanks so much and keep up the awesome work!

    Like

    1. If you have time, then it would be great alongside (or even before) the course—whenever there’s something from (baby) Rudin used in the Fourier analysis book, that would be good motivation to then learn that material well in Rudin. 🙂 And there are certainly plenty of connections between complex analysis and Fourier analysis!

      Like

Leave a comment