I suffered through determinants.
* Second edition: https://www.amazon.com/dp/0387982582
* Third edition: https://www.amazon.com/dp/3319307657
I wonder whether widespread adoption of his book pushed editors to make it look flashier and watered down. The contents are the same though.
For better or worse, the 3rd edition formatting and styling is something I've come to mentally associate with low-quality cash-grab big-lecture-hall tomes designed and written by committee over the course of a dozen editions. I wonder if I would have written off the book when I was a student if I'd seen it in such a form.
However, a nice geometric interpretation does not nice math make. If you remember their definition, the one with the sub-products and the alternating +/-es, and then imagine trying to prove that it has that geometric interpretation - you've arrived at a huge pain for undergrad math students that are just being introduced to matrices.
Exactly. We leaned heavily on determinants in my freshman linear algebra class but I went at least an additional year before I even heard the interpretation, much less could prove it from the standard terrible definition.
I keep seeing proponents of the Clifford Algebra as a much more elegant type of maths that supersedes linear algebra and also makes it much more intuitive, but I haven't really found a clear source on it yet.
Let's say you've got a linear transformation T:V->V, and suppose V is n-dimensional. Consider exterior powers of V, Λ^k(V); for each one we naturally get a linear transformation Λ^k(T):Λ^k(V)->Λ^k(V). In particular take k=n, so we get a linear transformation Λ^n(T):Λ^n(V)->Λ^n(V). But Λ^n(V) is one-dimensional, so Λ^n(T) must be multiplication by some constant factor. That factor is the determinant.
That's basically the most "natural" definition of the determinant (notice how multiplicativity immediately falls out of it, assuming of course you already know that Λ^k is a functor). You need the idea of a basis in order to (a) make sense of the statement "V is n-dimensional" and (b) prove that Λ^n(V) is 1-dimensional, but that's it, and you certainly never need to choose one in order to define the determinant.
Probably not a great definition for beginning students of linear algebra, but I do have to correct this idea that defining the determinant requires choosing a basis...
The determinant is uniquely determined by the following two axioms:
1. The determinant of the "multiply by lambda" operation on a one-dimensional vector space is lambda.
2. If you have a linear operator T on a vector space V, and a T-invariant subspace W, the determinant of T is the product of the determinant of T restricted to W and the determinant of the operator that T induces on the quotient space V / W.
This actually kinda intuitively meshes with the volume-stretching property: if you are stretching the volume by the factor lambda_1 along one subspace and by the factor lambda_2 along some complementary subspace, clearly the overall stretch factor is lambda_1 * lambda_2.
If you aren't working over an algebraically closed field, you can just tensor with the algebraic closure of whatever field you're working with and take the determinant there. There is also a way of adapting the definition so you don't have to do this, but it makes axiom (1) a bit more complicated.
Another bonus is this definition makes the Cayley-Hamilton theorem completely trivial.
Also, you can give an analogous definition of the trace if you replace multiplication with addition, and of the characteristic polynomial if you replace lambda in axiom 1 with x - lambda.
It's a bit trickier when the dimension is infinite, but again most definitions of dimension require there to be a basis of a particular size, the difficult part is proving that this makes sense (i.e. the dimension is unique and defined).
For those who are wondering, 'every vector space has a basis' is equivalent to the Axiom of Choice.
I mean I suppose you could just restrict the definition of dimension to vector spaces that have bases, and then you wouldn't have to. I guess that's what you're implicitly suggesting, that dimension would just not apply to vector spaces that don't have bases. That would make sense. But I'm used to thinking of dimension as, well, a function of vector spaces, not a partial function, so as I was thinking of it, you have to prove they all have bases before you can use it!
What definition of _dimension_ is there that does not rely upon the existence of a base?
In the infinite case this does not trivially give you a basis as 1) the supremum could be strictly larger than the cardinality of all linearly independent sets and 2) adding an extra vector to an infinite linearly independent set doesn't increase it's cardinality, hence there is no reason for the basis to span the entire space.
You could also take the infimum of all sets that span the entire space, but you run into similar problems.
Are there any books that use this definition? I like this style of linear algebra.
(Or, if you like, you could define the tensor algebra, take a quotient of that to get the exterior algebra, and then restrict to the image of the k'th tensor power to get the k'th exterior power.)
Again, obviously you need to use bases to prove how to compute the dimension of an exterior power. But you don't need them just to define it.
It's mostly graphical, and is really helpful in forming and cementing an intuition for linear algebra.
Which is to say, I could not recommend 3blue1brown's videos more highly, they are an invaluable aid to learning linear algebra and actually helping you understand what is you're doing when you're doing these various operations to "solve problems".
As a counterpoint, one place where determinants are incredibly useful is in Hartree-Fock theory, where they effective encode the Pauli exclusion principle (or anti-symmetry requirements) of atomic orbitals.
(Also, surface normals in integrals are bivectors, the 'i' of complex analysis is the bivector resulting from wedge product x^y, and e^(i theta) is the exponential map applied to the i operator, and (del wedge vector-function f) is the (bivector-valued) curl while (del wedge bivector-function g) is the (scalar valued) divergence (and that's why del(del(f)) = 0).)
(But differential forms should probably be omitted in a first course, because they get hairy quickly and are hard to wrap one's head around. It's enough to know that dxdy in integrals is actually dx^dy, and therefore the Jacobian appears when changing variables because of the factor that appears from dx'^dy' = dx'(x,y)^dy'(x,y).)
John Hamal Hubbard, Barbara Burke Hubbard - Vector Calculus, Linear Algebra and Differential Forms: A Unified Approach
EDIT: Nore precisely: A common way to axiomatize the cross product yields a cross products exactly in dimension 3 and 7.
Uhm, what the intuition behind _that_?
Which means that (T - λI)(x - y) = 0, and in general because of linearity every scalar multiple of (x - y) also maps to zero.
Letting (x - y) = v, we could get (T - λI)v = 0 ==> Tv = λv, which is perhaps a more familiar definition.
So, it's a neat way of expressing the concept, but I'm not sure what it buys you in terms of improving one's intuition.
1. λI stretches every vector (the whole space, really) by a factor λ.
2. Saying that the function is not injective means you lose information: when you apply it on some object and get a result, you can't trace back what was the original object, as there may be several. (There is no inverse function, then). In linear algebra, this only happens because there is some direction of space where all the vectors get collapsed to zero.
In short, T-λI collapses some line of vectors to zero.
So, when you took the effect of λI from T, you make it a lossy transformation in some direction. This means that _in that direction_ T had the effect of stretching all vectors by a factor of λ.
You gain some geometric understanding of T.
It is sort of intuitive, but the language may obscure it a little if you are not used to it.
> So, when you took the effect of λI from T,
If I understand right, you’re saying that there’s an interpretation in terms of the geometry of the T transformation, of subtracting this diagonal matrix from T. Multiplication of matrices is composition of transformations, I get that, but I’m not so sure what adddition/subtraction is.
Ideally you would like to do this for all n directions of space, and that way you completely describe what T does in simpler terms: it just stretches things differently in different directions. It's not always possible though. The matrices that allow this are called diagonalizable and the process of finding the stretch factors (eigenvalues) is called diagonalization.
Just a caveat: if an eigenvalue is complex, the effect is not as simple as a stretch, but the interpretation is very similar.
The question I have is _why_ use the injective-based definition instead of the usual well-known one? Is there some further insight down the road?
Also for finite-dimensional vector space a lot of concepts are equivalent, eg "a linear operator L is invertible <=> L is injective", which are not in infinite-dimensional vector spaces (eg https://math.stackexchange.com/a/2447563/21437). Studying T - λI turns out to be more useful in this case (eg https://en.wikipedia.org/wiki/Decomposition_of_spectrum_(fun...)
Furthermore, some would argue that mathematics has lost its way as it becomes dedicated to abstraction alone.
For most of the classical applications determinants are computationally terrible compared to factorization methods, e.g. for matrix inverse elimination is O(n^3) and Cramer's rule is something like O(n!).
The determinant intuition for me is the signed volume factor for a change of basis. I've seen the combinatorial lattice path application and I'm sure there are more in other fields.
But not much reason I can see to have them feature so prominently in an intro linear algebra class. Better to spend more time with SVD for instance, which wasn't even covered in the first linear algebra class I took.
In Germany this is the usual style for lectures for absolute beginners from 1st semester on - even commonly for people who don't major in math. This style is even not uncommon for 1st semester math lectures for student who don't major in mathematics or physics.
Hardly any faculty has a problem with this - they love it that the math departments weed out "unsuitable" students in their lectures so they don't have to
If you don't believe me and know a little German, here are two common German textbooks about linear algebra covering about 1.5 semesters of linear algebra for math majors:
- Gerd Fischer - Lineare Algebra: Eine Einführung für Studienanfänger (note the title "Linear Algebra: An introduction for freshmen" - I am really not kidding)
- Siegfried Bosch - Lineare Algebra
Even more: I know a lecturer from Hungary who had very direct words about how relaxing he considers the curriculum for math majors in Germany (he is used to a Sowjet-Russian-style-inspired math program).
(T-λ_2) ... (T-λ_m) v_k =
(T-λ_2) ... (T-λ_(m-1)) (λ_k - λ_m) v_k =
(T-λ_2) ... (T-λ_(m-2)) (λ_k - λ_(m-1)) (λ_k - λ_m) v_k =
(λ_k-λ_2) ... (λ_k - λ_(m-1)) (λ_k - λ_m) v_k
The eigenvalues are all distinct by hypothesis: "Non-zero eigenvectors corresponding to distinct eigenvalues...".
And the "distinct eigenvalues" part is obvious in hindsight. For some reason my brain thought that we were adding them, not subtracting.
They are very useful and intuitive, especially in 2D and 3D, where they represent areas and volumes. For example, they give an intuitive meaning to the notion of linear independence of 3 spatial vectors: they are independent when they span a non-zero volume.