We're sorry but this page doesn't work properly without JavaScript enabled. Please enable it to continue.
Feedback

6. Linear Algebra: Vector Spaces and Operators (continued)

00:00

Formal Metadata

Title
6. Linear Algebra: Vector Spaces and Operators (continued)
Title of Series
Number of Parts
25
Author
License
CC Attribution - NonCommercial - ShareAlike 4.0 International:
You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this
Identifiers
Publisher
Release Date
Language

Content Metadata

Subject Area
Genre
Abstract
In this lecture, the professor talked about linear operators and matrices, etc.
Free fallQuantumParticle physicsLinear motorGentlemanVisible spectrumMultiplizitätRail transport operationsGas turbineBasis (linear algebra)RulerAsymmetric digital subscriber linePlane (tool)Differential (mechanical device)Crystal structureMapCartridge (firearms)TypesettingWatercraft rowingLastSpare partCombined cycleFullingPower inverterElectronic componentRoll formingRotationAlcohol proofFACTS (newspaper)Angle of attackStrangenessMinuteMagic (cryptography)Musical developmentPresspassungPartition of a setGround stationKey (engineering)RedshiftSubwooferEnergiesparmodusQuality (business)String theoryCash registerLambda baryonLinear motorOrder and disorder (physics)Die proof (philately)BahnelementSingle (music)ToolInversion <Meteorologie>ForgingDirect currentPhotodissoziationMitsubishi A6M ZeroAngeregter ZustandEffects unitScoutingWoodturningHot workingNanotechnologyFlugbahnFuel injectionAvro Canada CF-105 ArrowDayHourWednesdayGasbohrlochMaterialFinishing (textiles)Scale (map)UniverseCurrent densityWeekLecture/Conference
Transcript: English(auto-generated)
The following content is provided under a Creative Commons license. Your support will help MIT OpenCourseWare continue to offer high quality educational resources for free. To make a donation or to view additional materials from hundreds of MIT courses, visit MIT OpenCourseWare at ocw.mit.edu.
So let's get started. This week, Professor Zuiebach is away. And I'll be doing today's lecture. And Will Detmold will do the one on Wednesday.
The normal office hours, unfortunately, will not be held today. One of us will cover his hours on Wednesday, though. And you should also just email either me or Professor Detmold if you want to set up an appointment to talk in the next few days. What I'm going to talk about today
will be more about the linear algebra that's behind all of quantum mechanics. And at the end of last time, so last lecture, you heard about vector spaces from a more abstract perspective than the usual vectors are columns of numbers perspective. Today, we're going to look at operators
which act on vector space, which are linear maps from a vector space to itself. And they're, in a sense, equivalent to the familiar idea of matrices, which are squares or rectangles of numbers,
but work in this more abstract setting of vector spaces, which has a number of advantages. For example, of being able to deal with infinite dimensional vector spaces, and also of being able to talk about basis independent properties. And so I'll tell you all about that today. So we'll talk about how to define operators, some examples,
some of their properties, and then, finally, how to relate them to the familiar idea of matrices. I'll then talk about eigenvectors and eigenvalues from this operator perspective. And depending on time, say a little bit about inner products, which you'll hear more about in the future. And these numbers here correspond to the sections of the notes that these refer to.
So let me first, this is a little bit mathematical and perhaps dry at first. The payoff is more distant than usual for things you'll hear in quantum mechanics. I just want to mention a little bit about the motivation for it.
So operators, of course, are how we define observables. And so if we want to know about the properties of observables, of which a key example are Hamiltonians,
then we need to know about operators. They also, as you will see in the future, are useful for talking about states. Right now, states are described as elements of a vector space.
But in the future, you'll learn a different formalism in which states are also described as operators, what are called density operators or density matrices. And finally, operators are also useful in describing symmetries of quantum systems.
So already in classical mechanics, symmetries have been very important for understanding things like momentum conservation and energy conservation and so on. They'll be even more important in quantum mechanics and will be understood through the formalism of operators. So these are not things that I'll talk about today, but are sort of the motivation
for understanding very well the structure of operators now. OK, so at the end of the last lecture, Professor Zwiebach defined linear maps. So this is the set of linear maps from a vector space
v to a vector space w. And just to remind you what it means for a map to be linear, so T is linear if for all pairs of vectors
in v, the way T acts on their sum is given by just T of u plus T of v.
That's the first property. And second, for all vectors u and for all scalars a, so f is the field that we're working over,
could be reals or complexes, we instead of T acts on a times u, that's equal to a times T acting on u.
So if you put these together, what this means is that T essentially looks like multiplication. The way T acts on vectors is precisely what you would expect from the multiplication map. It has the distributive property, and it commutes with scalars.
So this is sort of informal. I mean, the formal definition is here, but the informal idea is that T acts like multiplication. So if the map that squares every entry of a vector
does not act like this, but linear operators do. And for this reason, we often neglect the parentheses. So we just write Tu to mean T of u, which is justified because of this analogy with multiplication.
So an important special case of this is when v is equal to w. And so we just write L of v to denote the maps from v to itself. So you could also write like this.
And these are called operators on v. So when we talk about operators on a vector space v, we mean linear maps from that vector space to itself.
So let me illustrate this with a few examples, starting with some of the examples of vector spaces that you saw from last time.
So one example of a vector space is an example you've seen before, but different notation. This is the vector space of all real polynomials in one variable.
So real polynomials over some variable x. And this is an infinite dimensional vector space. And we can define various operators over it. For example, we can define one operator T
to be like differentiation. So what you might write is d dx hat. And it's defined for any polynomial p to map p to p prime.
So this is certainly a function from polynomials to polynomials. And you can check that it's also linear. If you multiply the polynomial by a scalar, then the derivative multiplies by the same scalar. If I take the derivative of a sum of two polynomials, then I get the sum of the derivatives
of those polynomials. So I won't write that down, but you can check that the properties are true, and this is indeed a linear operator. Another operator which you've seen before is multiplication by x. So this is defined as the map that simply
multiplies a polynomial by x. Of course, this gives you another polynomial. And again, you can check easily that it satisfies these two conditions.
So this gives you a sense of why things that don't appear to be matrix-like can still be viewed in this operator picture. Another example which you'll see later
shows some of the slightly paradoxical features of infinite dimensional vector spaces come from the vector space of infinite sequences. So these are all the infinite sequences of reals or complexes
or whatever f is. One operator we can define is the left shift operator,
which is simply defined by shifting this entire infinite sequence left by one place and throwing away the first position. So you start with x2, x3, and so on.
Still goes to infinity, so it still gives you an infinite sequence. So it is indeed a map. That's the first thing you should check, that this is indeed a map from v to itself. And you can also check that it's linear, that it satisfies these two properties. Another example is right shift.
And here, yeah? That's right. There's no back, really.
It's a good point. So you'd like to not throw out the first one, perhaps. But yeah, there's no canonical place to put it in. This just goes off to infinity. It just falls off the edge. It's a little bit like differentiation. Yeah, I guess there's some information.
It loses some information. That's right. It's a little bit weird, right? Because how many numbers do you have before you apply the left shift? Infinity. How many do you have after you apply the left shift? Infinity. But you lost some information. So you have to be a little careful with the infinities.
OK, the right shift here, it's not so obvious what to do. We've kind of made space for another number. And so we have to put something in that first position, right? So this will be question mark x1, x2, dot dot dot.
Any guesses what should go in the question mark? 0, right. And why should that be 0? What's that?
Otherwise, it wouldn't be linear, right. So imagine what happens if you apply the right shift to the all 0 string. If you were to get something non-zero here, then you would map the 0 vector to a non-zero vector. But by linearity, that's impossible. Because I could take any vector
and multiply it by the scalar 0. I get the vector 0. And that should be equal to the scalar 0 multiplied by the output of it. And so that means that t should always map 0 to 0. t should always map the vector 0 to the vector 0.
And so if we want right shift to be a linear operator, we have to put a 0 in there. And yeah, this one is strange also because it creates more space but still preserves all of the information.
So two other small examples of linear operators that come up very often. There's, of course, the 0 operator, which takes any vector to the 0 vector.
Here, I'm not distinguishing between here the 0 means an operator, here it means a vector. I guess I can clarify it that way. And this, of course, is linear and sends any vector space to itself. One important thing is that the output doesn't have
to be the entire vector space. The fact that it sends a vector space to itself only means that the output is contained within the vector space. It could be something as boring as 0 that just sends all the vectors to a single point. And finally, one other important operator is the identity operator that sends.
Actually, I won't use the arrows here. We'll get used to the mathematical way of writing it. That sends any vector to itself.
So those are a few examples of operators. I guess you've seen already the more familiar matrix
type of operators, but these show you also the range of what is possible. So this space L of v of all operators, I want to talk now about its properties.
So L of v is the space of all linear maps from v to itself. So this is a space of maps on a vector space, but it itself is also a vector space.
So the set of operators satisfies all the axioms of a vector space. It contains a 0 operator. That's this one right here. It's closed under a linear combination. If I add together two linear operators, I get another linear operator. It's closed under a scalar multiplication.
If I multiply a linear operator by a scalar, I get another linear operator, et cetera. And so everything we can do on a vector space, like finding a basis and so on, we can do for the space of linear operators. And however, in addition to having the vector space
structure, it has an additional structure, which is multiplication. And here, we're finally making use of the fact
that we're talking about linear maps from a vector space to itself. If we're talking about maps from v to w, we couldn't necessarily multiply them by other maps from v to w. We could only multiply them by maps from w to something else.
Just like how if you're multiplying rectangular matrices, the multiplication is not always defined if the dimensions don't match up. But since these operators are like square matrices, multiplication is always defined. And this can be used to prove many nice things about them.
So this type of structure being a vector space of multiplication makes it in many ways like a field, like real numbers or complexes, but without all of the properties. So the properties that the multiplication does have,
first, is that it's associative. So let's see what this looks like. So if we have a times bc is equal to ab times c.
And the way we can check this is just
by verifying the action of this on any vector. So an operator is defined by its action on all of the vectors in a vector space. And so the definition of ab can
be thought of as asking, how does it act on all the possible vectors? And this is defined just in terms of the action of a and b as you first apply b
and then you apply a. So this can be thought of as the definition of how to multiply operators. And then from this, you can easily check the associativity property, that in both cases, however you write it out, you obtain a of b of c of v. I'm writing out
all the parentheses just to emphasize this is c acting on v, and then b acting on c of v, and then a acting on all of this. The fact that this is equal, that this is the same,
no matter how ab and c are grouped, is again part of what lets us justify this right here, where we just don't use parentheses when we have operators acting. So yes, we have the associative property. Another property of multiplication
that operators satisfy is the existence of an identity. That's just the identity operator here, which for any vector space can always be defined. But there are other properties of multiplication that it doesn't have.
So inverses are not always defined. They sometimes are. You can't say that a matrix is never invertible. But for things like the reals and the complexes, every non-zero element has an inverse. And for matrices, that's not true.
And another property, a more interesting one that these lack, is that the multiplication is not commutative.
So this is something that you've seen for matrices. If you multiply two matrices, the order matters. And so it's not surprising that the same is true for operators. And just to give a quick example of that,
let's look at this example one here with polynomials. And let's consider s times t acting on the monomial x to the n.
So t is differentiation. So it sends this to n times x to the n minus 1. So we get s times n x to the n minus 1. Linearity means we can move the n past the s. s acting here multiplies by x.
And so we get n times x to the n. Whereas if we did the other order, we get t times s acting on x to the n, which is x to the n plus 1. When you differentiate this, you get n plus 1 times x to the n.
So these numbers are different, meaning that s and t do not commute. And it's kind of cute to measure, actually, to what extent do they not commute.
This is done by the commutator. And what these equations say is that if the commutator acts on x to the n, then you get n plus 1 times x to the n minus n times x to the n, which is just x to the n.
And we can write this another way as identity times x to the n. And since this is true for any choice of n, it's true for what turns out to be
a basis for the space of polynomials. So 1, x, x squared, x cubed, et cetera, these span the space of polynomials. So if you know what an operator does on all of the x to the n's, you know what it does on all the polynomials.
And so this means, actually, that the commutator of these two is the identity. And so the significance of this is, well,
I won't dwell on the physical significance of this. But it's related to what you've seen for position and momentum. And essentially, the fact that these don't commute is actually an important feature of the theory.
OK, so these are some of the key properties of the space of operators. I want to also now tell you about some of the key properties of individual operators. And basically, if you're given an operator and want to know the gross features of it,
what should you look at? So one of these things is the null space of an operator. So this is the set of all v of all vectors
that are killed by the operator, that are sent to 0. In some case, so this will always include the vector 0.
So this always at least includes the vector 0. But in some cases, it will be a lot bigger. So for the identity operator, the null space is only the vector 0. The only thing that gets sent to 0 is 0 itself. Whereas for the 0 operator, everything gets sent to 0.
So the null space is the entire vector space. For left shift, the null space is only 0 itself. Sorry, for right shift, the null space is only 0 itself. And what about for left shift, what's the null space here?
Yeah? Where the string of 0 is following it. Right. Any sequence where the first number is arbitrary, but everything after the first number is 0. And so from all these examples, you might guess that this is a linear subspace, because in every case, it's been a vector space.
And in fact, this is correct. So this is a subspace of v, because if there's a vector that gets sent to 0, any multiple of it also will be sent to 0. And if there's two vectors that get sent to 0,
their sum will also be sent to 0. So the fact that it's a linear subspace can be a helpful way of understanding this set. And it's related to the properties of t as a function.
So for a function, we often want to know whether it's one to one, or injective, or whether it's onto, or surjective. And you can check that if t is injective, meaning that if u is not equal to v, then t of u
is not equal to t of v. So this property that t maps distinct vectors to distinct vectors turns out to be
equivalent to the null space being only the 0 vector. So why is that? So this statement here, that whenever u is not equal to v, t of u is not equal to t of v, another way
to write that is, whenever u is not equal to v, t of u minus v is not equal to 0. And if you would look at this statement a little more carefully, you'll realize that all we cared about
on both sides was u minus v. Here, obviously, we care about u minus v. Here, we only care if u is not equal to v. So that's the same as saying if u minus v is non-zero, then t of u minus v is non-zero.
And this, in turn, is equivalent to saying that the null space of t is only 0. In other words, the set of vectors that get sent to 0
consists only of the 0 vector itself. So the null space is, for linear operators, how we can characterize whether they're one-to-one, whether they destroy any information.
The other subspace that will be important that we'll use is the range of an operator.
So the range of an operator, which we can also just write as t of v, is the set of all points
that vectors in v get mapped to. So the set of all t, v for some vector v. So this, too, can be shown to be a subspace.
And that's because it takes a little more work to show it, but not very much.
If there's something in the output of t, then whatever the corresponding input is, we could have multiplied that by a scalar. And then the corresponding output also would get multiplied by a scalar. And so that, too, would be in the range. And so that means that for anything in the range,
we can multiply it by any scalar and again get something in the range. Similarly for addition. A similar argument shows that the range is closed under addition. So indeed, it's a linear subspace. Again, and since it's a linear subspace, it always contains 0. And depending on the operator, may contain a lot more.
So whereas the null space determined whether t was injective, the range determines whether t is surjective. So the range of t equals v if and only if t is surjective.
And here, this is simply the definition of being surjective. It's not really a theorem like it
was in the case of t being injective. Here, that's really what it means to be surjective is that your output is the entire space. So one important property of the range and the null space, whenever v is finite dimensional, is that the dimension of v is equal to the dimension
of the null space plus the dimension of the range.
And this is actually not trivial to prove. And I'm actually not going to prove it right now. But I want to tell you, the intuition of it is as follows. Imagine that v is some n dimensional space.
And the null space has dimension k. So that means you have input of n degrees of freedom. But t kills k of them. And so k different degrees of freedom, no matter how you vary them, have no effect on the output. They just get mapped to 0. And so what's left are n minus k degrees of freedom
that do affect the output, where if you vary them, it does change the output in some way. And those correspond to the n minus k dimensions of the range. And if you want to make that formal, you have to formalize what I was saying about what's left is n minus k. You have to talk about something like the orthogonal complement
or completing a basis, or in some way formalize that intuition. And in fact, you can go a little further and you can decompose the space. This is just dimension counting. You can even decompose the space into the null space and the complement of that and show that t is one to one on the complement
of the null space. But for now, I think this is all that we'll need for now. Any questions so far? Yeah. Why isn't it part of the range?
So the null space is are all of the, so there's something, this theorem I guess would be a little bit more surprising
if you realize it works not only for operators but for general linear maps. And in that case, the range is a subspace of w, because the range is about the output. And the null space is a subspace of v, which is part of the input. And so in that case, they're not even comparable.
The vectors might just have different lengths. And so it can never, like the null space and range in that case would live in totally different spaces. And in general, let me give you a very simple example.
Let's suppose that t is equal to 3, 0, minus 1, 4. So just a diagonal 4 by 4 matrix. Then the null space would be the span of e2.
That's the vector with a 1 in the second position. And the range would be the span of e1, e3, and e4.
So in fact, usually it's the opposite that happens. The null space and the range are, in this case, they're actually orthogonal subspaces. But this picture is actually a little bit deceptive
in how nice it is. So if you look at this, total space is 4, four dimensions. It divides up into one dimension that gets killed and three dimensions where the output still tells you something about the input, where there's some variation in the output.
But this picture makes it seem, yeah, the simplicity of this picture does not always exist. A much more horrible example is this matrix, where here,
so what's the null space for this matrix?
You don't care about the, it's everything of, informally, it's everything of this form. Everything with something in the first position,
0 in the second position. In other words, it's the span of e1. And what about the range?
What's that? Yeah? It's actually, it's also the same e1. It's the same thing. So you have this intuition that some degrees of freedom
are preserved and some are killed. And here, they look totally different. And here, they look the same. So you should be a little bit nervous about trying to apply that intuition. You should be reassured that at least the theorem is still true.
At least 1 plus 1 is equal to 2. We still have that. But the null space and the range are the same thing here. And the way around that paradox, yeah.
No, it turns out that even with a change of basis, you cannot guarantee that the null space and the range will be perpendicular.
Right, good. So if you do that, then, so if you, depending on how you, if you do row reduction with different row and column operations, then what you've done is you have a different input and output basis.
And so that would, then once you kind of unpack what's going on in terms of the basis, then you would turn out that you could still have strange behavior like this. What your intuition is based on is that if the matrix is
diagonal in some basis, then you don't have this trouble. But the problem is that not all matrices can be diagonalized. Yeah. Is what you're acting on in the range what results from it? Exactly, and they really, they could even live in different spaces.
And so they really just don't, you know, to compare them is dangerous. So it turns out that the degree of freedom, the degrees of freedom corresponding to the range, what you should think about are the degrees of freedom that get sent to the range. And in this case, that would be E2.
And so then you can say that E1 gets sent to 0, and E2 gets sent to the range. And now you really have decomposed the input space into two orthogonal parts. And because we're talking about a single space, the input space, it actually makes sense to break it up into these parts.
Whereas here, they look like they're the same, but really, input and output spaces you should think of as potentially different. So this is just a mild warning about reading too much into this formula, even though the rough idea of counting degrees of freedom
is still roughly accurate. So I want to say one more thing about properties of operators, which is about invertibility.
And maybe I'll leave this up for now.
So we say that a linear operator T has a left inverse S if multiplying T on the left by S will give you the identity.
And T has a right inverse S prime. You can guess what will happen here if multiplying T on the right by S prime gives you identity.
And what if T has both? Then in that case, it turns out that S and S prime have to be the same. So here's the proof. So if both exist, then S is equal to S times
identity by the definition of the identity. And we can replace identity with TS prime.
Then we can group these and cancel them and get S prime. So if a matrix has both a left and a right inverse, then it turns out that the left and right inverse
are the same. And in this case, we say that T is invertible. And we define T inverse to be S.
So one question that you often want to ask is, when do left and right inverses exist?
Actually, maybe I'll write it here. So it turns out there exists. So intuitively, there should exist a left inverse when after we've applied T, we haven't
done irreparable damage. So whatever we're left with, there's still enough information that some linear operator can restore our original vector and give us back the identity. And so that condition of not doing irreparable damage,
of not losing information, is asking essentially whether T is injective. So there exists a left inverse if and only if T is injective.
Now for a right inverse, the situation is sort of dual to this. And here, what we want, we can multiply on the right by whatever we like, but there won't be anything on the left. So after the action of T, there
won't be any further room to explore the whole vector space. So the output of T had better cover all of our possibilities if we want to be able to achieve identity by multiplying T by something on the right. So any guesses for what the condition
is for having a right inverse? Surjective, right. So there exists a right inverse if and only if T is surjective.
Technically, I haven't proved that. I've only proved one direction. My hand waving just now proved that if T's not injective, there's no way it'll have a left inverse. If it's not surjective, there's no way it'll have a right inverse. I haven't actually proved that if it is injective, there is such a left inverse, and if it is surjective,
there is such a right inverse. But those, I think, are good exercises for you to do to make sure you understand what's going on. OK, so this takes us part of the way there. In some cases, our lives become much easier. In particular, if V is finite dimensional,
it turns out that all of these are equivalent. So T is injective if and only if T is surjective,
if and only if T is invertible.
And why should it be true that T is surjective if and only if T is injective? Why should those be equivalent statements?
Yeah? Taking vectors in V to vectors in V, and so you're mapping is one to one if and only if every
vector is mapped to, because then you're not leaving anything out. That's right. Failing to be injective and failing to be surjective both look like losing information. Failing to be injective means I'm sending a whole non-zero vector and all its multiples to 0. That's a degree of freedom lost.
Failing to be surjective means once I look at all the degrees of freedom I reach, I haven't reached everything. So they intuitively look the same. So that's the right intuition. There's a proof, actually, that makes use of something on a current blackboard, though.
So from this dimension formula, you immediately get it, because if this is 0, then this is the whole vector space. And if this is non-zero, this is not the whole vector space. And this proof is sort of non-illuminating
if you don't know the proof of that thing, which is I apologize for. But also, you can see immediately from that that we've used the fact that V is finite dimensional. And it turns out this equivalence breaks down if the vector space is infinite dimensional, which
is pretty weird. There's a lot of subtleties of infinite dimensional vector spaces that it's easy to overlook if you build up your intuition from matrices. So does anyone have an idea of a, so let's think of an example of a vector of something
that is on an infinite dimensional space that's surjective, but not injective. Yeah? Any guesses for such an operator? Yeah? The left shift? Yes. You'll notice I didn't erase this blackboard strategically.
Yes. The left shift operator is surjective. I can prepare any vector here I like just by putting it into the x2, x3, dot, dot, dot parts. So the range is everything, but it's not injective, because it throws away the first register. It maps things with a non-zero element in the first position,
and zeros everywhere else to 0. So this is surjective, not injective. On the other hand, if you want something that's injective and not surjective, you
don't have to look very far, the right shift is injective and not surjective. It's pretty obvious it's not surjective. There's that 0 there, which definitely means it cannot
achieve any vector. And it's not too hard to see it's injective. It hasn't lost any information. It's like you're in a hotel that's infinitely long and all the rooms are full. And the person at the front desk has no problem. I'll just move everyone down one room to the right. You can take the first room.
So that policy is injective. You'll always get a room to yourself, and made possible by having an infinite dimensional vector space. So in infinite dimensions, we cannot say this. Instead, we can say that T is invertible if and only
if T is injective and surjective. So this statement is true in general
for infinite dimensional, whatever, vector spaces. And only in the nice special case of finite dimensions do we get this equivalence. Yeah?
Yes, the question was, do the null space and range, are they properties just of T or also of V? And definitely you also need to know V. The way I've been writing it, T is implicitly
defined in terms of V, which in turn is implicitly defined in terms of the field f. And all these things can make a difference. Yes? So if you want to be a bijection, or to be invertible. That's right. Invertible is the same as bijection.
OK, so let me now try and relate this to matrices. I've been saying that operators are like the fancy mathematicians' form of matrices.
If you're arrested development fans, it's like magic trick versus an illusion. But are they different or not? Depends on your perspective. There are advantages to seeing it both ways, I think. So in any case, let me tell you how you can view an operator in a matrix form.
The way to do this, and the reason why matrices are not universally loved by mathematicians is I haven't specified a basis this whole time. But if I want a matrix, all I needed was a vector space and a linear function
between two vector spaces. Or sorry, from a vector space to itself. But if I want a matrix, I need additional structure. And mathematicians try to avoid that whenever possible. But if you're willing to take this additional structure, so if you choose a basis, v1 through vn,
it turns out you can get a simpler form of the operator that's useful to compute with. So why is that? Well, the fact that it's a basis means that any v can be written as linear combinations
of these basis elements, where a1 through an belong to the field. And since T is linear, if T acts on v,
we can rewrite it in this way. And you see that the entire action is determined by T acting on v1 through vn.
So think about if you wanted to represent an operator in a computer, you'd say, well, there's an infinite number of input vectors. And for each input vector, I have to write down the output vector. And this says, no, you don't. You only need to store on your computer what does T do to v1, what does T do to v2, et cetera.
So that's good. Now you only have to write down n vectors. And since these vectors, in turn, can be expressed in terms of the basis, you can express this just in terms of a bunch of numbers. So let's further expand Tvj in this basis.
And so there's some coefficient. So it's something times v1 plus something times v2, something times vn. And these somethings are a function of T. So I'm just going to call this T sub 1j, T sub 2j,
T sub nj. And this whole thing I can write more succinctly in this way.
And now all I need are these T sub ij. And that can completely determine for me the action of T. Because this Tv here, so Tv,
we can write as a sum over j of T times aj vj. And we can move the aj past the T. And then if we expand this out, we get that it's a sum over i from 1 to n,
sum over j from 1 to n of Tij aj vi. And so if we act on a general vector v, and we know the coefficients of v in some basis,
then we can re-express it in that basis as follows. And this output, in general, can always be written in the basis with some coefficients.
So we could always write it like this. And this formula tells you what those coefficients should be. They say if your input vector has coefficients a1 through an,
then your output vector has coefficients b1 through bn, where the b sub i are defined by this sum.
And of course, this formula is one that you've seen before. And it's often written in this more familiar form.
So this is now the familiar matrix vector multiplication. And it says that the b vector is obtained from the a vector by multiplying it
by the matrix of these Tij. And so this T is the matrix form. This is the matrix form of the operator T.
And you might find this not very impressive. You say, well, look, I already knew how to multiply a matrix by a vector. But what I think is nice about this is that the usual way you learn linear algebra is someone says, a vector is a list of numbers. A matrix is a rectangle of numbers. Here are the rules for what you do with them.
If you want to put them together, you do it in this way. Here, this was not an axiom of the theory at all. We just started with linear maps from one vector space to another one. And the idea of a basis is something that you can prove has to exist.
And you can derive matrix multiplication. So matrix multiplication emerges, or sorry, matrix vector multiplication emerges as a consequence of the theory, rather than as something that you have to put in. So that, I think, is what's kind of cute about this, even if it comes back on the end to something that you had been taught before.
Any questions about that? So this is matrix vector multiplication. You can similarly derive matrix multiplication.
So if we have two operators, T and S,
and we act on a vector, v sub k, and by what I argued before, it's enough just to know how they act on the basis vectors. You don't need to know. And once you do that, you can figure out with how they act on any vector.
So if we just expand out what we wrote before, this is equal to T times the sum over j of S jk vj. So S vk can be re-expressed in terms of the basis
with some coefficients. And those coefficients will depend on the vector you started with, k. And the vector, the part of the basis you're using to express it with j. Then we apply the same thing again with T. We get this is sum over i, sum over j, Tij, Sjk vi.
And now, what have we done?
Ts is an operator. When you act on vk, it's sped out something that's a linear combination of all of the basis states, v sub i. And the coefficient of v sub i is this part
in the parentheses. And so this is the matrix element of Ts. So the ik matrix element of Ts is the sum over j of Tij Sjk.
And so just like we derived matrix-vector multiplication, here we can derive matrix-matrix multiplication. And so what was originally just sort of an axiom of the theory
is now kind of the only possible way it could be if you want to define operator multiplication as first one operator acts, then the other operator acts. So in terms of this, this, I think,
justifies why you can think of matrices as a kind of a faithful representation of operators. And once you've chosen a basis, the square full of numbers becomes equivalent to the abstract map
between vector spaces. And they're so equivalent that I'm just going to write things like equal signs. Like I'll write identity equals a bunch of ones down the diagonal, and not worry about the fact that technically this is an operator and this is a matrix. And similarly, the zero matrix equals a matrix full of zeros.
Technically, we should write, if you want to express the basis dependence, you can write things like T parentheses.
Sorry, let me write it like this.
If you really want to be very explicit about the basis, you could use this to refer to the matrix, just to really emphasize that the matrix depends not only on the operator, but also on your choice of basis. But we'll almost never bother to do this. We'll usually just sort of say it in words what the basis is.
So matrices are an important calculational tool. And we ultimately want to compute numbers of physical quantities, so we cannot always spend our lives
in abstract vector spaces. But the basis dependence is an unfortunate thing. A basis is like a choice of coordinate systems. And you really don't want your physics to depend on it. And you don't want quantities you compute to be dependent on it. And so we often want to formulate.
We're interested in quantities that are basis independent. And in fact, that's a big point of the whole operator picture, is that because the quantities we want are ultimately basis independent, it's nice to have language that is itself basis independent, terminology and theorems that do not refer to a basis.
I'll mention a few basis independent quantities. And I won't say too much more about them because you will prove properties of them on your p set.
But one of them is the trace, and another one is the determinant. And when you first look at them, OK, you can check that each one is basis independent. And it really looks kind of mysterious. I mean, who pulled these out of a hat?
They look totally different. They don't look remotely related to each other. And are these all there is? Are there many more? And it turns out that at least for matrices with eigenvalues, these can be seen as members of a much larger family.
And the reason is that the trace turns out to be the sum of all of the eigenvalues. And the determinant turns out to be the product of all of the eigenvalues. And in general, we'll see in a minute that basis independent things, actually not in a minute, in a future lecture, that basis independent things are
functions of eigenvalues and furthermore that don't care about the ordering of the eigenvalues. So they're symmetric functions of eigenvalues. And then it starts to make a little bit more sense. Because if you talk about symmetric polynomials, those are two of the most important ones, where you just add up all the things and where you multiply all the things. And then if you have this perspective
of symmetric polynomial of the eigenvalue, then you can cook up other basis independent quantities. So this is actually not the approach you should take on the p-set. The p-set asks you to prove more directly that the trace is basis independent. But the sort of framework that these fit into is symmetric functions of eigenvalues.
So I want to say a little bit about eigenvalues. Any questions about matrices before I do?
So eigenvalues, I guess these are basis independent quantities.
Another important basis independent quantity or property of a matrix is its eigenvalue eigenvector structure. So the place where eigenvectors come from
is by considering a slightly more general thing, which is the idea of an invariant subspace. So we say that U is a T invariant subspace
if T of U, this is an operator acting in an entire subspace. So what do I mean by that? I mean the set of all TU for vectors in the subspace. If T of U is contained in U. So I
take a vector in this subspace, act on it with T, and then I'm still in the subspace, no matter which vector I had. So some examples that always work.
The zero subspace is invariant. T always maps it to itself. And the entire space V. T is a linear operator on V. So by definition, it maps V to itself.
These are called trivial examples. And when usually people talk about non-trivial invariant subspaces, they mean not one of these two. The particular type that we'll be interested in are one-dimensional ones.
So this corresponds to a direction that T fixes. So U, this vector space now can be written just as the span of a single vector U.
And U being T invariant is equivalent to TU being in U, because they're just a single vector.
So all I have to do is get that single vector right, and I'll get the whole subspace right. And that, in turn, is equivalent to TU being some multiple of U.
And this equation you've seen before. This is the familiar eigenvector equation. And a very important equation might be named after a mathematician, but this one is so important that two of the pieces of it have their own special name.
So these are called, lambda is called an eigenvalue, and U is called an eigenvector.
And more or less, it's true that all of the solutions to this are called eigenvalues, and all the solutions are called eigenvectors. There's one exception, which is there's one kind of trivial solution to this equation, which
is when U is 0, this equation is always true. And that's not very interesting. It is true for all values of lambda. And we say, and so that doesn't count as being an eigenvalue. And you can tell it doesn't correspond to a 1D invariant subspace. It corresponds to a zero dimensional subspace,
which is the trivial case. So we say that lambda is an eigenvalue of T
if T U equals lambda U for some non-zero vector U. So the non-zero is crucial.
And then the spectrum of T is the collection of all eigenvalues.
So there's something a little bit asymmetric about this, which is we still say the zero vector is an eigenvector with all the various eigenvalues.
But we had to put this here, or everything would be an eigenvalue, and it wouldn't be very interesting. So also I want to say this term spectrum,
you'll see in other contracts, you'll see spectral theory, or spectral this or that. And that means essentially making use of the eigenvalues. So people talk about partitioning a graph using eigenvalues of the associated matrix. That's called spectral partitioning. And so throughout math, this term is used a lot.
So I have only about three minutes left to tell. So I think I will not finish the eigenvalue discussion, but we'll just show you a few examples
of how it's not always as nice as you might expect. So one example that I'll consider
is the vector space will be the reals, 3D real space. And the operator T will be rotation about the z-axis by some small angle. Let's call it a theta rotation about the z-axis.
Turns out if you write this in matrix form, it looks like this. Cosine theta minus sine theta 0, sine theta cosine theta 0,
0, 0, 1. That 1 is because it leaves the z-axis alone. And then x and y get rotated. You can tell if theta is 0, it does nothing. So that's reassuring. And then if theta does a little bit, then it starts mixing the x and y components.
OK. So that is the rotation matrix. So can you say what an eigenvalue is of this matrix? 1. Good. And what's the eigenvector? The z-basis vector. The z-basis vector, right. So it fixes the z-basis vector.
So this is an eigenvector with eigenvalue 1. Does it have any other eigenvectors? Yes. Yeah. If you are talking about complex numbers, then yes. So it has complex eigenvalues. But if we're talking about a real vector space,
then it doesn't. And so this just has one eigenvalue and one eigenvector. And if we were to get rid of the third dimension, so if we just had t, and let's be even simpler. Let's just take theta to be pi over 2.
So let's just take a 90 degree rotation in the plane. Now t has no eigenvalues. There are no vectors other than 0 that it sends to itself.
And so this is a slightly unfortunate note to end the lecture on. You think, well, these eigenvalues are great, but maybe they exist, maybe they don't. And you'll see next time, part of the reason why we use complex numbers, even though it
looks like real space isn't complex, is because any polynomial can be completely factored in complex numbers, and every matrix has a complex eigenvalue. OK, I'll stop here.