Expected Values
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Part Number | 5 | |
Number of Parts | 16 | |
Author | ||
License | CC Attribution - ShareAlike 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/12890 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | |
Genre |
1
2
3
4
5
6
7
9
10
11
12
13
14
15
16
00:00
Open setMathematicsPopulation densityFinite setRandom variableFunctional (mathematics)Line (geometry)LengthWahrscheinlichkeitsfunktionIndependence (probability theory)INTEGRALGoodness of fitInfinityDifferent (Kate Ryan album)Point (geometry)SummierbarkeitMultiplication signNumerical analysisExistential quantificationDistribution (mathematics)MeasurementRadiusSocial classRight anglePlane (geometry)Probability distributionWeightOrder (biology)MereologyProduct (business)Hand fanExplosionPhase transitionArithmetic meanRandomizationLie groupLecture/Conference
09:51
Square numberWell-formed formulaDistribution (mathematics)Expected valueFunctional (mathematics)Decision theoryPoint (geometry)Marginal distributionSpherical capSet theoryWater vaporOrder (biology)Line (geometry)AreaCirclePopulation densityVariable (mathematics)Right angleRadiusMusical ensembleThomas BayesTerm (mathematics)Mortality rateINTEGRALCartesian coordinate systemDivisorRule of inferenceRandom variableMultiplication signPower (physics)InfinityGoodness of fit1 (number)Coefficient of determinationPlane (geometry)Lecture/Conference
19:42
Inverse trigonometric functionsSinePoint (geometry)Right angleSquare numberRootFunctional (mathematics)ThetafunktionAreaGraph (mathematics)INTEGRALRectangleLengthAngleMultiplication signRadiusTrigonometric functionsCircleLimit (category theory)Greatest elementSubstitute goodAxiom of choiceTrigonometryDivisorMereologySign (mathematics)Water vaporGrothendieck topologyTheoryFigurate numberGoodness of fitOrder (biology)MathematicsStatistical hypothesis testingLecture/Conference
29:33
SineMereologyPoint (geometry)Square numberInsertion lossSign (mathematics)Marginal distributionRootAreaInverse trigonometric functionsSineCoefficient of determinationTrigonometric functionsMany-sorted logicComplex numberComputabilityNumerical analysisGreen's functionStandard errorArc (geometry)ThetafunktionDifferent (Kate Ryan album)TrigonometryAngleAdditionLengthTerm (mathematics)Multiplication signRight angleWell-formed formulaDirected graphModel theoryCharge carrierCalculusDifferenz <Mathematik>Grothendieck topologyLecture/Conference
39:24
Moving averageLine (geometry)Functional (mathematics)WahrscheinlichkeitsfunktionMultiplication signMarginal distributionPopulation densitySign (mathematics)Variable (mathematics)INTEGRALLimit (category theory)Random variableSummierbarkeitDistribution (mathematics)Analytic continuationRandomizationAreaMassDecision theoryThomas BayesNegative numberShift operatorResultantWell-formed formulaBuildingAdditionFrequencyLecture/Conference
49:16
AverageNatural numberDifferent (Kate Ryan album)ComputabilityLimit (category theory)INTEGRALPopulation densityPotenz <Mathematik>Parameter (computer programming)Well-formed formulaLine (geometry)Condition numberMarginal distributionFinite setDistribution (mathematics)Gamma functionSign (mathematics)MassWahrscheinlichkeitsfunktionRange (statistics)Set theoryInfinityAlpha (investment)Analytic continuationVariable (mathematics)Square numberModel theorySpherical capNumerical analysisFunctional (mathematics)1 (number)Negative numberWater vaporLatin squareBasis <Mathematik>TheoryPosition operatorAreaTheory of relativitySampling (statistics)Lecture/Conference
59:07
Potenz <Mathematik>Population densityParameter (computer programming)Logical constantFraction (mathematics)Condition numberWell-formed formulaMoment (mathematics)Square numberDecision theoryFunctional (mathematics)Distribution (mathematics)MereologyMultiplication signSet theoryUniformer RaumInequality (mathematics)Negative numberMathematicsResultantStatistical hypothesis testingAreaVotingOrder (biology)Numerical analysisComplex (psychology)SummierbarkeitTerm (mathematics)Spherical capRandom variableExponential distribution2 (number)Marginal distributionRandomizationINTEGRALGleichverteilungLecture/Conference
01:08:58
Variable (mathematics)WahrscheinlichkeitsfunktionSummierbarkeitInequality (mathematics)Population densityMarginal distributionNumerical analysisDependent and independent variablesRandom variableTerm (mathematics)Line (geometry)ReliefPoint (geometry)Module (mathematics)CuboidSet theoryDecision theoryModel theoryGrothendieck topologyTaylor seriesRandomizationGroup actionRule of inferenceLecture/Conference
01:18:49
Probability distribution2 (number)Multiplication signModulformNormal distributionMultivariate NormalverteilungAnalytic continuationDreizehnHistogramWell-formed formulaDistribution (mathematics)Numerical analysisSkewnessSigma-algebraRandom variableObservational studyConnected spaceCausalityCross-correlationStudent's t-testDegree (graph theory)Marginal distributionPoint (geometry)Element (mathematics)StatisticsExplosionInsertion lossKnudsen numberOrder (biology)Statistical hypothesis testingDecision theoryGroup actionLecture/Conference
01:28:40
Sigma-algebraRootDependent and independent variablesPopulation densityMoment (mathematics)Extension (kinesiology)Independence (probability theory)Distribution (mathematics)Condition numberFunctional (mathematics)Square numberDivisorProduct (business)Marginal distributionLimit of a functionStatisticsCategory of beingMultiplication signMultilaterationWahrscheinlichkeitsfunktionLine (geometry)Random variableMathematicsRule of inferenceGroup actionMusical ensembleSpherical capCartesian coordinate systemOrbitMany-sorted logicAnalytic continuationEvent horizonOpen setLecture/Conference
Transcript: English(auto-generated)
00:05
Is the course hard, easy, or just about right? I heard it hard and a lot of giggles of affirmation, I guess. I don't know.
00:21
Yeah, I agree. It's a little different from calculus, I guess, from a different point of view. But I guess that means you just have to work a little bit harder, but you can do it.
00:43
All right. We do have class Friday. I looked at the academic calendar. It said Independence Day is observed July 4th, which means no observation on July 5th.
01:05
And because it's a holiday, I'll give you until Monday to hand in the assignment. Otherwise, I am in danger of being a Grinch. What's that? You're disappointed? You can hand it in Friday? Good. That's good. Okay.
01:20
You have the option of handing it in Friday. But you can wait until Monday. Okay.
01:45
We're still in Chapter 3 discussing jointly distributed random variables. Again, the basic general idea is that a lot of times random variables come in pairs, like your height and your foot size.
02:01
I ask anybody here have size 14 shoe or bigger? Yeah, size 14. So you have to buy online, correct? Mostly. Mostly, yeah. Do you know how I know that? My teenage son has size 14, sometimes 15 shoes.
02:25
We have to buy them online. So I'm trying to figure out what percentage of the population has shoe size 14 or higher. And it seems pretty low. But shoe size would be correlated to height.
02:43
Like if you take an individual at random, you might capture a human being, measure his foot size and measure his height, right? These are two random variables, the height and the shoe size. And you'd expect there's some relationship between them. A shorter individual you might expect would have shorter feet.
03:04
And a taller individual you might expect to have taller feet. And that's what's going on with jointly distributed random variables. You try to capture somehow in the joint distribution function the relationship between the two random variables. So for discrete random variables, this just means that there's some probability mass function on pairs.
04:06
And the probability that the random variables X and Y, which are somehow correlated, are less than or equal to little x and little y respectively is this sum. This would be a sum over an most countable number of points.
04:21
Usually just a finite number of points. p of u of e. Or in the continuous case, this is called the joint distribution function, is given by an integral of some density.
05:08
We'll label this thing f sub capital X, capital Y of X, Y. Similarly here like that.
05:25
Okay, so let's continue looking at examples.
05:46
I'm sorry, what? In that order? I guess not. Right. This is the integral in this variable so I should put it first.
06:04
Thanks.
06:31
Let's try to make sense of the pair X, Y where X, Y is uniformly distributed on the disk of radius r, centered at the origin in the plane.
06:52
Well let's think back to what it would mean to be uniformly distributed on an interval on the line.
07:22
What does it mean for a random variable to be uniformly distributed on an interval a to b? It means something about the density of this random variable. It means that the density is equal to one over the length of the interval if little z happens to be in that interval and zero if not.
08:10
In other words, the density is constant on the interval and what constant is it? Well the constant is determined by the requirement that the integral of the density be what number?
08:22
One. The integral of the density over the whole line has to be one. If we integrate this function over the whole line what do we get? Well, over most of the line there's no contribution. From minus infinity to a there's no contribution. From b to infinity there's no contribution. We just get a contribution to the integral of that on this interval.
08:45
But we'd be integrating a constant over that interval. What do we get then? We get the constant times the length of the interval that is one. So let's go back up here. If something's going to be uniformly distributed on a set what should its density be?
09:01
It should be a constant on that set and zero off it. And how do you determine the constant? The integral has to be one. So this would mean that the joint density for x, y would be one over pi r squared
09:47
if x squared plus y squared is less than or equal to r squared and zero if not.
10:25
So here's the circle of radius r centered at the origin. Now let's look at a point with coordinates x, y. And let's look at the joint distribution function of this pair at the point little x, little y.
10:58
It's defined to be that.
11:05
And in terms of the formula I just erased it's the integral of the density from minus
11:26
infinity to x and from minus infinity to y which means over this region of the plane. And so this would be the integral, well it would be a little bit hard to, it's a little bit complicated but not too bad.
12:06
I don't think I'll do it for arbitrary x and y. Well let's consider trying to find out what is the distribution of x and also to discover what is the distribution of y using this formula.
12:25
So we would get the integral, actually we'd integrate one over pi r squared on this region, right? That's what that integral would be. One over pi r squared on the shaded region.
12:41
Why don't I shade this area over here? Yeah, the density zero is over here, right? Look over there, density zero outside of the disk. So that integral would be the integral of one over pi r squared on this set. Okay, let's try to find the probability that capital X is less than or equal to something.
13:21
What would that equal? Or can we get that, can we get this from that other information? So we do have this, probably capital X is less than or equal to little x, capital Y is less than or equal to little y.
13:43
Is this integral? Can we get rid of this restriction somehow? How? Go ahead. You'd say it's V of x less than or equal to x given y less than or equal to y? Maybe. Times V of y.
14:02
Well, look at the picture for a minute and tell me when does this no longer become a restriction? K power. Pardon? Yes, in other words, take this value to be what?
14:20
Infinity. If we take this to be infinity, what region does that shaded region become? I take y to go to infinity, that means it goes up here, right? And then I'm looking at everything to the left of this line and that would be what shaded region?
14:40
It would be all of this. So actually all I need is y to be bigger than r. But let's say for safety sake, we take y to infinity. Then we're looking at everything to the left of x, right? This would be x.
15:02
So let's take y to be infinity. In other words, we integrate to infinity here and then we get the distribution function for capital X.
15:33
Now I'm not making a very good distinction between capital letters and little ones here, okay?
15:44
This would work whenever you have jointly distributed random variables. If you want to find the distribution of x and you know the joint distribution of x and y, just integrate the y variable from minus infinity to infinity. That gives you the distribution function for x. How would you get the distribution function for y?
16:05
Integrate, yeah, take x, little x to infinity. In other words, integrate this from, integrate an x from minus infinity to infinity. So the distribution function of y would be gotten by this, okay?
16:29
That's just taking x to infinity here, which means you integrate to infinity, okay? So this says if you know the joint distribution of x and y, you can retrieve the single distributions
16:46
and we give those a name. We call those marginal distributions. So if you know the joint distribution of x and y, you can retrieve the individual distributions of x and y and those are called the marginal distributions. You just integrate out the other variable.
17:05
So here, we would integrate minus infinity to x, integral minus infinity to infinity, f sub x, y, little u, little v, dv, du.
17:35
And I think we can say kind of what that is.
18:02
It's zero if x is less than minus r. That is if I move this line over to the left of that point, what are the coordinates of this point?
18:23
Minus r is zero if I move x over here and I have to look at the probability being less than that. It would be zero because what's the value of the density over here? Zero, okay? But then once I get above minus r, I start picking up stuff.
18:41
It looks a little different to the left of zero than to the right. What do I get when x is to the left of zero but bigger than minus r?
19:02
Those x's right here. What would the probability of capital X being less than or equal to this value be? It'd be this area divided by pi r squared, okay?
19:21
It's that area divided by pi r squared. Are you able to find that area? Should we compute it or should I leave it to you? I know I can do it. Would you like me to do it? Okay, so let's try to do that.
19:41
I can do it easily when x is zero. What's the probability that capital X is less than or equal to zero? Well, it's the proportion of the area that this occupies in the whole thing.
20:01
Half, right? Okay, so we just need to compute that area. Let me turn it on its side. Here's a circle of radius r again.
20:20
And we want to find the area above a certain height, so let's call this zero s. What are the coordinates of this point here and of this point over here if this height is s?
20:53
Well, that's r. This is s. And so this point over here has what coordinate?
21:00
Square of r squared minus s squared zero. And this point would have coordinates minus square root r squared minus s squared and zero.
21:26
And what do we want to do? We want to find the area. What's the area of this rectangular region then?
21:50
S times twice this, right?
22:01
It's just the length times the base of that rectangle. How can you get the is just the integral of this function from here to here, right?
22:38
The area under a graph is the integral of the function, right?
22:43
So that'd be the integral minus square root r squared minus s squared to plus r squared minus s squared. And what's that function? This is the circle of radius r centered at the origin, so x squared plus y squared is r squared,
23:08
so y is square root of r squared minus x squared, right?
23:24
And how do you do that integral? Make a substitution, x equal r. What would be a good substitution here?
23:43
You want to get rid of the square root, so you want to make a substitution so the thing inside is a square. Cosine? That's a good choice. And then dx would be r minus r sine theta d theta.
24:09
And so this would be, before I make the change, let me do this.
24:35
And then it becomes twice.
24:42
With this substitution, r squared minus x squared becomes r sine theta, right? Put r squared cosine squared there, factor out the r, you get this, and then you have one minus cosine squared.
25:02
One minus cosine squared is sine squared. Square root of sine squared is sine. So I get sine here with an r. Then I get minus r, so I get minus r squared, another sine, sine squared theta, d theta.
25:28
Now, x went from here to here, so what does theta do?
25:45
When x is zero, what does theta have to be? So the theta in this picture must be this angle, right? Pi over two.
26:12
And when x is zero, yeah. No, I'm sorry, it's not that.
26:22
It's this. When x is zero, theta is pi over two, and it comes down to whatever this angle is. What would that angle be? Well, the length of this side to that side is cosine, right?
26:44
So cosine of theta, when you're here, this angle would have cosine theta equals s over r, so theta is arc cosine s over r.
27:00
So we go from zero to arc cosine s over r. Okay, how do you integrate sine squared? Yeah. Oh, is that sine? Sorry. I'm sorry. Yeah, you're right.
27:27
Sine. Okay, so how do you integrate sine squared? Yeah, what did we forget entity?
27:48
Is it one half? Minus or plus? Plus. I think it's minus. You can check by putting in theta equals zero. You have to get zero here.
28:00
What do you get here if theta is zero? There you get one, so one minus that would be zero, okay? So we put in one minus cosine two theta there, and so the half and the two cancel.
28:41
This gives the theta, and then integral of cosine is sine, but we get a one half.
29:08
Does that look right? And so this becomes minus r squared times, first we evaluate the theta between the top limit and the bottom.
29:26
That gives arc sine, s over r, and nothing for zero. And then we have sine of twice arc sine, and then plus, what, when theta is zero?
29:59
When theta is zero, sine of two theta is zero, okay?
30:09
Now, if we knew, if this were arc sine in here, not twice arc sine, you could put s over r, but it's twice, so what do you do? You have to use an angle addition formula.
30:32
Well, we need, what's sine of a plus b?
30:55
Is this right? Yes or no? Okay, I always have to check this one.
31:20
So the real part here is cosine a cosine b minus sine a sine b, that's the cosine. The imaginary part gives sine of a plus b, and that would be, I think it's sine a cosine b plus cosine a sine of b, okay?
31:51
Alright, so we put, then we get sine of two a is sine of a cosine of a.
32:17
Okay, so we get sine of this and then cosine of that. When we take sine of that, we get s over r, and we take the cosine of that, and the final step is, pardon?
32:40
Right, and that gets rid of the half, and I should have two here, right?
32:52
Because we get the same term twice, thank you. And now what's the cosine of the arc sine of s over r?
33:04
Remember how to do those?
33:22
We just need the length of this side over r, right? What's the length of this side? Square root of r squared minus s squared. So the cosine of this angle is this divided by that. And so we get minus r squared arc sine s over r minus s over r times square root of r squared minus s squared over r.
33:55
And that's where, oh, but then we have to do what? That's this area, the blue, now we just subtract the green area.
34:05
So the final answer is for x less than zero, and bigger than minus r, minus this area, which was two s squared of r squared minus s squared.
34:56
That's the area of the green, remember?
35:05
And this thing, this is the same as this thing, right? Well, I mean, think about canceling these r's. Like an s squared over s minus, so I get a minus s squared of r squared minus s squared minus arc sine s over r.
35:28
And now there's been a sine mistake made somewhere. I should have a positive number here. So there was a, there's a minus sign that should be a plus sign somewhere here.
35:42
But I don't want to go looking for it. Oh yeah, maybe you know where it is. I have no idea where it is. How are we compensating for the little sides on the sides? Oh, they don't come into play because I'm integrating from here to here. I'm only interested in the area from that part to that part.
36:02
Okay, so we don't have to. Yeah, see I only integrated from this point to this point. Oh, okay. Yeah, okay. Okay, so I'm sorry, there's some sine error here somewhere. I should be getting a positive number. But I'm not sure it's educational to go through the calculus and show you where.
36:25
But I guess the, maybe you saw more calculus than you wanted to. You might want to remember how to integrate sine squared. And you might want to remember how to deal with cosines of arc sines and things like that. But anyway, this is how you would compute the marginal density.
36:44
This is only for x less than zero. For x bigger than zero, it gets to be a little bit different. Because when x is bigger than zero, then you're starting to look at areas like that. But then you could just actually subtract, take one minus something like that, which we've just done.
37:02
Okay, so if you've got the right answer here, what we've just done, or attempted to do, is find... Oh, I'm sorry, there's one last thing that I didn't do here.
37:23
I have to divide by pi r squared. It's supposed to be the relative area. And same thing here. So we've just found the area of this.
37:41
Can you use that to find areas like this? What we've just done is find the unshaded area, right? So this area would be one minus that one.
38:01
And we just did a computation to get what that's like. So I'm at a loss about where the minus sign came in because I started out integrating a non-negative quantity. And I got an answer. And somewhere in here, I introduced a minus sign that shouldn't have been introduced.
38:22
Or I didn't get rid of one somehow. Okay, I'll look through the notes and maybe put it up on the web. You're correct. Yeah. Would it have been tractable to compute the marginal probability on essentially the origin, sort of along the x-axis?
38:42
Compute it sort of with a simpler number than work backwards? What would that work? So let me see if I understand. Rather than subtracting all that green A, for example,
39:03
You mean compute that probability? Okay, that's a half. Well, yes. Can't you find the marginal probability that way? Well, the marginal probability would be the black shaded area divided by pi r squared.
39:26
So that, I think, requires finding an area under a graph. So, somehow, at some step here, probably it was here.
39:54
I wonder, I suspect that the minus sign came in here and probably shouldn't have.
40:05
But I don't want to spend more time looking for minus signs when I get to probability. So, yeah, it's probably there.
40:35
It might have been that these limits of integration should have been reversed and that would have gotten rid of the minus sign.
40:51
Okay, so, we can retrieve the marginal distribution functions the way you get the marginal for x or the distribution function for x.
42:58
In the discrete cases, you take the joint PMF, probability mass function,
43:03
add up over all the second variables, but in the first variables only up to x. Or, in the continuous case, you integrate out the second variable, which is the y one, and then in the first variable integrate up to x.
43:22
Then you reverse this for F sub capital Y. This can also give the marginal probability mass function of x or y in the discrete case,
44:16
or the marginal densities of x or y in the continuous case.
44:39
Here, you get the, in the discrete case, the marginal PMF for capital X is just the sum over all v, f of x, v.
45:01
This is the joint PMF. The density for y is the sum over all u, f of u, y, where this is the joint PMF for x and y. In the continuous case, the density for the first random variable is gotten by integrating
45:40
the density for the joint variables with respect to the second coordinate, fixing the first. The density for y is gotten by integrating out the x variable.
46:14
So, let's look at an example.
46:49
Suppose we have a continuous pair of random variables, and their joint density is given by this.
47:13
If x is between 0 and y, this is the value of the density. In all other cases, the density is 0.
47:32
Let's find the marginal densities.
48:16
So, let's do, which one do we want to do first, x or y?
48:20
X, okay. So, we do this.
48:47
I guess we could start by saying that the marginal density is 0 if the variable is negative.
49:02
Why is that? Well, this is still the right formula, but what's the value of this function when x is negative? According to this, it's 0. So, let's do the computation here when x is positive, because we know what the answer is.
49:22
When x is negative, it's 0. Now, we should ask, when is this density different from 0? And the question is, in the v variable, x is fixed here. x is fixed. It's a positive number. So, for what values of v is f of x v different from 0?
49:44
Well, v has to be bigger or equal to x. So, what should the lower limit of integration be here? x, because for values of v less than x, this is 0. So, we integrate here from x to infinity, f of x v.
50:04
And then, on that range, when v is bigger than x, that's the value of the density. So, I put a lambda squared here, e to the minus lambda v dv, okay?
50:25
And this is easy to integrate. That's minus 1 over lambda e to the lambda v, isn't it? So, I get a minus lambda, e to the minus lambda v, evaluated between x and infinity.
50:41
You can check that by differentiating this and getting back to here. And the value of this at infinity is, well, we're assuming lambda is positive. The value of this at infinity is 0, because of the positive lambda, but there's a minus sign in front of it.
51:04
And the value of that at x, we subtract the value of x, we get lambda e to the minus lambda x, for x bigger than 0. So, what distribution does capital X have? Exponential with parameter lambda.
51:20
So, the marginal dense distribution of capital X is exponential with parameter lambda. Now, let's get the marginal distribution for y, and we do that by integrating out the other variable.
51:58
So, again, this is 0 if y is negative, do you agree?
52:08
Because you still have this formula that it's the integral of the joint density, and the joint density would be 0 if y is negative. And if y is positive, we freeze the y variable and integrate out the x variable.
52:37
And now we should ask ourselves, where is this different from 0?
52:40
Remember, y is fixed, so it's a question for values of u, is this different from 0? y is a fixed positive number, and that is different from 0, in this case that is, when the other variable is between 0 and y. When the other variable is between 0 and y. So, this integral should be from where to where? 0 to y.
53:07
And what's the value of the density in that case? Lambda squared, e to the minus lambda, what? y. y.
53:24
This is f of u, y. f of u, y is lambda squared, e to the minus lambda, y. Right?
53:41
And we're integrating in v. Whoops. u. u. We integrate in u. We integrate in u. We integrate in u. This doesn't depend on u, so I can pull it out. So, what do we get? We get a lambda squared. We get the e to the minus lambda y.
54:00
Integral from 0 to y du is? y. And this, I believe, is the gamma density with alpha equal to 2 and lambda. This is a gamma density.
54:32
Okay, so that's how you retrieve the individual ones. Let's take a break. Formulas for the marginal densities.
54:45
You can also find something called the conditional densities in the continuous case, which
55:04
means the density of capital x if you know the value of capital y. So, you should think of this as the probability capital x is this given capital y is this. And how do you do that?
55:22
How do you- what's the probability that a- what's the probability of a given b? It's probability of a intersect b over probability of b. So, in this case, you'd interpret probability of a intersect b as, well, a would be capital x is little x. b is capital y is little y. What's the probability that capital x is little x and capital y is little y?
55:42
Well, the- it's not exactly correct, but this is what that would mean. The density- joint density of the pair divided by the density for y.
56:03
So, read this as probability capital x is little x, capital y is little y, divided by probability capital y is little y. The other way, if we condition on x, get the same thing in the numerator, but now we divide by the density for capital x at little x.
56:30
For the discrete case, it's the same, except we'd use p for probability mass function, I guess.
56:54
It's the joint probability mass function divided by the marginal one and going the other way.
57:06
So, let's return to the example we did just before the break.
57:25
That's the continuous case and let's compute the conditional densities. Yeah. Okay. Well, um, because I told you this is a continuous case.
57:41
But, also how else? Um, what does the variable y do here? Or what do x and y range over? Are they taking on discrete values or continuous values? In- in the discrete case, they would be taking on only a finite or a countable set of values, like maybe the natural numbers.
58:01
X and y would have to be confined to the natural numbers. Are we confining x or y to anything like a finite set or a countable set? No. They can range over the whole line. So, that means continuous case. Okay. So, that's all you know. Formally, the- what's the difference between discrete and continuous when you're finding marginals?
58:22
In discrete, you take a sum in continuous you integrate. Okay. And do you know what the integral sign is supposed to stand for? The letter s. Okay. So, um, let's see. I should write down- let me write down the marginal densities here.
59:15
And we had lambda squared y e to the minus lambda y when y was positive.
59:24
Okay. That's what we found. So, now let's find the conditional densities.
59:57
Okay. So, um, when will this be different from-
01:00:00
Well, let's see. First of all, the definition says it's this. And since we're dividing by this, we should make sure that little x is positive, I suppose. If x is negative, what would the conditional density of x given the value of capital Y be?
01:00:28
Zero. Little x can't take on negative values even if you know the value of capital Y. Okay. And now for here, I guess we assume that maybe x is between zero and y. Otherwise, this will be zero.
01:01:04
Why is this case true? Could x ever be bigger than y? No. Okay, so the remaining case would be this one. And the numerator here is given by that formula. Lambda squared e to the minus lambda y.
01:01:21
And the denominator is lambda e to the minus lambda x. And this is lambda e to the minus lambda y minus x.
01:01:51
So if we try to capture all this in one, that covers everything.
01:02:41
Now let's do the other conditional density. Well, this is zero if y is less than zero or y is bigger than x.
01:03:12
Maybe I'll throw in bigger than zero here. Or maybe I can capture this more easily by saying if zero less than or equal to x less than or equal to y does not hold.
01:03:33
In any case where this is not true, the density is zero, the conditional density. Otherwise, it's my dyslexia coming back.
01:04:08
I'm supposed to divide by what here? Y. Sorry. Y. F sub capital Y of y. And what's F sub capital Y of y? There it is.
01:04:21
Lambda squared y e to the minus lambda y. Which is one over y for zero less than x less than y. So one over y here.
01:04:46
And here, I have to divide by the, this is where we divide by the marginal density of x. So the numerator is lambda e to the minus lambda y. The denominator is lambda e to the minus lambda x.
01:05:02
Giving the formula I had before. Lambda squared here, sorry. Lambda e to the minus lambda y minus x. So let's try to interpret these.
01:05:23
So what's the distribution of capital X if we know capital Y is little y? Here's its density. What is that density? This is a function of little x. This is fixed.
01:05:50
So y is fixed here. If this doesn't hold, that means x is either less than zero or bigger than y. So here's a density that's constant on the interval from zero to y.
01:06:02
What constant is it? One over y. And it's zero off that interval. That's the density of what kind of random variable? Yeah, so given capital Y, capital X has a uniform distribution on the interval from zero to whatever value capital Y has. What about the distribution of capital Y given the value of capital X?
01:06:28
Well, if little x is zero, what is this? If little x is zero, this is the density for an exponential. So we would say this is the density for an exponential on the interval from x to infinity.
01:06:47
So what is the distribution of x? It's exponential.
01:07:01
So capital X is an exponentially distributed random variable. And capital Y given the value of capital X is exponential on the interval from x to infinity. So what's the interpretation here? Think of light bulbs. Capital X is the lifetime of a light bulb that has an exponential distribution with parameter lambda.
01:07:23
It burns out sometime. Suppose the moment when it burns out is little x. And you're told that. Now what happens? Well, you normally replace a light bulb with another one. And what's the lifetime of that light bulb? Well, exponential. When does the second light bulb burn out?
01:07:42
At a moment y. But what would the distribution be? Well, it's not put into the socket until moment x. And then this is the time after that. So this would be when the second light bulb, y would be the time when the second light bulb burned out. So it looks like y is the sum of two exponential random variables.
01:08:09
All right. Yeah? For the first conditional, how do we find the parameters for zero less than x equal to zero?
01:08:30
Actually, here I guess there is a little bit of a problem with y is zero. But with densities, you don't have to be careful about whether these are strict or not strict inequalities.
01:08:46
Because in the end, what do you do? You integrate these functions. And if you change the value of a function at a point, that won't change the value of its integral. Because that contributes zero area.
01:09:03
So I won't be very careful about whether I'm bigger than or bigger or equal to when I'm writing densities. However, when you have PMFs, you have to be very careful because you're taking sums and you might be excluding a positive value if you don't sum over. So you have to be careful with these inequalities when you're doing PMFs.
01:09:23
Discrete case, but not when you're doing densities. So let's do an example of a discrete case with jointly distributed random variables.
01:10:36
Okay. Here's an urn. And there are three coins in the urn.
01:10:42
And you're going to pick out a coin at random and toss it. And, well, these are the head sides. I didn't know what to do about drawing tails. I think faces are easier. And each one has a different probability of heads. So let's call this one, let's call this coin X.
01:11:19
Let's call this coin Y.
01:11:23
And let's call this coin Z.
01:12:08
And suppose we have the following PMF.
01:12:50
Okay, do all those numbers add up to one?
01:13:04
That's a relief. See, this adds up to one-third. This adds up to? That adds up to a third, right? All those add up to a third. So what do these probabilities mean? Well, you're going to draw a coin out and then toss it.
01:13:23
What's the probability you drew out coin Y and you got a head? Okay. This is the probability you drew a coin Z and you got a tail. So let's try to compute the marginal distributions.
01:13:49
So there are two random variables here. Capital X is the coin drawn.
01:14:04
So it takes on values little x, little y, little z. And Y will be whether it's heads or tails. So Y will take on values maybe H and T.
01:14:39
So what's the PMF for capital X?
01:14:42
What's the probability that it's coin X that you get? Well, it'd be you fix the X variable and you sum over the PMF over the other variables.
01:15:03
What are the other variables? H and T. So we fix the X. These are the only two that have X in them and we sum on the second variable. So we take P of XH plus P of XT. Add them up and what do we get? One third. So what's the probability the coin little x was drawn?
01:15:23
One third. What about the marginal for capital Y? Well, we fix the little y. I'm sorry. This is still the variable capital X. What's the probability we pick the coin Y?
01:15:43
Well, we fix the little y and we sum over the other variables. There are only two terms there that have Y. P of YH and P of YT. Add up one sixth plus one sixth. That's the middle line there. You get one third. And what's the marginal probability of picking coin Z?
01:16:10
Well, that better be a third because that's the only remaining thing. But if you add one ninth and two ninths, you get one third.
01:16:20
So this is the marginal PMF for the X variable. What about for the Y variable? So for the X variable, we sum across this way. For the Y variable, what would we do? We'd fix the H and sum over the first variable.
01:16:43
So what's the marginal density for Y of being a head? Well, it'd be P of XH plus P of YH plus P of ZH.
01:17:01
Which would be one twelfth plus one sixth plus one ninth. And what's that equal to? Say it again. 1336. Okay.
01:17:35
For the marginal probability of getting a tail, we fix the T and sum over the other variables.
01:17:41
And here, I think we should get 23 over 36. Can you verify? Is that true? Okay. So these are the marginal densities for the coin variable and for the head or tail variable. Okay, what about the conditional distributions?
01:18:29
What's the probability we get a head if we knew we drew coin X? Well, that's the joint PMF divided by the marginal for X.
01:18:49
The joint probability is one twelfth. And what's the probability that we drew coin X? It's one third.
01:19:01
So this is one fourth. What's the probability of tails given we drew coin little x? Well, what are the two outcomes?
01:19:20
Heads or tails? The probability of heads is one fourth. So the probability of tails has to be three fourths. So this tells us what the probabilities are for coin X. Coin X is a coin that has probably one fourth of heads and three fourths of tails.
01:19:56
Similarly, you can get the marginal distributions or the conditional distributions given you have coin Y or coin Z.
01:20:05
So I won't do those. You can do those. Let's do the other way around. What's the probability that it's coin X you're tossing if you go through this procedure and you get a head?
01:20:42
Well, P of X H is one twelfth. P Y of H is thirteen over thirty six. So this is three over thirteen.
01:21:01
And we could also do the probability that it's coin Y you're tossing given that you get a head. P Y H is one sixth.
01:21:23
And the probability of getting a head again is thirteen over thirty six. So this is six over thirteen. So this is the probability that it's coin Y you selected if you get a head.
01:21:43
And then finally what would be the probability that you had coin Z given the information you got? It's the remaining four over thirteen. Three over thirteen plus six over thirteen is nine over thirteen. So P X given Y of Z given H is four over thirteen.
01:22:07
That's what these look like when they have discrete random variables. Okay. Okay. So now let's do something really hard.
01:22:25
This always motivated me when I was a student. Cause the teacher would say if you get a hundred percent on this exercise set I'll give you a hundred bonus points. That would get my attention. Or if he'd say pay attention this is really hard.
01:22:42
Sometimes I'd listen more. So I'm saying that to get your attention. We're going to do something very important now. I'm sure this has its own use. But a normal distribution comes up a lot in statistical problems.
01:23:25
You probably use triple E from time to time at the end of a course. Or after an exam you go look at your scores. Sometimes the professor releases the statistics. And you see a histogram of the scores on the test.
01:23:41
And what shape do you generally see? Skewed. Early. Which way? Towards A I imagine?
01:24:02
I usually see that. That's the letter after D. Some places they'd give E instead of F. Extraordinarily bad. You usually see this.
01:24:24
This is a normal distribution. You also if you were to plot heights of individuals you'd see something like this if you had the histogram. So people study correlations to look for, you know, maybe some hidden causality like smoking.
01:24:45
Like you might ask about individuals. Do you smoke? Do you have lung cancer? And they find some correlation between these. So then they start to investigate whether there's some causality. Joint normal distribution, it was mentioned in the book, was first noticed in the case of a pair of random variables.
01:25:10
The first element in the pair was the height of a father. The second was the height of a son. And these individually form normal distributions and they form something called a joint normal distribution.
01:25:27
So what's the joint normal distribution? X, Y has a joint normal distribution.
01:25:41
So this is a continuous pair of random variables.
01:27:13
Wait until I don't write this down yet. I'm going to have to check I have the right formula.
01:27:26
I'm running out of room. Let's see if I can rearrange here.
01:27:45
No, I think it's a row here. Let me check. I think it's a row there. Yes.
01:28:14
And this is for all X and Y. Okay, so row is a number bigger than zero.
01:28:23
Sigma X and sigma Y will be numbers bigger than zero. Mu X and mu Y could be any number. Row is something that indicates the connection or the degree of dependence between X and Y.
01:28:44
Yeah, these are rows. This is a row, row, and a row. Okay, and it's not such a simple matter to check to see that this is a density.
01:29:04
That means if you integrate this over all of R2, you get one. Okay, so let's leave that there for a moment or two and check for something called independence.
01:29:41
This is an extension of the idea of independent events to independent random variables.
01:30:30
Pair of random variables are independent if their joint distribution function factors into the product of their individual distribution functions. So let's do a couple of examples.
01:30:59
In this example, take row equal to zero.
01:31:25
Then F becomes... Oh, and by the way, if you have a density, what would this become? The density factors into the product of densities.
01:31:44
That would be the continuous case. What about the discrete case? The joint PMF would factor into the product of the individual PMFs.
01:32:06
Okay, in this example here, if we take row equal to zero, what does it become? We get one over two pi, I guess sigma X and sigma Y can get rid of the square root because they're squared inside there.
01:32:20
Then I have exponential minus a half little x minus mu sub capital X squared over two sigma sub capital X squared plus little y minus mu sub capital Y squared over two sigma sub capital Y squared.
01:32:45
What happens to the next term? It vanishes. And I can write this as one over square root two pi sigma sub X squared E to the minus one half little x minus mu sub capital X squared over...
01:33:10
Oh, no two here and no two here. I already had it out there. So, sigma sub capital X squared times one over square root two pi sigma sub Y squared
01:33:22
E to the minus one half Y minus mu sub capital Y squared over sigma sub Y squared. So, the joint density factors into the product of two functions.
01:33:41
What property do these two functions have to have? All the X variables have to be in one and all the Y variables have to be in the other and then they'll be independent. This gives the marginal density for capital X, this gives the marginal density for capital Y and they are independent.
01:34:04
So, in this case X and Y are independent. In any case where rho is different from zero, this won't factor and they will be dependent.
01:34:21
We'll say they're dependent. Let's consider the case of this density.
01:34:57
If X and Y has this density, are they independent?
01:35:21
Can this density be factored into the product of a function of X times a function of Y?
01:35:42
Let's think for a moment about what consequence this might have. What's the conditional distribution density then when X and Y are independent?
01:36:10
This is the definition. If they're independent, this factors as the product of the marginal densities and that would just give the density for capital X.
01:36:30
In other words, given information about the value of capital Y, how does that change the statistical behavior of capital X? Not at all. Information about Y gives you no information about X.
01:36:43
Is that true here? That if I know something about Y, it doesn't tell me anything about X? Suppose I tell you Y is 1 minus epsilon r or bigger than 1 minus epsilon.
01:37:06
Y is 1 minus epsilon r or maybe even... Do you like epsilons? 1 minus epsilon r, Y minus epsilon r would be something like that.
01:37:24
Does that tell me anything about X? As epsilon gets closer and closer to 0, this line goes higher and higher. Where does X range? X ranges from here to here. So as epsilon gets bigger, X gets more determined. Closer and closer to the value 0.
01:37:50
So if I know Y is really big, then I know X has to be kind of small. So does that give me information about X if I know Y is big? Yes. So are X and Y here independent? No.
01:38:02
In fact, this function can't factor into a product of a function of X times a function of Y. So we'll do more in independence later. Well, actually a lot more. So have a happy fourth. Remember the assignment is not due till Monday.
01:38:22
And I'll put up the next assignment I hope this evening.