Survey Sampling
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Part Number | 10 | |
Number of Parts | 16 | |
Author | ||
License | CC Attribution - ShareAlike 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/12879 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | |
Genre |
1
2
3
4
5
6
7
9
10
11
12
13
14
15
16
00:00
Open setMathematicsDistribution (mathematics)Multiplication signSquare numberNormal (geometry)Physical lawDivisorWahrscheinlichkeitsfunktionRight angleMortality ratePoisson-Klammer3 (number)Marginal distributionPopulation densityRandom variableEqualiser (mathematics)VotingSpacetimeModulformPotenz <Mathematik>CoefficientMultivariate NormalverteilungRootComputabilityGoodness of fitWell-formed formulaExponentiationSigma-algebraLecture/Conference
08:29
Multiplication signSquare numberRootSigma-algebraMarginal distributionRight angleVarianceINTEGRALArithmetic meanNormal (geometry)DivisorInverse elementMathematicsMultivariate NormalverteilungCondition numberPopulation densityFraction (mathematics)ModulformVariable (mathematics)Well-formed formulaLogical constantRoutingGroup actionRule of inferencePhysical lawSineDelay differential equationMereologyLecture/Conference
16:57
RoutingTerm (mathematics)Square numberDirected graphPopulation densityFraction (mathematics)FactorizationRight angleTrailUniverse (mathematics)Multiplication signOrder (biology)Film editingPotenz <Mathematik>ExpressionPoisson-KlammerMereologyMultiplicationTable (information)Greatest elementMusical ensembleSigma-algebraInversion (music)RootModulformDivisorNumerical analysisVarianceArithmetic meanLecture/Conference
25:26
Conditional expectationSigma-algebraVariancePopulation densityArithmetic meanSummierbarkeitAbsolute valueModulformExpected valueMultiplication signMathematicsIndependence (probability theory)Condition numberDistribution (mathematics)Multivariate NormalverteilungFunctional (mathematics)Marginal distributionNormal (geometry)ExpressionDot productAnalytic continuationVariable (mathematics)ChainProduct (business)TheoryLecture/Conference
33:55
Multiplication signVariable (mathematics)Conditional expectationPopulation densityTransformation (genetics)ExpressionArithmetic meanComputabilityWell-formed formulaDistribution (mathematics)INTEGRALMarginal distributionSigma-algebraRootMathematicsIndependence (probability theory)InfinityDeterminantMusical ensembleSpherical capExpected valueDivisorBuildingNumerical analysisNear-ringLecture/Conference
42:24
Multiplication signLimit (category theory)1 (number)SummierbarkeitRandom variablePopulation densityFunctional (mathematics)Alpha (investment)Condition numberVariable (mathematics)Parameter (computer programming)Gamma functionRandomizationINTEGRALMarginal distributionRight angleConditional expectationDivisorWahrscheinlichkeitsfunktionIndependence (probability theory)Uniformer RaumPotenz <Mathematik>Expected valueGleichverteilungMathematicsMany-sorted logicWater vaporSeries (mathematics)Spherical capExplosionMortality rateRule of inferenceSurjective functionAreaCycle (graph theory)Lecture/Conference
50:52
Moving averageVariable (mathematics)Independence (probability theory)SummierbarkeitRight angleMultiplication signGeometric seriesConditional expectationFunctional (mathematics)Physical lawTotal S.A.Social classNumerical analysisProduct (business)Potenz <Mathematik>Parameter (computer programming)Logical constantComputabilityExponential distributionConditional probabilityRandom variableINTEGRALExpected valuePopulation densityModule (mathematics)GeometrySpherical capEntropyDistribution (mathematics)Order (biology)Lecture/Conference
59:49
Product (business)Independence (probability theory)Multiplication signRight angleFunctional (mathematics)Expected valueGeometric seriesGeometrySummierbarkeitRule of inferenceMoment (mathematics)Event horizonAreaNumerical analysisDuality (mathematics)Condition numberVariable (mathematics)RandomizationGoodness of fitPower (physics)Matrix (mathematics)Substitute goodPotenz <Mathematik>Equaliser (mathematics)Position operatorGeometric distributionWahrscheinlichkeitsfunktionLecture/Conference
01:08:45
Potenz <Mathematik>Parameter (computer programming)Inequality (mathematics)Variable (mathematics)Stochastic processGeometric seriesConditional expectationObservational studyPopulation densityRadiusNumerical analysisDirection (geometry)DistanceCombinatory logicMoment (mathematics)Homogene FunktionExponential distributionDistribution (mathematics)Multiplication signFunctional (mathematics)MultiplicationExpected valueProcess (computing)Term (mathematics)Arithmetic meanRight angleGroup actionWater vaporLecture/Conference
01:17:42
Multiplication signPopulation densitySigma-algebraEvent horizonSquare numberTerm (mathematics)Numerical analysisTheory of relativityLaw of large numbersFrequencySummierbarkeitMilitary baseClosed setLattice (order)Rule of inferenceBasis <Mathematik>SpacetimeFigurate numberOvalInequality (mathematics)VarianceDistanceDiagonalAutocovariance1 (number)Arithmetic meanRandom variablePhysical lawWell-formed formulaMany-sorted logicLecture/Conference
01:26:38
Uniformer RaumDifferent (Kate Ryan album)Standard errorMultiplication signAdditionAnnihilator (ring theory)Numerical analysisFrequencyArithmetic meanFunctional (mathematics)Mortality rateProcess (computing)AreaCartesian coordinate systemSquare numberNumerical integrationState of matterTable (information)Division (mathematics)Position operatorExpressionIndependence (probability theory)Analytic continuationSet theoryINTEGRALContinuous functionVarianceClosed setModulformSummierbarkeitRandom variableTheoremDivisorLecture/Conference
01:35:34
Computer animation
Transcript: English(auto-generated)
00:06
OK, so it's time to do a little bit with the joint normal, which is a very important distribution for a pair of random variables.
00:32
Well, I hope blue is OK with you, for the first hour at least.
01:01
Is this visible? Not so good, not so bad? Let me try a different one. So here's the joint normal density for a pair, your long-winded formula.
02:37
And this is a density, let me change the P to an F.
02:44
Not that this is wrong, but you're used to seeing P for a probability mass function and F for a density, so I'll write F. OK, so especially if you're going to take 131B, you'll see this character again.
03:08
Let's compute the marginal density. You should have seen the look on this person's face here. How do you compute marginals if you have the joint?
03:25
You integrate out the other variable, right? So we have to integrate out the Y variable if we're going to get the marginal for the X.
03:50
OK, so it looks pretty ugly, but let me tell you, it's not as bad as it looks. We have to integrate this thing in Y.
04:03
And what do we know? We know that this is 1.
04:35
OK, so we want to use that somehow.
04:45
Here we have Y minus V squared, but we have a bunch of other stuff, and the problem is that it's also here. So let's... I think one bad thing about this, or one thing you probably don't like,
05:01
is there are so many things you have to write down. So let's simplify it a little bit. Let's write U equal to X minus mu sub X over sigma sub X. And let's write V as Y minus mu sub capital Y over sigma sub capital Y.
05:26
And then in the exponent, at least in the square brackets, we'd have an exponential. Well, I'll do the whole exponential. We get minus 1 over 2, 1 minus rho squared.
05:43
And then we have U squared plus V squared minus 2 rho U V. And it looks like we want to integrate dV, more or less, right?
06:08
And if we can get this into some form like that, we'll be in good shape. So let's complete the square here. This is usually what you do in computations with normals or joint normals.
06:20
You complete the square in the exponent. So how do we complete the square here? Let's start with V squared. We're going to integrate V squared.
06:41
So we have V squared. Here we have a V, and we have a 2 rho U as a coefficient. We're going to complete the square root of V squared. V squared minus 2 rho U V. So we take the coefficient of V, divide by 2. That gives us rho U. And we square that and add it in.
07:07
So we add in rho squared U squared. And then we have to subtract it off. And we have a U squared here.
07:42
So then in V, we have V minus rho U quantity squared here. And then here, we have 1 minus rho squared U squared.
08:08
OK. So when you're writing your notes, leave a little space between this constant and the integral
08:21
because I'm going to factor something out. What am I going to factor out? We're going to integrate in Y and change variables to V.
08:41
So let's do the change of variable first. dV is going to be 1 over sigma Y dY. OK. So this should be an integral dY, but I take away this thing, put it with a dY, and call it dV.
09:29
And now, I'm integrating with respect to V. And here's an exponential of minus this times something that involves V.
09:45
And then plus minus this times something that does not involve V. So this thing that does not involve V is a constant as far as V is concerned and can be factored out of the integral. So I can factor out E to the minus this times this.
10:03
OK. But what do I get when I multiply minus 1 over 2, 1 minus rho squared times this thing here? The 1 minus rho squared cancels here and here. So what I'll factor out is going to be E to the minus U squared over 2.
10:31
And then what's left inside is E to the minus V minus rho U squared over 2 times 1 minus rho squared.
11:00
OK.
11:00
Now is a good time to use what we know. Here's what we know. And we want to use that to compute this integral. Does it look at all like that? It does with, well, the variable is different, but never mind that.
11:21
Mu is this thing. Does that come into the answer at all? Well, so mu would be this, and what would be sigma squared? 1 minus rho squared. 1 minus rho squared. So this integral should be square root 2 pi times 1 minus rho squared.
11:57
In other words, that integral is the inverse of this constant.
12:04
That integral is root 2 pi sigma squared. What's the sigma squared? It's this, right? So it's root 2 pi sigma squared. And now we see some more cancellation.
12:22
The 1 minus root 1 minus rho squared here and here cancel. Here's a root 2 pi in the numerator, 2 pi in the denominator. So you get 1 over root 2 pi. I'm going to put the sigma X inside the square root. And then it becomes a square. And I get E to the minus U squared over 2.
12:41
What's U? U is here. So it would be minus little x minus mu sub capital X squared over 2 sigma sub X squared. So there's the marginal density for capital X if XY is joint normal.
13:04
And what's the name of that? Normal with mean mu X and variance sigma X squared. What about the marginal for Y? Just change the role of X and Y and you'll get normal with mean
13:21
mu Y and variance sigma Y squared. So that wasn't so bad, was it? All right. Well, let's do something more. Question?
13:44
Over there would it be 1 minus rho squared squared? OK. Here? The form of that 1 over rho squared. This one? Yeah. OK. Right. So sigma squared in here is 1 minus rho squared.
14:04
Sigma squared is 1 minus rho squared. And notice it's square root of sigma squared here. And it's square root of sigma squared there. Does that answer it? OK. All right. Let's find the conditional density.
15:10
OK. What a mess. This will be, huh? This is the definition of conditional density, correct? And so we just plug in our formulas.
16:47
And this should be simplified. I think there should be some cancellation and consolidation.
17:02
So sigma Y cancels. You should get a root 2 pi here.
17:25
And so the constant out in front becomes only this, right? Invert and multiply. And all you get for the constant out in front is that. And then? And then what?
17:42
Well, here we have a coefficient of minus 1 over 2 times the quantity 1 minus rho squared. And we want to add to that plus Y, plus this thing, right? We're going to put this up here. But if I'm going to put it inside the square bracket, I should multiply it by 1 minus rho squared and divide by 1 minus rho squared.
18:10
And when I do that, when I put it inside here,
18:35
it's going to be, this part will stay outside, the part that I multiply down here.
18:41
But the numerator will have a factor of 1 minus rho squared. This 2 will stay outside. And when I put it inside here, it's going to have, since this is a minus sign, it's going to have a plus sign. So I'm going to have, I'm sorry, it's going to have a minus sign.
19:01
So the rigid, I have this term.
19:21
And then this term came from the denominator. Notice there's a minus, a minus. That makes it a plus, right? And when I move this to the numerator, it has to become a plus. And then I have this term. So everything inside the square bracket is multiplied by that and then it's exponential of that thing.
20:00
Okay, well let's combine these two and it becomes that.
21:13
Because there's this term and then minus 1 minus rho squared times this term. And this part, the minus 1 times this stuff and plus that will cancel, leaving plus rho squared times it.
21:30
And you stare at that for a while and eventually come to the realization.
21:43
Inside the square bracket is a perfect square.
22:57
I think that's correct, isn't it?
23:03
Now what's the variable for this density? It's little x. And usually when you have a normal with mean mu and variance sigma squared, how does the density look? It looks like e to the minus little x minus mu squared over 2 sigma squared, right?
23:22
And then the 1 over root 2 pi sigma squared. Does that last expression look like this? Not yet. But we can bake it a little bit and it'll come out to be like that. What do we have to do? We have to have x over x minus something over sigma x squared.
23:48
So let's see if we can do that.
24:31
I want to get this into something, a form that looks like that.
24:42
So I have x over what? Sigma x. So this should be the sigma x. And then I have to subtract mu from that. So what should mu be?
25:06
What if I have this? And think of this as being a. How could I combine this to look something like that? I should multiply and divide by sigma sub capital x, right?
25:20
So I can combine the fractions. And then I'd have x minus mu sub capital x minus a sigma x over sigma x. So how am I going to combine these? I should just multiply this by sigma x and put it here.
25:58
So that would give me a rho sigma sub x over sigma sub y times little y minus mu sub y over sigma sub capital x squared.
26:17
One last step puts it into this form.
27:01
One more parenthesis there. So what's the conditional distribution of capital x given that capital y is little y?
27:23
Conditional distribution of capital x given capital y is little y is normal with mean, this guy here, and variance. So it's normal but a little bit different from what it would be.
27:47
I mean the marginal was normal mean mu sub x and variance sigma sub x squared. When would x and y be independent?
28:02
Exactly when rho is zero. Independents would say that the conditional density is the same as the marginal density. When is this conditional density going to be the same as that? Well, if rho is zero, you get a one here. You get a one here. You get a zero there.
28:21
You get exactly that. So normals are- joint normals are independent if and only if rho is zero.
28:51
So there were some homework problems that involved conditional expectations. You don't have to hand those in until the next homework set, which is due Monday.
29:04
Conditional expectations are just expectations using conditional PMFs or conditional densities.
29:25
In the case of a discrete random variable, the definition would be this.
29:51
And that is again provided that when you put an absolute value here, the sum converges. I get tired of writing that down, but this is provided when you put an absolute value here.
30:01
The sum converges. In the case of a continuous random variable, you integrate x against the conditional density.
30:29
And again, this is provided this integral converges when you put an absolute value of x there. So as a first exercise, let's compute what?
30:50
How about the conditional expectation of a normal? Why not? We have it there. How hard could it be after all the things we've done?
31:45
Okay, well, this will equal- there's the density there on the other board.
32:34
Okay, this dot, dot, dot is- I'm a little tired of writing that over and over again.
32:47
Okay, it's that thing here. All right, well, for a moment, let's suppose we have minus dot, dot, dot here. And I did that integral.
33:00
What if I had the integral x minus this thing against that density? Or, in other words, what if I integrated this expression? What would I get?
33:32
Well, change variables, put y equal to x minus mu, and you get that.
33:52
And that integral is zero because y times e minus y squared is an odd function. In other words, integral x e to the minus x squared over 2 sigma squared 1 over root 2 pi sigma squared minus infinity to infinity dx is what number?
34:18
It's mu. That's just the mean, okay?
34:23
So if I had minus dot, dot, dot, I'd get zero. But what happens if I don't? I just get the dot, dot, dot, dot, right? In other words, this would be mu capital X.
34:41
Well, let me write it down here. This is just the mean of the normal that we have in this expression. So it'd be plus mu sub x rho sigma sub capital X over sigma sub capital Y little y minus mu sub capital Y.
35:02
Very simple. Just the mean of that normal density.
35:20
Any question on that? The integral of, I'm just using this fact here. Oops, I'm sorry. I missed something here. This integral is mu.
35:43
That's the significance of mu being there. When you integrate x against this, you get mu. So what would the expected value of x be without conditioning?
36:08
Well, what's the distribution of capital X, the marginal distribution? Right, mu sub x, right? So if you know something about y, it changes the conditional expectation if rho is different from zero.
36:25
If rho is zero, they're the same, but if rho is different from zero, this would be the expected value without information from y. But if you find that capital Y has this value, then the conditional expectation of x given that information is this.
36:51
Let's try another example.
37:56
Let's find the conditional expectation of u given capital V is little v.
38:06
So I like to think of exponential and variables as being lifetimes of light bulbs. So u is the time when a light bulb burns out. You replace it, and then v would be the time when the second light bulb burned out.
38:26
So if you know when the second light bulb burned out, when did you expect the first one to have burned out? That's what this is asking. So what do we need to compute conditional expectation like this? You need the conditional density.
38:42
What do you need to compute the conditional density? You need the joint density and the marginal ones, right? So let's try to find the joint density of u and v. How do you find the joint density of u and v? Well, that's for T1 and T2, but u and v aren't independent.
39:07
We have to use that transformation stuff. We have to use the Jacobian. So remember the formula.
39:21
The joint density for u and v is where this is G1 of T1, T2, and this is G2 of T1, T2.
40:45
So what's, remember H1 and H2 are the things that invert this transformation.
41:06
So what's H1 of uv, u? What's H2 of uv? How do you get v from, if you know, I'm sorry, how do you get T2 if you know u and v?
41:26
Well, if we subtract u from v, we get T2, don't we?
41:40
Subtract this from this, you get T2. Okay? So what's this determinant? What's G1 of two variables? This is just the first variable. So this would be 1, and this would be 0.
42:11
This would be 1, and this is 1. So the determinant is 1. So this doesn't appear.
42:37
And so the joint density, well, the density of T1 and T2 is what?
42:58
Yeah, they're independent and exponentials, so each one has a factor of lambda.
43:04
So we just multiply the two densities together, and this is for s and t bigger than or equal to 0. Right? That's just multiplying two exponential densities together. Lambda e to the minus lambda s times lambda e to the minus lambda t.
43:24
Okay? So we evaluate that now at s equal to H1 and t equal to H2. So what's H1 plus H2? v. And we're calling that the variable, okay, that's v, right?
43:41
So we get lambda squared e to the minus lambda H1 of u, v, plus H2 of u, v. And that's equal to lambda squared e to the minus lambda v. And now what are the limits on the variables?
44:03
u and v are non-negative, right? u is T1, v is T1 plus T2, so which is bigger? u is less than or equal to v. Yeah, u is less than or equal to v.
44:20
So there's the joint density. Now, we want to compute the conditional expectations, so we should find the conditional density of u given v. That's the joint over the marginal.
44:50
Alright, so we have this. Do we have this? Not yet. How do we get the marginal? We have to integrate the joint over the variable u.
45:05
So I'll erase this.
45:27
And what are the limits of integration? Zero to what? Zero to v. So we integrate, factor out the lambda squared, and what do we have here?
45:44
We're integrating du, right? And what do we have in the joint density? We have a v. So as far as u is concerned, that's constant, so I pull that out also. And then I have to do this integral, zero to v du, which is nice.
46:06
That's v. And this is good for v bigger than zero. There it is. And what's that?
46:21
That's one of the ones you know. That's a gamma with parameter alpha equal two and lambda. That's where gammas come from, sums of, well, typically, sums of exponentials. OK, so now we have this and we have this.
46:41
So in the numerator, we have lambda squared e to the minus lambda v. In the denominator, we have lambda squared v, e to the minus lambda v. And this is one over v. And that's for u in what range? From zero to v.
47:01
So what's the conditional density of u given v? Uniform on zero to v. So in other words, that light bulb was as likely to burn out anywhere in that interval as it is anywhere else. First light bulb has a uniform distribution given the time when the second light bulb burned out.
47:24
So what's the expected value then of u given capital v is little v? Yeah, it'd be v over two.
47:40
The expected value is right in the middle. But we can compute it. How do we do that? We take u times the density to u, right? It's u times the conditional density, which is one over v.
48:03
And it goes from zero to v. And one over v comes out. Integral of u to u is u squared over two. Evaluated from zero to v is v squared over two.
48:23
So we get v over two right in the middle. That's what we wanted. It's a midpoint. So let's take a break. Any questions on that? Get back to work?
48:46
This is a function of little y. Has anybody watched these on the...
49:03
Is that really the way I look? Or sound? Okay. Well, you should watch yourself sometime. You might be a little surprised. Okay, so this is a function of y, that variable.
49:21
And we can evaluate that at the random variable capital y. And we call that expected value of capital x given capital y. That would be g of capital y.
49:42
And what would happen if I took the expected value of g of capital y? How do you take the expected value of a function of a random variable? Well, let's say we're in the discrete case.
50:03
It'd be the sum over little y, g of little y times the pmf for capital y. I'm sorry, not little g, capital g.
50:21
That's how you take the expectation of a function of a random variable. Now let's see what this is. It's this conditional expectation.
50:50
And now let's go a little further. We're in the discrete case, so that would be this.
51:03
And I'm going to write all the sums out in front here. Sum over little x, sum over little y. What do I have? I have a little x. I have a conditional probability of capital x given capital y at little x little given little y.
51:21
And then times the marginal for capital y. And now let me do the sum on y first. So I'm going to pull this out.
51:58
Now what is that product there of conditional probability times the marginal?
52:06
That's the joint, right?
52:22
Because the definition of this is this thing divided by that thing. And now if I sum this over all y, what do I get? The marginal for x. So if we sum that thing over all y, we get the marginal for x.
52:43
And what's this? That's the expected value of capital x. And same computation using integrals and densities instead of sums and PMFs. Says that this would be equal to that for continuous random variables as well.
53:08
But what is this thing? This is expected value of capital x given the random variable y. Pardon? That's right.
53:22
It's called the law of total probability. So you can compute an expected value by first computing a conditional expectation. And then this is a random variable in y and you take the expected value of that. So let's look at an example.
53:43
I'll write this over here.
55:46
OK, so you have a geometric sum of exponential random variables. OK, so no web surfing during class. If you have a computer you should put it away.
56:11
OK, well, what were we just doing? We were just doing this.
56:20
This wouldn't be so bad if that were a fixed number, right? Because what's the expected value of this thing? If this is a fixed number like 10 or 11 or even 10 to the 23, this would be the sum of the expected values.
56:44
And the expected value of an exponential with parameter lambda is lambda so this would be n times lambda. OK? So what does that say this would be?
57:05
Let's start with this. This is our g of n, g of little n. Given capital N is little n, all we do is place a little n up here and now we can exchange the expectation and the sum.
57:50
Now what do we know about capital N and the random variable Ti? What's their joint distribution? They're independent of each other. So the expected value of Ti given capital N is the same as the expected value of Ti and that's lambda.
58:17
So we get n times lambda again. OK?
58:21
So what did we do in this? We defined g to be the conditional expectation of x given the other thing and then we evaluate that function at the random variable capital Y. So we evaluate g now at capital N. That's the expected value of our sum given capital N.
58:48
And the expected value of the sum is the expected value of g of n.
59:04
Right? It's the expected value of this given that. Right? That's the law of total probability. Right there. But we just computed this. What is this? Evaluated a capital N. That would be capital N times lambda.
59:24
And what's the, uh, lambda is a constant. I pull that out. What's the expected value of a geometric with parameter p? If you're rolling a die waiting for a 1, what's the expected number of rolls until you get a 1?
59:40
6, right? It's 1 over p, right? Lambda over p. So this gives us an easy way to do this. Let's do something a little more interesting.
01:00:11
Let's compute the moment generating function of that thing.
01:00:55
OK, I conditioned inside on variable capital N. Now I take this.
01:01:06
So this means I treat this as a constant.
01:01:22
What's this thing? Well, e to a sum becomes e to a product of e's, right? Products of exponentials.
01:01:47
And the t sub i's are independent random variables, so the exponentials of them are independent. So what do we know about the expected value of products of independent random variables? It's the product of the expectations, right?
01:02:02
And all of the expectations would be the same. So we get expected value e to the t, capital T1, raised to the power n. OK, but what's that when you have an exponential random variable? Well, we did that already. It's lambda over lambda minus t.
01:02:49
And this is good for t less than lambda. That's good for t less than lambda. Otherwise the integrand goes to infinity as s goes to infinity.
01:03:05
T is bigger than lambda. OK, so this would be lambda over lambda minus t to the n. OK, so what do we have here?
01:03:22
This would be expected value of lambda over lambda minus t to the power capital N. See, that's our little g of n.
01:03:51
And we take g of capital N then. That'd be this. So what's that expected value? This is expected value of some number raised to a geometric random variable.
01:04:02
So as a function of a geometric random variable, how do you compute expected values of functions of random variables? You have to have the PMF. So we evaluate this by taking the sum. What values can a geometric random variable have? You're waiting until the number of first success, right?
01:04:22
Number of trials until first success. So it could happen the first time. So it's k equal 1 to infinity. Then we take this function to the power k, or to the k, or evaluated at k. And then we have a 1 minus p to the k minus 1 for k minus 1 failures.
01:04:43
And then a success. And can we simplify this?
01:05:08
Well, the only thing that this reminds me of is the geometric series, right? Whenever you have a geometric random variable, you should try to think about, if you want to evaluate something, usually you can use a geometric series.
01:05:31
Now we don't quite have that, do we? Because we start at 1, go to infinity, we have something to the k, something to the k minus 1, and then an extra p. Can we shake this up a little bit and get it to look like that?
01:06:06
And then inside we have this raised to the k minus first power. Okay, substitute j equal k minus 1.
01:06:50
Let's see, we get a p lambda over lambda minus t times 1 over 1 minus, see when I
01:07:00
said j equal k minus 1, this becomes some j equals 0 to infinity, this thing to the j. So it'd be 1 over 1 minus this.
01:07:37
Good question.
01:07:44
What about lambda over lambda minus t? So this is if 1 minus p lambda over lambda minus t is less than 1.
01:08:05
That would say 1 minus p less than lambda minus t over lambda, which would be 1 minus t over lambda, right?
01:08:25
Or, what? p lambda bigger than t. So as long as we keep t less than p lambda, we're fine. And that's still enough to have an interval about the origin and the t values, right?
01:08:45
Because this is positive. So as long as t is smaller than this, so if we go lambda p over 2, actually we don't have to worry about the other side here, 0. As long as t is over here, then all of this complication is good.
01:09:04
And that's enough to determine the distribution if we can recognize the homogeneous function. Or should we simplify it a little bit before we try to recognize it?
01:09:21
Does it simplify at all? Well, if you have a over b times c over d, that's ac over bd, right? So we should multiply the numerators, and that gives us p lambda. And multiply the denominators, lambda minus t times that gives what?
01:09:50
Right, this cancels, right? And you get lambda minus t minus 1 minus p times lambda. What's that?
01:10:33
Does that tell us anything? Is that a moment generating function you're familiar with?
01:10:43
Have you ever seen that anywhere before? Have you ever seen that anywhere before? Yes, exponential with parameter lambda has this.
01:11:01
So this must be exponential with parameter p lambda. So it says geometric sum of exponential random variables is exponential.
01:11:21
If you take a course on stochastic processes and study Markov processes, you'll come across this. So this is a nice combination of conditional expectations and moment generating functions.
01:11:45
Okay, so I think you should be able to do the problems now on conditional expectations, and you can hand those in Monday, as I said before, a week from today. Any questions on anything? No? Okay. Pardon?
01:12:14
No, the next one will be due Monday. I'll shift it to Monday, if that's all right with you.
01:13:41
Okay, this is sometimes called Markov's inequality. Sometimes it's called Chebyshev's inequality.
01:14:04
And it's actually very easy. Let's establish this when X is a continuous random variable.
01:14:24
Probability of X being in some set E is going to be the integral over E against the density of X. Our probability is the probability of X being in some set E.
01:14:44
What's E? It's an interval centered at mu of radius a. That's what our probability is there.
01:15:32
Does everybody agree with this? E is just the interval from mu minus a to mu plus a.
01:15:40
This is the same as X is in E if E is the interval mu minus a to mu plus a. Okay, now if our little x is in there, this number is what? Bigger than one or less than one?
01:16:07
I'm sorry. I have the wrong direction here. I want that.
01:16:38
C means complement, so it means not in that interval.
01:16:46
If X isn't in here, X minus the distance from X to mu must be bigger than a. If X is not in here, if X is not between mu minus a and mu plus a, then its distance to mu must be bigger than a.
01:17:03
That means this number must be bigger than a. That means this ratio must be bigger than one. So if there's a one here, if I replace the one by this, I get something bigger.
01:17:24
So I repeat. If X is not in this interval, which is where we're integrating, then the distance from X to mu, which is this, must be bigger than a. That means this ratio is bigger than one. That means if I insert this in the interval, I get bigger than what I had before because here I have a one, here I have something bigger than one.
01:17:44
Okay, and now this is non-negative, and so if I integrate over the whole space instead of just this set, I get something bigger still. Oh, let me square because if I square a number bigger than one, I get something bigger yet, right?
01:18:06
Okay, and then if I integrate over the whole space, I get something even bigger, okay?
01:18:24
And now the one array squared can come out of the interval, and what's left? I'm integrating X minus mu squared against the density of X. That's the variance. This is the variance of X over a squared.
01:18:45
So just a couple quick steps to show that the probability that distance from X to its mean is bigger than a is less than sigma squared over a squared, where sigma squared is the variance.
01:19:08
Let's use that.
01:20:31
What's the expected value of Sn divided by n? Well, this is a constant, so I can pull that out of the expected value.
01:20:51
And expected value of the sum is the sum of the expected values, and each one of
01:21:09
these is mu, and I have n terms like that, so this would be n times mu. But then there's a one over n here to cancel the n, and this becomes mu.
01:21:27
Let's compute the variance of Sn over n. This would be one over n squared times the variance of Sn.
01:21:42
And do you remember the formula we had last time for variance of a sum? It is the sum of the variances plus twice the sum over below the diagonal of the covariances.
01:22:20
And if you have independent random variables, what's the covariance?
01:22:24
Zero. So for independent random variables, the variance of the sum is the sum of the variances. Variance of each term would be sigma squared. How many terms do we have? N of them. So this becomes n sigma squared divided by n squared would be sigma squared over n.
01:22:51
So variance here is getting smaller. What does variance mean? That means how far the random variable can be from its expected value.
01:23:04
And this means not very far. So Sn over n is sort of concentrating near what? Mu. Let's use Chebyshev's inequality.
01:23:48
What would go here? It's the variance of this thing.
01:24:04
So this is saying the probability that Sn over n is more than A away from the expected value is going to zero like a constant over n. In other words, this probability is going to zero. So here we say Sn over n converges to mu in probability.
01:24:31
The probability that it's far from mu is going to zero. This is called the weak law of large numbers.
01:24:56
So, for example, if you roll a die a thousand times and you count one every time you get the number one.
01:25:09
So Xi would be one. If one appears, zero otherwise, what would S one thousand be?
01:25:30
That would be the number of times a one appeared when you rolled a die a thousand times.
01:25:48
And then one over a thousand times that would be the relative frequency of ones in the thousand trials.
01:26:03
And what should this number be close to? According to the law of large numbers, the probability that it's far from what would be the expected value of Xi? One sixth. The probability that this is close to one sixth is very close to one.
01:26:22
Or the probability that this is far from one sixth would be going to zero. It'd be, you could say then, the probability that S one thousand over one thousand minus one sixth is bigger than .01 would be less than one thousand times .01 times the variance of the Xi.
01:27:02
But what's the variance of Xi? Well what values can Xi have, one or zero?
01:27:32
What's the probability that it's one? It's one whenever a one comes up, that'd be one sixth. So you get one sixth times one minus one sixth squared.
01:27:44
And then the other value you could have would be zero. And that happens with the probability five sixths. So we get one sixth times twenty five over thirty six plus five over six times one over thirty six.
01:28:04
That's thirty over two sixteen. So up here we get thirty and I can put two sixteen down here.
01:28:29
And a thousand times .01, that's ten right? So it'd be thirty over two thousand one hundred and sixty, we can cancel that.
01:28:42
Three over two sixteen, that must be one over seventy two right? So only one chance, at most one chance of seventy two that this frequency that you observe is more than one hundredth away from one sixth.
01:29:24
Actually it turns out that a stronger theorem is true under the same circumstances.
01:30:08
This one doesn't actually say that if I take a realization of S n over n that it'll actually converge to mu. It just says that the size of the set of realizations where you're close to mu is very big.
01:30:23
Here this says that actually S n over n will converge to mu. There's a subtle difference between those two. Here's an application.
01:30:48
Yeah, quick question? Oh yeah, I think, yeah I'm sorry.
01:31:01
It's supposed to be a squared, thank you. So this would be ten to the minus four right? So let me put in three more zeros here.
01:31:28
And now this is, I get another zero here and I get that.
01:31:49
So he's pointing out that I forgot the squared a. So I changed the n to a million instead of a thousand.
01:32:03
And this was ten to the sixth, ten to the minus four so that's ten squared. So I have two zeros here, one cancels and then I get one over seven hundred twenty. Okay, so suppose g is a continuous function on zero one. Then we can integrate of course. And this would be the expected value of g of u where u is uniform on zero one.
01:32:41
If we take u one through u n uniformly distributed on zero one and independent.
01:33:13
And we set x i equal to g of u i. Since g is continuous on this interval it's bounded so it has a finite mean.
01:33:24
And g of u has a finite mean and there's the expected value right? That's our mu. If I sum the x i's to n of them and divide by n that converges to what?
01:33:46
That converges to the mean. This is s n over n. That converges to the expected value mu or the integral. So this sum will be close to that integral.
01:34:03
So this is a way of doing numerical integration. You get a sample of independent uniformly distributed random variables. You evaluate your function at those random variables or those realizations. Add them up, divide by the number of realizations.
01:34:21
That should be close to the integral. I think you do realize that you cannot integrate every function g. As good as you think you are at integration you cannot integrate every function g. There are functions g you can't integrate. One example would be e to the minus x squared over two.
01:34:43
There's no closed form expression for integral zero to one. e to the minus x squared over two dx. Even something as explicit as that you can't integrate. So how could you get a value for it? Well this is one way. There are tables where you can read off a thousand, two thousand realizations of uniformly distributed independent random variables.
01:35:12
You can evaluate g at those and just add them up and divide by the number you've got and that should be close to the integral.
01:35:21
And there are precise error estimates or pretty good error estimates for that. OK, so see you Wednesday.