Random Sampling
This is a modal window.
The media could not be loaded, either because the server or network failed or because the format is not supported.
Formal Metadata
Title |
| |
Title of Series | ||
Part Number | 14 | |
Number of Parts | 16 | |
Author | ||
License | CC Attribution - ShareAlike 3.0 Unported: You are free to use, adapt and copy, distribute and transmit the work or content in adapted or unchanged form for any legal and non-commercial purpose as long as the work is attributed to the author in the manner specified by the author or licensor and the work or content is shared also in adapted form only under the conditions of this | |
Identifiers | 10.5446/12884 (DOI) | |
Publisher | ||
Release Date | ||
Language |
Content Metadata
Subject Area | |
Genre |
1
2
3
4
5
6
7
9
10
11
12
13
14
15
16
00:00
Open setMathematicsMultiplication signAutocovarianceExpected valueSummierbarkeitLecture/ConferenceMeeting/Interview
02:01
ModulformMeasurementMultiplication signSocial classHistogramGoodness of fitNumerical analysisMaß <Mathematik>MetreLecture/Conference
03:28
HistogramArithmetic meanMultiplication signOvalRule of inferenceNumerical analysisClassical physicsRoundness (object)Social classLecture/Conference
07:14
Lipschitz-StetigkeitNumerical analysisMultiplication signComplete metric spaceLecture/Conference
09:46
Numerical analysisMultiplication signRight angleTerm (mathematics)AreaStudent's t-testMultiplicationStatisticsAverageModulformSquare numberVapor barrierSummierbarkeitDot productTotal S.A.Arithmetic meanDeterminantDivisorLecture/Conference
18:05
Cue sportsSampling (statistics)Multiplication signSigma-algebraProduct (business)EstimatorNumerical analysisSet theoryDichotomySpherical capGrothendieck topologyGroup actionRandomizationSquare numberLecture/ConferenceMeeting/Interview
25:26
Group actionSampling (statistics)Random variableRandomizationSelectivity (electronic)Decision theorySpherical capDistribution (mathematics)Numerical analysisOrder (biology)Lecture/Conference
28:41
RandomizationNumerical analysisTotal S.A.Mortality rateLecture/Conference
30:55
Element (mathematics)Network topologyOperator (mathematics)Object (grammar)Lecture/Conference
32:09
Expected valueSampling (statistics)Arithmetic meanEstimatorNumerical analysisMathematicsDifferent (Kate Ryan album)Lecture/Conference
33:56
Total S.A.Numerical analysisDifferent (Kate Ryan album)SummierbarkeitExpected valueParameter (computer programming)MereologyMultiplication signDivision (mathematics)Arithmetic meanEstimatorLecture/Conference
36:18
NumberEstimatorProcess (computing)Mortality rateMultiplication signMultiplicationState of matterOrder (biology)CommutatorModel theoryMany-sorted logicRight angle12 (number)Numerical analysisMusical ensembleSquare numberSelectivity (electronic)RandomizationMixture modelArithmetic meanAutocovarianceDecision theoryExpected valueDistribution (mathematics)VarianceSummierbarkeitParameter (computer programming)Lecture/Conference
42:53
Conditional probabilityMarginal distributionSummierbarkeitTerm (mathematics)Charge carrierRight angleMereologyDifferent (Kate Ryan album)Event horizonDiagonalHand fanEqualiser (mathematics)Square numberSubstitute goodMultiplication signLecture/Conference
49:27
MereologyPrice indexTerm (mathematics)VotingSummierbarkeitRight anglePower (physics)Doubling the cubeMultiplication signCovering spaceMultilaterationLecture/Conference
51:11
Positional notationEqualiser (mathematics)Multiplication signMereologySquare numberDivisorLecture/Conference
52:45
VarianceComputabilitySquare numberMultiplication signTerm (mathematics)Theory of relativityComputer programmingGrothendieck topologyLecture/Conference
55:54
RandomizationVarianceSampling (statistics)Term (mathematics)Square numberSigma-algebraEqualiser (mathematics)Multiplication signAutocovarianceSummierbarkeitObject (grammar)Mortality rateMereologyDifferent (Kate Ryan album)ResultantLecture/Conference
58:53
Standard errorVarianceTerm (mathematics)SummierbarkeitArithmetic meanRandom variableSampling (statistics)Square numberDivisorMultiplication signAutocovarianceNumerical analysisVotingFinitismusIndependence (probability theory)RandomizationSigma-algebraContrast (vision)AreaOvalRight angleMortality rateMereologyEqualiser (mathematics)Fraction (mathematics)Many-sorted logic19 (number)Lecture/Conference
01:05:30
Student's t-testStandard errorComputabilityEstimatorVarianceLecture/Conference
01:07:32
Sampling (statistics)AverageRight angleRootStandard deviationSquare numberMultiplication signSimilarity (geometry)Sigma-algebraVarianceAlgebraic structureEstimatorStandard error1 (number)ComputabilityLecture/Conference
01:11:52
Flow separationRandomizationArchaeological field surveyEstimatorParameter (computer programming)Square numberMetreExponential functionMultiplication signSampling (statistics)Arithmetic meanRandom variableVarianceSigma-algebraLecture/Conference
01:14:41
VarianceEstimatorMultiplicationDivision (mathematics)ComputabilityLecture/Conference
01:16:20
ApproximationMereologyPunktschätzungLie groupState of matterRandomizationSpacetimeConfidence intervalIndependence (probability theory)Normal (geometry)Direction (geometry)Numerical analysisProduct (business)Random variableOrder (biology)Sigma-algebraRootParameter (computer programming)VarianceCentral limit theoremAlpha (investment)Square numberLecture/Conference
01:21:50
Numerical analysisRandomizationSampling (statistics)Alpha (investment)Multiplication signEstimatorCovering spaceAverageWater vaporMereologyEqualiser (mathematics)Square numberGoodness of fitComputabilitySummierbarkeitInequality (mathematics)RootPunktschätzungConfidence intervalSigma-algebraLecture/Conference
01:29:28
Arithmetic meanTable (information)AreaMultiplication signAlpha (investment)Point (geometry)Order (biology)Confidence intervalLecture/Conference
01:31:06
Table (information)Normal distributionVarianceAlpha (investment)Sign (mathematics)WeightLecture/Conference
01:32:31
Negative numberGrothendieck topologyMultiplication signPopulation densityMarginal distributionDivisorStandard errorInfinityNumerical analysisAreaPoint (geometry)Normal (geometry)ResultantNormal distributionTable (information)Functional (mathematics)Symmetry (physics)Confidence intervalAlpha (investment)AverageLecture/Conference
01:36:49
Confidence intervalResultantMarginal distributionStandard errorLecture/Conference
Transcript: English(auto-generated)
00:05
Okay, so we're discussing chapter seven.
00:29
And last time we considered a simple case where population had assigned to it values minus one, zero, or one according to whether the individual in question was in favor
00:45
of candidate B, had no preference, or candidate A, and I think we computed the almost computed the covariance. I left one step to you, subtracting the expected value of xi times xj.
01:01
Let's review a little bit. There's some population of size capital N. Individuals in this population have values assigned to them.
01:24
And the list of values would be given by these small x's with subscripts.
02:00
So we have a list of capital N values.
02:04
The example we did last time, the values were all the form minus one, zero, or one. But they could be heights, so they could be some measurement in inches, could be weight, so some measurement in kilograms,
02:22
or could be total wealth, so that'd be units of dollars, or many other possibilities. In the book there's a running example about hospitals, and these values could be the number of patients discharged in a month. Okay, so there are many, many situations that cover this.
02:43
Could be scores on an exam for a class. So for example, I printed out the histogram from exam one. How many of you have seen it? You've seen it? Okay, good.
03:00
So let me just sort of reproduce what it looks like on the board here. This is, oh I see, I have printed one out from another class.
03:23
Okay, but it's still an illustrative example.
03:44
I don't know if I'll be able to get all these in, and you probably can't read these from the back of the room.
04:06
Okay, and now the histogram for this particular class looked like this, and there's a number four above that. Seven, eight, ten, eight and eight, nine and nine,
04:53
three, two, zero, six, two, zero, six again, two, four, zero, zero.
05:27
All right, so what do these numbers mean? That means four people had a score between 95 and, seven had a score between 90 and 94, I guess.
05:42
So if we were to round down the scores, in other words, just record things like 95, if this person had 98, we'll record 95, okay? If the person had 86, we'll record 85. So I'm going to list the numbers now that we see in this population.
06:03
So we'd see 95 four times, whoops. And then we'd see 90 seven times.
06:27
And then we'd see 85 five times. I'm not going to do the whole list. And then we'd see 80 eight times.
06:52
And we'd keep going like this, and then we'd see 15 twice. And then we'd see ten four times.
07:08
And that would be the list. That would be this list, X1 through X of capital N, okay? And there is a lot of repetition in this list.
07:27
So we might just record the numbers you see. We see the score is 95, 90, 85, 80, 75, 70, et cetera, 15, 10.
08:04
And then we record how many times those scores appear. So we saw 95 four times. We saw 90 seven times. We saw 85 five times. We saw 80 eight times. We saw 75 ten times, right?
08:21
We saw 78 times. 15 we saw twice. 10 we saw four times, okay? So up here is just the complete list of the numbers you see in the population. And here's the number of times each of those numbers appeared.
08:43
So sometimes we write C1 through, C2 through CM
09:24
as the complete list of the distinct values that you see here. So there'll be no repetitions in this list. This would be this list here, okay? We just record the distinct values that occur in a population without repetition.
09:41
But then we also record how many times each of those values occurs.
10:05
So little n1 would be the number of times C1 occurs. Little n2 would be the number of times C2 occurs, et cetera.
10:27
Little nM would be the number of times CM occurs.
10:42
So little m is the number of distinct values in this list.
11:02
So in this example here there would be 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 distinct values.
11:26
Now we're interested in sampling to try to determine things like what's the population mean. So the population mean, again, is given the name mu
11:50
and it's exactly the average of the values that occur in the population. We add up all these numbers, divide by the total number of numbers.
12:07
So in the example of the scores, we'd add up 95 plus 95 plus 95 plus 95 plus 90 plus 90 all the way down to here. And then we divide by how many scores there were. That would be we divide by 4 plus 7 plus 5 plus 8 plus 10, et cetera.
12:26
That would be the population mean. Now there's another way to compute the population mean. Here mu would be what? Well, we add up 95 five times. Another way of doing that is multiplying 5 times 95, right?
12:47
And then we multiply 4 times, oh no, not 4, 7. 7 times 90, right? Because we have 90 in there 7 times.
13:03
All right. OK, be picky. 4 times 95 because there are 4 95s. 7 times 90 plus how many? 5 times 85, et cetera. Plus dot, dot, dot. And then we get down to 2 times 15.
13:21
And then we divide by capital N, which I didn't compute. It's the total number of students. OK? So how could we formulate that in terms of, in a general case where we have these distinct values occurring this many times? Well, we'd have the 1 over N.
13:40
And we'd sum over the number of distinct values, j equal 1 to m. How many times do we add in the number c1? N1 times. How many times do we add in the value c2? N2 times. How many times do we add in the value c sub j? N sub j times.
14:00
So what do we do here? N sub j times c sub j. It's another way of expressing the population mean.
14:32
Another statistic that sometimes is of interest is the total population total.
14:51
For example, I might want to know, here's a population of 43 students.
15:02
And I want to know how many iPhones there are in this population. So what numbers would I have? This would be 43. And here I'd have the values 1 or 0
15:21
according to whether this person that is labeled 1 has an iPhone or not. If person number 1 has an iPhone, this would be 1. If not, it would be 0. If person number 2 has an iPhone, that would be 1. If not, it would be 0. What would tau be?
15:41
I add it in 1. Every time that person has an iPhone, 0 or not, this would be the total number of iPhones. Or it could be the number of cars, whatever. And it doesn't have to be dichotomous, actually. I could count 2 if you have 2 iPhones,
16:04
or 3 if you have 3 iPhones. Or it might be computers. So this would be the population total. Could be like the total wealth. So anyway, this is capital N times mu.
16:21
And because it's capital N times mu, we also have this way of computing it. Here's mu. If I multiply it by capital N, it just erases the 1 over N over there.
16:49
And then there's the population variance,
17:21
which can also be expressed.
17:54
And xi squared occurs how many times
18:08
when I'm adding up like this? I can use this thing. I guess if xi is equal to cj, it occurs nj times.
18:27
So I could rewrite this as nj times the value of cj squared minus mu squared.
18:45
And finally, we call the dichotomous case.
19:27
This whole set is when one of the values
19:52
that appears on the list is 0. Another of the values that appears on the list is 1. And that's all of them because this is
20:00
the number of times 0 appears. This is the number of times 1 appears. And if I add those together, I get all the numbers, capital N. It's called the dichotomous case. And this is used in determining whether or not you like something like chocolate ice cream.
20:20
So every one of you think, well, do I like chocolate ice cream or not? So raise your hands if you like chocolate ice cream. You can raise it higher if you want. All of you people with raised hands, you get 1's. The rest of you get assigned the value of 0. And now there's some true population proportion
20:43
that likes, in this population, ice cream. The population proportion that likes ice cream would be P equal to N2, the number who raised their hands, times C2, or just N2, divided by capital N.
21:22
This is the number of people who have 1 assigned divided by the total population size. This is called the true proportion sometimes. So that would be mu, but we give it a special name in the dichotomous case.
21:41
Dichotomous case is also used in opinion polls about elections. Okay, now the goal usually is to try to estimate mu
22:03
or sigma squared or P. P might be, so another example, what we just talked about might be some companies interested
22:21
in preferences in a population of products. This could also be applied to whether a certain medication works or not. You have a population of individuals of this size. They're given medications,
22:42
and you want to know whether it works or not, so each value would be 1 or 0 according to whether the medication worked or not, and you want to know how often this medication is working. Okay, so what you do to estimate these things is you sample the population.
23:54
In most cases, capital N is a lot bigger than little n.
25:26
We sample without replacement. That means we reach in and grab a group of size little n from this group of size capital N. So as I said last time, think of this as being an urn with capital N balls, and you reach in and grab little n of them.
25:42
Then we treat these as being random variables. The randomness comes from the selection procedure, and they are not independent. These are not independent, which doesn't provide much of a problem
26:02
if little n over capital N is small.
26:33
Let's compute their marginal and joint distributions. They are identically distributed,
27:31
and the only values they can take are c1, c2, c3, up to cm, because those are the only values that appear in the population.
27:41
No matter which one you take, the probability that it's equal to cj is the same. There's no preference for x1 being cj as opposed to x2 being cj because you blindly reach in and sample. In other words, they're identically distributed,
28:13
identically distributed, but not independent. So let's compute what this number is.
28:21
What's the probability that x1 is equal to cj?
28:43
You reach in and grab an individual at random. What's the probability that that individual has the value cj? Well, it would just be the proportion of the population that has the value of cj. How many individuals have the value of cj?
29:00
I erased it, but it was little nj, and how many are in the population? Capital N. So this is simply little nj over capital N. The number of individuals having this value divided by the total number of individuals.
29:23
Now let's compute the probability that, and this would be true then for any, if I put 2 here or 3 or whatever, okay?
29:49
Let's compute this probability. The probability xl is cj given xm is ci.
30:02
And there are two cases here. One is when i is equal to j. Another is when i is not equal to j. Let's consider this one first, i equal j.
30:20
So we reached in and grabbed an individual, and that individual had value ci. Now we ask what's the probability that this one has value ci also. Well, when this happens, how many values of ci remain in the population?
30:40
N i minus 1. And what's the total population size after we draw one out? So what would this be? This would be N i minus 1 over capital N minus 1.
31:03
And what if i is different from j? Well, how many remaining elements or individuals in population have this value if we picked out a different value? We still have nj, but how many individuals remain in the population?
31:20
Capital N minus 1. So this would be nj over capital N minus 1. So this reminds me of a joke. You seem a little bit sleepy, so I'll tell you a joke. My mother-in-law asked me once,
31:42
Michael, if there are five crows sitting in a tree and you shoot one, how many are left? I said four. She said, well, she doesn't speak English. She said, fun Micah, which means stupid Michael. There'll be none left. They'll all fly away.
32:02
Okay, but we were talking about how many are left. Okay, so now you seem a little more alert. Okay, so we want to estimate things like mu or whatever.
32:22
So an estimator for mu is the sample mean.
32:59
What's the expected value of x-bar?
33:01
Well, the expectation is over all possible samplings.
33:21
Well, of course, that doesn't change, and if we take the expected value of a sum, samplings, I should be more precise, samplings of little n individuals from the population. What's the expected value of xi?
33:52
It's a sum. L equal 1 to m. Oh, I shouldn't use m here. Sorry.
34:01
Let me use k here. m is reserved for the total number of different values you can see. Sorry about that.
34:22
So just from the definition of expectation, we get this. You sum over all the values that the discrete random variable xi can have. That would be these. And then you multiply each one of those by the probability of taking on that value. And what's the probability of taking on the value of cl?
34:54
It's that, as we just observed over there. And I can factor out the capital n.
35:13
And this is mu, the true or population mean.
35:20
It's the expected value of... Yeah, it's mu. So this up here would be what? I'd have mu here this many times. So when I form that sum, I get n times mu. Divide by n. You get mu.
35:41
So the expected value of x-bar is mu. So we're using x-bar to predict or approximate or estimate mu. And the nice feature here is on average it's equal to mu. So x-bar is called an unbiased estimator.
36:14
That's because its expected value is equal to the parameter we're using x-bar to approximate or predict.
36:25
Its expected value is equal to the parameter we're using it to predict. We're using x-bar to estimate mu. And its expected value is mu, so we call it an unbiased estimator.
36:46
Okay, one way of telling how good an estimator is might also be how spread out the distribution of x-bar is. If variance of x-bar is small, then we'd think x-bar is a good estimator for mu.
37:57
That means that usually the values of x-bar aren't very far from its expected value.
38:03
If the variance of x-bar is big, that means that there are a lot of times when x-bar is far from its mean. So we'd like this to be small. So we should try to compute this. And the variance means what? This is the expected value of x-bar squared minus the expected value of x-bar and then squaring that.
38:32
And this expectation means with respect to the random selection of individuals in the population. Now, so let's compute.
38:43
The variance of x-bar is the variance of a normalized sum, like that. And whenever you have a variance of a multiple of a random variable, what can you do with that constant?
39:03
It comes out with a square, right?
39:43
The variance of a sum becomes this.
40:03
We've seen this formula before. And so we need to compute the covariance of xi and xj.
40:49
First step to simplifying that is to write the covariance this way. And we know expected value of xi and expected value of xj. Those are both mu.
41:00
So this would be expected value of xi times xj minus mu squared here. So that leads us to computing this.
41:43
Well, xi might be CL. xj might be CK. And then we multiply by the joint distribution, PMF.
42:12
And I'm going to change the letters over here.
42:47
You don't have to do that because I'll just record it over there. Again, when K is L, that means one individual of types with the number CL has been taken out. So there's one less left.
43:01
So we get this probability. When L is different from K, there are still n sub L individuals with that value. So the probability is this. I do that because here I'm going to do conditional probability.
43:39
The probability of A intersect B is the probability of A given B times the probability of B.
43:50
And I know the conditional probability now and the marginal. So I'll substitute those values.
44:29
But now this depends on whether L is equal to K or not. So I should break up this sum into L different from K and L equal to K.
44:46
For L different from K, that's this case. So the conditional probability is n sub L over capital N minus one. And the marginal probability is n sub K over capital N.
45:08
And then for L equal to K, these two things are the same.
45:21
So I get CL squared. And then the conditional probability now is n sub L minus one over capital N minus one. And the marginal probability is n sub L over capital N.
45:47
And the difference here between this and this is this minus one.
46:02
So let me separate that part out. So I'll take this part here and I'm going to hit control V. And then down here I'm going to hit control C.
46:24
I think you all know what I'm talking about. Oh, OK. You are paying attention. And then plus, I'll take the part here with, this is the minus one part here.
47:12
And then this part goes here. OK, now if I look at this and this, I guess look at these two, I could recombine, I could put this back in here.
47:30
This gives me, what's missing here? It's the diagonal term, right? It's the term L equal K that's missing here. What would the term L equal K look like? What would I have here?
47:41
CL squared, what would I have there? NL over N minus one, what would I have there? NL over N. Well, that's exactly what you have here. This is the missing diagonal term from here.
48:40
OK, let's see if we can make sense out of this.
48:48
I can erase this.
49:05
How do I simplify this?
49:24
I think I can rewrite this as, first I'll sum on K, and then here I'll sum on L.
50:10
And these two are equal, let's see why. When I multiply this term by this, I can just put the sum, forget this for a moment, I can just put the sum outside.
50:20
So this is just like when you integrate, if you have double integral, and it's F of X times G of Y, DX DY, you can integrate over one variable first, and then over the second one later. So this just becomes a double sum out here, L and K going from one to M, this times this.
50:42
But that's what you have here. Now let's cover up that part. Does this give this term? Well what's the power up, what do you get here? You get C K squared, and then N K over N, and over N minus one. Well that's the same as this, but with K instead of L.
51:04
I'm using the summation index K instead of L, but it's exactly the same stuff. So this can be rewritten this way.
51:26
Now factor out the N times N minus one, and what would be left for this part would be tau squared. And I get N K times C K squared for the second part.
51:51
Tau, remember was, I guess I didn't write it.
52:08
I'll use the notation tau equal N times X bar, expected value I should say. It's the expected value of tau squared.
52:27
If I square this thing, and that's what I would get here.
52:48
Okay, and then using this relation, we can replace this by N mu squared over N minus one.
54:00
And this thing here is the expected value of X squared.
54:26
Because how do you compute the expected value of X squared? You take the values squared times the probability of taking on those values. So that would be N K over N. So this thing here is N times expected value of X I squared, which would be, keeping in mind that there's this N minus one here.
54:46
This N and this N cancel. And expected value of a random variable squared is its variance squared plus its mean squared.
55:05
So if I have the expected value of X squared, it would be sigma squared plus mu squared. Okay, so if we finally combine the mu squared terms, I get N mu squared minus one, mu squared over N minus one.
55:32
That's mu squared minus sigma squared over N minus one. So that's the last, that's the end of the computation of expected value of X I X J.
55:51
And so we have to put in that here. If we do that, we get a mu squared minus sigma squared over N minus one plus mu squared.
56:05
Which is minus sigma squared over N minus one. So the answer for what's the covariance of X I and X J with simple random sampling, it's minus sigma squared over N minus one.
56:24
So that's the thing to remember here at the end. The next step after the break is we have to write down, we're writing the variance of X bar. Okay, I guess we can finish. The variance of X bar is this.
56:45
So every one of these terms is this. Minus sigma squared over N minus one, capital N minus one. So we put that in here every time. How many terms are in a sum like this? I J equal little one to N. Little n squared terms.
57:01
So we get little n squared times this divided by little n squared, so that cancels. And so what does this become? Minus sigma squared over capital N minus one. That's the variance of X bar. Variance of the sample mean, I think I made a little mistake somewhere along the way.
57:53
Let me fix it after the break. This isn't quite right. Sorry. This is correct, but I think I made a mistake somewhere.
58:06
So this is correct. Let's stop here and I'll give this afterwards. How many have I equal to J? N of them. So how many have I different from J? N squared minus N. So here we'd have N squared minus N over this thing, N squared.
58:28
Times this minus sigma squared over capital N minus one. Then we'd have plus one over N squared, little n squared.
58:41
Sum I equal one to N. Covariance of xi xi, that's also called the variance of xi. That's the variance of xi. Covariance of xi xi is the same as the variance of xi. So what's the variance of xi?
59:34
This is just doing what? Rounding up all the values squared times the number of times they appear.
59:41
So this would be one over N times the sum I equal one to capital N, x sub i squared minus mu squared.
01:00:03
That's equal to 1 over capital N sum i equal 1 to capital N x of i squared minus 1 over capital N sum i equal 1 to capital N x of i, and then squaring afterwards, because this is mu.
01:00:22
And this is just 1 over N sum i equal 1 to capital N xi minus mu squared, which is sigma squared.
01:00:45
So the variance of xi is the population variance. The mean of xi is the population mean. The variance of xi is the population variance. So this term here is sigma squared for all
01:01:03
terms in the sum. So I'd say that the variance of x bar is, well, this simplifies a bit, I think. You get minus N minus 1 over little n, and then capital sigma squared over capital N minus 1.
01:01:23
And then plus, we have little n terms. We're dividing by little n squared, so it'd be 1 over little n. And each term is sigma squared. Factoring out sigma squared, what do we get? We get sigma squared over, I'll factor out little n as
01:01:46
well, and then 1 minus little n minus 1 over capital N minus 1.
01:02:06
So that's the variance of x bar.
01:02:26
So contrast this with the case of IID.
01:02:56
We did this computation before. This was just sigma squared over little n.
01:03:06
And what was the reason? Well, what's the variance of a sum of independent random variables? It's the sum of the covariances.
01:03:23
Or you could look back up here. What's the covariance of xi and xj when xi and xj are independent? It's 0. So this term isn't there. This term is there. The variance of each one is sigma squared. How many times are we adding it up? Little n times.
01:03:41
So we get, from the sum, little n times sigma squared. Divide by n squared, you get sigma squared over little n. So in the independent case, you get this. When you're sampling without replacement from a finite population, you get this extra term here. It's called a finite population correction.
01:04:21
And this is due to the dependence between the random variables. Well, how do you know it's from the dependence? Well, it's because the covariance of xi and xj for i different from j doesn't vanish. So this comes from the dependence in the random sample. Why do I say finite population correction?
01:04:40
What if the population that we're sampling from is infinite, capital N is infinite? What would that number be? 1 minus 0 or 1, right? So there'd be no correction. So there's a little bit of correction for the variance of x bar. It's what you would have if you had independence, but times
01:05:00
a little factor. So if capital N is a lot bigger than small n, this isn't a very big number. It's pretty close to 1, in fact, if capital N is a lot bigger than small n. So if you have a population of, I don't know, 100 million voters, we probably don't have 100 million voters in this country, but let's say we do.
01:05:23
And you sample 5,000 of them. This would be something like 5,000 over 100 million, but maybe 4,999 over what's 100 million minus 1. That's 99,999,999.
01:05:42
That's very, very close to 1. So it's essentially sigma squared over n. It comes into play when this is like 0.2 or something. OK, so let me just summarize, rather than go through more
01:06:07
computations, how these corrections come in when you're doing sampling.
01:06:40
Sometimes this variance of x bar is called the standard error, it's how far off you might be from the thing you're estimating.
01:07:37
So the standard error would be the square root of the variance of the estimator.
01:07:52
So for mu, the standard error would be the square root of the variance of x bar. We just computed the variance of x bar. So the square root of the variance would be sigma over
01:08:02
little root n square root 1 minus little n minus 1 over capital N minus 1. It indicates how far x bar is on average from mu. So here you'd want the standard error to be small. So that means you'd probably want to take the sample
01:08:22
size big enough compared to the population variance so that the standard error isn't very big, so that x bar isn't varying much from mu bar on average. This is just little n times x bar. This was capital N times mu.
01:08:44
So if we take the variance of T, that would be n squared times the variance of x bar. And then we take square roots. That just gets rid of this thing. So here we get a, let's see, n.
01:09:17
OK, so what would, we should actually put
01:09:22
capital N here, sorry. Capital N. Our estimator from mu is x bar. So our estimator for capital N times mu should be capital N times x bar. I put the wrong n here. Estimating this with x bar means you estimate this with n times x bar. And then we get a sigma square root capital N over little n
01:09:47
times 1 minus little n minus 1 over capital N minus 1. In this proportion case, the sigma is just p times 1
01:10:06
minus p, sigma squared is. So we just replace that here. And you'd get, coming back to here, you'd get p, 1 minus
01:10:22
p over little n. And then the population correction, finite population correction. So p hat here is still, but keep in mind, these are now
01:10:41
zeros or ones. So this would be what you can read off from this computation here. OK, recall the sample variance is defined this way.
01:12:18
So if you have sample random variables x1 through xn, the
01:12:21
sample variance is defined this way. In the case when x1 through xn are independent and identically distributed, s squared is a non-biased
01:12:50
estimator of sigma squared.
01:13:30
That is, the expected value of s squared is sigma squared. Remember, a non-biased estimator of a parameter is
01:13:41
a random variable whose expected value is equal to that parameter. So the parameter here is sigma squared. The expected value of s squared is sigma squared. For us, in sampling without replacement, or it's called
01:14:02
simple random sampling, there is dependence. The random variables aren't independent anymore. And in this case, we have to correct by
01:14:23
the following means. We take s squared x bar to be s squared over n times 1 minus n over capital N. And this becomes a
01:14:42
non-biased estimator of the population variance.
01:15:16
And this uses a computation we did a moment ago about the variance of x bar.
01:15:25
But I won't go through it. So we take this thing, divide by little n, and
01:15:46
multiply by 1 minus little n over capital N.
01:16:07
So keep this in.
01:16:48
So you can't always give a, we don't give a point estimate for mu, as you do with x bar. Sometimes you say that with high probability, mu lies in
01:17:04
a certain interval. And we use a normal approximation x bar to make a statement like that. So x bar is now not the average of independent random
01:17:21
variables, as in the central limit theorem.
01:17:45
There, you added up an independent random variable subtracted mu divided by sigma over root n, or sigma root n.
01:18:01
And this is approximately normal, when these were independent. We're no longer in the independent case, but when n is large enough, little n that is, and the ratio
01:18:23
little n over capital N is small, then this is
01:19:03
approximately normal, where sigma x bar, the variance
01:19:20
with this population correction, or the square root of the variance with the population correction. So the central limit theorem still holds. If this is big enough, in the independent case, you need this big enough for the approximation to be true. But you also need this to be small in order to get
01:19:43
enough independence between the random variables. And if both these hold, then this is approximately normal. Now here, I'm sorry, I have x bar.
01:20:01
So I'm dividing everything by n here. I forgot. So it's x bar minus mu over sigma. So when estimating mu, we give a confidence interval for
01:20:51
that parameter.
01:21:01
This means we select a number alpha between 0 and 1, close to 1, say like 0.95 or 0.9, and find a randomly located
01:21:31
interval i, such that the probability that mu is an i,
01:21:48
here the randomness is an i, not mu. Mu is a population parameter. That's a real fixed number. But i is random. And we want the probability that mu is an i to be alpha.
01:22:04
So we don't say exactly what, we don't give exactly a point estimate for mu, but we give an interval, and we're pretty sure, like alpha times 100% of the time, mu will be in that interval. This is called alpha percent confidence interval.
01:22:29
I'm sorry, 100 alpha percent confidence interval.
01:22:41
I just called it 100 alpha percent confidence interval. OK, so let's use the x bar and sigma x bar to get a confidence interval for mu in a specific example.
01:23:21
Suppose we have a population of 8,000 condominiums, and
01:23:41
you're going to build a parking lot for the people who occupy these condominiums, so you want to know how many cars they have. So you sample 100 residents, and you find that the
01:24:08
average number of cars that they have is 1.6, where xi is
01:24:20
1 if ith resident has car, 0 otherwise. Well, OK, I'm sorry.
01:24:41
It's xi is the number of cars ith resident has. So maybe the ith resident has a Prius for weekdays, and then a Lamborghini for weekends, so that'd be 2.
01:25:04
And then on average, in this small sample, a resident has 1.6 cars. Suppose s, as I've computed before, s squared is 1 over n
01:25:28
minus 1 times the sum of the squares of xi minus x bar. Suppose s is equal to 0.8, so that s x bar is 0.8 over
01:25:43
square root of 100 times the square root of 1 minus n over a capital N. Now I'm going to fudge a little bit here.
01:26:06
This number will be almost the same as n minus 1 over capital N minus 1. So it'd be 0.8 over 10 times the square root of 1 minus 100 over 8,000. 100 over 8,000 is what?
01:26:22
1 over 8. Is that right? 1 over 80. And let's say that's about 1. So this is about 0.08. Here we can, this is like 79 over 80. That's very, very near to 1.
01:26:40
So let's replace it by 1. So 0.8 over 10 would be 0.08. 0.8 over 10 times this would be 0.08. Oops, I don't want to erase this.
01:27:07
So x bar minus 1.6 over 0.08 is approximately normal.
01:27:24
I think we could say that 1 over 80 is small. And little n equal 100 is big enough. So we want to find, let's say, a 95% confidence
01:27:45
interval for mu.
01:28:08
So alpha here, 100 alpha would be 95. So alpha is 0.95. x bar is our estimator for mu.
01:28:40
And the probability, we look for a number of z of alpha
01:28:49
such that the probability that minus z of alpha is less than or equal to x bar minus 1.6 over 0.08
01:29:01
is less than or equal to z of alpha is equal to alpha, where alpha is 0.95. Then if we solve this inequality for x bar,
01:29:23
we get what? 0.08 times 1.6 minus z of alpha on the left.
01:29:41
Sorry, I did the wrong order. 1.6 minus 0.08 z of alpha is less than or equal to x bar is less than or equal to 1.6 plus 0.08 times z of alpha
01:30:03
is equal to alpha, or 0.95. And then our confidence interval is the interval from here to here,
01:30:41
because where alpha is 0.95. Because the probability that x bar is in there is 0.95. So the probability mu is in there is 0.95. So that's our 95% confidence interval. We still have to find the value of z of alpha
01:31:03
that makes this probability equal to 0.95. So how do we do that? We look in the table.
01:31:21
First of all, let's, for simplicity, write this as z, where z is normal. This is about equal to z, which is a normal random variable. And so if z is normal, it means 0 variance 1. And if we have to find the value z of alpha
01:31:48
that makes this true from looking in the table, we know we can only find values of probabilities
01:32:06
of this type, the probability that a normal random variable is less than or equal to some positive number. So I write this as the probability that capital Z is smaller than that value, minus the probability capital Z is smaller than this value. Again, this is just using probability of A union B
01:32:22
is probability of A plus probability of B. When A and B are disjoint, this would be A. This would be B. A union B is this. And A and B are disjoint because z can't simultaneously
01:32:43
be between these two numbers and less than this number. OK, but now I know I can't read off things like this probability from the table when you have a negative number on the right-hand side. So I use the symmetry of the normal distribution.
01:33:11
This is the area under the normal density from minus Z infinity to minus infinity. That's the area under the normal density from Z of alpha. I'm sorry. This is the area under the normal density
01:33:21
from minus infinity to minus Z of alpha. That would be equal to the area under the normal density from Z of alpha to infinity because it's an even function. And now I can't read this off from a table, but I can read this kind of thing off. So I use the probability of A as 1 minus probability of A complement.
01:33:52
OK, and then this becomes up here. I can write it twice the probability that capital Z is less than or equal to Z of alpha minus 1.
01:34:03
OK, and now what did I want to find? I want to find the Z of alpha that makes this true. So let me solve this for probability that Z is less than or equal to Z of alpha. I want to find a Z of alpha.
01:34:26
So this is true. I add 1 to both sides and divide by 2. OK, now if alpha is 0.95, 1 plus alpha over 2 is 1.95 over 2, which would be 0.975.
01:34:55
So now I look in table 2 for the normal,
01:35:09
and I look in there to find 0.975, and I find 0.975 at 1.96. Z of alpha is 1.96.
01:35:21
It's in the, if I go to the left side, I get 1.9. And over to the column with 0.06, I get 0.975. So Z of alpha is 1.96. Z of alpha is 1.96. OK, so now let me write down the 95% confidence
01:35:42
interval for the average number of cars. Is this here?
01:36:01
It's 1.6 minus 0.08 times 1.96 to 1.6 plus 0.08 times 1.96.
01:36:35
Average number of cars per condo. The probability that the average number of cars lies in that interval is 0.95.
01:36:44
And maybe you've heard of margin of error, things like that in public opinion polls. That's what they're talking about. They say 95% sure that the results are correct. That means they found a 95% confidence interval. They talk about margin of error is plus or minus 3%.
01:37:00
So they're giving a confidence interval statement and things like that. So we'll do more examples like this, but that's it for today.
Recommendations
Series of 5 media