Lecture 13 Notes

« previous | Tuesday, March 8, 2011 | next »

Confidence Intervals (Chapter 7)

Confidence interval might not even contain μ

3 Cases for Confidence Intervals with mean (μ) of population:

When population distribution is Normal and σ is known
When we have a large sample (pop. distribution may not be normal and σ may not be known)
When we have a small sample from a normal distribution and σ is unknown

Confidence Interval: A range of values

(l,u)

such that we are

x

% sure that the mean

\mu

of a population lies within that range.

Note: confidence intervals can be random, but μ is fixed: it never changes. Therefore, our confidence interval might not even contain μ.

Case 1: Normal Population Distribution

Given a sample $X_{1}$ to $X_{n}$ of size $n$ from a normal distribution with (unknown) mean $\mu$ and (known) standard deviation $\sigma$ ,
a $x$ % Confidence Interval for $\mu$ is:

{\bar {X}}\pm z_{\alpha /2}{\frac {\sigma }{\sqrt {n}}}=\left({\bar {X}}-z_{\alpha /2}{\frac {\sigma }{\sqrt {n}}},\ {\bar {X}}+z_{\alpha /2}{\frac {\sigma }{\sqrt {n}}}\right)

Where $n$ is the sample size and $z_{\alpha /2}$ is a value such that the area underneath the Z-curve from $-z_{\alpha /2}$ to $z_{\alpha /2}$ is $x$ :

In α notation: $\alpha =1-x$ . For example, 95% confidence interval will have $z_{\alpha /2}=z_{0.025}=1.96$

Note: A larger sample size will decrease the range of the confidence interval.

Controlling Interval Width

Suppose we want our confidence interval to be at most $w$ :

${\begin{aligned}2\cdot z_{\alpha /2}{\frac {\sigma }{\sqrt {n}}}&\leq w\\\left(2z_{\alpha /2}{\frac {\sigma }{w}}\right)^{2}&\leq n\end{aligned}}$

This means that we will have to take at least $\left(2z_{\alpha /2}{\frac {\sigma }{w}}\right)^{2}$ samples from the population in order to have the same confidence over an interval of width $w$ .

Case 2: Large Sample

We do not have a normal population, but we can assume that the distribution of our sample average is approximately normal by CLT. Therefore, we can substitute our sample's standard deviation $s$ for the population:

{\bar {X}}\pm z_{\alpha /2}\cdot {\frac {s}{\sqrt {n}}}

Case 3: Normal Population with Small Sample

Standard deviation σ is unknown.

By standardizing, our sample average approaches Normal distribution, but a small sample size forces us to use T distribution:

$Z={\frac {{\bar {X}}-\mu }{\sigma }}\sim N(0,1)$ , but $T={\frac {{\bar {X}}-\mu }{s}}$ is "something else"

{\bar {X}}\pm t_{n-1,\alpha /2}\cdot {\frac {s}{\sqrt {n}}}

T distribution

T={\frac {\Gamma ({\frac {\nu +1}{2}})}{{\sqrt {\nu \pi }}\,\Gamma ({\frac {\nu }{2}})}}\left(1+{\frac {x^{2}}{\nu }}\right)^{-({\frac {\nu +1}{2}})}\!

A more spread-out version of the Z-curve. For sample size $n$ , $T$ has $\nu =n-1$ degrees of freedom (df)

Similar to α notation, $t_{\nu ,\alpha }$ implies that $P\left(T_{\nu }>t_{\nu ,\alpha }\right)=\alpha$ .

Examples

Apartment Rental Fees

Suppose we want to find the average cost to rent a 1-bedroom apartment in College Station.

Let $X$ represent this rental cost. We want to find μ for the population, but note that we do not know the real value of μ or the distribution of $X$ .

Suppose we sample 32 apartment facilities and find that our sample average (a statistic) is $450. We can estimate μ with the expression:

{\hat {\mu }}=450

Is $450 a good estimate? what about $460? $420?

What if we had an interval—lower to upper bound $(l,u)$ —for μ such that we are 95% sure that μ is between $l$ and $u$ ?

P(\mu \in (l,u))=0.95

This interval is called a confidence interval and can be generated from random sample values.

GPR at A&M

Suppose the average GPR at Texas A&M is 3.0. Dr. Jun decides to inquire students' GPR at a bar on Northgate. She finds with 95% confidence that students who go to bars have a mean GPR between 2.5 and 2.9. What does this mean? (Students that go to bars tend to have lower GPR).

Lecture 14

Lecture 14 Notes

Thursday, March 10, 2011

Confidence Interval of a population proportion for a large sample

A population consists of success or failure
Population proportion $p$ can be estimated:

{\hat {p}}={\frac {X}{n}}

where

X

is the number of successes in a sample of size

n

.

A 100 × (1 − α)% confidence interval for $p$ is:

{\hat {p}}\pm z_{\alpha /2}{\sqrt {\frac {{\hat {p}}(1-{\hat {p}})}{n}}}

Thus, to control the width of our confidence interval (where the width is at most $w$ ):

n\approx {\frac {4{z_{\alpha /2}}^{2}{\hat {p}}(1-{\hat {p}})}{w^{2}}}\geq {\frac {{z_{\alpha /2}}^{2}}{w^{2}}}

Example

A random sample of 539 households from a city was selected, and we determine that 133 of these households owned at least one firearm. Give a 95% confidence interval for the proportion of all households in this city that own at least one firearm.

${\hat {p}}={\frac {133}{539}}$

${\mbox{CI}}={\frac {133}{539}}\pm 1.96{\sqrt {\frac {\left({\frac {133}{539}}\right)\left(1-{\frac {133}{539}}\right)}{539}}}$

Sample Variance Distribution

If the population follows a normal distribution, our sample variance $s^{2}$ in the following form follows a chi-squared distribution with degrees of freedom $n-1$ .

{\frac {(n-1)s^{2}}{\sigma ^{2}}}\sim \chi _{n-1}^{2}

Therefore we can construct a 100 × (1 − α)% confidence interval for the population variance $\sigma ^{2}$ :

{\begin{array}{rcl}\chi _{1-\alpha /2,\ n-1}^{2}\leq &{\frac {(n-1)s^{2}}{\sigma ^{2}}}&\leq \chi _{\alpha /2,\ n-1}^{2}\\{\frac {(n-1)s^{2}}{\chi _{\alpha /2,\ n-1}^{2}}}\leq &\sigma ^{2}&\leq {\frac {(n-1)s^{2}}{\chi _{1-\alpha /2,\ n-1}^{2}}}\end{array}}

χ² Distribution

Only takes positive values (similar to gamma distribution).

PDF: $\chi ^{2}={\frac {1}{2^{k/2}\Gamma (k/2)}}\;x^{k/2-1}e^{-x/2}$

STAT 211 Topic 6

Contents