STAT 211 Topic 8

From Notes
Jump to navigation Jump to search
Lecture 16 Notes

« previous | Tuesday, April 5, 2011 | next »


Comparing 2 Samples

In topic 7, we did confidence intervals and hypothesis tests for a single sample with the population. Now we're comparing 2 samples with each other.

  • Two samples (Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle X_{1..m}} and ) must be independent and from separate populations


Look at difference in averages Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \bar{X}-\bar{Y}} for estimate of difference of means Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mu_X-\mu_Y} and compare these with the proposed difference Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \Delta_0} :

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \begin{align} E(\bar{X}-\bar{Y}) &= E(\bar{X}) + E(\bar{Y}) = \mu_X-\mu_Y \\ V(\bar{X}-\bar{Y}) &= V(\bar{X}) + V(\bar{Y}) = \frac{\sigma_X^2}{m} + \frac{\sigma_Y^2}{n} \end{align}}

Cases 1&2: Normal Population or Large Sample

When the sample sizes are large, the distribution of Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \bar{X}-\bar{Y}} is also Normal.

Use the same methods as in STAT 211 Topic 7, substituting Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mu = \mu_X-\mu_Y} , Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mu_0 = \Delta_0} and Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \sigma^2 = \tfrac{\sigma_X^2}{m}+\tfrac{\sigma_Y^2}{n}} .

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \begin{align} \bar{X}-\bar{Y} & \sim \mathrm{Normal}\left(\mu_X-\mu_Y,\ \frac{\sigma_X^2}{m}+\frac{\sigma_Y^2}{n}\right) \\ z & = \frac{\bar{X} - \bar{Y} - \Delta_0}{\sqrt{\frac{\sigma_X^2}{m} + \frac{\sigma_Y^2}{n}}} \end{align}}


Example

A realtor from the northeast claims that houses are more valuable (higher sales price) than anywhere else in the US.

  • Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mu_1} : average sales price in Northeast
  • Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \mu_2} : average sales price anywhere but Northeast
  • Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle H_0: \mu_1 - \mu_2 = 0}
  • Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle H_a: \mu_1 - \mu_2 > 0}
  • Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle z = \tfrac{\bar{X} - \bar{Y} - 0}{\sqrt{s_X^2/m+s_Y^2/n}}}


Case 3: Small Sample from Normal Population

We do different things depending on the two population variances (not necessarily known):

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \begin{cases} \sigma_X^2 = \sigma_Y^2 & \mathrm{pooled t-test} \\ \sigma_X^2 \ne \sigma_Y^2 & \mathrm{unpooled t-test} \end{cases}}

If smaller σ is greater than half of the bigger σ, then we say that they are the same:

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \sigma_1 > \frac{\sigma_2}{2}}

Pooled Sample Variance

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle s_p^2 = \frac{(m-1)s_X^2 + (n-1)s_Y^2}{n+m-2}}

Use t-test as normal with df Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \nu = n+m-2}

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle t = \frac{\bar{X} - \bar{Y} - \Delta_0}{s_p\sqrt{\frac{1}{m} + \frac{1}{n}}}}

Unpooled Sample Variance

t-test would be normal as expected, but degrees of freedom is more complicated (this is why the pooled test is more common):

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \begin{align} t & = \frac{\bar{X} - \bar{Y} - \Delta_0}{s_p\sqrt{\frac{1}{m} + \frac{1}{n}}} \\ \nu & = \frac{\left(\frac{s_X^2}{m} + \frac{s_Y^2}{n}\right)^2}{\frac{\left(s_X^2/m\right)^2}{m-1} + \frac{\left(s_Y^2/n\right)}{n-1}} \end{align}}


Paired Data

When two samples are related to each other by a third variable (e.g. mother of two children, student who takes two exams, etc.)

Use paired t-test:

  1. calculate differences between samples: Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle D_i = X_i - Y_i}
  2. calculate average and standard deviation of the differences
  3. use regular t-test on Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle t = \tfrac{\bar{D} - \Delta_0}{s_D/\sqrt{n}}}


Confidence Interval

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \bar{D} \pm t_{\alpha/2,\ n-1} \frac{s_D}{\sqrt{n}}}


Comparing Two Population Proportions

Lecture 17 Notes

Given two sample proportions Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \hat{p}_X} and Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \hat{p}_Y} from two different populations

We are interested in the difference between the two proportions (Normal distribution). Therefore, we can standardize and perform a z-test with the following parameters:

  • H0: Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle p_X-p_Y = \Delta_0}
  • Ha: Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \begin{cases}p_X \ne p_Y \\ p_X > p_Y \\ p_X < p_Y \end{cases}}
  • Test Statistic: Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle z = \frac{\hat{p}_X - \hat{p}_Y - \Delta_0}{\sqrt{\hat{p}(1-\hat{p})\left(\frac{1}{m} + \frac{1}{n}\right)}}}

The only problem is that we don't know Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle p_X} and Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle p_Y} so we estimate it with the second equation.

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \hat{p} = \frac{m}{m+n} \hat{p}_1 + \frac{n}{m+n} \hat{p}_2}


Review

(See STAT 211 Topic 3→)

In general, the exact distribution of our sample proportions are Binomial:

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \begin{align} \hat{p}_1 & = \frac{\textstyle\sum_{i=1}^m X_i}{m} \\ \hat{p}_2 & = \frac{\textstyle\sum_{i=1}^n Y_i}{n} \end{align}} Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \begin{align} \textstyle\sum_{i=1}^m X_i \sim \mathrm{Bin}(m, p_1) \\ \textstyle\sum_{i=1}^n Y_i \sim \mathrm{Bin}(m, p_2) \end{align}}
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \begin{align} E\left(\textstyle\sum_{i=1}^m X_i\right) &= mp_1 \\ E\left(\textstyle\sum_{i=1}^n Y_i\right) &= mp_2 \end{align}} Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \begin{align} V\left(\textstyle\sum_{i=1}^m X_i\right) &= mp_1(1-p_1) \\ V\left(\textstyle\sum_{i=1}^n Y_i\right) &= mp_2(1-p_2) \end{align}}

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \hat{p}_1 - \hat{p}_2 \sim \mathrm{Normal}\left(p_1-p_2,\ \frac{p_1(1-p_1)}{m} + \frac{p_2(1-p_2)}{n}\right)}


Comparing Two Variances

Instead of using Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \chi^2} distribution, we use Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle F} distribution (F-test).

If we have two samples Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle X_1,\ldots,X_m} and Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle Y_1, \ldots, Y_n}

  • H0: Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \sigma_X^2 = \sigma_Y^2}
  • Test statistic: Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle F = \frac{s_X^2}{s_Y^2}}

Rejection Range:

Ha Reject H0 if
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \sigma_X^2 > \sigma_Y^2} Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle F > F_{a,\ m-1,\ n-1}}
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \sigma_X^2 < \sigma_Y^2} Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle F < F_{a,\ m-1,\ n-1}}
Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \sigma_X^2 \ne \sigma_Y^2} Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle F > F_{a/2,\ m-1,\ n-1}} or Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle F < F_{1-a/2,\ m-1,\ n-1}}


Confidence Interval

Failed to parse (MathML with SVG or PNG fallback (recommended for modern browsers and accessibility tools): Invalid response ("Math extension cannot connect to Restbase.") from server "https://wikimedia.org/api/rest_v1/":): {\displaystyle \left(\frac{s_X^2/s_Y^2}{F_{a/2,\ m-1,\ n-1}},\ \frac{s_X^2/s_Y^2}{F_{1-a/2,\ m-1,\ n-1}}\right)}