**Why Should I Care?**

Your sample data shows some wild difference with what you expected. Is this a fluke; are these rogue observations? Or this there a story here, such as gender discrimination, a localized phenomenon that contradicts commonly held beliefs?

There is a way to measure the probability that this relationship is statistically significant.

This technique is useful when sample sizes are large.

**Definitions**

**Test –** A verification of validity.

**Test Statistic – **A statistic calculated to be compared to a benchmark critical value.

**Critical Value – **A standardized distribution value that encompasses an area under a benchmark curve.

For example, a Z-value of ±1.96 represents 95% of the central area under the normal curve, excluding the tails.

Level of significance |
Rounded Z |
Precise Z |

68% | 1 | 0.995 |

95% | 2 | 1.960 |

99% | 3 | 2.575 |

**Hypothesis** – A statement of expectations. A proposed relationship between two variables that is intended to be verified empirically.

**Null Hypothesis **– A hypothesis that is tested. A statement of equality; a statement of no difference; a statement of chance. In the case of a hypothesis test involving a single sample mean (that is compared to a known population mean), the null is typically a statement of the value of the population mean.

**Hypothesis Testing **– A verification of the validity of a hypothesis, using the test statistic.

**Standard Error – **A probable measure of sample mean difference from population mean.

The standard deviation of sampling distribution of sample means.

**Type I Error –** Rejecting the null hypothesis when it is true, due to chance.

**Type II Error –** Failing to reject the null hypothesis when it is false, because the sample did not “pick up” the phenomena.

**Formula and Algorithm**

- Formulate the null hypothesis Ho : = population mean
- Determine a level of significance: α = 0.05 (95%)
- Identify the critical value ± 1.96 (almost 2 Z for precisely 95%)
- Calculate the test Z statistic
- Compare the test Z to the critical value
- If Z > Critical value, then REJECT null hypothesis
- If Z < Critical value, then FAIL TO REJECT null hypothesis

**Interpretation**

The null hypothesis represents the idea that the sample mean is wrong, if it is different from the population mean.

In science, you would prefer to show that you are NOT WRONG.

This is more powerful, scientifically, than being RIGHT.

There are always differences between a sample mean, and a population mean. So this test measures how much of a difference is tolerated.

If you FAIL TO REJECT the null hypothesis, that is because your sample mean is close enough to the population mean, that the difference is not extreme.

If you REJECT the null hypothesis, that is because your sample mean is far away from the population mean.

It is always better to use many samples. There is always a chance that your sample randomly ended up being an extreme case. Depending on the project, you can repeat the exercise with several samples to see if the sample means are really different from, or if they are close to, the population mean.

If the sample mean ends up being significantly different from the population mean, it may be very useful for the science. You may have discovered an unusual phenomenon or a significant change over time (if the population represents historical data).

FAIL TO REJECT –The sample phenomenon is a usual occurrence.The variation from the population mean is not statistically significant.

REJECT –The sample phenomenon is unusual, and statistically significant.

**Graphical Analysis**

A population was measured on the following graph. Single occurrences between 1 and 8.

Population is 8, population mean is 4.5.

A sample was randomly selected, with single occurrences between 3 and 7.

Sample size is 5, sample mean is 5.

Does the sample mean fall outside a not-too-large zone (similar to confidence intervals), around the population mean?

How large is not-too-large? Let’s say it would be inside the zone, 95 percent of the time.

Critical value is 1.96 for a chosen 0.05 level of confidence.

How do we measure that?

We need the standard error. That represents the standard deviation of sample means.

Let’s say our

Then we can create a confidence interval,

where

So is it possible that the sample mean is reasonably close to the population mean?

Can we say it’s not an extreme event?

Sure.

Let’s see what the official calculation tells us.

H_{o}: 5 = population mean

Therefore, we FAIL TO REJECT the null hypothesis, at the 95% significance level. The sample mean is a probable occurrence and close enough to the population mean.

**Phrasing**

When you write (or read) about the result of a hypothesis test, you will get a tricky piece of text.

**REJECT**

“This study has rejected the null hypothesis, with the knowledge that there is a 5 percent chance of having committed a Type I error. Variations are important enough to be statistically significant.”

You can’t stray from this type of writing. You can go on to write that you have identified a new trend, but no more.

You CANNOT write that you have proven, or discovered, a phenomena. This is not a revolution yet. You will need more data, more sampling, and that would have to be replicated by scientists everywhere on the planet, before the result is accepted as mainstream theory.

**FAIL TO REJECT**

“This study has failed to reject the null hypothesis. The sample statistics appear to be representative of the population. Any variations are not statistically significant.”

You CANNOT write that you have ACCEPTED the null hypothesis. There is a big difference. We are estimating using rules of inference. We are not measuring the absolute truth

**Exercise 1**

The following samples have measured price inflation in Canadian cities. Each sample is a Consumer Price Index average for 100 products. For example, prices in Toronto have increased on average by 2 percent, over the year. The sample mean for Toronto is 0.02.

Each city has a different sample mean.

Identify the cities for which the result is statistically significant.

I.E. the null hypothesis is not rejected, at 95% significance.

σ = 0.06 , N = 100, µ = 0.025

Sample(fake data) |
Sample mean |
Critical Z |
Test Z |
Result |
Interpretation |

1 – Toronto | 0.02 | ±1.96 | – 0.83 | Fail to Reject | Small difference |

2 – Montreal | 0.03 | ±1.96 | 0.83 | Fail to Reject | Small difference |

3 – Calgary | 0.04 | ±1.96 | 2.50 | Reject | Big Difference |

4 – Edmonton | 0.08 | ±1.96 | 9.17 | Reject | Big Difference |

5 – Vancouver | 0.05 | ±1.96 | 4.16 | Reject | Big Difference |

6 – Ottawa | 0.01 | ±1.96 | -2.50 | Reject | Big Difference |

7 – Halifax | 0.02 | ±1.96 | -0.83 | Fail to Reject | Small difference |

H_{o}: Toronto sample mean = population mean

H_{o}:fail to reject if Z < Critical value at 95%

Fail to Reject the Toronto sample mean. It’s close to the mean.

Which cities have price inflation?

way below the Canadian norm? Ottawa

way above the Canadian norm? Calgary, Edmonton, Vancouver

that is not significantly different from the Canadian norm? Toronto, Montreal, Halifax