Select Page

Chapter 8 – Quantitative Methods – Sample

Quantitative Analysis for Social Science Research

Chapter 8 – Using Inference to Test a Scientific Hypothesis

 

1 – Simple Sample Mean Hypothesis Test

Why Should I Care?

Your sample data shows some wild difference with what you expected. Is this a fluke; are these rogue observations? Or this there a story here, such as gender discrimination, a localized phenomenon that contradicts commonly held beliefs?

There is a way to measure the probability that this relationship is statistically significant.

This technique is useful when sample sizes are large.

Definitions

Test – A verification of validity.

Test Statistic – A statistic calculated to be compared to a benchmark critical value.

Critical Value – A standardized distribution value that encompasses an area under a benchmark curve.

For example, a Z-value of ±1.96 represents 95% of the central area under the     normal curve, excluding the tails.

Level of significance Rounded Z Precise Z
68% 1 0.995
95% 2 1.960
99% 3 2.575

Hypothesis – A statement of expectations. A proposed relationship between two variables  that is intended to be verified empirically.

Null Hypothesis – A hypothesis that is tested. A statement of equality; a statement of no  difference; a statement of chance. In the case of a hypothesis test involving a single sample mean (that is compared to a known population mean), the null is typically a statement of the value of the population mean.

Hypothesis Testing – A verification of the validity of a hypothesis, using the test statistic.

Standard Error – A probable measure of sample mean difference from population mean.

The standard deviation of sampling distribution of sample means.

Type I Error – Rejecting the null hypothesis when it is true, due to chance.

Type II Error – Failing to reject the null hypothesis when it is false, because the sample did  not “pick up” the phenomena.

Formula and Algorithm

  1. Formulate the null hypothesis    Ho : \bar{x} = population mean
  2. Determine a level of significance: α = 0.05 (95%)
  3. Identify the critical value ± 1.96 (almost 2 Z for precisely 95%)
  4. Calculate the test Z statistic Z=\frac{\bar{X}-\mu\ }{\sigma_{\bar{x}}}
  5. Compare the test Z to the critical value
    1. If Z > Critical value, then REJECT null hypothesis
    2. If Z < Critical value, then FAIL TO REJECT null hypothesis

Interpretation

The null hypothesis represents the idea that the sample mean is wrong, if it is different from the population mean.

In science, you would prefer to show that you are NOT WRONG.

This is more powerful, scientifically, than being RIGHT.

There are always differences between a sample mean, and a population mean. So this test measures how much of a difference is tolerated.

If you FAIL TO REJECT the null hypothesis, that is because your sample mean is close enough to the population mean, that the difference is not extreme.

If you REJECT the null hypothesis, that is because your sample mean is far away from the population mean.

It is always better to use many samples. There is always a chance that your sample randomly ended up being an extreme case. Depending on the project, you can repeat the exercise with several samples to see if the sample means are really different from, or if they are close to, the population mean.

If the sample mean ends up being significantly different from the population mean, it may be very useful for the science. You may have discovered an unusual phenomenon or a significant change over time (if the population represents historical data).

FAIL TO REJECT – The sample phenomenon is a usual occurrence.

The variation from the population mean is not statistically significant.

 

 

 

REJECT – The sample phenomenon is unusual, and statistically significant.

Graphical Analysis

A population was measured on the following graph. Single occurrences between 1 and 8.

Population is 8, population mean is 4.5.

A sample was randomly selected, with single occurrences between 3 and 7.

Sample size is 5, sample mean is 5.

Does the sample mean fall outside a not-too-large zone (similar to confidence intervals), around the population mean?

How large is not-too-large? Let’s say it would be inside the zone, 95 percent of the time.
Critical value is 1.96 for a chosen 0.05 level of confidence.

How do we measure that?

We need the standard error. That represents the standard deviation of sample means.

Let’s say our SE=\frac{\ 2}{\sqrt8}=0.707

Then we can create a confidence interval,

where CI=\mu\ \pm1.96\bullet0.707=\mu\ \pm1.386

So is it possible that the sample mean is reasonably close to the population mean?

Can we say it’s not an extreme event?

Sure.

Let’s see what the official calculation tells us.

Ho: 5 = population mean

Therefore, we FAIL TO REJECT the null hypothesis, at the 95% significance level. The sample mean is a probable occurrence and close enough to the population mean.

Phrasing

When you write (or read) about the result of a hypothesis test, you will get a tricky piece of text.

  • REJECT

“This study has rejected the null hypothesis, with the knowledge that there is a 5 percent chance of having committed a Type I error. Variations are important enough to be statistically significant.”

You can’t stray from this type of writing. You can go on to write that you have identified a new trend, but no more.

You CANNOT write that you have proven, or discovered, a phenomena. This is not a revolution yet. You will need more data, more sampling, and that would have to be replicated by scientists everywhere on the planet, before the result is accepted as mainstream theory.

  • FAIL TO REJECT

“This study has failed to reject the null hypothesis. The sample statistics appear to be representative of the population. Any variations are not statistically significant.”

You CANNOT write that you have ACCEPTED the null hypothesis. There is a big difference. We are estimating using rules of inference. We are not measuring the absolute truth

Exercise 1

The following samples have measured price inflation in Canadian cities. Each sample is a Consumer Price Index average for 100 products. For example, prices in Toronto have increased on average by 2 percent, over the year. The sample mean for Toronto is 0.02.

Each city has a different sample mean.

Identify the cities for which the result is statistically significant.

I.E. the null hypothesis is not rejected, at 95% significance.

σ = 0.06 , N = 100, µ = 0.025

Sample
(fake data)
Sample mean Critical Z Test Z Result Interpretation
1 – Toronto 0.02 ±1.96 – 0.83 Fail to Reject Small difference
2 – Montreal 0.03 ±1.96 0.83 Fail to Reject Small difference
3 – Calgary 0.04 ±1.96 2.50 Reject Big Difference
4 – Edmonton 0.08 ±1.96 9.17 Reject Big Difference
5 – Vancouver 0.05 ±1.96 4.16 Reject Big Difference
6 – Ottawa 0.01 ±1.96 -2.50 Reject Big Difference
7 – Halifax 0.02 ±1.96 -0.83 Fail to Reject Small difference

Ho: Toronto sample mean = population mean

Z=\frac{\bar{X}-\mu\ }{\sigma_{\bar{x}}}=\ \frac{0.02-0.025}{0.006}=-\ 0.833

Ho:fail to reject if Z < Critical value at 95%

-\ 0.833<\pm1.96

Fail to Reject the Toronto sample mean. It’s close to the mean.

Which cities have price inflation?

way below the Canadian norm? Ottawa

way above the Canadian norm? Calgary, Edmonton, Vancouver

that is not significantly different from the Canadian norm? Toronto, Montreal, Halifax

Research Methods Part 3.6 – MLA Style

Research Methods Part 3.6 – MLA Style

Part 2 – The Research Methods

This is a short presentation of the main research methods, as they apply to social science.

6 – MLA Style

– Why Should I Care?

Some journals (and academics in that same field) use the MLA style, which is useful to know.

  • The Basics

MLA stands for Modern Literature Association. This group of “English Lit” professors share a common style for the publishing of papers, which are fiction, and usually prose.

It is known to be a very short format, without a cover page, which saves trees.

The style is also used by many historians, anthropologists, and many other disciplines in the humanities. If the text is not fiction, there is a specific style for citations (paraphrase and quote).

Here is the complete set of guidelines

  • The Format

 

  • The Bibliography

Related Articles

Quantitative Methods – Samie Ly

Quantitative Methods – Samie Ly

An advanced version of this course is available for University Level Business Statistics.Chapter 3.2 - Percentiles and Box Plots This chapter explains how to calculate percentiles and illustrates how to build a box plot. Chapter 4.3 - Computing Probabilities This...

read more
Research Methods Part 3.6 – MLA Style

Research Methods Part 3.6 – MLA Style

Part 2 – The Research Methods This is a short presentation of the main research methods, as they apply to social science. 6 – MLA Style - Why Should I Care? Some journals (and academics in that same field) use the MLA style, which is useful to know. The Basics MLA...

read more
Research Methods – Part 2.1 – Survey

Research Methods – Part 2.1 – Survey

Part 2 – The Research Methods This is a short presentation of the main research methods, as they apply to social science. 1 – Survey Why Should I Care? Surveys are very common. However, many are not done well. There are many traps most people don’t know about that...

read more

Research Methods Part 3.6 – MLA Style

Research Methods – Part 2.1 – Survey

Part 2 – The Research Methods

This is a short presentation of the main research methods, as they apply to social science.

1 – Survey

Why Should I Care?

Surveys are very common. However, many are not done well. There are many traps most people don’t know about that reduce their scientific validity.

Definitions

Survey: the act of measuring objects.

Social survey: a research technique that obtains information from a sample of individuals by asking questions and analyzing the responses

Questionnaire: a written set of questions organized in a sequence appropriate to the purpose of a survey, or psychological test.

Interview: a loose set of questions, mostly designed to produce an open-ended conversation.

Usefulness

Allows for a “real-time” expression of opinions and attitudes on a particular topic.

In politics, allows to identify shifts in opinions, and relate them to a particular event (speech, riot, etc.)

Business people need them to prepare marketing strategies, advertising campaigns, etc.

Objects of Measurement

Sampling

Hopefully _random_________ and _large__________.

Possible using phonebooks as lists of population, but not so much through _email and internet_ .

With interviews, samples are non-random and tiny.

Types of Surveys

  1. Cross-sectional compare many independent variables to a dependent variable
  2. Longitudinal compare a few variables over time
    1. Trend similar samples taken at different time points
    2. Panel same sample followed through time

Instruments

  1. Questionnaire (116) set question list, closed-ended questions, larger sample
    1. In-person
    2. Telephone
    3. Internet
    4. Group
  2. Interview starting question list, open-ended questions, smaller sample
    1. Field interview
    2. Formal Face-to-face interview

Scientific Power

Exploratory: possible but not likely if the topic is taboo or difficult to discuss.

Mostly descriptive studies, which focus on who, where, how and what.

But may also be used for explanatory studies which validate hypotheses and their causal relationships (why).

Warning – “Surveys” are also used as a commercial ploy to build email mailing lists for advertising.

Related Course Content

Quantitative Methods – Samie Ly

Quantitative Methods – Samie Ly

An advanced version of this course is available for University Level Business Statistics.Chapter 3.2 - Percentiles and Box Plots This chapter explains how to calculate percentiles and illustrates how to build a box plot. Chapter 4.3 - Computing Probabilities This...

Research Methods Part 3.6 – MLA Style

Research Methods Part 3.6 – MLA Style

Part 2 – The Research Methods This is a short presentation of the main research methods, as they apply to social science. 6 – MLA Style - Why Should I Care? Some journals (and academics in that same field) use the MLA style, which is useful to know. The Basics MLA...

Research Methods – Part 2.1 – Survey

Research Methods – Part 2.1 – Survey

Part 2 – The Research Methods This is a short presentation of the main research methods, as they apply to social science. 1 – Survey Why Should I Care? Surveys are very common. However, many are not done well. There are many traps most people don’t know about that...

EnglishFrenchSpanishChinese (Simplified)Greek