72  Hypothesis Testing

72.1 What is Hypothesis Testing?

A hypothesis is a statement about a population parameter that is to be verified or rejected on the basis of sample evidence. Hypothesis testing is the formal statistical procedure for deciding whether to reject or fail to reject a hypothesis based on a sample.

Two competing hypotheses are stated:

TipNull and Alternative Hypotheses
Hypothesis Statement Default position
Null hypothesis (H₀) A statement of “no effect”, “no difference”, “status quo” Assumed true until evidence rejects it
Alternative hypothesis (H₁ or Ha) The contradiction of H₀; what we are trying to support What we accept if H₀ is rejected

The null hypothesis is only rejected — never proven. We say “reject” or “fail to reject” — never “accept” H₀.

72.2 Type I and Type II Errors

Because hypothesis testing rests on probability, two errors are possible:

TipType I and Type II Errors
H₀ True H₀ False
Reject H₀ Type I error (α) — false positive Correct (Power = 1 − β)
Fail to reject H₀ Correct Type II error (β) — false negative
TipTwo Statistical Errors
Error Symbol Consequence
Type I (False positive) α Reject true H₀ — convict the innocent
Type II (False negative) β Fail to reject false H₀ — let the guilty go free
Power 1 − β Probability of correctly rejecting false H₀

The textbook trade-off: reducing α tends to raise β. A larger sample reduces both.

72.3 The Five-Step Process

TipFive-Step Hypothesis-Testing Process
# Step
1 State the null and alternative hypotheses
2 Choose the level of significance (α — typically 0.05 or 0.01)
3 Choose the appropriate test statistic and identify its sampling distribution
4 Compute the test statistic from sample data; find the critical value or p-value
5 Decision — Reject H₀ if test statistic > critical value (or p-value < α)

flowchart LR
  H[State H₀, H₁] --> A[Choose α]
  A --> T[Choose test statistic]
  T --> C[Compute statistic /<br/>p-value]
  C --> D{Reject H₀?}
  D -- Yes --> R[Reject H₀]
  D -- No --> F[Fail to reject H₀]
  style H fill:#E3F2FD,stroke:#1565C0
  style R fill:#FFEBEE,stroke:#C62828

72.4 One-tailed vs Two-tailed Tests

TipOne-tailed vs Two-tailed Tests
Type When to use Critical region
Two-tailed Testing for any difference (H₁: μ ≠ μ₀) Both tails of the distribution
One-tailed (right) Testing if greater (H₁: μ > μ₀) Right tail only
One-tailed (left) Testing if less (H₁: μ < μ₀) Left tail only

72.5 Test Statistics — When to Use Which

TipCommon Test Statistics
Test Used for Distribution
z-test Mean (large sample, known σ) or proportion Standard normal
t-test Mean (small sample, unknown σ) Student’s t
Paired t-test Same units measured twice Student’s t
Independent samples t-test Compare means of two groups Student’s t
F-test Compare variances; ANOVA F
Chi-square goodness-of-fit Observed vs expected categorical frequencies Chi-square
Chi-square independence Two categorical variables Chi-square
ANOVA Compare means across 3+ groups F

72.5.1 When to use t vs z

Tipt vs z
Use z when Use t when
Population σ is known, OR Population σ is unknown
Large sample (n > 30, even with σ unknown) Small sample with unknown σ

The t-distribution converges to z as the sample size grows.

72.6 p-Value Approach

The p-value is the probability of observing a sample statistic as extreme as the one observed if the null hypothesis is true.

Tipp-Value Decision Rules
Comparison Decision
p-value ≤ α Reject H₀
p-value > α Fail to reject H₀

A common misinterpretation: p-value is NOT the probability that H₀ is true. It is the probability of observed data assuming H₀.

72.7 ANOVA — Analysis of Variance

When comparing means of three or more groups, multiple t-tests inflate Type I error. ANOVA (R.A. Fisher) handles this with a single F-test (Ronald A. Fisher, 1925):

TipANOVA — Three Sources of Variance
Source What it captures
Between groups (treatment) Variation due to group differences
Within groups (error) Variation due to chance within groups
Total Sum of the two

The F-statistic:

\[F = \frac{\text{Mean Square Between}}{\text{Mean Square Within}}\]

If F > critical value (or p < α), reject the null that all group means are equal.

TipThree Common ANOVA Designs
Design What it captures
One-way ANOVA One independent variable, three or more groups
Two-way ANOVA Two independent variables; tests main and interaction effects
Repeated-measures ANOVA Same units measured multiple times

72.8 Chi-Square Tests

TipTwo Common Chi-Square Tests
Test What it does Formula
Goodness-of-fit Compare observed frequencies with expected χ² = Σ (O − E)² ÷ E
Test of independence Test whether two categorical variables are independent χ² = Σ (O − E)² ÷ E (with contingency table)

72.9 Practice Questions

Q 01 Definition Easy

The null hypothesis (H₀) is best described as:

  • AA statement we want to prove
  • BA statement of "no effect" or "no difference" assumed true until evidence rejects it
  • CA statement that always holds
  • DA guess
View solution
Correct Option: B
H₀ is the default position of "no effect" — assumed true unless rejected by evidence.
Q 02 Type I Medium

A Type I error occurs when:

  • AA true H₀ is rejected (false positive)
  • BA false H₀ is accepted (false negative)
  • CThe sample is too small
  • DThe variance is unknown
View solution
Correct Option: A
Type I = false positive — reject a true H₀. Probability = α. Type II = false negative; probability = β.
Q 03 Power Medium

The "power" of a statistical test is:

  • Aα
  • Bβ
  • C1 − β
  • D1 − α
View solution
Correct Option: C
Power = 1 − β — probability of correctly rejecting a false H₀.
Q 04 t vs z Medium

A t-test is preferred over a z-test when:

  • ASample size is large and σ is known
  • BSample size is small and σ is unknown
  • CVariables are categorical
  • DPopulation is non-normal
View solution
Correct Option: B
Use t when σ is unknown and n is small. Use z when σ is known or n is large.
Q 05 p-value Medium

If the p-value of a test is 0.02 and α = 0.05, the appropriate decision is:

  • AReject H₀
  • BAccept H₀
  • CInconclusive
  • DIncrease sample size
View solution
Correct Option: A
p (0.02) ≤ α (0.05) → reject H₀. We never "accept" H₀; we only fail to reject.
Q 06 ANOVA Medium

When comparing means across three or more groups, the appropriate test is:

  • At-test
  • Bz-test
  • CANOVA
  • DChi-square
View solution
Correct Option: C
ANOVA avoids the Type I-error inflation of multiple t-tests. F-test compares between-group vs within-group variance.
Q 07 Chi-square Medium

A test of whether two categorical variables (gender × purchase) are independent uses:

  • At-test
  • BF-test
  • CChi-square test of independence
  • Dz-test
View solution
Correct Option: C
Categorical × categorical → chi-square test of independence.
Q 08 Tails Medium

A two-tailed test is appropriate when the alternative hypothesis is:

  • Aμ > μ₀
  • Bμ < μ₀
  • Cμ ≠ μ₀
  • Dμ = μ₀
View solution
Correct Option: C
"Different" (μ ≠ μ₀) is two-tailed — could be greater or less. Directional H₁ (>, <) is one-tailed.
ImportantQuick recall
  • Hypothesis test = sample-based decision about a population parameter. H₀ vs H₁; we reject or fail to reject H₀, never “accept”.
  • Type I (α) = false positive; Type II (β) = false negative; Power = 1 − β.
  • Five-step process: hypotheses → α → test statistic → compute → decide.
  • One-tailed (directional) vs two-tailed.
  • Tests: z (large/known σ), t (small/unknown σ), F (variance/ANOVA), chi-square (categorical), ANOVA (3+ group means).
  • p-value ≤ α → reject H₀.
  • ANOVA decomposes variance into between-group + within-group; F = MSbetween/MSwithin.
  • Chi-square tests: goodness-of-fit and independence.