73  Hypothesis Testing

73.1 Concept

Hypothesis Testing = a statistical procedure to decide whether sample evidence supports a particular claim about a population. The modern framework was developed by Jerzy Neyman and Egon Pearson (1928, 1933) building on R.A. Fisher (1925) and Karl Pearson (1900).

73.2 Key Concepts

TipHypothesis testing terms
  • Null Hypothesis (H₀) — no difference / no effect; the status quo.
  • Alternative Hypothesis (H₁ or Hₐ) — researcher’s claim.
  • One-tailed test — directional.
  • Two-tailed test — non-directional.
  • Test statistic — Z, t, χ², F.
  • Significance level (α) — probability of Type I error (typically 0.05 or 0.01).
  • p-value — probability of observing data as extreme as ours, given H₀ is true.
  • Power (1 − β) — probability of correctly rejecting H₀.
  • Critical region / Rejection region.
  • Degrees of freedom (df).

73.3 Type I and Type II Errors

TipDecision matrix
H₀ True H₀ False
Reject H₀ Type I error (α) — false positive Correct (Power)
Fail to Reject H₀ Correct (1 − α) Type II error (β) — false negative

73.4 Steps in Hypothesis Testing

TipSix-step hypothesis-testing procedure
  1. State H₀ and H₁.
  2. Choose significance level α (0.05, 0.01).
  3. Select appropriate test statistic (Z, t, χ², F).
  4. Compute test statistic from sample data.
  5. Determine p-value or critical value.
  6. Decision — reject or fail to reject H₀.
  7. Interpret in business / managerial context.

73.5 Parametric Tests

Assume specific distribution (usually normal) and meet certain conditions.

TipMajor parametric tests
Test Use case
Z-test Large samples (n ≥ 30) or known σ
One-sample t-test Single mean, σ unknown, small n
Independent samples t-test Two means, independent samples
Paired t-test Two means, dependent samples (before-after)
F-test Equality of variances
ANOVA 3+ group means; R.A. Fisher
One-way ANOVA Single factor
Two-way ANOVA Two factors with interaction
MANOVA Multiple dependent variables
Pearson Correlation Test
Linear Regression Test

73.6 Non-Parametric Tests

No distributional assumptions; for ordinal / nominal data.

TipMajor non-parametric tests
Test Use case Parametric equivalent
Chi-square (χ²) Goodness of Fit Observed vs expected
Chi-square Test of Independence Two categorical variables
Mann-Whitney U Two independent samples t-test
Wilcoxon Signed-Rank Paired samples Paired t-test
Kruskal-Wallis H 3+ independent groups One-way ANOVA
Friedman Test 3+ related groups Repeated ANOVA
Spearman’s Rank Correlation Ordinal Pearson
Kolmogorov-Smirnov (K-S) Goodness of fit, normality
Sign Test
Runs Test Randomness

73.7 ANOVA — Analysis of Variance

ANOVA — R.A. Fisher (1925) — tests equality of three or more means by partitioning total variance.

\[F = \frac{\text{MS}_{\text{between}}}{\text{MS}_{\text{within}}}\]

TipANOVA types
  • One-way ANOVA — single factor.
  • Two-way ANOVA — two factors + interaction.
  • N-way ANOVA.
  • Repeated-measures ANOVA.
  • ANCOVA — Analysis of Covariance.
  • MANOVA — multiple DVs.
  • Latin Square Design.
  • Randomised Block Design.
  • Factorial designs.

73.8 Choice of Test

TipDecision rule — which test?
  • Sample size — large (Z) vs small (t).
  • Variance known? — Z (known) vs t (unknown).
  • Number of groups — 1 (one-sample), 2 (two-sample), 3+ (ANOVA).
  • Dependent vs independent samples.
  • Distributional assumptions — parametric vs non-parametric.
  • Measurement scale — interval/ratio (parametric); ordinal/nominal (non-parametric).

73.9 Effect Size

TipCommon effect-size measures
  • Cohen’s d — for mean differences (Small 0.2, Medium 0.5, Large 0.8).
  • η² (eta squared) — ANOVA.
  • r — correlation.
  • Odds Ratio · Hazard Ratio — for categorical/survival.

73.10 p-value Controversy and Modern Critique

Tipp-value issues
  • p < 0.05 is arbitrary (Fisher).
  • American Statistical Association (2016 statement) cautioning against over-reliance.
  • Calls for confidence intervals, effect sizes, Bayesian alternatives.
  • Replication crisis in social sciences.
  • Pre-registration of hypotheses becoming standard.

73.11 Bayesian Hypothesis Testing

TipBayesian approach
  • Updates prior beliefs with evidence to form posterior.
  • Bayes Factor — strength of evidence for H₁ vs H₀.
  • Avoids many p-value pitfalls.
  • Computational: MCMC, Stan, JAGS.

73.13 Practice Questions

Q 01H0Easy

The Null Hypothesis (H₀) typically asserts:

  • ANo difference / no effect
  • BThe researcher's claim
  • CAn alternative
  • DAlways rejected
View solution
Correct Option: A
Status quo.
Q 02Type IMedium

A Type I error is:

  • ARejecting H₀ when it is true
  • BFailing to reject H₀ when false
  • CSampling error
  • DNon-response
View solution
Correct Option: A
False positive; probability α.
Q 03p-valueMedium

A p-value less than 0.05 indicates:

  • AStatistically significant; reject H₀
  • BAccept H₀
  • CHigh Type II error
  • DPower = 1
View solution
Correct Option: A
Conventional threshold; reject H₀.
Q 04ANOVAMedium

ANOVA was developed by:

  • AR.A. Fisher
  • BGosset
  • CPearson
  • DNeyman
View solution
Correct Option: A
R.A. Fisher (1925).
Q 05Non-parametricMedium

The non-parametric equivalent of an independent t-test is:

  • AWilcoxon Signed-Rank
  • BMann-Whitney U
  • CKruskal-Wallis
  • DChi-square
View solution
Correct Option: B
Mann-Whitney U.
Q 06KruskalHard

Kruskal-Wallis is the non-parametric equivalent of:

  • At-test
  • BOne-way ANOVA
  • CTwo-way ANOVA
  • DChi-square
View solution
Correct Option: B
3+ independent groups.
Q 07Cohen's dHard

Cohen's d = 0.5 indicates effect of:

  • ASmall
  • BMedium
  • CLarge
  • DNegligible
View solution
Correct Option: B
Cohen: 0.2 small, 0.5 medium, 0.8 large.
Q 08Two-tailedMedium

Two-tailed test is appropriate when:

  • ANo direction specified
  • BGreater than specified
  • CLess than specified
  • DOnly categorical data
View solution
Correct Option: A
Non-directional H₁.
Q 09Chi-squareMedium

Chi-square Test of Independence tests:

  • ATwo categorical variables
  • BTwo means
  • CThree means
  • DCorrelation
View solution
Correct Option: A
Association between two categoricals.
Q 10PowerMedium

Power of a test is:

  • Aα
  • B1 − α
  • Cβ
  • D1 − β
View solution
Correct Option: D
Power = 1 − β = correctly rejecting H₀ when false.
Q 11PairedMedium

Before-after design uses:

  • AIndependent t-test
  • BPaired t-test
  • CANOVA
  • DChi-square
View solution
Correct Option: B
Paired samples → paired t.
Q 12F-testMedium

F-test compares:

  • AMeans
  • BVariances / ANOVA
  • CProportions
  • DFrequencies
View solution
Correct Option: B
Variance ratio.
Q 13Neyman-PearsonHard

Modern hypothesis-testing framework (1928-1933) is by:

  • ANeyman & Pearson
  • BFisher
  • CKarl Pearson
  • DGosset
View solution
Correct Option: A
Jerzy Neyman & Egon Pearson.
Q 14BonferroniHard

Bonferroni correction addresses:

  • ASampling error
  • BMultiple-testing inflation
  • CHeteroskedasticity
  • DOutliers
View solution
Correct Option: B
α / k for k tests.
Q 15BootstrapHard

Bootstrap technique was introduced in 1979 by:

  • ABradley Efron
  • BFisher
  • CNeyman
  • DBayes
View solution
Correct Option: A
Bradley Efron (1979).

73.13.1 Advanced Format Questions

AR 1Assertion-ReasonHard

A: One-tailed test rejects H₀ in one direction only.
R: The critical region is split equally in two tails.

  • ABoth true; R explains A
  • BBoth true; R does not explain A
  • CA true, R false
  • DA false, R true
View solution
Correct Option: C
Split-tail behaviour is for two-tailed tests.
S 1Statement-basedMedium

Hypothesis testing steps: (i) State H₀/H₁. (ii) Choose α. (iii) Compute statistic. (iv) Decide.

  • AAll four (in order)
  • B(i) and (ii) only
  • C(iii) and (iv) only
  • D(iv) only
View solution
Correct Option: A
N 1NumericalMedium

Sample n = 100; x̄ = 52; μ = 50; σ = 10. Z-statistic:

  • A2.0
  • B1.0
  • C0.5
  • D5.0
View solution
Correct Option: A
Z = (52−50)/(10/√100) = 2/1 = 2.
N 2NumericalHard

At α = 0.05 two-tailed, critical Z value:

  • A±1.96
  • B±1.645
  • C±2.576
  • D±3.0
View solution
Correct Option: A
α/2 = 0.025 in each tail → Z = ±1.96.

73.14 Quick Recall

ImportantQuick recall
  • Hypothesis testing — Neyman-Pearson (1928, 1933) building on Fisher (1925).
  • H₀ vs H₁; One-tailed vs Two-tailed.
  • α (Type I) vs β (Type II); Power = 1 − β.
  • 6 steps: H₀/H₁ → α → test stat → compute → p-value → decide.
  • Parametric: Z · t (one-sample, independent, paired) · F · ANOVA (Fisher 1925) · ANCOVA · MANOVA · Pearson.
  • Non-parametric: χ² · Mann-Whitney U · Wilcoxon Signed-Rank · Kruskal-Wallis H · Friedman · Spearman · K-S · Sign · Runs.
  • Choice: based on sample size, σ, # groups, samples dependent?, scale.
  • Effect size: Cohen’s d (0.2 / 0.5 / 0.8) · η² · r · OR.
  • p-value critique — ASA 2016; replication crisis.
  • Bayesian: prior + likelihood = posterior; Bayes Factor.
  • Modern: Bayesian · Bonferroni · FDR (Benjamini-Hochberg) · A/B testing · causal inference · pre-registration · effect-size focus · Bootstrap (Efron 1979) · permutation · ML-driven.