72 Hypothesis Testing

72.1 What is Hypothesis Testing?

A hypothesis is a statement about a population parameter that is to be verified or rejected on the basis of sample evidence. Hypothesis testing is the formal statistical procedure for deciding whether to reject or fail to reject a hypothesis based on a sample.

Two competing hypotheses are stated:

Null and Alternative Hypotheses

Hypothesis	Statement	Default position
Null hypothesis (H₀)	A statement of “no effect”, “no difference”, “status quo”	Assumed true until evidence rejects it
Alternative hypothesis (H₁ or Ha)	The contradiction of H₀; what we are trying to support	What we accept if H₀ is rejected

The null hypothesis is only rejected — never proven. We say “reject” or “fail to reject” — never “accept” H₀.

72.2 Type I and Type II Errors

Because hypothesis testing rests on probability, two errors are possible:

Type I and Type II Errors

	H₀ True	H₀ False
Reject H₀	Type I error (α) — false positive	Correct (Power = 1 − β)
Fail to reject H₀	Correct	Type II error (β) — false negative

Two Statistical Errors

Error	Symbol	Consequence
Type I (False positive)	α	Reject true H₀ — convict the innocent
Type II (False negative)	β	Fail to reject false H₀ — let the guilty go free
Power	1 − β	Probability of correctly rejecting false H₀

The textbook trade-off: reducing α tends to raise β. A larger sample reduces both.

72.3 The Five-Step Process

Five-Step Hypothesis-Testing Process

#	Step
1	State the null and alternative hypotheses
2	Choose the level of significance (α — typically 0.05 or 0.01)
3	Choose the appropriate test statistic and identify its sampling distribution
4	Compute the test statistic from sample data; find the critical value or p-value
5	Decision — Reject H₀ if test statistic > critical value (or p-value < α)

flowchart LR
  H[State H₀, H₁] --> A[Choose α]
  A --> T[Choose test statistic]
  T --> C[Compute statistic /<br/>p-value]
  C --> D{Reject H₀?}
  D -- Yes --> R[Reject H₀]
  D -- No --> F[Fail to reject H₀]
  style H fill:#E3F2FD,stroke:#1565C0
  style R fill:#FFEBEE,stroke:#C62828

72.4 One-tailed vs Two-tailed Tests

One-tailed vs Two-tailed Tests

Type	When to use	Critical region
Two-tailed	Testing for any difference (H₁: μ ≠ μ₀)	Both tails of the distribution
One-tailed (right)	Testing if greater (H₁: μ > μ₀)	Right tail only
One-tailed (left)	Testing if less (H₁: μ < μ₀)	Left tail only

72.5 Test Statistics — When to Use Which

Common Test Statistics

Test	Used for	Distribution
z-test	Mean (large sample, known σ) or proportion	Standard normal
t-test	Mean (small sample, unknown σ)	Student’s t
Paired t-test	Same units measured twice	Student’s t
Independent samples t-test	Compare means of two groups	Student’s t
F-test	Compare variances; ANOVA	F
Chi-square goodness-of-fit	Observed vs expected categorical frequencies	Chi-square
Chi-square independence	Two categorical variables	Chi-square
ANOVA	Compare means across 3+ groups	F

72.5.1 When to use t vs z

t vs z

Use z when	Use t when
Population σ is known, OR	Population σ is unknown
Large sample (n > 30, even with σ unknown)	Small sample with unknown σ

The t-distribution converges to z as the sample size grows.

72.6 p-Value Approach

The p-value is the probability of observing a sample statistic as extreme as the one observed if the null hypothesis is true.

p-Value Decision Rules

Comparison	Decision
p-value ≤ α	Reject H₀
p-value > α	Fail to reject H₀

A common misinterpretation: p-value is NOT the probability that H₀ is true. It is the probability of observed data assuming H₀.

72.7 ANOVA — Analysis of Variance

When comparing means of three or more groups, multiple t-tests inflate Type I error. ANOVA (R.A. Fisher) handles this with a single F-test (Ronald A. Fisher, 1925):

ANOVA — Three Sources of Variance

Source	What it captures
Between groups (treatment)	Variation due to group differences
Within groups (error)	Variation due to chance within groups
Total	Sum of the two

The F-statistic:

\[F = \frac{\text{Mean Square Between}}{\text{Mean Square Within}}\]

If F > critical value (or p < α), reject the null that all group means are equal.

Three Common ANOVA Designs

Design	What it captures
One-way ANOVA	One independent variable, three or more groups
Two-way ANOVA	Two independent variables; tests main and interaction effects
Repeated-measures ANOVA	Same units measured multiple times

72.8 Chi-Square Tests

Two Common Chi-Square Tests

Test	What it does	Formula
Goodness-of-fit	Compare observed frequencies with expected	χ² = Σ (O − E)² ÷ E
Test of independence	Test whether two categorical variables are independent	χ² = Σ (O − E)² ÷ E (with contingency table)

72.9 Practice Questions

Q 01 Definition Easy

The null hypothesis (H₀) is best described as:

AA statement we want to prove
BA statement of "no effect" or "no difference" assumed true until evidence rejects it
CA statement that always holds
DA guess

View solution

Correct Option: B

H₀ is the default position of "no effect" — assumed true unless rejected by evidence.

Q 02 Type I Medium

A Type I error occurs when:

AA true H₀ is rejected (false positive)
BA false H₀ is accepted (false negative)
CThe sample is too small
DThe variance is unknown

View solution

Correct Option: A

Type I = false positive — reject a true H₀. Probability = α. Type II = false negative; probability = β.

Q 03 Power Medium

The "power" of a statistical test is:

Aα
Bβ
C1 − β
D1 − α

View solution

Correct Option: C

Power = 1 − β — probability of correctly rejecting a false H₀.

Q 04 t vs z Medium

A t-test is preferred over a z-test when:

ASample size is large and σ is known
BSample size is small and σ is unknown
CVariables are categorical
DPopulation is non-normal

View solution

Correct Option: B

Use t when σ is unknown and n is small. Use z when σ is known or n is large.

Q 05 p-value Medium

If the p-value of a test is 0.02 and α = 0.05, the appropriate decision is:

AReject H₀
BAccept H₀
CInconclusive
DIncrease sample size

View solution

Correct Option: A

p (0.02) ≤ α (0.05) → reject H₀. We never "accept" H₀; we only fail to reject.

Q 06 ANOVA Medium

When comparing means across three or more groups, the appropriate test is:

At-test
Bz-test
CANOVA
DChi-square

View solution

Correct Option: C

ANOVA avoids the Type I-error inflation of multiple t-tests. F-test compares between-group vs within-group variance.

Q 07 Chi-square Medium

A test of whether two categorical variables (gender × purchase) are independent uses:

At-test
BF-test
CChi-square test of independence
Dz-test

View solution

Correct Option: C

Categorical × categorical → chi-square test of independence.

Q 08 Tails Medium

A two-tailed test is appropriate when the alternative hypothesis is:

Aμ > μ₀
Bμ < μ₀
Cμ ≠ μ₀
Dμ = μ₀

View solution

Correct Option: C

"Different" (μ ≠ μ₀) is two-tailed — could be greater or less. Directional H₁ (>, <) is one-tailed.

Quick recall

Hypothesis test = sample-based decision about a population parameter. H₀ vs H₁; we reject or fail to reject H₀, never “accept”.
Type I (α) = false positive; Type II (β) = false negative; Power = 1 − β.
Five-step process: hypotheses → α → test statistic → compute → decide.
One-tailed (directional) vs two-tailed.
Tests: z (large/known σ), t (small/unknown σ), F (variance/ANOVA), chi-square (categorical), ANOVA (3+ group means).
p-value ≤ α → reject H₀.
ANOVA decomposes variance into between-group + within-group; F = MS_between/MS_within.
Chi-square tests: goodness-of-fit and independence.