71  Sampling: Concept, Process and Techniques

71.1 What is Sampling?

Sampling is the process of selecting a subset of units from a population to make inferences about the whole. Studying every unit (a census) is often impossible or impractical — too many people, too much money, too much time. Sampling theory makes statistically valid inference possible from a manageable sample.

Cochran’s classical text defines a sample as “a part of a population, or a subset from a set of units, which is provided by some process or other, usually by deliberate selection” (cochran1977?).

TipKey Sampling Terms
Term Definition
Population All units of interest
Sampling frame Operational list of units from which the sample is drawn
Sample Subset of the population actually studied
Sampling unit The unit being sampled (person, household, firm)
Element The unit on which information is sought
Parameter Population characteristic (e.g., μ, σ)
Statistic Sample characteristic (e.g., x̄, s)
Sampling error Difference between statistic and parameter due to sampling
Non-sampling error Errors not due to sampling — measurement, non-response, processing

71.2 Sampling Process

TipSix-Step Sampling Process
# Step
1 Define the target population
2 Determine the sampling frame
3 Choose the sampling technique
4 Determine the sample size
5 Execute the sampling process
6 Validate the sample

71.3 Sampling Techniques — Two Families

TipProbability vs Non-Probability Sampling
Family What it does
Probability sampling Each unit has a known, non-zero probability of selection
Non-probability sampling Selection is judgemental or convenience-based
TipFive Probability Sampling Techniques
Technique Description
Simple random sampling (SRS) Each unit has equal probability of selection
Systematic sampling Pick every kth unit after a random start
Stratified sampling Divide population into homogeneous strata; sample each
Cluster sampling Divide into heterogeneous clusters; sample whole clusters
Multistage sampling Combination across stages (e.g., country → state → district → city → household)
TipFive Non-Probability Sampling Techniques
Technique Description
Convenience sampling Whoever is easy to reach
Judgemental / Purposive Selected by expert judgement
Quota sampling Fill quotas matching population strata, but unit selection is non-random
Snowball sampling Existing respondents refer others
Self-selection Respondents volunteer

flowchart TB
  S[Sampling Techniques] --> P[Probability]
  S --> NP[Non-probability]
  P --> SRS[Simple Random]
  P --> SY[Systematic]
  P --> ST[Stratified]
  P --> CL[Cluster]
  P --> MS[Multistage]
  NP --> CO[Convenience]
  NP --> J[Judgemental]
  NP --> Q[Quota]
  NP --> SN[Snowball]
  style S fill:#FCE4EC,stroke:#AD1457

71.4 Stratified vs Cluster Sampling

TipStratified vs Cluster Sampling
Feature Stratified Cluster
Internal homogeneity Strata are homogeneous within Clusters are heterogeneous within
Between-group variation High Low (clusters are similar to each other)
Sample drawn from All strata Selected clusters only
Cost Higher Lower
Purpose Reduce variance Reduce cost

71.5 Sample Size Determination

The required sample size depends on:

TipFactors Affecting Sample Size
Factor Direction
Required confidence level Higher → larger sample
Required precision (margin of error) Tighter → larger sample
Population variability Higher → larger sample
Population size Has limited effect once sample is reasonably large
Cost and time Larger sample costs more

The classical formula for infinite-population mean:

\[n = \left( \frac{Z_{\alpha/2} \cdot \sigma}{E} \right)^2\]

For a proportion:

\[n = \frac{Z_{\alpha/2}^2 \cdot p(1-p)}{E^2}\]

where \(p\) is the assumed proportion (use 0.5 for maximum n if unknown), \(E\) is the margin of error, \(Z\) is the critical z-value.

71.6 Sampling Error and Standard Error

The standard error is the standard deviation of the sampling distribution of a statistic:

\[\text{SE}(\bar{x}) = \frac{\sigma}{\sqrt{n}}\]

A larger sample → smaller SE → more precise estimate. The relationship is square-root — to halve the SE you need four times the sample.

TipTwo Types of Errors
Type What it captures Source
Sampling error Statistic ≠ Parameter because of chance in selection Reduce by larger n or better design
Non-sampling error Measurement, non-response, processing Reduce by better instrument and execution

71.7 Practice Questions

Q 01 Definition Easy

A complete enumeration of every unit in the population is called:

  • ASampling
  • BCensus
  • CFrame
  • DCluster
View solution
Correct Option: B
Studying every unit = census. Sampling studies a subset.
Q 02 Stratified Medium

In stratified sampling, the strata are typically:

  • AInternally heterogeneous
  • BInternally homogeneous; differ between strata
  • CIdentical to each other
  • DRandom
View solution
Correct Option: B
Strata = internally homogeneous, differ across strata. Cluster = the opposite — heterogeneous within, similar across.
Q 03 Probability Easy

Which is NOT a probability sampling technique?

  • ASimple random sampling
  • BSystematic sampling
  • CStratified sampling
  • DConvenience sampling
View solution
Correct Option: D
Convenience = non-probability. The first three are probability methods.
Q 04 Snowball Medium

Existing respondents referring further respondents — useful for hidden populations — is:

  • AQuota sampling
  • BSnowball sampling
  • CCluster sampling
  • DStratified sampling
View solution
Correct Option: B
Snowball sampling — useful for hidden / hard-to-reach populations (rare diseases, illegal activities).
Q 05 Standard Error Medium

The standard error of the sample mean is:

  • Aσ × n
  • Bσ ÷ √n
  • Cσ × √n
  • Dσ²
View solution
Correct Option: B
SE(x̄) = σ ÷ √n. Halving the SE requires quadrupling the sample.
Q 06 Sample Size Medium

Doubling the desired precision (halving the margin of error) typically requires the sample size to:

  • ADouble
  • BQuadruple
  • CHalve
  • DStay the same
View solution
Correct Option: B
Sample size scales with the inverse square of margin of error. Halving E requires 4× sample.
Q 07 Errors Medium

Non-response and measurement errors fall under:

  • ASampling error
  • BNon-sampling error
  • CStandard error
  • DType I error
View solution
Correct Option: B
Non-sampling errors — measurement, non-response, processing, coverage. Cannot be reduced by larger sample size.
Q 08 Multistage Medium

India's National Sample Survey draws households via state → district → village → household. This is:

  • ASimple random
  • BStratified random
  • CMultistage sampling
  • DQuota sampling
View solution
Correct Option: C
Multistage sampling — sampling done at multiple stages (state, district, village, household). Common in large-scale surveys.
ImportantQuick recall
  • Sampling = study a subset to infer about the whole. Census = study every unit.
  • Key terms: Population, Sampling frame, Sample, Sampling unit, Element, Parameter, Statistic.
  • Six-step process: Define population → Frame → Technique → Sample size → Execute → Validate.
  • Probability sampling: SRS · Systematic · Stratified · Cluster · Multistage.
  • Non-probability sampling: Convenience · Judgemental · Quota · Snowball · Self-selection.
  • Stratified = internally homogeneous, between heterogeneous; Cluster = the opposite.
  • Sample size: depends on confidence, precision, variability. n ∝ 1/E².
  • SE(x̄) = σ / √n.
  • Sampling vs non-sampling errors (measurement, non-response, coverage).