70 Data Collection and Questionnaire Design

70.1 What is Data Collection?

Data collection is the systematic process of gathering information for analysis and decision making. It is the raw material of every statistical and managerial study. Naresh Malhotra’s standard text frames it as “the process by which the researcher collects information from the chosen units of inquiry” (malhotra2019?).

Two Sources of Data

Source	Description	Examples
Primary	Collected first-hand for the study	Surveys, observation, experiments
Secondary	Already collected for some other purpose	Government reports, journals, company records

Comparing Primary and Secondary Data

Feature	Primary	Secondary
Cost	High	Low
Time	Long	Short
Customisation	High	Low
Currency	Fresh	Possibly outdated
Availability	Always (collect it)	Sometimes

70.2 Methods of Primary Data Collection

Six Common Methods of Primary Data Collection

Method	What it does	Use
Observation	Watch behaviour without interaction	Retail traffic, ethnography
Survey	Standardised questions to many respondents	Most common method
Interview	One-on-one (structured / semi / unstructured)	Depth
Focus group	Small moderated discussion	Qualitative insight
Experiment	Manipulate variables, observe outcomes	Cause-effect
Projective techniques	Indirect — word association, sentence completion	Subconscious motives

70.2.1 Survey Modes

Five Survey Modes

Mode	Strengths	Weaknesses
Personal interview	Depth, completion rate	Costly, interviewer bias
Telephone	Speed, moderate cost	Sample bias (non-mobile)
Mail	Wide reach, low cost	Low response rate
Online / web	Low cost, fast, large reach	Sample bias (digitally connected)
Mobile / SMS	Reach low-tech areas, fast	Limited length

70.3 The Questionnaire — Steps in Design

The questionnaire is the central instrument of survey research. Malhotra’s classical 10-step design process (malhotra2019?):

Ten Steps in Questionnaire Design

#	Step
1	Specify the information needed
2	Choose the type of interview / mode
3	Determine the content of individual questions
4	Design the question to overcome inability and unwillingness
5	Decide the question structure (open / closed)
6	Determine question wording
7	Arrange the questions in proper order
8	Design the form and layout
9	Reproduce the questionnaire
10	Pre-test, revise, finalise

flowchart LR
  S[Specify info<br/>needed] --> M[Mode]
  M --> C[Content]
  C --> Q[Question type<br/>open / closed]
  Q --> W[Wording]
  W --> O[Order]
  O --> L[Layout]
  L --> P[Pre-test]
  P --> F[Finalise]
  style S fill:#E3F2FD,stroke:#1565C0
  style F fill:#E8F5E9,stroke:#2E7D32

70.4 Question Types

Two Main Types of Questions

Type	Description	Strength
Open-ended	No fixed response options	Rich depth, qualitative
Closed-ended	Fixed response options	Easier to code and analyse

Closed-ended Question Formats

Format	Description
Dichotomous	Yes/No
Multiple choice	Pick one or more
Likert scale	Strongly Agree → Strongly Disagree (typically 5 or 7 points)
Semantic differential	Bipolar scale (cold ↔︎ warm)
Stapel scale	Single adjective with +5 to −5
Constant sum	Allocate fixed total across options
Rank order	Rank items in order of preference

70.5 Common Errors in Questionnaires

Common Errors in Question Design

Error	Description
Leading question	Suggests the desired answer
Loaded question	Contains emotionally charged language
Double-barrelled	Asks two things in one question
Ambiguous	Vague or unclear wording
Negative wording	Confuses respondents
Long-winded	Too long; respondent gives up
Sequence bias	Earlier questions affect later answers
Social desirability	Pressure to give the “acceptable” answer

70.6 Reliability and Validity

Two essential tests of any measurement instrument:

Reliability vs Validity

Concept	What it captures	Measures
Reliability	Consistency — repeats give same result	Test-retest, split-half, Cronbach’s α
Validity	Accuracy — measures what it claims	Content, criterion (concurrent / predictive), construct

A test can be reliable but not valid; a valid test must be reliable. Cronbach’s alpha ≥ 0.7 is the typical reliability threshold.

70.7 Pre-testing the Questionnaire

A pre-test is the trial run of the questionnaire on a small sample to identify problems before full deployment. Pre-testing checks:

Length and respondent fatigue.
Clarity of instructions.
Wording problems.
Skip patterns and routing.
Recording and coding ease.

70.8 Practice Questions

Q 01 Sources Easy

Government statistical reports used in a marketing study are an example of:

APrimary data
BSecondary data
CTertiary data
DExperimental data

View solution

Correct Option: B

Already collected for some other purpose = secondary data.

Q 02 Likert Easy

A 5-point "Strongly Agree to Strongly Disagree" scale is a:

ALikert scale
BSemantic differential
CStapel scale
DConstant-sum scale

View solution

Correct Option: A

Likert scale (Rensis Likert, 1932) — agreement scale, typically 5 or 7 points.

Q 03 Errors Medium

"Do you find this product good and worth its price?" is an example of:

ALeading question
BLoaded question
CDouble-barrelled question
DOpen-ended question

View solution

Correct Option: C

The question asks two things at once — quality AND value. Double-barrelled.

Q 04 Reliability Medium

A measure that gives consistent results when repeated under similar conditions is said to be:

AValid
BReliable
CPredictive
DConcurrent

View solution

Correct Option: B

Consistency = reliability. Validity = accuracy. A test can be reliable but invalid; a valid test must be reliable.

Q 05 Cronbach Medium

Cronbach's alpha is used to assess:

AInternal-consistency reliability
BPredictive validity
CSample size
DSkewness

View solution

Correct Option: A

Cronbach's α (1951) — most-used measure of internal consistency. Threshold typically α ≥ 0.7.

Q 06 Methods Medium

Word association and sentence completion are examples of:

AClosed-ended scales
BProjective techniques
CExperiments
DObservations

View solution

Correct Option: B

Projective techniques — indirect methods to surface subconscious motives.

Q 07 Survey Mode Medium

An advantage of online surveys over personal interviews is:

ALower cost and faster turnaround
BBetter depth
CHigher response rate always
DNo sample bias

View solution

Correct Option: A

Online surveys are cheap and fast. They suffer from sample bias toward the digitally connected.

Q 08 Pre-test Easy

The purpose of pre-testing a questionnaire is to:

AReplace the actual survey
BIdentify problems and refine before full deployment
CCompute final results
DTrain respondents

View solution

Correct Option: B

Pre-test = trial run to identify wording, length, routing and coding problems.

Quick recall

Data collection — primary (first-hand, customised, costly) vs secondary (already collected, cheap, fast).
Six methods of primary data: observation, survey, interview, focus group, experiment, projective.
Survey modes: personal, telephone, mail, online, mobile.
Malhotra’s 10 steps in questionnaire design.
Question types: open vs closed. Scales: Likert, Semantic Differential, Stapel, Constant Sum, Rank Order.
Common errors: leading, loaded, double-barrelled, ambiguous, negative wording, long-winded, sequence bias, social desirability.
Reliability (consistency) vs Validity (accuracy). Cronbach’s α ≥ 0.7.
Pre-testing is essential before full deployment.