BUSS1020 · Quantitative Business Analysis
Foundations: Data, Variables & Sampling
Foundations: Data, Variables & Sampling (Week 1, Berenson Ch 1) sets the vocabulary the whole subject rests on. You learn to tell a population from a sample and a parameter from a statistic, to classify variables as categorical or numerical and place them on the four levels of measurement, and to choose a sampling method that gives a representative picture without bias. Getting these distinctions right is what lets every later technique — probabilities, intervals, tests, regression — be applied to the correct kind of data.
What this chapter covers
- 01Population vs sample; parameter (Greek) vs statistic (Latin)
- 02Categorical variables: nominal vs ordinal
- 03Numerical variables: discrete vs continuous
- 04Levels of measurement: nominal · ordinal · interval · ratio
- 05Data sources: primary vs secondary; structured vs unstructured
- 06Probability sampling: simple random, systematic, stratified, cluster
- 07Non-probability sampling: convenience, judgement
- 08Sampling vs non-sampling error; coverage and selection bias
Classify variables and identify a sampling method
- 1 mark(a) Preferred store is categorical with no natural order → nominal.
- 1 mark(b) Satisfaction 1–5 is categorical with a meaningful order but unequal/undefined gaps → ordinal.
- 1 mark(c) Number of items is numerical and countable → discrete (ratio level, since 0 means none and ratios are meaningful).
- 1 mark(d) Total spend in dollars is numerical and measurable on a continuum → continuous (ratio level).
- 1 markSelecting every 20th shopper is systematic sampling.
- 1 markIt risks coverage/selection bias: only shoppers who actually entered and exited the physical store are sampled, so online-only or non-visiting customers are excluded.
Key terms
- Population vs sample
- The population is the entire group of interest; a sample is the subset you actually observe and from which you infer.
- Nominal vs ordinal
- Both are categorical: nominal categories have no order (e.g. store location), ordinal categories have a meaningful order but no fixed spacing (e.g. a 1–5 rating).
- Discrete vs continuous
- Discrete numerical data come from counting and take separate values (number of items); continuous data come from measuring and can take any value in a range (dollars, time).
- Stratified sampling
- A probability method that splits the population into homogeneous groups (strata) and samples within each, improving representativeness for known subgroups.
- Sampling vs non-sampling error
- Sampling error is the natural variation from observing only a sample rather than the whole population; non-sampling error comes from bias, bad measurement or coverage gaps and is not reduced by a larger sample.
Foundations: Data, Variables & Sampling FAQ
Why does the parameter/statistic distinction matter so much?
Because every inference technique in the subject estimates an unknown population parameter (like μ or π) from a sample statistic (like X̄ or p). Mixing the symbols up is one of the most common ways to lose easy marks, so the Greek-for-population, Latin-for-sample rule is worth locking in early.
Is a postcode numerical data?
No — even though it looks like a number, a postcode is nominal categorical data because the values are just labels with no arithmetic meaning. This 'looks numerical but isn't' trap appears in MCQs.
What makes a sample 'good'?
A good sample is representative and chosen by a probability method (every unit has a known, non-zero chance of selection), which lets you quantify sampling error. Convenience and judgement samples are easier but introduce bias you cannot measure.
Exam move
This week is pure vocabulary, and it is the cheapest set of MCQ marks in the whole unit — so over-learn it. Make a one-page table: variable type → level of measurement → an example → a typical chart, and drill the 'trap' cases (postcodes, ratings, years). For sampling, memorise the four probability methods and the two non-probability ones, and be able to name the bias each one risks. Because Week 1 feeds the in-semester test and the final's Part A, a few minutes of flashcards here pays off all semester.