ECON1012 · Data Analytics
Sampling Distributions & the CLT
Sampling Distributions & the CLT (Module 5, Week 5) is the second module of the Inferential Statistics block of ECON 1012: it studies how a statistic computed from a random sample — the sample mean X̄ or the sample proportion p̂ — behaves across repeated samples, so that one sample can be used to reason about an entire population. You learn that X̄ has mean μ and standard error σ/√n, and meet the Central Limit Theorem: X̄ is approximately normal for a sufficiently large sample (usually n ≥ 30) whatever the population's shape, and exactly normal for every n when the population itself is normal. The module also covers the finite population correction factor, Bernoulli trials and the binomial distribution (μ = np, σ² = np(1−p)), and the normal approximation that lets Z-table methods handle p̂. Everything in Weeks 6 and 7 — confidence intervals and hypothesis tests — stands on this foundation.
What this chapter covers
- 01Sampling distribution: the probability distribution of a sample statistic (X̄ or p̂) over repeated random samples
- 02Mean and spread of X̄: μ_X̄ = μ, σ²_X̄ = σ²/n, standard error σ_X̄ = σ/√n
- 03Central Limit Theorem: X̄ ≈ normal for n ≥ 30 from ANY population; exactly normal for every n if the population is normal
- 04Standardising a sample mean: Z = (X̄ − μ)/(σ/√n)
- 05Finite population correction √((N−n)/(N−1)) — omit when the population is at least 20× the sample
- 06Bernoulli trials and the binomial: μ = np, σ² = np(1−p)
- 07Sample proportion p̂ = X/n: E(p̂) = p, V(p̂) = pq/n, Z = (p̂ − p)/√(pq/n)
- 08Normal approximation for p̂ is valid when np ≥ 10 and nq ≥ 10
One cup vs the mean of 16 cups, then a sample proportion
- 2 marks(a)(i) One cup varies with σ itself: Z = (246 − 250)/8 = −0.50, and the area-to-left Z table gives P(X < 246) = P(Z < −0.50) = 0.3085.
- 1 mark(a)(ii) The population is normal, so X̄ is exactly normal for n = 16 — no CLT needed. Standard error: σ/√n = 8/√16 = 8/4 = 2 mL.
- 2 marks(a)(ii) Z = (246 − 250)/2 = −2.00, so P(X̄ < 246) = P(Z < −2.00) = 0.0228.
- 1 mark(a) Explain: averaging cancels individual variation. The standard error (2 mL) is a quarter of σ (8 mL), so the same 4 mL shortfall is common for one cup (30.85%) but rare for a 16-cup average (2.28%).
- 1 mark(b) Check the normal approximation first: np = 100 × 0.20 = 20 ≥ 10 and nq = 100 × 0.80 = 80 ≥ 10 — the approximation is valid.
- 1 mark(b) Standard error of p̂: √(pq/n) = √(0.20 × 0.80/100) = √0.0016 = 0.04.
- 2 marks(b) Z = (0.25 − 0.20)/0.04 = 1.25, so P(p̂ > 0.25) = 1 − P(Z < 1.25) = 1 − 0.8944 = 0.1056.
Key terms
- Sampling distribution
- The probability distribution of a sample statistic (such as the sample mean X̄) when random samples are repeatedly drawn from a population — the bridge between one observed sample and inference about the whole population.
- Central Limit Theorem (CLT)
- For a random sample drawn from any population, the sampling distribution of X̄ is approximately normal for a sufficiently large sample size (usually n ≥ 30); if the population itself is normal, X̄ is exactly normal for every n.
- Standard error of the mean
- The standard deviation of X̄ across repeated samples, σ_X̄ = σ/√n; it shrinks with the square root of the sample size, which is why averages are more stable than single observations.
- Finite population correction
- The factor √((N−n)/(N−1)) that multiplies σ/√n when the sample is a noticeable share of a finite population of size N; by course convention it is dropped when the population is at least 20 times the sample size.
- Bernoulli trial
- A random experiment with exactly two outcomes — success with probability p and failure with probability q = 1 − p — where p is identical across trials and trials are independent; n such trials generate the binomial distribution with μ = np and σ² = np(1−p).
- Sample proportion p̂
- The number of successes divided by the sample size, p̂ = X/n; it has E(p̂) = p and V(p̂) = pq/n, and is approximately normal when np ≥ 10 and nq ≥ 10.
Sampling Distributions & the CLT FAQ
What is the difference between σ and the standard error σ/√n?
σ describes how much individual observations vary in the population; the standard error σ/√n describes how much the sample MEAN varies across repeated samples. Averaging cancels noise, so the mean of n observations is √n times steadier than one observation. Before standardising, always ask: is this question about one draw or about a sample mean?
When can I treat the sample mean as normal in ECON 1012?
Two routes. If the population is normal, X̄ is exactly normal for any sample size. If the population's shape is unknown or non-normal, the Central Limit Theorem makes X̄ approximately normal once the sample is sufficiently large — the course's working rule is n ≥ 30. MCQs regularly test the exact wording of both statements.
When do I need the finite population correction factor?
When you sample without replacement from a small finite population, the standard error becomes (σ/√n)·√((N−n)/(N−1)). The course convention is to treat a population at least 20 times larger than the sample as effectively infinite and omit the factor. Standard-error MCQs sometimes include a trap variant where N is small enough that the correction is required.
Do I have to calculate binomial probabilities in this module?
The module uses the binomial through its mean np and variance np(1−p), then switches to the normal approximation for the sample proportion (valid when np ≥ 10 and nq ≥ 10, standardised as Z = (p̂ − p)/√(pq/n)). Full binomial pmf calculations do not appear in the module's teaching materials — if unsure what is examinable, check the current unit outline and myLearning.
Studying with AI? Sia — free AI economics tutor works through ECON 1012 step by step.
Exam move
Drill the distinction that decides most marks here: a single observation standardises with σ, a sample mean with σ/√n — using plain σ for X̄ is the classic slip. Watch the finite-population trap: when the population is not at least 20 times the sample, multiply the standard error by √((N−n)/(N−1)). For proportions, check np ≥ 10 and nq ≥ 10 before assuming normality, and build the standard error from the stated population p. Learn the CLT as two precise statements — approximately normal for n ≥ 30 from any population, exactly normal for every n from a normal population — because MCQs test the wording. The Module 5 quiz on myLearning is re-attemptable with randomised questions; repeat it until the standard-error reflex is automatic.