BUSS1020 · Quantitative Business Analysis
Numerical Descriptive Measures
Numerical Descriptive Measures (Week 2, Berenson Ch 3.1–3.4, 3.6 & Ch 2) is where you summarise a dataset's centre, spread and shape in a handful of numbers. You compute the mean, median and mode; measure variation with the range, variance, standard deviation and the coefficient of variation; standardise values with Z-scores; and locate position with quartiles, the IQR and the five-number summary that drives the boxplot. The Empirical and Chebyshev rules connect spread to how much data falls within a given number of standard deviations.
What this chapter covers
- 01Central tendency: mean, median, mode
- 02Geometric mean and geometric rate of return
- 03Variation: range, variance S², standard deviation S
- 04Coefficient of variation (CV) for relative spread
- 05Z-scores: standardising a value
- 06Quartiles, IQR and the five-number summary
- 07Boxplots and the link between shape and skew
- 08Empirical Rule (68–95–99.7) and Chebyshev's Rule
Centre, spread, CV and a Z-score
- 2 marksMean X̄ = (40 + 52 + 47 + 38 + 61 + 50)/6 = 288/6 = 48.
- 1 markDeviations from the mean: −8, 4, −1, −10, 13, 2; squared: 64, 16, 1, 100, 169, 4; sum = 354.
- 1 markSample variance S² = 354/(6 − 1) = 354/5 = 70.8.
- 1 markSample standard deviation S = √70.8 ≈ 8.41 customers.
- 1 markCoefficient of variation CV = (S/X̄) × 100% = (8.41/48) × 100% ≈ 17.5%.
- 1 markZ-score for 61: Z = (61 − 48)/8.41 ≈ 1.55, so the busiest week is about 1.5 SD above the mean.
Key terms
- Mean vs median
- The mean is the arithmetic average ΣXᵢ/n and is pulled toward extreme values; the median is the middle of the ranked data and is resistant to outliers, making it the better centre for skewed data.
- Coefficient of variation (CV)
- CV = (S/X̄) × 100%, a unit-free measure of relative variability that lets you compare spread across datasets with different means or units.
- Z-score
- Z = (X − X̄)/S, the number of standard deviations a value lies from the mean; it standardises values so they can be compared across distributions.
- Interquartile range (IQR)
- IQR = Q₃ − Q₁, the spread of the middle 50% of the data, and the basis of the boxplot's box; it is resistant to outliers.
- Empirical Rule
- For a bell-shaped distribution, about 68%, 95% and 99.7% of values fall within ±1, ±2 and ±3 standard deviations of the mean.
- Chebyshev's Rule
- For ANY distribution, at least (1 − 1/k²) × 100% of values lie within k standard deviations of the mean (for k > 1) — weaker than the Empirical Rule but assumption-free.
Numerical Descriptive Measures FAQ
When should I use the geometric mean instead of the arithmetic mean?
Use the geometric mean for multiplicative quantities like investment returns or growth rates over time. Averaging percentage returns arithmetically overstates performance; the geometric rate of return correctly compounds the period factors.
How do I tell skew direction from a boxplot?
Look at the whiskers and the median's position in the box. A long upper whisker and a median toward the bottom of the box signal right (positive) skew; a long lower whisker signals left (negative) skew. Symmetric data have roughly equal whiskers and a centred median.
What's the difference between the Empirical Rule and Chebyshev's Rule?
The Empirical Rule gives sharp percentages (68–95–99.7) but only for bell-shaped data; Chebyshev's Rule gives a guaranteed minimum percentage for any distribution shape. Use Empirical when normality is reasonable, Chebyshev when shape is unknown.
Exam move
Drill the hand-computation pipeline until it is automatic: rank the data, find the five-number summary, compute the mean and the (n − 1) standard deviation, then standardise. Keep a small reference card of the quartile position rules and the CV formula. Practise reading shape from both a histogram and a boxplot, and be ready to state in one sentence what a CV or Z-score means in business terms. This week underpins everything that follows, so make sure you can move fluently between centre, spread and shape.