ECON1012 · Data Analytics
Describing Data: Graphs & Numbers
Describing Data: Graphs & Numbers (Module 2, Week 2) is the double-width descriptive-statistics module of ECON 1012 — the only week with two practice quizzes, one for the graphical strand and one for the numerical strand. On the graphical side you build histograms (choosing classes with Sturges' formula), frequency polygons and cumulative relative frequency graphs (ogives), and read shape: symmetric, positively or negatively skewed, unimodal or bimodal. On the numerical side you compute measures of central location (mean, median, mode), relative standing (percentiles, quartiles, quintiles, deciles via Lₚ = (n+1)·P/100) and variability (range, variance, standard deviation, IQR, coefficient of variation), then bound data with the Empirical Rule and Chebyshev's Theorem. Workshops practise the charts in Excel; the final exam expects the same measures by hand. This module also underpins the Descriptive Statistics Case Study (10%).
What this chapter covers
- 01Histograms for numerical data: touching bars, height = class frequency; classes via Sturges K = 1 + 3.3·log₁₀(n), width ≈ range/K
- 02Frequency polygons and the ogive: cumulative relative frequency = proportion below each class upper limit
- 03Central location: mean x̄ = Σxᵢ/n · median (middle of the sorted data) · mode (most frequent value)
- 04Shape rules: symmetric ⇒ mean = median = mode; positive skew ⇒ mean > median > mode; negative skew reverses it
- 05Relative standing: percentiles, quartiles, quintiles, deciles — location Lₚ = (n+1)·P/100, then interpolate
- 06Variability: range · variance s² = Σ(xᵢ − x̄)²/(n−1) · standard deviation s · IQR = Q3 − Q1 · CV = s/x̄
- 07Empirical Rule (bell-shaped only): ≈68% within 1s, ≈95% within 2s, ≈99.7% within 3s
- 08Chebyshev's Theorem (any shape): at least 1 − 1/k² of data within k standard deviations (k = 2 → at least 75%)
Centre, spread and relative standing for daily revenue
- 1 mark(a) Mean: x̄ = Σxᵢ/n = (42 + 48 + 51 + 55 + 58 + 62 + 70 + 94)/8 = 480/8 = $60.
- 1 mark(a) Median: n = 8 is even, so average the 4th and 5th ordered values: (55 + 58)/2 = $56.50.
- 2 marks(b) The $94 day is a high outlier. It drags the mean above the median (60 > 56.50) — the signature of positive skew. Because the mean is seriously affected by extreme values, the median is the better description of a typical day here.
- 2 marks(c) 75th percentile location: Lₚ = (n+1)·P/100 = 9 × 0.75 = 6.75 — a LOCATION, not the value. Interpolate between the 6th value (62) and the 7th (70): 62 + 0.75 × (70 − 62) = 62 + 6 = $68.
- 2 marks(d) Shortcut variance: s² = [Σxᵢ² − (Σxᵢ)²/n]/(n − 1) = (30638 − 480²/8)/7 = (30638 − 28800)/7 = 1838/7 ≈ 262.57.
- 1 mark(d) Standard deviation: s = √262.57 ≈ $16.20 — interpreted in dollars, the data's native unit.
- 1 mark(e) CV = s/x̄ = 16.20/60 ≈ 0.27: day-to-day revenue varies by about 27% of the mean. Being unit-free, the CV lets you compare this truck's relative stability with another whose average takings differ.
Key terms
- Histogram
- The standard graph for numerical data: class frequencies drawn as touching bars — no gaps, unlike a bar chart — where the area of each rectangle corresponds to the observations in that class.
- Ogive (cumulative relative frequency graph)
- The graph of the cumulative relative frequency distribution: the proportion of observations below each class upper limit, rising from 0 to 1 and never decreasing.
- Percentile
- The pth percentile is the value for which at most p% of measurements are smaller and at most (100 − p)% are larger; its location in sorted data is Lₚ = (n+1)·P/100. Quartiles, quintiles and deciles are the quarter, fifth and tenth special cases.
- Interquartile range (IQR)
- Q3 − Q1, the spread of the middle 50% of observations; unlike the full range it is not inflated by a single extreme value.
- Coefficient of variation (CV)
- s/x̄ for a sample (σ/μ for a population): the standard deviation expressed as a proportion of the mean, so variability can be compared across data sets with different scales or units.
- Chebyshev's Theorem
- For data of ANY shape, at least 1 − 1/k² of measurements lie within k standard deviations of the mean (k > 1): at least 75% within 2s, roughly 89% within 3s. The Empirical Rule's 68–95–99.7 pattern applies only to bell-shaped data.
Describing Data: Graphs & Numbers FAQ
Why does Module 2 of ECON 1012 have two quizzes?
Because the module spans two strands: graphical description (histograms, frequency polygons, ogives) and numerical description (central location, relative standing, variability). myLearning carries a separate practice quiz for each — 'Graphical Description of Data' and 'Numerical Description of Data' — and the final exam's MCQs draw on both, so treat them as one chapter with two halves.
When should I use the median instead of the mean in ECON 1012?
The course rule: the mean is generally the first selection, but it is seriously affected by outliers — with skewed data or extreme values the median describes a typical value better. The mean also applies only to numerical data. Watch for MCQ phrasings like 'half of respondents earned more than X': that quotes a median, not a mean.
What is the difference between the Empirical Rule and Chebyshev's Theorem?
The Empirical Rule needs bell-shaped data and gives approximations: about 68% within 1s, 95% within 2s, 99.7% within 3s. Chebyshev's Theorem works for ANY shape but only gives a lower bound: at least 1 − 1/k², so at least 75% within 2 standard deviations. If a question does not say the data are bell-shaped, Chebyshev is the safe tool.
Do I need Excel for this topic, or will the exam make me draw histograms by hand?
Both skills matter, in different places. The Week 2 workshop practises building histograms and cumulative distribution graphs in Excel (with a practice-data workbook), and that feeds the Descriptive Statistics Case Study (10%) — check myLearning for the current brief. The final exam is hand-calculation: you read pictured histograms and compute means, standard deviations, CVs and percentiles with a non-wireless calculator.
Studying with AI? Sia — free AI economics tutor works through ECON 1012 step by step.
Exam move
Drill the traps this module is known for. First: the percentile location Lₚ = (n+1)·P/100 is a location, not the value — sort, locate, then interpolate; the Week 2 slides warn about this explicitly. Second: divide by n − 1 for a sample variance, and report s in native units but CV unit-free. Third: know the ordering mean > median > mode for positive skew, and read skew and modality off a pictured histogram. Fourth: the Empirical Rule needs a bell shape; Chebyshev works anywhere but only bounds ('at least 75%'). Rehearse range/4 as a quick estimate of s for bell-shaped data; remember an outlier inflates range, variance and s together. In the practice exam, Module 2 supplied more MCQs than any other module — these distinctions pay off twice.