University of Adelaide · FACULTY OF BUSINESS & ECONOMICS

ECON1012 · Data Analytics

- one subject, every graph, every model, every mark
Business and Economics14 Chapters8-page Bible
Our own words - no uploaded lecturer files
Updated for this semester
Chapter 10 of 11 · ECON 1012

Regression II: Inference & Fit

Regression II: Inference & Fit (Module 10, Week 10) finishes what Week 9 started: instead of just writing down the simple linear regression model, you now estimate it, test it and judge it. First comes the estimation recipe — from summary sums (Σx, Σy, Σx², Σxy) to s_x² and s_xy, then β̂₁ = s_xy/s_x² and β̂₀ = ȳ − β̂₁x̄. Next you measure how badly the line misses: SSE, the sum of squares for error, and the standard error of estimate s_ε = √(SSE/(n−2)). Inference arrives with the slope test — H₀: β₁ = 0 against H_A: β₁ ≠ 0, a t statistic on n − 2 degrees of freedom — and fit is graded by the coefficient of determination R², the proportion of the variation in Y explained by X. ECON 1012 uses one X variable only; multiple regression is out of scope.

In this chapter

What this chapter covers

  • 01Estimation recipe: summary sums → s_x², s_xy → β̂₁ = s_xy/s_x², β̂₀ = ȳ − β̂₁x̄
  • 02SSE = Σ(yᵢ − ŷᵢ)² = (n−1)(s_y² − s_xy²/s_x²) — the unexplained variation
  • 03Standard error of estimate s_ε = √(SSE/(n−2)) — the typical residual size
  • 04Slope test: H₀: β₁ = 0 (no linear relationship) vs H_A: β₁ ≠ 0, t = β̂₁/s_{β̂₁}, df = n − 2
  • 05Standard error of the slope s_{β̂₁} = s_ε/√((n−1)s_x²)
  • 06R² = SSR/SST = 1 − SSE/SST = r² — proportion of Y's variation explained by X
  • 07Partition SST = SSR + SSE; naming trap: this course's SSE is elsewhere called RSS
  • 08One X variable only — multiple regression is out of scope
Worked example · free

Full regression from summary statistics: line, fit and slope test

Q [12 marks]. A hardware chain studies whether local radio advertising moves sales. For n = 26 stores in one week it records x = radio ads aired and y = weekly sales ($1000s), with summary statistics Σx = 130, Σy = 260, Σx² = 1400, Σy² = 2725, Σxy = 1450. (a) Estimate the least squares regression line and interpret the slope. (b) Compute SSE and the standard error of estimate. (c) Test at the 5% level of significance whether a linear relationship exists between ads and sales. (d) Compute and interpret the coefficient of determination.
  • 2 marks(a) Building blocks first: x̄ = 130/26 = 5 and ȳ = 260/26 = 10; s_x² = [Σx² − (Σx)²/n]/(n − 1) = (1400 − 650)/25 = 30; s_xy = [Σxy − (Σx)(Σy)/n]/(n − 1) = (1450 − 1300)/25 = 6.
  • 2 marks(a) β̂₁ = s_xy/s_x² = 6/30 = 0.200 and β̂₀ = ȳ − β̂₁x̄ = 10 − 0.200 × 5 = 9.000, so ŷ = 9.000 + 0.200x. Interpretation: each additional ad aired is, on average, associated with 0.200 × $1000 = $200 more weekly sales.
  • 1 mark(b) s_y² = (2725 − 2600)/25 = 5, so SSE = (n − 1)(s_y² − s_xy²/s_x²) = 25 × (5 − 36/30) = 25 × 3.8 = 95.
  • 1 mark(b) Standard error of estimate: s_ε = √(SSE/(n − 2)) = √(95/24) = √3.9583 ≈ 1.990 (in $1000s).
  • 2 marks(c) H₀: β₁ = 0 (no linear relationship) vs H_A: β₁ ≠ 0, two-tail at α = 0.05. Standard error of the slope: s_{β̂₁} = s_ε/√((n − 1)s_x²) = 1.990/√750 ≈ 1.990/27.386 ≈ 0.0727. Test statistic t = (0.200 − 0)/0.0727 ≈ 2.75 with df = n − 2 = 24.
  • 2 marks(c) Critical values ±t₀.₀₂₅,₂₄ = ±2.064. Since 2.75 > 2.064, reject H₀: there is sufficient evidence at the 5% level of significance of a linear relationship between ads aired and weekly sales.
  • 2 marks(d) R² = s_xy²/(s_x²·s_y²) = 36/(30 × 5) = 36/150 = 0.24. Cross-check via the partition: SST = (n − 1)s_y² = 125 and 1 − SSE/SST = 1 − 95/125 = 0.24. About 24% of the variation in weekly sales is explained by variation in ads aired; the remaining 76% sits in SSE.
(a) ŷ = 9.000 + 0.200x — each extra ad is associated with about $200 more weekly sales on average; (b) SSE = 95 and s_ε ≈ 1.990; (c) t ≈ 2.75 > 2.064, so reject H₀ at the 5% level — a linear relationship exists; (d) R² = 0.24, so 24% of sales variation is explained by advertising.
Sia tip — Degrees of freedom in regression are n − 2 — two coefficients are estimated before any residual exists — so read the t table one row lower than in a one-mean problem. And keep significance separate from fit: here the slope is clearly significant (t ≈ 2.75) yet R² is only 0.24, a real but modest relationship.
Glossary

Key terms

Sum of squares for error (SSE)
The sum of squared vertical gaps between the observations and the fitted line, SSE = Σ(yᵢ − ŷᵢ)², with shortcut SSE = (n−1)(s_y² − s_xy²/s_x²). It measures the variation in Y the line leaves unexplained — the smaller, the better the fit.
Standard error of estimate
s_ε = √(SSE/(n−2)): roughly the typical size of a residual, in the units of Y. It feeds directly into the standard error of the slope and hence the slope test.
Slope significance test
The t-test of H₀: β₁ = 0 (no linear relationship) against H_A: β₁ ≠ 0 (a linear relationship exists), using t = (β̂₁ − β₁)/s_{β̂₁} with n − 2 degrees of freedom. A two-tail test is the most typical choice, though one-tail versions (β₁ > 0 or β₁ < 0) exist.
Standard error of the slope
s_{β̂₁} = s_ε/√((n−1)s_x²): the estimated sampling variability of the slope estimate β̂₁, and the denominator of the slope t statistic. More spread in X (larger s_x²) makes the slope estimate more precise.
Coefficient of determination (R²)
The proportion of the variation in Y explained by the variation in X: R² = SSR/SST = 1 − SSE/SST, which also equals the square of the correlation, r². It lies between 0 (no linear relationship) and 1 (perfect fit) and has no critical value for hypothesis testing.
SST = SSR + SSE partition
The total variation of Y around its mean, SST = Σ(yᵢ − ȳ)² = (n−1)s_y², splits into SSR, the variation explained by X, plus SSE, the unexplained remainder. In this course SSR always means the regression (explained) sum of squares.
FAQ

Regression II: Inference & Fit FAQ

Why does the slope test use n − 2 degrees of freedom instead of n − 1?

Because the fitted line estimates two coefficients — β̂₀ and β̂₁ — from the data before any residual can be computed, two degrees of freedom are used up. Both the standard error of estimate s_ε = √(SSE/(n−2)) and the slope t statistic therefore run on df = n − 2. Reading the t table at n − 1, the habit carried over from one-mean problems, picks the wrong critical value and is one of the easiest marks to lose in Week 10.

What does R² actually measure, and can I hypothesis-test it?

R² measures the proportion of the variation in Y that is explained by the variation in X: R² = SSR/SST = 1 − SSE/SST, and it equals the square of the correlation coefficient, r². It sits between 0 and 1 — R² = 1 means the points lie exactly on the line, R² = 0 means no linear relationship. The course is explicit that R² has no critical value for testing hypotheses: if you need a formal yes/no on whether a linear relationship exists, run the t-test on the slope, not anything on R².

Is SSE in ECON 1012 the same thing as SSR?

No — and the slides flag this trap directly. In ECON 1012, SSE is the sum of squares for ERROR (the residual, unexplained variation) and SSR is the REGRESSION (explained) sum of squares, with SST = SSR + SSE. Some other textbooks and websites use SSR or RSS for the residual quantity this course calls SSE. On the exam, stick to the course convention and, if in doubt, write the defining formula next to the symbol so your meaning is unambiguous.

How does regression show up on the ECON 1012 final exam?

The final exam is 25 MCQs plus 3 case-study questions covering Weeks 1–10 (180 minutes, invigilated, one double-sided A4 note sheet, non-wireless calculators, Z and t tables provided), and Week 10 material is squarely in scope. In the practice materials the regression case study follows a set shape: interpret the fitted coefficients in context, explain and compute R², run a full slope significance test from reported standard errors showing every step, then bracket the p-value from the t table and say whether the conclusion changes at a different α. Confirm current details on myLearning.

Studying with AI? Sia — free AI economics tutor works through ECON 1012 step by step.

Study strategy

Exam move

Drill the chain that workshop questions, the Module 10 quiz and the practice case study all reuse: sums → s_x², s_xy → β̂₁, β̂₀ → SSE → s_ε → s_{β̂₁} → t — and put it on your A4 note sheet in that order. Three habits protect marks: use df = n − 2 (never n − 1) for every t lookup; keep naming straight — here SSE is the unexplained (residual) sum of squares, SSR the explained one, though other books flip them; and R² has no critical value, so significance always comes from the slope t-test. Show every step of that test and close with a plain-language conclusion at the stated α — 'do not reject', never 'accept'. Re-run the Module 10 quiz on myLearning until the chain is automatic.

A+Everything unlocked
Unlocks this Bible + your other University of Adelaide subjects - and 1,000+ Bibles across every Australian university.
Sia - your ECON1012 tutor, unlimited, worked the way the exam marks it
The full 8-page Bible + practice bank with worked solutions
Chrome extension - sync your LMS so Sia knows your deadlines
Bilingual EN / Chinese on every Bible and every Sia answer
$25/ month
30-day money-back · cancel in one tap · how it works
Unlock the full ECON1012 Bible + your other University of Adelaide subjects解锁完整 ECON1012 Bible + University of Adelaide 全部科目
$25/mo