University of Melbourne · S1 2026 · FACULTY OF BUSINESS & ECONOMICS

ECON20003 · Quantitative Methods 2

- one subject, every graph, every model, every mark
50% final exam · hurdle14 Chapters9-page Bible
Our own words - no uploaded lecturer files
Built to mirror S1 2026 · updated this semester
Chapter 6 of 12 · ECON20003

Correlation & Simple Linear Regression

Correlation & Simple Linear Regression moves from comparing groups to modelling a relationship between two numerical variables. You measure linear association with covariance and the correlation coefficient r, then fit the ordinary-least-squares line ŷ = β̂₀ + β̂₁x. Inference centres on the t-test of the slope (is there a real linear relationship?), the coefficient of determination R² (how much variation the line explains, equal to r² in simple regression), and the distinction between a confidence interval for the mean response and a wider prediction interval for an individual value.

In this chapter

What this chapter covers

  • 01Covariance and correlation r ∈ [−1, 1]; test ρ = 0 with t = r√(n−2)/√(1−r²)
  • 02OLS estimates: β̂₁ = S_xy/S²_x, β̂₀ = ȳ − β̂₁x̄
  • 03Gauss-Markov assumptions for valid inference
  • 04Slope t-test: t = β̂₁/SE(β̂₁), df = n − 2
  • 05R² = 1 − SSE/SST = r² in simple regression
  • 06Confidence interval for the mean response vs the wider prediction interval
Worked example · free

t-test on a regression slope and R²

Q [8 marks]. A simple regression of weekly sales on advertising spend uses n = 20 weeks and returns a slope estimate β̂₁ = 0.45 with standard error SE(β̂₁) = 0.15, and a correlation r = 0.60. Test at α = 0.05 (two-tailed) whether the slope is non-zero, and state R².
  • 1 markState the hypotheses for the slope (two-tailed): H₀: β₁ = 0 versus H₁: β₁ ≠ 0.
  • 1 markIdentify the distribution: t with df = n − 2 = 18.
  • 2 marksCompute the test statistic: t = β̂₁/SE(β̂₁) = 0.45/0.15 = 3.0.
  • 1 markDecision rule: the two-tailed critical value is t₀.₀₂₅,₁₈ = 2.101. Since 3.0 > 2.101, reject H₀.
  • 2 marksCoefficient of determination: in simple regression R² = r² = 0.60² = 0.36.
  • 1 markConclude in context: there is significant evidence of a positive linear relationship; advertising spend explains about 36% of the variation in weekly sales.
t = 0.45/0.15 = 3.0 > 2.101, so we reject H₀ and conclude the slope is significantly non-zero; R² = 0.36, so 36% of sales variation is explained.
Sia tip — In simple regression R² and r² are the same number, and the slope t-test answers the same question as the correlation test — so quote both consistently. Keep df = n − 2 for slope inference; it is the classic place to lose a mark by writing n − 1.
Glossary

Key terms

Correlation coefficient r
A unit-free measure of linear association between two variables, ranging from −1 to +1. It captures only linear strength and direction — a curved relationship can have r near 0 yet be strongly related.
OLS slope β̂₁
The least-squares estimate of how much ŷ changes per one-unit increase in x: β̂₁ = S_xy/S²_x. It minimises the sum of squared residuals between the observed and fitted values.
Coefficient of determination R²
The proportion of variation in Y explained by the model, R² = 1 − SSE/SST, between 0 and 1. In simple linear regression R² equals r².
Confidence vs prediction interval
A confidence interval estimates the MEAN response at a given x; a prediction interval estimates an INDIVIDUAL future value and is wider because it adds the variability of a single observation (a +1 inside the root).
FAQ

Correlation & Simple Linear Regression FAQ

Why is a prediction interval wider than a confidence interval at the same x?

A confidence interval captures uncertainty about the average response at that x; a prediction interval must also capture the scatter of a single new observation around that average, so it adds an extra variance term (a +1 inside the standard-error root) and is always wider.

Does a significant slope prove advertising causes sales?

No. Regression establishes association, not causation. A significant slope means a linear relationship is statistically detectable, but omitted variables, reverse causality or confounding can drive it — a point the specification chapter develops further.

Study strategy

Exam move

Practise reading the slope's Estimate, Std. Error, t value and Pr(>|t|) straight off an R lm summary, and convert the slope into a one-sentence interpretation in the units of the problem. Keep df = n − 2 and remember R² = r² as quick self-checks.

A+Everything unlocked
Unlocks this Bible + all 13 of your University of Melbourne subjects - and 1,000+ Bibles across every Australian university.
Sia - your ECON20003 tutor, unlimited, worked the way the exam marks it
The full 9-page Bible + practice bank with worked solutions
Chrome extension - sync your LMS so Sia knows your deadlines
Bilingual EN / Chinese on every Bible and every Sia answer
$25/ month
30-day money-back · cancel in one tap · how it works
Unlock the full ECON20003 Bible + 13 University of Melbourne subjects解锁完整 ECON20003 Bible + University of Melbourne 13 门科目
$25/mo