University of Sydney · S1 2026 · FACULTY OF BUSINESS & ECONOMICS

BUSS1020 · Quantitative Business Analysis

- one subject, every graph, every model, every mark
50% final exam · hurdle14 Chapters11-page Bible
Our own words - no uploaded lecturer files
Built to mirror S1 2026 · updated this semester
Chapter 10 of 11 · BUSS1020

Simple Linear Regression

Simple Linear Regression (Weeks 10–11, Berenson Ch 13) models the straight-line relationship between one predictor X and a response Y. You measure association with covariance and correlation, fit the line Ŷ = b₀ + b₁X by least squares, assess fit with the coefficient of determination r² and the standard error of the estimate, then move to inference: testing the slope, building a confidence interval for it, and distinguishing a confidence interval for the mean of Y from a wider prediction interval for an individual Y.

In this chapter

What this chapter covers

  • 01Covariance and the correlation coefficient r
  • 02Correlation measures linear association, not causation
  • 03Least-squares line Ŷ = b₀ + b₁X; slope and intercept meaning
  • 04Sums of squares: SST = SSR + SSE
  • 05Coefficient of determination r² = SSR/SST
  • 06Standard error of the estimate S_ER
  • 07t-test and confidence interval for the slope β₁
  • 08Confidence interval for mean Y vs wider prediction interval for individual Y
Worked example · free

Interpret a fitted regression line, r² and a residual

Q [7 marks]. Across several branches, weekly advertising spend X ($000) predicts weekly sales Y ($000) with the fitted line Ŷ = 18 + 2.5X and r² = 0.64. (a) Predict sales when ad spend is $12k. (b) State the value of r and what r² means. (c) A branch with X = 8 actually sold $42k — find its residual.
  • 2 marks(a) Substitute X = 12 (in $000): Ŷ = 18 + 2.5(12) = 18 + 30 = 48, i.e. predicted sales of $48,000.
  • 1 mark(b) Since the slope is positive, r = +√0.64 = +0.80.
  • 1 mark(b) Interpret r²: 64% of the variation in weekly sales is explained by advertising spend through the linear model.
  • 1 mark(c) Predicted sales at X = 8: Ŷ = 18 + 2.5(8) = 18 + 20 = 38, i.e. $38,000.
  • 1 mark(c) Residual = actual − predicted = 42 − 38 = +4, i.e. $4,000 above the line.
  • 1 markInterpret the slope for context: each extra $1,000 of advertising is associated with about $2,500 more weekly sales.
(a) Predicted sales ≈ $48,000; (b) r = +0.80 and r² = 0.64 means 64% of sales variation is explained by ad spend; (c) the residual is +$4,000 (the branch sold above prediction).
Sia tip — The sign of r matches the sign of the slope. A residual is always actual minus predicted, so a positive residual means the point lies above the line. Always interpret the slope as a 'per one-unit change' statement in the variables' real units.
Glossary

Key terms

Correlation coefficient r
A standardised measure of the strength and direction of a LINEAR relationship between two numerical variables, ranging from −1 to +1; it does not imply causation.
Least-squares line
The line Ŷ = b₀ + b₁X that minimises the sum of squared vertical residuals; b₁ is the slope and b₀ the intercept.
Coefficient of determination r²
The proportion of variation in Y explained by the regression, r² = SSR/SST, between 0 and 1; it is the square of the correlation in simple regression.
Residual
The vertical gap between an observed value and the fitted line, residual = Y − Ŷ; positive residuals lie above the line.
Confidence interval vs prediction interval
A confidence interval estimates the MEAN of Y at a given X (narrower); a prediction interval brackets an INDIVIDUAL future Y at that X (wider, because it also includes individual scatter).
FAQ

Simple Linear Regression FAQ

What's the difference between covariance and correlation?

Both measure how two variables move together, but covariance is in the product of the original units and so is hard to interpret, while correlation r is standardised to lie between −1 and +1, giving a comparable measure of linear strength and direction.

Why is the prediction interval always wider than the confidence interval?

The confidence interval estimates the average Y at a given X, which depends only on uncertainty in the fitted line. The prediction interval brackets a single new observation, so it must also include the natural scatter of individual points around the line — adding the extra '1' under the root.

Does a significant slope prove that X causes Y?

No. Regression and correlation establish association, not causation. A significant slope means X helps predict Y in this data, but a lurking variable or reverse causation could explain the link — a point examiners often test in interpretation questions.

Study strategy

Exam move

Treat regression as two layers: first the fitting/description (slope, intercept, r, r², residuals) and then the inference (slope t-test, slope CI, mean-Y versus individual-Y intervals). Practise interpreting every quantity in business words — what the slope means per unit, what r² means as a percentage, what a residual says about one observation. Memorise that r shares the slope's sign and that the prediction interval is the wider of the two. Because this is a heavy Part B topic across two weeks, work full past-style problems end to end, including reading numbers off an Excel-style output.

A+Everything unlocked
Unlocks this Bible + all 203 of your University of Sydney subjects - and 1,000+ Bibles across every Australian university.
Sia - your BUSS1020 tutor, unlimited, worked the way the exam marks it
The full 11-page Bible + practice bank with worked solutions
Chrome extension - sync your LMS so Sia knows your deadlines
Bilingual EN / Chinese on every Bible and every Sia answer
$25/ month
30-day money-back · cancel in one tap · how it works
Unlock the full BUSS1020 Bible + 203 University of Sydney subjects解锁完整 BUSS1020 Bible + University of Sydney 203 门科目
$25/mo