University of Sydney · S1 2026 · FACULTY OF BUSINESS & ECONOMICS

QBUS5001 · Foundation In Data Analytics For Business

- one subject, every graph, every model, every mark
50% final exam · hurdle14 Chapters11-page Bible
Our own words - no uploaded lecturer files
Built to mirror S1 2026 · updated this semester
Chapter 9 of 11 · QBUS5001

Simple Linear Regression

Module 10 fits a straight line to two variables: the least-squares estimates b₁ = Σ(xᵢ−x̄)(yᵢ−ȳ)/Σ(xᵢ−x̄)² and b₀ = ȳ − b₁x̄, giving the fitted line Ŷ = b₀ + b₁X. You interpret the slope (expected change in Y per unit of X) and the intercept, decompose variation as SST = SSR + SSE, and report model fit with and the standard error of the regression (SER).

This module is where the descriptive covariance of Module 1 becomes a predictive tool, and it sets up the diagnostics and inference that follow in Module 11.

In this chapter

What this chapter covers

  • 01Population model E(Y|X) = β₀ + β₁X and the error term
  • 02Least-squares slope b₁ and intercept b₀
  • 03Fitted values Ŷ and residuals e = Y − Ŷ
  • 04Interpreting the slope and intercept in business terms
  • 05Sums of squares: SST = SSR + SSE
  • 06R² = SSR/SST as the proportion of variation explained
  • 07Standard error of the regression (SER)
  • 08Prediction and the danger of extrapolation
Worked example · free

Estimating a simple linear regression line

Q [6 marks]. A café records weekly advertising spend X (in $00s) and sales Y (in $000s) over five weeks: (2, 6), (3, 7), (5, 10), (6, 11), (9, 16). Estimate the least-squares regression line and interpret the slope.
  • 1 markCompute the means: x̄ = (2+3+5+6+9)/5 = 25/5 = 5; ȳ = (6+7+10+11+16)/5 = 50/5 = 10.
  • 1 markCompute Σ(x−x̄)(y−ȳ): (−3)(−4) + (−2)(−3) + (0)(0) + (1)(1) + (4)(6) = 12 + 6 + 0 + 1 + 24 = 43.
  • 1 markCompute Σ(x−x̄)²: 9 + 4 + 0 + 1 + 16 = 30.
  • 1 markSlope: b₁ = 43/30 = 1.4333.
  • 1 markIntercept: b₀ = ȳ − b₁x̄ = 10 − 1.4333×5 = 10 − 7.1667 = 2.8333. Line: Ŷ = 2.8333 + 1.4333X.
  • 1 markInterpret: each additional $100 of advertising spend (one unit of X) is associated with about $1,433 more in weekly sales (since Y is in $000s).
Ŷ = 2.8333 + 1.4333X. Each extra $100 of advertising spend is associated with about $1,433 in additional weekly sales.
Sia tip — Lay out a small table of (x−x̄), (y−ȳ) and their products before computing the slope — it is the same numerator as the sample covariance, so you are reusing Module 1's machinery. Always state the slope interpretation in the question's units.
Glossary

Key terms

Least-squares estimates
The slope b₁ and intercept b₀ that minimise the sum of squared residuals; b₁ = Σ(x−x̄)(y−ȳ)/Σ(x−x̄)² and b₀ = ȳ − b₁x̄.
Residual (e)
The vertical gap between an observed value and the fitted line, eᵢ = yᵢ − ŷᵢ; least squares minimises the sum of their squares.
SST, SSR, SSE
Total (SST), explained/regression (SSR) and unexplained/error (SSE) sums of squares; they satisfy SST = SSR + SSE.
Coefficient of determination (R²)
R² = SSR/SST, the fraction of the variation in Y explained by the model; it ranges from 0 to 1 in simple regression.
Standard error of the regression (SER)
SER = √(SSE/(n−k−1)), the typical size of a residual; smaller values indicate tighter fit around the line.
FAQ

Simple Linear Regression FAQ

How do I interpret the intercept b₀?

It is the expected value of Y when X = 0. Often this is outside the range of the data (e.g. zero advertising), so treat it as a mathematical anchor for the line rather than a meaningful business prediction unless X = 0 is realistic.

What does R² tell me and what does it not?

R² is the proportion of variation in Y the model explains. It does not tell you whether the relationship is causal, whether the model assumptions hold, or whether predictions outside the data range are safe.

Why is extrapolation risky?

The estimated line is only supported by the range of X observed. Predicting Y for X values far outside that range assumes the linear relationship continues, which the data cannot justify and which often fails in practice.

Study strategy

Exam move

Practise the slope-intercept calculation by hand on a five- or six-point dataset until the table method is second nature, then verify with Excel's Data Analysis → Regression so you can read the same output the exam shows. Always finish a regression answer with a slope interpretation in the variables' actual units — that sentence is reliably worth a mark and is where rushed answers fall short.

A+Everything unlocked
Unlocks this Bible + all 203 of your University of Sydney subjects - and 1,000+ Bibles across every Australian university.
Sia - your QBUS5001 tutor, unlimited, worked the way the exam marks it
The full 11-page Bible + practice bank with worked solutions
Chrome extension - sync your LMS so Sia knows your deadlines
Bilingual EN / Chinese on every Bible and every Sia answer
$25/ month
30-day money-back · cancel in one tap · how it works
Unlock the full QBUS5001 Bible + 203 University of Sydney subjects解锁完整 QBUS5001 Bible + University of Sydney 203 门科目
$25/mo