University of Sydney · S1 2026 · FACULTY OF BUSINESS & ECONOMICS

QBUS5001 · Foundation In Data Analytics For Business

- one subject, every graph, every model, every mark

50% final exam · hurdle14 Chapters11-page Bible

Our own words - no uploaded lecturer files

Built to mirror S1 2026 · updated this semester

Chapter 9 of 11 · QBUS5001

Simple Linear Regression

Module 10 fits a straight line to two variables: the least-squares estimates b₁ = Σ(xᵢ−x̄)(yᵢ−ȳ)/Σ(xᵢ−x̄)² and b₀ = ȳ − b₁x̄, giving the fitted line Ŷ = b₀ + b₁X. You interpret the slope (expected change in Y per unit of X) and the intercept, decompose variation as SST = SSR + SSE, and report model fit with R² and the standard error of the regression (SER).

This module is where the descriptive covariance of Module 1 becomes a predictive tool, and it sets up the diagnostics and inference that follow in Module 11.

In this chapter

What this chapter covers

01Population model E(Y|X) = β₀ + β₁X and the error term
02Least-squares slope b₁ and intercept b₀
03Fitted values Ŷ and residuals e = Y − Ŷ
04Interpreting the slope and intercept in business terms
05Sums of squares: SST = SSR + SSE
06R² = SSR/SST as the proportion of variation explained
07Standard error of the regression (SER)
08Prediction and the danger of extrapolation

Worked example · free

Estimating a simple linear regression line

Q [6 marks]. A café records weekly advertising spend X (in $00s) and sales Y (in $000s) over five weeks: (2, 6), (3, 7), (5, 10), (6, 11), (9, 16). Estimate the least-squares regression line and interpret the slope.

1 markCompute the means: x̄ = (2+3+5+6+9)/5 = 25/5 = 5; ȳ = (6+7+10+11+16)/5 = 50/5 = 10.
1 markCompute Σ(x−x̄)(y−ȳ): (−3)(−4) + (−2)(−3) + (0)(0) + (1)(1) + (4)(6) = 12 + 6 + 0 + 1 + 24 = 43.
1 markCompute Σ(x−x̄)²: 9 + 4 + 0 + 1 + 16 = 30.
1 markSlope: b₁ = 43/30 = 1.4333.
1 markIntercept: b₀ = ȳ − b₁x̄ = 10 − 1.4333×5 = 10 − 7.1667 = 2.8333. Line: Ŷ = 2.8333 + 1.4333X.
1 markInterpret: each additional $100 of advertising spend (one unit of X) is associated with about $1,433 more in weekly sales (since Y is in $000s).

Ŷ = 2.8333 + 1.4333X. Each extra $100 of advertising spend is associated with about $1,433 in additional weekly sales.

Sia tip — Lay out a small table of (x−x̄), (y−ȳ) and their products before computing the slope — it is the same numerator as the sample covariance, so you are reusing Module 1's machinery. Always state the slope interpretation in the question's units.

Glossary

Key terms

Least-squares estimates: The slope b₁ and intercept b₀ that minimise the sum of squared residuals; b₁ = Σ(x−x̄)(y−ȳ)/Σ(x−x̄)² and b₀ = ȳ − b₁x̄.
Residual (e): The vertical gap between an observed value and the fitted line, eᵢ = yᵢ − ŷᵢ; least squares minimises the sum of their squares.
SST, SSR, SSE: Total (SST), explained/regression (SSR) and unexplained/error (SSE) sums of squares; they satisfy SST = SSR + SSE.
Coefficient of determination (R²): R² = SSR/SST, the fraction of the variation in Y explained by the model; it ranges from 0 to 1 in simple regression.
Standard error of the regression (SER): SER = √(SSE/(n−k−1)), the typical size of a residual; smaller values indicate tighter fit around the line.

FAQ

Simple Linear Regression FAQ

How do I interpret the intercept b₀?

It is the expected value of Y when X = 0. Often this is outside the range of the data (e.g. zero advertising), so treat it as a mathematical anchor for the line rather than a meaningful business prediction unless X = 0 is realistic.

What does R² tell me and what does it not?

R² is the proportion of variation in Y the model explains. It does not tell you whether the relationship is causal, whether the model assumptions hold, or whether predictions outside the data range are safe.

Why is extrapolation risky?

The estimated line is only supported by the range of X observed. Predicting Y for X values far outside that range assumes the linear relationship continues, which the data cannot justify and which often fails in practice.

Study strategy

Exam move

Practise the slope-intercept calculation by hand on a five- or six-point dataset until the table method is second nature, then verify with Excel's Data Analysis → Regression so you can read the same output the exam shows. Always finish a regression answer with a slope interpretation in the variables' actual units — that sentence is reliably worth a mark and is where rushed answers fall short.

Keep going — explore the course

A+Everything unlocked

Unlocks this Bible + all 203 of your University of Sydney subjects - and 1,000+ Bibles across every Australian university.

Sia - your QBUS5001 tutor, unlimited, worked the way the exam marks it

The full 11-page Bible + practice bank with worked solutions

Chrome extension - sync your LMS so Sia knows your deadlines

Bilingual EN / Chinese on every Bible and every Sia answer

$25/ month

30-day money-back · cancel in one tap · how it works