University of Sydney · S1 2026 · FACULTY OF BUSINESS & ECONOMICS

BUSS6002 · Data Science In Business

- one subject, every graph, every model, every mark
50% final exam · hurdle14 Chapters8-page Bible
Our own words - no uploaded lecturer files
Built to mirror S1 2026 · updated this semester
Chapter 5 of 11 · BUSS6002

Clustering & Customer Segmentation

Clustering is the unit's main unsupervised method: with no labels to predict, it partitions n data points into k groups that are similar within and different between. Its headline business use is customer segmentation — splitting a market into groups you can serve differently — and its headline algorithm is K-means, which alternates an assignment step and a centroid update step to drive down the within-cluster sum of squares (WCSS). The two facts the exam tests hardest are that the objective squares the Euclidean distance, and that K-means converges only to a local optimum because k is fixed and the result depends on the starting centres. This is the Week 5 topic, examined by MCQ (compute a centroid or squared distance), short answer (write the objective, name the steps, argue convergence) and framework recall (segmentation bases and CLV).

In this chapter

What this chapter covers

  • 011. Supervised vs unsupervised — clustering learns structure with no labels
  • 022. Customer segmentation — four bases: demographic, geographic, psychographic, behavioural
  • 033. Surrounding metrics — market size & value, market share, customer lifetime value (CLV)
  • 044. Squared Euclidean distance — the only distance K-means uses, and why it skips the root
  • 055. Centroid — the per-coordinate mean of a cluster's members (not a fixed data point)
  • 066. Clustering objective (WCSS) — sum of squared point-to-centroid distances, minimised
  • 077. The K-means algorithm — init → assign → update → repeat until centres stop moving
  • 088. Convergence, local optima & choosing k — monotone non-increasing, elbow method
Worked example · free

One K-means iteration — assign, update & new WCSS (k = 2)

Q [4 marks]. Four customers are encoded as A=[1,1], B=[2,1], C=[6,5], D=[7,5] with k=2 and initial centres μ₁=[1,1], μ₂=[2,1]. Carry out one assignment step, one update step, and state what happens to the objective.
  • +1Assignment — squared distance of each point to each centre, then assign to the nearest. A: 0 vs 1 → C₁. B: 1 vs 0 → C₂. C: 41 vs 32 → C₂. D: 52 vs 41 → C₂. So C₁={A}, C₂={B,C,D}.
  • +1Update μ₁ = mean of {A} = [1,1] (one member, unchanged).
  • +1Update μ₂ = mean of {B,C,D} = [(2+6+7)/3, (1+5+5)/3] = [5, 3.67].
  • +1The new WCSS (0 from C₁ plus ≈24.7 from C₂) is lower than the start — both steps can only reduce or hold the objective, so it is non-increasing and the algorithm converges. (Next iteration would flip B into C₁.)
C₁={A}, C₂={B,C,D}; μ₁=[1,1], μ₂=[5, 3.67]; WCSS falls (non-increasing) and K-means converges to a local optimum.
Sia tip — Always use the SQUARED distance for assignment (no square root needed), recompute each centre as the mean of its NEW members, and show the arithmetic — the working banks the marks even if one assignment slips.
Glossary

Key terms

Unsupervised learning
Learning structure from data that has no labels (no target y); clustering is the main example.
Cluster
A group of points that are close to each other and far from points in other groups.
Centroid
The centre of a cluster, computed as the per-coordinate mean of its members; it need not be an actual data point.
Squared Euclidean distance
‖x − μ‖² = the sum of squared coordinate differences = (x − μ)ᵀ(x − μ); the closeness measure K-means uses.
Within-cluster sum of squares (WCSS)
The clustering objective f = Σᵢ Σ_{x∈Cᵢ} ‖x − μᵢ‖²; lower means tighter clusters.
K-means
An iterative algorithm that minimises WCSS by alternating an assignment step (nearest centroid) and an update step (recompute centroids as means).
Customer segmentation
Splitting a market into groups that are similar within and different between, so marketing can be tailored beyond the 'average customer'.
Customer lifetime value (CLV)
The present value of a customer's expected future net cash flows (revenue minus acquisition and retention costs).
FAQ

Clustering & Customer Segmentation FAQ

Is clustering supervised or unsupervised?

Unsupervised — there is no label y to predict. K-means learns the grouping purely from the feature values, judged by how compact the clusters are, not by matching a known answer.

Why does K-means use the SQUARED distance instead of the plain distance?

Squaring removes the square root, which is faster, and it makes the update step clean: the point that minimises the sum of squared distances to a set of points is exactly their mean — which is why the centroid is the mean. For deciding the nearest centre the two give the same ranking anyway.

Does K-means always find the best possible clustering?

No. The objective can only fall or stay equal each iteration, so K-means always converges — but only to a LOCAL optimum that depends on the starting centres. The standard fix is to run it several times from different random seeds and keep the lowest-WCSS result.

Does K-means choose the number of clusters k for me?

No — k is an INPUT you fix in advance. K-means never discovers the right k. To choose k, plot WCSS against k and look for the 'elbow' (the point of diminishing returns), or use business judgement about how many segments are actionable.

Is the centroid always one of the data points?

No. The centroid is the average of its members, so it usually sits between them and is not an actual observation. It is also recomputed every iteration as the membership changes.

Does the objective strictly decrease every iteration?

It is non-increasing — it falls or stays the same. At convergence it holds steady. Claiming it 'always strictly decreases' is the subtly-wrong statement exams like to test.

Study strategy

Exam move

Lock in the two formulas first — centroid = per-coordinate mean, and WCSS = Σ Σ ‖x − μ‖² — because the easy MCQ marks (compute a centroid, compute a squared distance) are pure arithmetic once you know them, and the squared norm is where careless students drop marks. Then drill one full K-means iteration by hand: for each point compute the squared distance to every centre, assign to the smallest, recompute each centre as the mean of its new members, and show the WCSS fell. For the short-answer question be able to write the objective, name the assignment and update steps, and argue convergence precisely — the objective is non-increasing and converges to a LOCAL (not global) optimum because k is fixed and initialisation matters. Finally, keep the marketing framework warm: the four segmentation bases (demographic, geographic, psychographic, behavioural) and the surrounding metrics (market size, value, share, CLV), since these are quick recall marks students forget while drowning in the linear algebra.

A+Everything unlocked
Unlocks this Bible + all 203 of your University of Sydney subjects - and 1,000+ Bibles across every Australian university.
Sia - your BUSS6002 tutor, unlimited, worked the way the exam marks it
The full 8-page Bible + practice bank with worked solutions
Chrome extension - sync your LMS so Sia knows your deadlines
Bilingual EN / Chinese on every Bible and every Sia answer
$25/ month
30-day money-back · cancel in one tap · how it works
Unlock the full BUSS6002 Bible + 203 University of Sydney subjects解锁完整 BUSS6002 Bible + University of Sydney 203 门科目
$25/mo