University of Sydney · FACULTY OF COMPUTER SCIENCE

COMP5318 · Machine Learning and Data Mining

- one subject, every graph, every model, every mark

Computer Science14 Chapters8-page Bible

Our own words - no uploaded lecturer files

Updated for this semester

Chapter 2 of 11 · COMP5318

k-Nearest Neighbour & Rule-Based Classifiers

This chapter covers k-Nearest Neighbour (k-NN) and the rule-based classifiers (1R and PRISM) from the Week 2 classification block of COMP5318 Machine Learning and Data Mining at the University of Sydney. You will learn how k-NN labels a new point by a majority vote of its closest neighbours, how to compute the distances that decide “closest”, and how the choice of k trades bias against variance. It is a dependable source of short numeric and explain questions in the final exam.

In this chapter

What this chapter covers

01Why k-NN is a lazy learner: it stores the data and defers all work to prediction time
02The four-step k-NN algorithm: distance, sort, take k nearest, majority vote (average for regression)
03Computing Euclidean (straight-line) and Manhattan (city-block) distances by hand
04Why you must scale features before measuring distance, so no large-range feature dominates
05How k controls the decision boundary: small k overfits, large k oversmooths, k=n predicts the majority
06Choosing k with validation error, and the rule of thumb k ≤ √(number of training examples)
07The speed and memory cost of k-NN, and why eager learners suit real-time prediction
08Rule-based baselines: 1R (one attribute) and PRISM (a covering rule set)

Worked example · free

Classify a point with k-NN (k=3, Euclidean)

Q [4 marks]. Five training points with two features and classes A/B: (1,1)→A, (2,3)→A, (5,4)→B, (6,2)→B, (3,5)→A. Using k=3 and Euclidean distance, classify the test point (4,4).

+1Compute the squared Euclidean distance from (4,4) to each training point (comparing squares is enough to rank, so no square roots are needed): (1,1)→3²+3²=18; (2,3)→2²+1²=5; (5,4)→1²+0²=1; (6,2)→2²+2²=8; (3,5)→1²+1²=2.
+1Sort ascending and take the three smallest: (5,4)=1, (3,5)=2, (2,3)=5 are the three nearest; (6,2)=8 and (1,1)=18 are further away.
+1Read off the labels of those three neighbours: (5,4)→B, (3,5)→A, (2,3)→A, giving the set {B, A, A}.
+1Take the majority vote: two A against one B → predict class A.

Class A. The three nearest neighbours are one B and two A, so the majority vote is A. (Note the answer depends on k: at k=1 the single nearest point (5,4) is B, so k=1 would instead predict B.)

Sia tip — Always scale your features first and write out the distance formula before the numbers. For ranking neighbours you can compare squared distances and skip the square root, but take the root if the question actually asks for a numeric distance.

Glossary

Key terms

k-Nearest Neighbour (k-NN): A classifier that labels a new point by a majority vote of the k closest training points under a distance measure; for regression it averages their target values.
Lazy learner: A method that does no training — it simply stores the training set and does all computation at prediction time. k-NN is the standard example; Naïve Bayes and decision trees are eager.
Euclidean distance: The straight-line distance between two points: the square root of the summed squared coordinate differences, √(Σ(aᵢ−bᵢ)²).
Manhattan distance: The city-block distance: the sum of the absolute coordinate differences, Σ|aᵢ−bᵢ| — the total of axis-aligned steps between the points.
Feature scaling: Rescaling features (for example min-max to [0,1] or the z-score) so that a feature with a large numeric range does not dominate the distance calculation.
Bias-variance effect of k: Small k gives a jagged boundary that overfits noise (low bias, high variance); large k smooths the boundary (high bias, low variance); at k=n every point is labelled with the global majority class.
1R: A rule-based baseline that builds a single rule on one attribute (a decision stump): each attribute value is assigned its majority class, and the attribute with the lowest total error is kept.
PRISM: A rule-based covering algorithm that builds a set of IF-THEN rules one class at a time, adding conditions to each rule to push it toward covering only examples of its class.

FAQ

k-Nearest Neighbour & Rule-Based Classifiers FAQ

Do I have to know both Euclidean and Manhattan distance for the exam?

Yes. Short distance calculations appear in the k-NN style of question, and the paper may specify either metric, so practise both. Euclidean squares each coordinate difference then takes a square root; Manhattan sums the absolute differences. Because the exam is closed book with only a non-programmable calculator, the numbers are kept small enough to work by hand — always show the working, since method marks are awarded even if one slip changes the final vote.

Why is k-NN a poor choice for real-time classification?

k-NN is a lazy learner: it does no training and instead compares each new point to every stored training point, so a single prediction over m training rows of n features costs about m×n distance computations and the whole training set must stay in memory. For a task needing thousands of classifications per second, an eager learner that precomputes a model — Naïve Bayes is the usual answer — predicts far faster. State the reason (lazy versus eager, prediction cost), not just the name.

Can AI help me with k-Nearest Neighbour in COMP5318?

Yes, as a study aid. Sia, the AskSia tutor, can explain k-NN step by step — how to set up a distance table, sort the neighbours, run the majority vote, and reason about how the choice of k changes the boundary — and it can generate fresh practice points so you build the method yourself. It will not do your graded assignments or exam for you, and it cannot promise a mark or a pass; the University also requires you to acknowledge any AI tools used in assessable work. Use it to understand the method, then practise unaided under exam conditions.

Studying with AI? Sia — free AI machine learning tutor works through COMP5318 step by step.

Study strategy

Exam move

Make the k-NN procedure mechanical first: given a test point, build a distance table, sort it, take the k nearest, and vote — rehearse this on small integer coordinates until it is automatic, since that is exactly how the final exam frames it. Layer the concepts on top: be able to write both distance formulas from memory, explain in one sentence why you scale features before measuring distance, and describe how the boundary and the two errors move as k grows (small k overfits, large k oversmooths, k=n predicts the majority). Keep the trade-off ready for the recurring “which classifier?” question — k-NN's slow prediction is its deciding weakness against an eager learner. Budget roughly one minute per mark, so a 4-mark classification takes about four to five minutes, and finish by drilling the 1R and PRISM baselines. The final exam is worth the majority of the unit and carries a minimum-mark hurdle, so this quick-scoring topic is worth banking cleanly; confirm the exact exam date on the Canvas exam timetable.

Keep going — explore the course

A+Everything unlocked

Unlocks this Bible + all 25 of your University of Sydney subjects - and 1,000+ Bibles across every Australian university.

Sia - your COMP5318 tutor, unlimited, worked the way the exam marks it

The full 8-page Bible + practice bank with worked solutions

Chrome extension - sync your LMS so Sia knows your deadlines

Bilingual EN / Chinese on every Bible and every Sia answer

$25/ month

30-day money-back · cancel in one tap · how it works