University of Melbourne · S1 2026 · FACULTY OF INFORMATION TECHNOLOGY

COMP90038 · Algorithms And Complexity

- one subject, every graph, every model, every mark
50% final exam · hurdle14 Chapters6-page Bible
Our own words - no uploaded lecturer files
Built to mirror S1 2026 · updated this semester
Chapter 8 of 10 · COMP90038

Space-Time Tradeoffs and Hashing

The transform-and-conquer idea that you can spend extra memory to buy time. Three techniques carry the chapter. Counting sort tallies keys drawn from a small universe into an occurrence array and emits them in order in O(n + |U|) = O(n) — and because it makes no comparisons it legitimately beats the Ω(n log n) comparison-sort lower bound. Horspool's algorithm precomputes a shift table so string matching can skip ahead instead of advancing one character at a time (Boyer-Moore, KMP and Rabin-Karp are explicitly not examinable here, but Horspool is). And hashing trades a table's worth of space for expected O(1) lookup: a hash function maps keys to buckets, with collisions resolved by chaining (a list per bucket) or open addressing (probe to the next free slot, e.g. linear probing). The chapter covers the load factor's effect on performance and universal hashing (choosing the hash function randomly so no input is reliably bad). Q1 hand-traces a linear-probing insert sequence, so the probe order and modular arithmetic must be exact.

In this chapter

What this chapter covers

  • 01Counting sort — O(n) and beating the comparison lower bound
  • 02Horspool's algorithm — shift tables (Boyer-Moore / KMP / Rabin-Karp NOT examinable)
  • 03Hash functions and collisions
  • 04Chaining vs open addressing (linear probing); the load factor
  • 05Universal hashing and the pigeonhole principle
Worked example · free

Worked example: linear probing

Q [4 marks]. Insert the keys 18, 41, 7, 29 into a hash table of size 11 using h(k) = k mod 11 and linear probing. Give the final table positions.
718294178910
  • +118: 18 mod 11 = 7 → slot 7 (free). 41: 41 mod 11 = 8 → slot 8 (free).
  • +27: 7 mod 11 = 7 → slot 7 occupied by 18 → probe to slot 8 → occupied by 41 → probe to slot 9 (free) → place 7 at slot 9.
  • +129: 29 mod 11 = 7 → slots 7, 8, 9 occupied → probe to slot 10 (free) → place 29 at slot 10.
Final positions: 18→7, 41→8, 7→9 (probed past 7, 8), 29→10 (probed past 7, 8, 9). Linear probing piled the three colliding keys into the next free slots, the clustering that makes its cost climb as the load factor grows.
Glossary

Key terms

Counting sort
A non-comparison sort for keys from a small integer universe: tally each key's occurrences, then emit keys in order. O(n + |U|) time and O(|U|) space; beats the Ω(n log n) comparison lower bound precisely because it never compares keys.
Hash function
A function mapping keys to bucket indices in a table, ideally spreading keys uniformly. Good hashing gives expected O(1) lookup; the cost of a bad function is many collisions and degraded performance.
Collision resolution
How a hash table handles two keys hashing to the same bucket. Chaining stores a list per bucket; open addressing (e.g. linear probing) places the colliding key in the next free slot by a probe sequence.
Load factor
The ratio of stored keys to table slots, α = n / m. Expected lookup cost rises with α (sharply for open addressing as α → 1), so tables are resized to keep α bounded; it is the key tuning parameter.
Universal hashing
Choosing the hash function at random from a carefully designed family so that, for any fixed input, the expected number of collisions is small. It removes the guarantee that a single fixed function can be defeated by an adversarial input.
FAQ

Space-Time Tradeoffs and Hashing FAQ

How can counting sort be faster than O(n log n) if that's the sorting lower bound?

The Ω(n log n) bound applies only to comparison sorts. Counting sort never compares keys — it indexes them directly into an occurrence array — so the bound does not apply. The catch is it only works when keys come from a small, bounded universe; otherwise the |U| term dominates.

Chaining or open addressing — which is better?

Chaining degrades gracefully as the load factor grows and tolerates α > 1, at the cost of pointer overhead and worse cache behaviour. Open addressing (e.g. linear probing) is cache-friendly and pointer-free but degrades sharply as α → 1 and needs careful deletion. The choice depends on load and memory layout.

Why bother with universal hashing?

Because any single fixed hash function has some set of keys that all collide, and an adversary (or just unlucky data) can hit it, turning O(1) lookups into O(n). Picking the function randomly from a universal family makes the expected collision count small for every input, restoring the O(1) expectation.

Study strategy

Exam move

Drill the linear-probing trace, because Q1 gives you a table size, a hash function and a key sequence and asks for the final positions — a one-mark trace where the probe order and the modular arithmetic must be exact, and one early misplacement cascades. Know counting sort's O(n) bound and exactly why it sidesteps the comparison lower bound (a favourite Q2 justification). Keep the chaining-vs-probing tradeoffs and the load factor's role ready as design-question reasoning, and remember the examinability line: Horspool yes, Boyer-Moore / KMP / Rabin-Karp no.

A+Everything unlocked
Unlocks this Bible + all 24 of your University of Melbourne subjects - and 1,000+ Bibles across every Australian university.
Sia - your COMP90038 tutor, unlimited, worked the way the exam marks it
The full 6-page Bible + practice bank with worked solutions
Chrome extension - sync your LMS so Sia knows your deadlines
Bilingual EN / Chinese on every Bible and every Sia answer
$25/ month
30-day money-back · cancel in one tap · how it works
Unlock the full COMP90038 Bible + 24 University of Melbourne subjects解锁完整 COMP90038 Bible + University of Melbourne 24 门科目
$25/mo