University of Sydney · FACULTY OF COMPUTER SCIENCE

COMP9120 · Database Management Systems

- one subject, every graph, every model, every mark

Computer Science14 Chapters8-page Bible

Our own words - no uploaded lecturer files

Updated for this semester

Chapter 9 of 12 · COMP9120

Schema Refinement & Normalisation

Schema refinement is the COMP9120 topic where a workable relational design becomes a good one: the University of Sydney's Database Management Systems unit teaches you to detect redundancy formally through functional dependencies, grade a relation by normal form (1NF through BCNF and 4NF), and split a bad relation into good ones without losing information. It sits between ER→relational mapping and physical design, and on the paper it is a self-contained method question where every closure and every split earns marks.

In this chapter

What this chapter covers

01Why redundancy is the enemy: update, insertion and deletion anomalies
02Functional dependencies X→Y, and why an instance can disprove but never prove one
03Armstrong's axioms (reflexivity, augmentation, transitivity) plus the derived union and decomposition rules
04Attribute closure X+ as the one algorithm that tests keys and derives FDs
05Superkeys, candidate keys (minimal superkeys), and prime vs non-prime attributes
06The nested normal forms 1NF ⊇ 2NF ⊇ 3NF ⊇ BCNF ⊇ 4NF and how to name the highest one
07The 3NF-vs-BCNF gap and multivalued dependencies with 4NF
08Lossless-join and dependency-preserving decomposition, and the BCNF decomposition algorithm

Worked example · free

Candidate key by closure, highest normal form, and a BCNF split

Q [8 marks]. Relation Enrol(StudentID, UnitCode, UnitName, Tutor, TutorRoom) records who tutors each student in each unit. FDs: (StudentID, UnitCode)→Tutor; UnitCode→UnitName; Tutor→TutorRoom. Find the candidate key, name the highest normal form, and give a lossless BCNF decomposition.

+1Forced key attributes: StudentID and UnitCode never appear on any right-hand side, so both must be in every candidate key.
+2Closure test: (StudentID, UnitCode)+ = {StudentID, UnitCode} → +Tutor → +UnitName (UnitCode→UnitName) → +TutorRoom (Tutor→TutorRoom) = all 5 attributes, so it is a superkey.
+1Minimality: StudentID+ = {StudentID} and UnitCode+ = {UnitCode, UnitName} — neither half reaches everything, so the key is minimal. Candidate key = (StudentID, UnitCode).
+1Highest normal form: the key is composite and UnitCode (a proper part) determines non-prime UnitName — a partial dependency — so the relation is 1NF only (not even 2NF).
+1Decompose on UnitCode→UnitName: pull out Unit(UnitCode, UnitName); the remainder keeps UnitCode as the link.
+2Decompose on Tutor→TutorRoom: pull out Tutor(Tutor, TutorRoom); what remains is Assigned(StudentID, UnitCode, Tutor) with key (StudentID, UnitCode). Every determinant is now a key of its table, so all relations are in BCNF.

Candidate key = (StudentID, UnitCode); the relation is in 1NF only (a partial dependency UnitCode→UnitName breaks 2NF). A lossless BCNF decomposition is Unit(UnitCode, UnitName), Tutor(Tutor, TutorRoom), and Assigned(StudentID, UnitCode, Tutor). Every shared attribute is a key of one side, so the join is lossless, and each original FD sits inside one table, so dependencies are preserved too.

Sia tip — Always finish with the minimality check: (StudentID, UnitCode) is a candidate key, but adding UnitName or Tutor to it gives a superkey that scores zero when the question asks for a candidate key.

Glossary

Key terms

Functional dependency (X→Y): A rule that whenever two tuples agree on all attributes in X they must also agree on all attributes in Y; it is decided by the meaning of the data, and an instance can only disprove one, never prove it.
Armstrong's axioms: A sound and complete rule set for deriving FDs: reflexivity (Y⊆X gives X→Y), augmentation (X→Y gives XZ→YZ) and transitivity (X→Y, Y→Z gives X→Z), plus the derived union and decomposition rules.
Attribute closure X+: Every attribute that X functionally determines under a set of FDs; X is a superkey exactly when X+ is all attributes, and X→Y holds exactly when Y is a subset of X+.
Candidate key: A minimal superkey — an attribute set whose closure is all attributes and from which removing any attribute stops it being a superkey. A relation may have several; one is chosen as the primary key.
Prime vs non-prime attribute: A prime attribute is a member of some candidate key; a non-prime attribute belongs to no candidate key. These labels drive the 2NF and 3NF tests.
BCNF: Boyce-Codd Normal Form: for every non-trivial FD X→Y the determinant X must be a superkey — 'every arrow comes out of a key'. It is stricter than 3NF, which also allows an FD when its right-hand side is prime.
Lossless-join decomposition: A split of R into R1 and R2 whose natural join returns exactly R with no spurious tuples; guaranteed when the shared attributes R1∩R2 form a key of at least one component.
Dependency-preserving decomposition: A split where the union of the FDs projected onto each component has the same closure as the original set, so every constraint can be checked on a single table without a join. It cannot always be achieved together with BCNF.

FAQ

Schema Refinement & Normalisation FAQ

What is the difference between 3NF and BCNF?

Both require that for every non-trivial functional dependency X→Y the determinant X should ideally be a superkey. BCNF demands exactly that with no exceptions. 3NF is slightly weaker: it also accepts an FD whose determinant is not a superkey as long as the right-hand side Y is a prime attribute (part of some candidate key). That loophole means a 3NF relation can still hold a little redundancy that BCNF removes. The practical catch is that forcing BCNF can sometimes cost dependency preservation, whereas a lossless, dependency-preserving 3NF decomposition is always achievable, so 3NF is occasionally the right stopping point.

How do I find a candidate key quickly?

Start with the shortcut: any attribute that never appears on the right-hand side of any FD must be in every candidate key, because nothing can determine it. Seed your key with those attributes, compute their attribute closure, and add attributes until the closure reaches every attribute (a superkey). Then check minimality by removing one attribute at a time — if the closure still reaches everything, that attribute was not needed. A minimal superkey is a candidate key; repeat with different seeds to find any others.

Can AI help me with schema refinement and normalisation in COMP9120?

Yes, as a study aid. Sia is an AI tutor that explains concepts step by step — it can walk you through an attribute-closure calculation, show why a partial dependency breaks 2NF, or trace a BCNF decomposition on a practice relation so you see the method. It does not sit your assessments or hand you assignment or exam answers, and it cannot promise any grade; the goal is to help you understand the technique so you can compute closures and decompositions yourself under exam conditions. Always follow the University of Sydney's academic-integrity rules on permitted tools.

Studying with AI? Sia — free AI data science tutor works through COMP9120 step by step.

Study strategy

Exam move

Make attribute closure your reflex, because every other question reduces to it. Practise a fixed order of attack on each relation: first mark the attributes that appear on no right-hand side (they are in every candidate key), then compute the closure of your candidate to confirm it reaches all attributes and is minimal, then name the highest normal form by testing each FD's left-hand side for superkey-ness (and its right-hand side for prime-ness), and finally decompose by peeling off one violating FD at a time as X∪Y and R−Y, checking that the shared attributes are a key of one side so the join is lossless. Since the normalisation question is marked step by step, always write the closure out in full and name the exact FD each split removes — a correct method with a small slip still earns most of the marks. Confirm the exact mark split and exam timetable on Canvas.

Keep going — explore the course

A+Everything unlocked

Unlocks this Bible + all 25 of your University of Sydney subjects - and 1,000+ Bibles across every Australian university.

Sia - your COMP9120 tutor, unlimited, worked the way the exam marks it

The full 8-page Bible + practice bank with worked solutions

Chrome extension - sync your LMS so Sia knows your deadlines

Bilingual EN / Chinese on every Bible and every Sia answer

$25/ month

30-day money-back · cancel in one tap · how it works