Is the MAT9004 exam open or closed book?

The MAT9004 final exam at Monash University is closed-book with no calculator and no notes permitted. However, a formula sheet is provided inside the exam paper, and unlimited blank A4 paper is allowed for working. This means you cannot bring this cheat sheet into the exam; use it for revision and to memorise the methods, because the provided formula sheet gives you the formulas but not the techniques.

How is MAT9004 assessed?

MAT9004 is assessed by two assignments worth 10 percent each, five quizzes worth 4 percent each (20 percent total), and a final exam worth 60 percent. The exam is a hurdle: you must score at least 45 percent on it to pass the unit. The exam runs for about 3 hours 10 minutes and covers all six topic areas.

What topics does the MAT9004 exam cover?

The exam covers six areas: single-variable calculus (differentiation, optimisation, integration), linear algebra (vectors, matrices, Gaussian elimination, determinants, eigenvalues and diagonalisation), multivariable calculus (partial derivatives, gradient, the Hessian determinant test), counting and combinatorics, probability including Bayes' theorem and named distributions, and graph theory and trees. Anything from lectures is examinable unless stated otherwise.

How do you classify a stationary point of a function of two variables?

First find the stationary points by solving the gradient equal to zero, that is f_x=0 and f_y=0. Then form the Hessian and compute D=f_xx·f_yy-(f_xy)^2 at each point. If D>0 the point is a local extremum, a minimum when f_xx>0 and a maximum when f_xx<0; if D<0 it is a saddle point; and if D=0 the test is inconclusive.

How do you diagonalise a matrix in MAT9004?

Find the eigenvalues as the roots of the characteristic equation det(A-lambda I)=0, then find an eigenvector for each by solving (A-lambda I)x=0. Place the eigenvectors as the columns of P and the eigenvalues on the diagonal of D, giving A=PDP^-1. The payoff is that A^n=PD^nP^-1, so high matrix powers and linear recurrences become easy to compute by hand.

What are the four selection types in counting?

Choosing r objects from n: ordered without repetition is n!/(n-r)!; ordered with repetition is n^r; unordered without repetition is the binomial coefficient C(n,r)=n!/(r!(n-r)!); and unordered with repetition is C(n+r-1,r). Deciding which to use means asking whether order matters and whether repeats are allowed.

How do you use Bayes' theorem in MAT9004?

Bayes' theorem reverses a conditional probability: Pr(A|B)=Pr(B|A)Pr(A)/Pr(B). First compute the total probability of the evidence using Pr(B)=Pr(B|A)Pr(A)+Pr(B|not A)Pr(not A), often via a two-branch tree diagram, then divide. A typical MAT9004 question gives prior rates and conditional rates and asks for the reversed conditional, for example the probability a part came from supplier A given that it was late.

What is the handshaking lemma and why does it matter?

The handshaking lemma states that in any graph the sum of the vertex degrees equals twice the number of edges. So the number of edges is half the degree sum, and the degree sum is always even, which means an even number of vertices have odd degree. It is the standard tool for deciding whether a graph with a given degree sequence can exist, for example there is no 3-regular graph on 7 vertices because the degree sum 21 is odd.

MAT9004

Mathematical Foundations for Data Science & AI

Q: How do you diagonalise a matrix in MAT9004?

Find the eigenvalues as the roots of the characteristic equation det(A-lambda I)=0, then find an eigenvector for each by solving (A-lambda I)x=0. Place the eigenvectors as the columns of P and the eigenvalues on the diagonal of D, giving A=PDP^-1. The payoff is that A^n=PD^nP^-1, so high matrix powers and linear recurrences become easy to compute by hand.

Q: What are the four selection types in counting?

Choosing r objects from n: ordered without repetition is n!/(n-r)!; ordered with repetition is n^r; unordered without repetition is the binomial coefficient C(n,r)=n!/(r!(n-r)!); and unordered with repetition is C(n+r-1,r). Deciding which to use means asking whether order matters and whether repeats are allowed.

Q: How do you use Bayes' theorem in MAT9004?

Bayes' theorem reverses a conditional probability: Pr(A|B)=Pr(B|A)Pr(A)/Pr(B). First compute the total probability of the evidence using Pr(B)=Pr(B|A)Pr(A)+Pr(B|not A)Pr(not A), often via a two-branch tree diagram, then divide. A typical MAT9004 question gives prior rates and conditional rates and asks for the reversed conditional, for example the probability a part came from supplier A given that it was late.

Q: What is the handshaking lemma and why does it matter?

The handshaking lemma states that in any graph the sum of the vertex degrees equals twice the number of edges. So the number of edges is half the degree sum, and the degree sum is always even, which means an even number of vertices have odd degree. It is the standard tool for deciding whether a graph with a given degree sequence can exist, for example there is no 3-regular graph on 7 vertices because the degree sum 21 is odd.

Monash University · Faculty of Science

Exam Revision
Sem 1 2026 · Side 1 of 2
All six topics · the continuous half

SIDE 1/2 CONTINUOUS · Single-variable calculus · Optimisation · Integration · Linear algebra · Eigenvalues · A=PDP⁻¹ · Multivariable calculus · Hessian test Revision sheet · closed-book exam Compiled by AskSia · mapped to the MAT9004 syllabus · asksia.ai/cheatsheet/monash-mat9004

0 · Exam Blueprintread first

One paper, six disjoint maths worlds. The final is 60% of the unit AND a hurdle — you must score ≥45/100 on the exam itself to pass, regardless of coursework.

Item	Detail
Weight	60% · hurdle ≥45%
Coursework	2 assign×10% + 5 quiz×4%
Duration	~3 h 10 min · e-exam
Materials	closed-book · NO calc
Formula sheet	provided in paper
Template	~36 questions, fixed slots

~36-slot pattern: ~31 short-answer (1.5–3 marks, answer is an integer or lowest-terms a/b, no spaces, no decimals) + 5 long-answer (6 marks: global extrema · Hessian classify · eigen/diagonalise · free-variable system · two-stage Bayes).

This is a REVISION sheet — you cannot bring it in. The exam gives its own formula sheet, so memorise the METHODS, not the formulas: the recipe wins marks, the formula is handed to you.

Sia → Everything must be hand-computable. Practise Gaussian elimination, char-poly eigenvalues and Bayes by hand until fluent — no calculator on the day.

1 · Derivatives · RulesArea 1 · L4–5

f'(a) = slope of tangent = lim_x→a (f(x)−f(a))/(x−a). |x| is not differentiable at 0.

f(x)	f'(x)
c (const)	0
xᵇ	b·x^(b−1)
aˣ	ln(a)·aˣ
eˣ	eˣ
ln x	1/x
logₐ x	1/(ln(a)·x)

Combining rules(c·f)'=c·f' · (f±g)'=f'±g'
product: (fg)' = f'g + fg'
chain: (f(g(x)))' = g'(x)·f'(g(x))

Common chains(e^(cx+d))' = c·e^(cx+d)
(ln(cx+d))' = c/(cx+d)
(a^(cx+d))' = c·ln(a)·a^(cx+d)

nth derivative f⁽ⁿ⁾ = differentiate n times (f''=f⁽²⁾). Worked tangent slope: for h(x)=x·e^(2x), h'=e^(2x)(1+2x) ⇒ h'(0)=1 (product rule, then evaluate).

1b · Function TypesL1–3

Convex: chord lies on/above plot ⇔ f''≥0. Concave: chord below ⇔ f''≤0. Lines are both.

Transform-plot trick: log-log linear ⇒ power law (slope −a); log-lin linear ⇒ exponential (slope ln a); lin-log ⇒ logarithmic.

Log & exponential rulesaˣaʸ=a^(x+y) · (aˣ)ʸ=a^(xy) · a⁰=1
logₐ(xy)=logₐx+logₐy · logₐ(xᵇ)=b logₐx
change of base: logₐx = log_b x / log_b a

Line through 2 points: m=(y₂−y₁)/(x₂−x₁), then y=mx+b with b=y₁−mx₁; zero at x=−b/m.

Injective = distinct inputs give distinct outputs; surjective = image fills codomain; bijective = both ⇔ has an inverse.

2 · Optimisation (1 var)Area 1 · L5–6 ★

Stationary point s: f'(s)=0. Sign of f' gives increase (f'>0) / decrease (f'<0).

2nd-derivative test (at stationary a)f''(a) > 0 ⇒ local min
f''(a) < 0 ⇒ local max
f''(a) = 0 ⇒ inconclusive (x³,x⁴,−x⁴)

First-deriv (sign) test: at stationary m, f' goes +→− ⇒ max; −→+ ⇒ min. Inflection = concavity changes.

Global extrema on [c,d]candidates = stationary pts + singular pts (f' undef) + endpoints c,d.
Evaluate f at ALL ⇒ largest = max, smallest = min.

Shortcut: local min of a convex f is the global min; local max of concave = global max.

Quadratic roots: x²+ax+b=0 ⇒ x=−a/2±√(a²/4−b); real iff a²≥4b. (x−u)(x−v)=x²−(u+v)x+uv.

Sia → The long-answer "global extrema on [a,b]" question forgets the endpoints at your peril — they are candidates too. Always tabulate f at every candidate.

2b · Worked · Extrema on intervalrenumbered

f'(t)=t³−5t²+6t=t(t−2)(t−3) on [−1,3]. Stationary t=0,2,3.

f''=3t²−10t+6: f''(0)=6>0 (min), f''(2)=−2<0 (max), f''(3)=3>0 (min). Compare f at {−1,0,2,3} ⇒ pick global max/min by value.

Cubic variant: f(x)=2x³−3x²−12x+4 on [−3,3] ⇒ stationary x=−1,2; global max (−1,11), global min at the endpoint (−3,−41). The endpoint wins — count it.

2c · Worked · Can min surfaceapplied

Volume πr²h = 128π ⇒ h=128/r². Surface f(r)=2π(r²+128/r); f'=2π(2r−128/r²)=0 ⇒ r³=64 ⇒ r=4. f''>0 (convex) ⇒ global min; h=128/16=8.

Page-layout variant: printable A=(x−4)(y−6) with xy=294 ⇒ A=318−6x−1176/x; A'=−6+1176/x²=0 ⇒ x=14, A''<0 (concave, so a max), y=21.

Convexity-on-interval trap: for f''=6ax−12 to be neither convex nor concave on (2,3), require f'' to change sign there ⇒ solve for the parameter range, don't just plug one point.

Constrained product, variants: max xy s.t. 2x+3y=60 ⇒ 150; s.t. x+3y=60 ⇒ 300. Same substitute-then-optimise recipe; check it's a max (f''<0), not a min.

Singular points (f' undefined, e.g. a corner like |x| at 0) are candidates too — don't only solve f'=0. The Extreme Value Theorem guarantees a continuous f on [c,d] attains both extrema.

2d · RSS / least squaresapplication

Residual sum of squaresRSS = Σᵢ (yᵢ − f(xᵢ))²

Fit f(x)=2x−1 to (2,1),(3,4),(5,2): preds 3,5,9; residuals −2,−1,−7 ⇒ RSS = 4+1+49 = 54. Squares are differentiable & punish big errors.

Workflow: ① f', solve f'=0 + find singular pts; ② classify with f'' (or sign change); ③ compare f at stationary/singular/boundary; use convexity if available.

3 · Integration · FTCArea 1 · L7

Antiderivative F: F'=f, unique up to +c. ∫f dx = F(x)+c.

f(x)	∫f dx
xᵃ (a≠−1)	x^(a+1)/(a+1)+c
x⁻¹	ln\|x\|+c
e^(ax)	(1/a)e^(ax)+c

Fundamental Theorem of Calculus∫ₐᵇ f(x)dx = F(b) − F(a)
G(x)=∫ₐˣ f(z)dz ⇒ G'=f

Linearity ∫(f+g)=∫f+∫g, ∫cf=c∫f. Additivity ∫ₐᶜ=∫ₐᵇ+∫_bᶜ (piecewise).

Rate ⇒ total: if E'(x)=rate, total change = ∫ rate dx.

Sia → A definite integral is signed area — area below the x-axis counts negative. Sketch first if signs are in doubt.

3b · Worked · Definite + FTCrenumbered

∫₀² (x³−6x²)dx = [x⁴/4 − 2x³]₀² = 4 − 16 = −12.

Rate→total (battery): cost falls at rate (x+1)⁻², start $5. D(4)=5+∫₀⁴ −(x+1)⁻²dx = 5 + [(x+1)⁻¹]₀⁴ = 5 + (1/5 − 1) = $4.20 = 21/5. Antiderivative chosen so D(0)=5.

Look-up integrals (used in probability): ∫xe⁻ˣdx = −e⁻ˣ(x+1)+c; ∫e^(−x²/2)dx is not elementary (the normal's normaliser uses erf — hence z-tables). ∫₂(6x²+6x−4)dx-type slots are routine power-rule.

3c · Σ / Π Notationslots 1–2

Σ_{x=a}^{b} f(x)=f(a)+…+f(b); Π is the product. Empty sum = 0, empty product = 1. e.g. Σ_{k=1}^{4} k² = 1+4+9+16 = 30; Π_{k=1}^{4} k = 4! = 24. ℕ={0,1,2,…} here (0 is natural). "iff" = if and only if.

4 · VectorsArea 2 · L8

ℝᵈ = column d-tuples; add & scale component-wise.

Dot product & normv·w = v₁w₁+…+v_dw_d
‖v‖ = √(v·v) = √(v₁²+…+v_d²)
orthogonal ⇔ v·w = 0

Linear comb. w=a₁v₁+…+aₙvₙ. Linearly dependent = one vᵢ is a combo of the rest. Pairwise-orthogonal nonzero vectors are independent.

Worked norm: ‖(3,12,−4)‖=√(9+144+16)=√169=13; ‖(−8,9,−12)‖=√289=17. Orthogonal solve: (2,−1,z)·(3,4,1)=0 ⇒ 6−4+z=0 ⇒ z=−2.

Line interval joining u,v = {αu+(1−α)v : α∈[0,1]}. Geometrically a vector = displacement (direction + length, no fixed position).

4b · Inverse FunctionsL2

g=f⁻¹ ⇔ g(f(x))=x and f(g(y))=y. f⁻¹(x) ≠ 1/f(x). To find f⁻¹: solve y=f(x) for x. f has an inverse iff bijective (injective + surjective).

Worked: f(x)=x²+4x on [0,∞), find f⁻¹(32): x²+4x=32 ⇒ x²+4x−32=0 ⇒ (x+8)(x−4)=0 ⇒ x=4 (take the non-negative root).

4c · Constrained Productsubstitution

Max xy s.t. 2x+5y=100: y=(100−2x)/5 ⇒ maximise (1/5)(100x−2x²); deriv 100−4x=0 ⇒ x=25, y=10 ⇒ xy=250. Substitute the constraint, reduce to one variable, then optimise — also a Lagrange-style multivariable framing.

5 · Matrices & SystemsArea 2 · L8–10 ★

Mult: A (m×n)·B (n×r) = (m×r); (AB)ᵢⱼ = row i of A · col j of B. AB ≠ BA in general. A(BC)=(AB)C.

Gaussian elimination — 3 valid row ops① swap two rows
② multiply a row by a nonzero k
③ add a multiple of one row to another

Row-reduce Ax=b to upper-triangular, then back-substitute. Apply ops top-to-bottom, NOT simultaneously.

Outcome	Signal
No solution	row 0=3 (contradiction)
Unique	full pivots
∞ many	row 0=0 ⇒ free var t

Free variable ⇒ write solution in vector form (point + t·direction). Fewer equations than variables ⇒ usually ∞ many.

6 · Determinant & InverseArea 2 · L10

2×2 det & inversedet[a b; c d] = ad − bc
A⁻¹ = (1/det A)·[d −b; −c a]

A invertible ⇔ det(A) ≠ 0. det(AB)=det(A)det(B); det(I)=1. Identity Iₙ: 1s on diagonal, AI=IA=A.

Solve via inverseA invertible ⇒ Ax=b has unique x = A⁻¹b

Worked: det[1,1;−1,1]=1−(−1)=2 ⇒ invertible. det[1,1;1,1]=0 ⇒ not invertible (repeated row). Singular-parameter Q: set ad−bc=0, solve.

If A is square but not invertible, Ax=b has either no solution or infinitely many (never a unique one).

6b · Worked · Ax=brenumbered

3x+y=2, 5x+2y=3 ⇒ det=1, A⁻¹=[2,−1;−5,3]; x=A⁻¹b = (2·2−1·3, −5·2+3·3) = (1,−1). Reuse A⁻¹ for many b.

6c · Worked · Gaussian elimrenumbered

Solve x+2y−z=6 ; −x+y+2z=3 ; x+y−z=8.

R2+R1, R3−R1: gives 3y+z=9 and −y=2 ⇒ y=−2; back-sub z=9−3y=15? recheck signs in your own working — the discipline is clear one column at a time, top-to-bottom, then back-substitute and verify in the original.

6d · Free-Variable Systemsparametrise

Underdetermined (fewer equations than unknowns) ⇒ a row collapses to 0=0 ⇒ one free variable t∈ℝ.

Express each pivot variable in terms of t, then write the solution set as point + t·direction (a line) — e.g. (x,y,z,w) = (a,b,c,0) + t(p,q,r,1). State "for all t∈ℝ"; two free variables ⇒ a plane.

Two-variable products (Hadamard) exist but are NOT the matrix product used here — always use the row·column rule. (kA)B = k(AB).

Worked 2×2 mult: [1,2;0,1]·[3;4] = [1·3+2·4; 0·3+1·4] = [11;4]. Mult is defined only when columns of A = rows of B.

A vector is an m×1 matrix. Identity Iₙ acts as 1: AIₙ=A. If BA=I then AB=I, so each is the other's inverse. A(B+D)=AB+AD distributes.

7 · Eigenvalues & EigenvectorsArea 2 · L11–12 ★

For square A: Ax = λx, x ≠ 0. x = eigenvector, λ = eigenvalue. Zero vector is never an eigenvector; λ=0 can be an eigenvalue.

Recipe① eigenvalues: det(A − λI) = 0 (char. poly)
② eigenvectors: solve (A − λI)x = 0
(always ∞ many — scalar multiples)

If v is an eigenvector so is cv (c≠0). Char-poly degree = n; roots = the eigenvalues.

DiagonalisationA = P D P⁻¹ · P = (v₁|…|vₙ), D = diag(λᵢ)
⇒ Aⁿ = P Dⁿ P⁻¹ (Dⁿ = diag raised to n)

Construct when n distinct eigenvalues. Use: linear recurrences/Markov processes aₙ=Vⁿa₀; long run ruled by the largest eigenvalue (=1 for stochastic); PageRank = eigenvector for λ=1.

Sia → P⁻¹ is usually not needed for the long-answer — examiners want eigenvalues, eigenvectors and the assembled P, D. Show det(A−λI)=0 explicitly.

7b · Worked · Diagonalise 2×2renumbered

A=[0,2;−1,3]: det(A−λI)=λ²−3λ+2=(λ−1)(λ−2) ⇒ λ=1,2.

λ=2: (A−2I)x=0 ⇒ v=[1,1]ᵀ. λ=1: v=[2,1]ᵀ. So P=[1,2;1,1], D=[2,0;0,1], A=PDP⁻¹.

Sanity: trace 0+3=3=1+2 ✓; det 0·3−2·(−1)=2=1·2 ✓.

7c · Worked · Aⁿ recurrencematrix powers

cₙ₊₁=4cₙ+4uₙ, uₙ₊₁=cₙ+4uₙ ⇒ A=[4,4;1,4]. det(A−λI)=λ²−8λ+12=(λ−2)(λ−6) ⇒ λ=2,6; eigenvectors [2,−1]ᵀ,[2,1]ᵀ. Long-run ratio → dominant eigenvalue 6.

7d · Diagonal Matriceswhy diag is easy

D (Dᵢⱼ=0 off-diagonal): Dⁿ raises each diagonal entry to n. That is the whole point of A=PDP⁻¹ — push the power onto D where it's trivial, then conjugate back.

Char-poly check: for a 2×2, det(A−λI)=λ² − (trace)λ + det. So λ²−3λ+2 came from trace 3, det 2. Sum of eigenvalues = trace; product = det — a fast sanity check.

7e · Worked · Eigenvector unknownshort slot

Given λ is an eigenvalue and an eigenvector of the form (x,1,z)ᵀ, solve (A−λI)v=0 row-by-row for x and z (a small linear system). Likewise "find the missing entry of A⁻¹": use A·A⁻¹=I and read off one equation.

7f · Eigen Recipe Recapstep list

Form A−λI; expand det(A−λI)=0 to the characteristic polynomial
Solve for the roots λ₁,…,λₙ (the eigenvalues)
For each λ, row-reduce (A−λI)x=0; the free variable gives the eigenvector (pick the neatest scalar multiple)
Distinct λ ⇒ assemble P=(v₁|…|vₙ), D=diag(λᵢ) ⇒ A=PDP⁻¹
For powers/recurrences quote Aⁿ=PDⁿP⁻¹

8 · Multivariable CalculusArea 3 · L13–16 ★

Partial derivative fₓ: differentiate, treat y as constant; f_y treats x as constant.

Gradient∇f = [fₓ ; f_y]
points in direction of steepest increase
∇f ⟂ the level set through the point

Level set of value c = {(x,y): f=c}; level curves never cross. Linear approx: f(x+Δx,y+Δy) ≈ f + fₓΔx + f_yΔy.

Stationary point∇f = 0 ⇔ fₓ = 0 AND f_y = 0

Types: local min, local max, saddle (max one way, min the other). For nice f, mixed partials equal: fₓy = f_yx.

9 · Hessian TestArea 3 · L15–16 ★

Hessian & discriminantH = [fₓₓ fₓy ; f_yx f_yy]
D = det H = fₓₓ·f_yy − (fₓy)²

D = det H	Verdict
D>0, fₓₓ>0	local min
D>0, fₓₓ<0	local max
D<0	saddle
D=0	inconclusive

Convex (2-var): fₓₓ≥0, f_yy≥0 AND det H≥0 everywhere ⇒ local min = global min.

Sia → Solve ∇f=0 by factoring, not expanding — e.g. fₓ=x(x−4y)=0 splits into x=0 or x=4y. Test every stationary point with its own H.

9b · Worked · Classify all stat. ptsrenumbered

f=2x+x²+x²y−2xy²: ∇f=(2+2x+2xy−2y², x²−4xy); H=[2+2y, 2x−4y; 2x−4y, −4x].

f_y=x(x−4y)=0. Cases give (0,±1),(−4,−1): det H=−16<0 ⇒ saddles; (−3,−3): det H>0, fₓₓ>0 ⇒ local min.

9c · Worked · ∇g & det Hshort slots

g=2xy(x²+y): ∇g at (2,1) = (26,24). Stationary of f=x−ln(x²+y²): ∇f=0 ⇒ (2,0). Given ∇f=(2xy²+2x, 2x²y−3y²) ⇒ det H(1,−1)=16.

9d · How the Islands Connectrecurring themes

Saddle points have no 1-var analogue — they're why the 2-var test needs det H, not just fₓₓ
Convexity ⇒ global extremum in BOTH 1-var (f''≥0) and 2-var (det H≥0 + fₓₓ≥0)
Matrix multiplication drives linear systems, Aⁿ eigen-powers, Markov/PageRank AND graph walk-counting (Aᵏ)
Gaussian elimination solves Ax=b AND finds eigenvectors via (A−λI)x=0
Integration is the engine for continuous probability (pdf normalisation, E, Var on side 2)

Formula Beltside 1

(xᵇ)'=bx^(b−1) · (eˣ)'=eˣ · chain g'·f'(g)
2nd-deriv: f''>0 min, <0 max, =0 ?
∫xᵃ=x^(a+1)/(a+1) · FTC F(b)−F(a)
det₂=ad−bc · Ax=b ⇒ x=A⁻¹b
det(A−λI)=0 · Aⁿ=PDⁿP⁻¹
det H = fₓₓf_yy−fₓy² (<0 saddle)

MAT9004

Mathematical Foundations for Data Science & AI

Monash University · Faculty of Science

Exam Revision
Sem 1 2026 · Side 2 of 2
Counting · probability · graphs

SIDE 2/2 DISCRETE · Counting · Selection types · Binomial · Inclusion–exclusion · Probability · Bayes · Distributions · Graphs · Trees · Euler/Hamilton Revision sheet · closed-book exam Compiled by AskSia · mapped to the MAT9004 syllabus · asksia.ai/cheatsheet/monash-mat9004

10 · Counting · Basic RulesArea 4 · L17

Product & sum rulesAND (sequence) ⇒ multiply: |S| = Π kⱼ
OR (disjoint cases) ⇒ add: |S| = Σ |Sⱼ|
complement: |good| = |S| − |bad|

Keyword test: "and"/"then" ⇒ ×; "or"/disjoint cases ⇒ +. n! = 1·2·…·n; 0! = 1. Sequences with kⱼ choices each ⇒ |S| = Π kⱼ.

11 · Four Selection TypesArea 4 · L17–18 ★

Draw r from n — ask: order matter? repeats allowed?

	No repeat	With repeat
Ordered	n!/(n−r)!	nʳ
Unordered	C(n,r)	C(n+r−1, r)

Binomial coefficientC(n,r) = "n choose r" = n!/(r!(n−r)!)

Permutation of all n = n!. Unordered-with-repetition ↔ multisets (stars & bars: r stars, n−1 bars). Ordered-no-repeat = n(n−1)…(n−r+1).

Sia → Decide the two yes/no questions before writing anything — "order? repeats?" picks the cell. The classic trap is reading "choose a committee" (unordered) as a permutation.

11b · Worked · Selectionsrenumbered

Password 10 symbols from 31 = ordered+repeat = 31¹⁰. All-distinct = 31·30·…·22 = 31!/21!.

Committee C(12,9)=C(12,3)=220. C(11,6)=462.

Buy 8 loaves, 6 types (unordered+repeat) = C(13,8)=C(13,5)=1287.

4-digit all-odd, no repeat: 5·4·3·2 = 120. 3-digit all-even, no repeat, no leading 0: 4·4·3 = 48.

Alphabetical-order passwords (letters must be in order) ⇒ one per unordered selection: no repeats = C(26,10); with repeats = C(35,10).

12 · Binomial TheoremArea 4 · L18

Expansion & identities(x+y)ⁿ = Σ_{r=0}^{n} C(n,r) x^(n−r) yʳ
row sum: Σ_r C(n,r) = 2ⁿ
Pascal: C(n,r)=C(n−1,r−1)+C(n−1,r)
symmetry: C(n,r)=C(n,n−r)

2ⁿ = total subsets of an n-set. Grid paths across m×k = a binomial coefficient (choose where the up-moves go).

"Express count as a power, find n": 32⁷ = (2⁵)⁷ = 2³⁵ ⇒ n=35; 27⁷ = (3³)⁷ = 3²¹ ⇒ n=21. (Rewrite the base as a prime power, then match exponents.) Symmetry C(n,r)=C(n,n−r) keeps the arithmetic small.

12b · Exactly-k & Positionschoose-then-fill

Count arrangements with exactly k of a special kind: choose the positions × fill them × fill the rest. e.g. a 10-symbol string with exactly 3 specials (5 each) and 7 letters (26 each) = C(10,3)·5³·26⁷.

"At least one" via complement: count = (total) − (none). e.g. fleets with ≥1 van = C(235,3) − C(171,3) (all fleets minus van-free fleets) — usually far easier than summing cases. Same idea as Pr(Ā)=1−Pr(A).

13 · Inclusion–Exclusion & PigeonholeArea 4 · L18

Two / three sets|A∪B| = |A|+|B| − |A∩B|
|A∪B∪C| = Σ singles − Σ pairs + |A∩B∩C|

None-of-N (general I–E): |U| − Σ singles + Σ pairs − Σ triples + … + (−1)ᴺ(all-N). Same structure as Pr(A∪B).

Pigeonholen items, m boxes, n > m ⇒ some box ≥ 2
generalised ⇒ some box ≥ ⌈n/m⌉

Use: "must two share…?" or "can all be distinct?" — e.g. 10 people, degrees in {0,…,9}: 0 and 9 can't coexist ⇒ two share a friend-count.

Worked I–E: |U|=100, |A|=30, |B|=40, |A∩B|=10 ⇒ satisfying neither = 100−30−40+10 = 40. (Same add-subtract pattern as Pr(A∪B) — the two topics share one identity.)

Three-set "none": |U|−Σ|Aᵢ|+Σ|Aᵢ∩Aⱼ|−|A₁∩A₂∩A₃|. Venn diagrams make the signs obvious. The general rule alternates sign: +singles? no — subtract singles, add pairs, subtract triples…

14 · Probability FoundationsArea 5 · L19

Sample space S; Pr:S→[0,1], Σ Pr(s)=1. Event A⊆S, Pr(A)=Σ_{x∈A}Pr(x).

Core rulesuniform: Pr(A) = |A|/|S| (pure counting)
complement: Pr(Ā) = 1 − Pr(A)
union: Pr(A∪B)=Pr(A)+Pr(B)−Pr(A∩B)
independent: Pr(A∩B)=Pr(A)·Pr(B)

Mutually exclusive ⇒ Pr(A∩B)=0 ⇒ Pr(A∪B)=Pr(A)+Pr(B). Independent repeated trials: Pr((s₁,s₂))=Pr(s₁)Pr(s₂).

Worked: Pr(A)=0.3, Pr(B)=0.4, Pr(A∪B)=0.5 ⇒ Pr(A∩B)=0.3+0.4−0.5=0.2; since 0.2≠0.3·0.4=0.12, A and B are not independent.

Each trial yields exactly one outcome; the unit's probability is discrete apart from the continuous distributions in L22.

15 · Conditional & BayesArea 5 · L20 ★

Conditional & multiplicationPr(A|B) = Pr(A∩B)/Pr(B)
⇒ Pr(A∩B) = Pr(A|B)·Pr(B)

Total probability & BayesPr(B) = Pr(B|A)Pr(A) + Pr(B|Ā)Pr(Ā)
Pr(A|B) = Pr(B|A)Pr(A) / Pr(B)

Independence ⇔ Pr(A)=Pr(A|B). Use a tree diagram; watch the base-rate fallacy (rare-disease tests).

Sia → The long-answer Bayes is two steps: total probability first (the denominator), then divide. Write both fractions — method marks survive an arithmetic slip.

15b · Worked · Two-stage Bayesrenumbered

Supplier A: 20% of parts, 20% late. B: 80%, 5% late.

Pr(late)=0.2·0.2+0.8·0.05 = 0.04+0.04 = 0.08. Pr(A|late)=0.04/0.08 = 1/2.

Commuter variant: walks 10% (late 60%), drives 90% (late 20%). Pr(walk|late)=0.06/(0.06+0.18)=1/4; Pr(drove|on-time)=0.72/0.76=18/19.

16 · Random VariablesArea 5 · L21 ★

RV X: S→ℝ. Distribution = list Pr(X=x). Independent ⇔ Pr(X=x,Y=y)=Pr(X=x)Pr(Y=y) for all x,y.

Expectation & varianceE[X] = Σ pᵢxᵢ (weighted average)
Var[X] = E[(X−μ)²] = E[X²] − μ²
σ = √Var[X]

AlgebraE[X+Y]=E[X]+E[Y] (even if dependent!)
E[kX]=kE[X] · Var[kX]=k²Var[X]
indep: E[XY]=E[X]E[Y], Var[X+Y]=Var[X]+Var[Y]

Linearity-of-expectation (indicator trick): E[Σ Xᵢ]=Σ E[Xᵢ] works even when the Xᵢ are dependent — e.g. expected #adjacent-odd-pairs in a 4-digit PIN = 3·(1/4) = 3/4.

16b · Worked · E & Varrenumbered

Die X (1–6): E=3.5. Coin game: 2H→+24, 1H→+2, 0H→−8 ⇒ E = 24·¼+2·½−8·¼ = 6+1−2 = 5.

Two indep uniforms: Var(2X+3Y)=4Var(X)+9Var(Y). X∈{3,7,11} probs ¼,¼,½ ⇒ E[X]=8. Var uses E[X²]−μ², not E[X]².

17 · DistributionsArea 5 · L22 ★

Dist	E	Var
Bernoulli(p)	p	p(1−p)
Binomial(n,p)	np	np(1−p)
Geometric(p)	(1−p)/p	(1−p)/p²
Unif {a..b}	(a+b)/2	((b−a+1)²−1)/12
Unif [a,b]	(a+b)/2	(b−a)²/12
Expon(λ)	1/λ	1/λ²
Normal(μ,σ)	μ	σ²

Binomial Pr(X=k)=C(n,k)pᵏ(1−p)^(n−k); geometric Pr(X=k)=p(1−p)ᵏ (k failures first). Z=(X−μ)/σ = standard normal score.

Recognise the model: single success/fail ⇒ Bernoulli; #successes in n trials ⇒ Binomial; #failures before first success ⇒ Geometric; sums/large counts ⇒ tend to Normal (CLT intuition).

17b · Continuous · PDFnormalise!

PDF requirementsf ≥ 0 everywhere AND ∫ f over domain = 1
Pr(a≤X≤b)=∫ₐᵇ f dx · Pr(X=a)=0

E[X]=∫ x f dx; Var=∫ x²f dx − (E[X])². Find k: h=kx(1−x) on (0,1) ⇒ ∫=k/6=1 ⇒ k=6. Variant kx²(1−x) ⇒ k=12. Always set ∫f=1 first, then compute moments.

Worked E,Var: h(x)=(x−1)² on (0,2): ∫h=1 ✓, E[X]=∫x h=1, Var=∫x²h−1=1/5. (Integration is the engine for continuous distributions — Area 1 feeding Area 5.)

Continuous uniform on [a,b]: f=1/(b−a). Exponential(λ) f=λe^(−λx), x≥0 — memoryless, the Poisson waiting time. ≤ and < are interchangeable since Pr(X=a)=0.

17c · Sum of Independent RVsconvolution

Distribution of Z=X+Y: sum probabilities over all (x,y) with x+y=z (enumerate the pairs). For a count of successes across n independent Bernoulli(p) trials, the sum is exactly Binomial(n,p).

Law of large numbers (intuition): the average of n iid trials → E[X] as n→∞.

18 · Graphs · VocabularyArea 6 · L23

Graph = vertices + edges (pairs). Simple = no loops, no parallel edges. Walk = vertex sequence joined by edges; length = #steps. Path = walk with all-distinct vertices.

Connected = every pair joined by a path. Cycle = walk start=end, length ≥3 (simple). k-regular = every vertex degree k.

Same graph can be drawn many ways — crossings/curvature don't matter.

19 · Degrees & HandshakingArea 6 · L23 ★

Handshaking lemmaΣ deg(v) = 2·|E|
⇒ #edges = (Σ deg)/2
⇒ an EVEN number of vertices have odd degree

Degree = #incident edges. Path: endpoints deg 1, internal deg 2; cycle: all deg 2; k-regular: all deg k. Σ deg odd ⇒ no graph; |E| from a valid sequence = Σ/2.

Feasibility tool: if Σ deg is odd, no such graph. No 3-regular graph on 7 vertices (Σ=21 odd).

Sia → "Can such a graph exist?" ⇒ reach for handshaking: sum the degrees, check it's even, and check it doesn't exceed the max possible (or break tree limits).

19b · Worked · Degree sequencerenumbered

(1,1,1,1,2,2,4): Σ=12 ⇒ 6 edges. (1,2,2,2,3,3,3): Σ=16 ⇒ 8 edges. (3,3,3,1,1): Σ=11 odd ⇒ impossible.

3-regular on 10 vertices ⇒ 10·3/2 = 15 edges. (3,3,3,3,4): Σ=16 ⇒ 8 edges. (3,2,1,0) on 4 vertices: Σ even but no such graph (deg-3 needs all others as neighbours, contradicting deg-0).

20 · TreesArea 6 · L24 ★

Tree = connected + no cycles. Equivalent: any two vertices joined by a unique path; deleting any edge disconnects it; adding any edge makes a cycle.

Edge counta tree on n vertices has exactly n − 1 edges

So 10 vertices + 8 edges can't be connected. Spanning tree = tree inside G with all vertices (delete one edge per cycle). Removable edges keeping connectivity = |E| − (|V|−1). A tree with a Hamilton path is a path.

20b · Counting Cycle Graphsrenumbered

How many distinct cycle graphs on {A,B,C,D}? Each vertex has degree 2; fix A's two neighbours (C(3,2)=3 ways) and each choice completes uniquely ⇒ 3 cycles. (Counting meets graphs.)

20c · Graph Drawingssame graph

Two drawings are the same graph if vertices & edges match — bends, crossings and positions are irrelevant. Deleting any vertex of a cycle leaves a connected path.

Walk vs path: a walk may repeat vertices; a path may not. If any walk A→B exists, a path A→B also exists. Digraph edges have direction; a multigraph allows loops/parallel edges. A cycle has length ≥3 in a simple graph.

21 · Adjacency MatrixArea 6 · L24 ★

A: aᵢⱼ=1 if vᵢ adjacent to vⱼ, else 0. Symmetric for undirected simple graphs. From A: #vertices = dimension; #edges = (count of 1s)/2; #components by tracing connectivity.

Walk counting — links to Area 2(Aᵏ)ᵢⱼ = number of walks of length k from vᵢ to vⱼ

Compute walk counts by matrix powers (same machinery as eigen/diagonalisation). Worked read-off: a 9×9 0/1 matrix with twelve 1s ⇒ 9 vertices, 6 edges; trace components.

(A²)ᵢᵢ = degree of vᵢ (number of length-2 closed walks = its neighbours). The diagonal of A is all 0 for a simple graph (no loops).

Sia → Aᵏ ties graphs back to linear algebra — if you can diagonalise A you can get walk counts for any k. Same toolbox, two topic islands.

22 · Euler vs HamiltonArea 6 · L24

Type	Visits	Exists iff
Euler circuit	every edge once, closed	connected & ALL degrees even
Euler trail	every edge once, open	connected & ≤2 odd-deg
Hamilton cycle	every vertex once	hard (no easy test)

Euler = edges, Hamilton = vertices. Königsberg's bridges: 4 odd-degree vertices ⇒ no Euler trail. Hamilton sufficient (Dirac): simple, n≥3, every degree ≥ n/2 ⇒ hamiltonian (not necessary). Hamiltonicity relates to P vs NP.

Mnemonic: Euler walks the edges (every bridge once); Hamilton visits the towns (every vertex once). Euler has a clean degree test; Hamilton has none easy — that's the whole difference.

A connected graph always has a spanning tree (delete one edge from each cycle until acyclic) — the backbone for counting removable edges.

22b · Worked · Tree feasibilityrenumbered

Can a 101-vertex graph with ten vertices of degree 11 be a tree? Σ deg ≥ 11·10 + 91·1 = 201 ⇒ |E| ≥ 201/2 ⇒ > 100 = n−1. No — a tree needs exactly 100 edges.

Cycle from edges: adjacency 1s at (2,3),(2,4),(3,5),(4,5) form the walk 2-3-5-4-2 = a cycle ⇒ not a tree (trees are acyclic).

22c · Distinct-degree trappigeonhole

Can 10 people all have a different number of friends? Degrees must lie in {0,…,9}, but degree 0 and degree 9 cannot coexist (the deg-9 person is friends with everyone, contradicting the deg-0 person) ⇒ only 9 usable values for 10 people ⇒ two must share. No.

22d · Removable Edgesspanning tree

Edges you can delete while staying connected = |E| − (|V|−1) (drop one per independent cycle until only a spanning tree remains). 3-regular on 10 vertices: 15 − 9 = 6 removable.

23 · Method-Trigger Checklistread the verb

Question says…	Reach for
"max/min on [a,b]"	stat pts + endpoints
"max/min/saddle f(x,y)"	∇f=0 then det H
"solve the system"	Gaussian elim
"not invertible"	set det = 0
"A=PDP⁻¹ / Aⁿ"	det(A−λI)=0
"how many ways"	order? repeat? ⇒ cell
"committee / choose"	C(n,r)
"given … find Pr(·\|·)"	total prob + Bayes
"E[·] / Var(·) from pdf"	normalise, ∫xf, ∫x²f
"# edges from degrees"	handshaking /2
"walks of length k"	(Aᵏ)ᵢⱼ
"can graph exist?"	parity / n−1 edges

23b · Counting → Probabilitythe bridge

Uniform probability is pure counting: Pr(A)=|A|/|S|, so every Area-4 selection formula feeds straight into Area 5. Binomial probabilities ARE C(n,k) weighted by pᵏ(1−p)^(n−k).

Independence test for RVs: one mismatched cell Pr(X=x,Y=y)≠Pr(X=x)Pr(Y=y) disproves independence; check before using Var[X+Y]=Var[X]+Var[Y].

Inclusion–exclusion ↔ Pr(A∪B) are the same identity; counting ↔ uniform probability via |A|/|S|; Aᵏ walk-counting ↔ eigen-powers. Six islands, one toolbox.

24 · Exam Disciplinedon't lose marks

Short-answer = integer or lowest-terms a/b, no spaces, no decimals
No calculator — keep arithmetic clean, check by substitution
Long-answer: show the working; method marks survive a wrong final number
Gaussian elim: ops top-to-bottom, never simultaneously; verify in the original system
Global extrema: include the endpoints
Bayes: compute the denominator (total prob) first
Hand-written parts: scan + upload in the 30 min after the e-exam
Counting: ask order? repeats? before picking a formula
"Is this graph possible?" ⇒ check degree parity and the n−1 edge bound
Eigen long-answer: show det(A−λI)=0 and assemble P, D explicitly

Sia → The formula sheet is given — so spend revision time on recipes, not memorising formulas: be able to run each method end-to-end, by hand, under time.

Formula Beltside 2

ordered: n!/(n−r)!, nʳ · unord: C(n,r), C(n+r−1,r)
(x+y)ⁿ=ΣC(n,r)x^(n−r)yʳ · Σ_r C(n,r)=2ⁿ
Pr(A|B)=Pr(B|A)Pr(A)/Pr(B)
Var[X]=E[X²]−μ² · Bin: E=np,Var=np(1−p)
pigeonhole ⌈n/m⌉ · uniform Pr=|A|/|S|
Σ deg=2|E| · tree: n−1 edges · (Aᵏ)ᵢⱼ=walks