Monash University · S1 2026 · FACULTY OF INFORMATION TECHNOLOGY

MAT9004 · Mathematical Foundations For Data Science And Ai

- one subject, every graph, every model, every mark
50% final exam · hurdle14 Chapters4-page Bible
Our own words - no uploaded lecturer files
Built to mirror S1 2026 · updated this semester
Chapter 3 of 6 · MAT9004

Multivariable Calculus

Multivariable calculus is single-variable calculus done in two variables at once — the maths behind optimising a model with more than one parameter. You start with functions of two variables and their surfaces, read them through level sets and contour maps, then differentiate with partial derivatives (first and second order, including the cross-partials and Clairaut's symmetry). The two key objects are the gradient ∇f — the vector of first partials that points in the direction of steepest ascent — and the Hessian of second partials. Optimisation mirrors the 1-D method: find stationary points where ∇f = 0, then classify each with the Hessian determinant test (the flagship long-answer). The chapter closes with global optimisation and convexity, which is exactly why some machine-learning training problems are easy and others are hard.

In this chapter

What this chapter covers

  • 013.1 Functions of two variables and their surfaces
  • 023.2 Level sets and contour maps
  • 033.3 Partial derivatives
  • 043.4 Second-order and cross-partials (Clairaut)
  • 053.5 The gradient ∇f — direction of steepest ascent
  • 063.6 First-order approximation and the tangent plane
  • 073.7 Stationary points: ∇f = 0
  • 083.8 The Hessian determinant test
  • 093.9–3.10 Full worked classification and global optimisation
  • 103.11 Convexity — why training is easy or hard
Worked example · free

Worked example: partial derivatives and the gradient

Q [4 marks]. Let f(x, y) = x2y + 3y2. (a) Find the partial derivatives fx and fy. (b) Evaluate the gradient ∇f at the point (1, 2).
  • +1(a) Differentiate in x, treat y as constant: fx = ∂/∂x (x2y + 3y2) = 2xy.
  • +1(a) Differentiate in y, treat x as constant: fy = ∂/∂y (x2y + 3y2) = x2 + 6y.
  • +1(b) Evaluate at (1, 2): fx(1, 2) = 2·1·2 = 4; fy(1, 2) = 12 + 6·2 = 13.
  • +1Write the gradient: ∇f(1, 2) = (4, 13).
fx = 2xy and fy = x2 + 6y, so ∇f(1, 2) = (4, 13) — the direction of steepest ascent at that point.
Glossary

Key terms

Partial derivative
The derivative of a multivariable function with respect to one variable while holding the others constant, written fx or ∂f/∂x. It measures how f changes as you move along just one axis.
Gradient
The vector of first partial derivatives, ∇f = (fx, fy). It points in the direction of steepest ascent, and its zero locates the stationary points: ∇f = 0.
Hessian
The matrix of second partial derivatives of a two-variable function. Its determinant at a stationary point classifies that point as a maximum, minimum or saddle — the multivariable analogue of the second-derivative test.
Saddle point
A stationary point that is a maximum along one direction and a minimum along another, so it is neither. The Hessian test flags it when the Hessian determinant is negative.
Convexity
A function is convex if its graph curves upward everywhere; a convex function has a single global minimum, which is why convex training problems are easy to optimise and non-convex ones are not.
FAQ

Multivariable Calculus FAQ

How do I take a partial derivative correctly?

Differentiate with respect to the named variable and treat every other variable as a constant. For fx, treat y as a fixed number, so a pure-y term like 3y2 differentiates to zero. The most common slip is forgetting to hold the other variable constant, or dropping a term that survives.

How does the Hessian test classify a stationary point?

At a stationary point where ∇f = 0, compute the Hessian determinant D = fxxfyy − (fxy)2. If D > 0 and fxx > 0 it is a local minimum; if D > 0 and fxx < 0 a local maximum; if D < 0 it is a saddle point; and if D = 0 the test is inconclusive.

What is the gradient actually telling me?

The gradient ∇f at a point is a vector that points in the direction in which the function increases fastest, and its length is that maximum rate of increase. Setting ∇f = 0 finds the flat points — the candidate maxima, minima and saddles — which is the first step of two-variable optimisation.

Why does convexity matter for AI?

Training a model means minimising a loss function. If the loss is convex it has a single global minimum and gradient methods reliably find it. If it is non-convex there can be many local minima and saddle points, so optimisation can get stuck — which is exactly why the chapter ties convexity to whether training is easy or hard.

Study strategy

Exam move

The flagship long-answer is a full two-variable classification, so drill the chain end to end: compute both first partials, set ∇f = 0 and solve the system for every stationary point, then build the Hessian and evaluate D = fxxfyy − (fxy)2 at each point to classify it. Use Clairaut's symmetry (fxy = fyx) as a free check on your cross-partials. Lay the work out cleanly — partials, gradient equation, stationary points, Hessian, verdict — because each stage carries marks. Keep the gradient's meaning (direction of steepest ascent) and the saddle case (D < 0) at your fingertips, since the wording of the conclusion is itself worth a mark.

A+Everything unlocked
Unlocks this Bible + all 23 of your Monash University subjects - and 1,000+ Bibles across every Australian university.
Sia - your MAT9004 tutor, unlimited, worked the way the exam marks it
The full 4-page Bible + practice bank with worked solutions
Chrome extension - sync your LMS so Sia knows your deadlines
Bilingual EN / Chinese on every Bible and every Sia answer
$25/ month
30-day money-back · cancel in one tap · how it works
MAT9004 · Mathematical Foundations For Data Science And Ai - independent study guide on the AskSia Library. More Monash University subjects · Microeconomics across all universities
Unlock the full MAT9004 Bible + 23 Monash University subjects解锁完整 MAT9004 Bible + Monash University 23 门科目
$25/mo