MAST90105 · Methods Of Mathematical Statistics
Bivariate Distributions and Correlation
When two variables move together, a joint law f(x,y) carries everything about the pair (X,Y). From it you recover the marginals by summing or integrating out the other variable, and the conditional distribution f(y|x) = f(x,y)/fX(x) by renormalising a slice. The chapter then builds the two summaries of co-movement: covariance Cov(X,Y) = E[XY] − E[X]E[Y], and its scale-free version correlation ρ = Cov(X,Y)/(σXσY), which always lies in [−1, 1]. It introduces the workhorse joint model, the bivariate normal, whose contours are ellipses tilted by ρ, and lands the most-examined subtlety of the topic: zero correlation does not imply independence. Correlation measures only the linear part of dependence; two variables can be perfectly dependent through a non-linear relationship yet have ρ = 0. (Within the bivariate normal, uniquely, zero correlation does imply independence — a special case students routinely over-generalise.)
What this chapter covers
- 014.1 Joint, marginal and conditional laws
- 024.2 Recovering marginals by summing/integrating out
- 034.3 Covariance: E[XY] − E[X]E[Y]
- 044.4 Correlation ρ and its [−1, 1] range
- 054.5 The bivariate normal and its elliptical contours
- 064.6 Zero correlation ≠ independence (and the BVN exception)
Worked example: covariance and correlation from a joint table
- +1Marginals. P(X=1) = 0.1 + 0.3 = 0.4, so E[X] = 0.4; P(Y=1) = 0.2 + 0.3 = 0.5, so E[Y] = 0.5.
- +1E[XY]. Only the (1,1) cell contributes: E[XY] = 1·1·0.3 = 0.3.
- +1Covariance. Cov(X,Y) = E[XY] − E[X]E[Y] = 0.3 − (0.4)(0.5) = 0.10.
- +1Correlation. For a Bernoulli, Var(X) = 0.4(0.6) = 0.24 and Var(Y) = 0.5(0.5) = 0.25, so ρ = 0.10/√(0.24·0.25) = 0.10/0.2449 = 0.408.
- +1Independence check. P(1,1) = 0.3 but P(X=1)P(Y=1) = 0.4·0.5 = 0.20 ≠ 0.3, so X and Y are NOT independent — consistent with the non-zero covariance.
Key terms
- Joint distribution
- f(x,y) gives the probability (or density) of the pair (X,Y) taking specified values. Everything about the pair — marginals, conditionals, covariance — is derived from it.
- Marginal distribution
- The distribution of one variable alone, obtained by summing or integrating the joint over the other: fX(x) = ∑yf(x,y). It discards all information about the partner variable.
- Covariance
- Cov(X,Y) = E[(X−μX)(Y−μY)] = E[XY] − E[X]E[Y] — the signed strength of linear co-movement. Positive means they tend to move together; its scale depends on the units of X and Y.
- Correlation
- ρ = Cov(X,Y)/(σXσY), the unit-free covariance, always in [−1, 1]. ρ = ±1 means an exact linear relationship; ρ = 0 means no linear association — not necessarily no association.
- Bivariate normal
- The joint normal model for (X,Y) with elliptical density contours whose tilt is set by ρ. Its marginals and conditionals are all normal, and uniquely among joint laws, ρ = 0 within the bivariate normal does imply independence.
Bivariate Distributions and Correlation FAQ
Does zero correlation mean two variables are independent?
No — this is the most over-generalised result in the topic. Correlation captures only linear association, so two variables related through a curve (for example Y = X² with X symmetric about zero) can have ρ = 0 while being completely dependent. Independence is the stronger condition that the joint law factors into the marginals. The one exception is the bivariate normal, where zero correlation does force independence.
How do I check independence from a joint table?
Verify that every joint probability equals the product of its marginals: P(X=x, Y=y) = P(X=x)P(Y=y) for all cells. A single cell that fails the factorisation proves dependence. A quick screen: if the covariance is non-zero the variables are automatically dependent, but a zero covariance still requires the full factorisation check because of the curve case above.
What does the sign and size of correlation tell me?
The sign gives the direction of linear co-movement — positive for ‘move together’, negative for ‘move oppositely’. The magnitude, between 0 and 1, gives the strength of the linear relationship, with ±1 being an exact line. Because ρ is unit-free it is comparable across variable pairs, unlike covariance whose size depends on the measurement scales.
Exam move
Practise the full pipeline from a small joint table: marginals, E[XY], covariance, correlation, then the factorisation independence check — that exact sequence is the standard exam item. Burn in the headline distinction: a non-zero correlation proves dependence, but zero correlation does not prove independence (except inside the bivariate normal), and have the Y = X² counterexample ready to quote. Sketch the bivariate-normal ellipse and note how its tilt and eccentricity track the sign and size of ρ, so a conceptual question about the joint normal resolves to a picture rather than a memorised fact.