ECON20003 · Quantitative Methods 2
Dummy Dependent-Variable Models: LPM, Logit & Probit
Dummy Dependent-Variable Models: LPM, Logit & Probit handles regressions where the outcome Y is binary (0/1). The linear probability model (LPM) is just OLS on the 0/1 outcome — simple, but it can predict probabilities outside [0,1] and is inherently heteroskedastic. Logit and probit fix this by squeezing a linear index Z = β₀ + ΣβⱼXⱼ through an S-shaped function: logit uses P = 1/(1 + e−Z) and probit uses the normal CDF P = Φ(Z). The coefficient sign gives the direction of the effect, but the marginal effect on probability is not constant — it depends on where you are on the curve.
What this chapter covers
- 01Binary Y ∈ {0,1}; linear index Z = β₀ + ΣβⱼXⱼ; P = P(Y=1|X)
- 02LPM: P = Z, ∂P/∂Xⱼ = βⱼ; problems = P̂ outside [0,1], inherent heteroskedasticity
- 03Logit: P = 1/(1 + e−Z); ∂P/∂Xⱼ = f(Z)·βⱼ
- 04Probit: P = Φ(Z); ∂P/∂Xⱼ = φ(Z)·βⱼ
- 05Sign of βⱼ = direction; magnitude of marginal effect is non-constant
- 06Reading a predicted probability off a fitted Z
Predicted probability from a logit model
- 2 marksCompute the linear index at X = 6: Z = −2 + 0.5 × 6 = −2 + 3 = 1.
- 1 markApply the logit link: P = 1/(1 + e−Z) = 1/(1 + e−1).
- 1 markSubstitute e−1 ≈ 0.3679: P = 1/(1 + 0.3679) = 1/1.3679.
- 1 markEvaluate: P ≈ 0.731.
- 1 markInterpret: a 6-year customer has about a 73% predicted probability of renewing. The positive coefficient 0.5 confirms renewal probability rises with tenure, though by a non-constant amount.
Key terms
- Linear probability model (LPM)
- Ordinary least squares applied to a 0/1 outcome, so the slope βⱼ is read directly as a change in probability. Its drawbacks are predicted probabilities that can fall outside [0,1] and built-in heteroskedasticity.
- Logit model
- A binary-outcome model using the logistic link P = 1/(1 + e^(−Z)), which keeps predicted probabilities in (0,1). Its marginal effect ∂P/∂Xⱼ = f(Z)·βⱼ varies with where you sit on the S-curve.
- Probit model
- A binary-outcome model using the standard normal CDF as the link, P = Φ(Z). It behaves very similarly to logit; both bound probabilities to (0,1) and have non-constant marginal effects.
- Marginal effect
- The change in the predicted probability of Y = 1 for a one-unit change in a predictor. In LPM it is constant (the coefficient); in logit/probit it depends on the current value of the index Z, peaking near P = 0.5.
Dummy Dependent-Variable Models: LPM, Logit & Probit FAQ
Why prefer logit or probit over the linear probability model?
The LPM can predict probabilities below 0 or above 1, which is nonsensical, and it carries inherent heteroskedasticity. Logit and probit squeeze the linear index through an S-shaped link that keeps predictions strictly between 0 and 1, at the cost of coefficients that no longer read directly as probability changes.
Can I read a logit coefficient as the change in probability?
No — only its sign and significance interpret directly. The actual change in probability (the marginal effect) depends on the value of the index Z and is largest around P = 0.5, so to quantify an effect you must compute predicted probabilities at specific X values or report a marginal effect.
Exam move
Memorise the two link functions and practise computing a predicted probability from a given fitted index — it is a near-certain exam task. Be ready to state the LPM's two flaws and to explain in words why logit/probit marginal effects are non-constant.