ECON20003 · Quantitative Methods 2
Multiple Regression, the General F-test & Specification
Multiple Regression, the General F-test & Specification generalises the line to several predictors, where each slope is a ceteris-paribus partial effect. You learn three inference layers — the overall F-test (are any slopes non-zero?), the individual t-tests (which specific predictor matters?), and the general/partial F-test (does a group of predictors add explanatory power?). Specification issues are central: adjusted R² penalises adding junk regressors, omitted-variable bias versus irrelevant variables, the RESET test for functional-form error, and multicollinearity diagnosed by the variance inflation factor (VIF).
What this chapter covers
- 01Partial slope = ceteris-paribus effect of one predictor
- 02Overall F: F = (R²/k)/((1−R²)/(n−k−1)), df = (k, n−k−1)
- 03Adjusted R²: R̄² = 1 − (1−R²)(n−1)/(n−k−1)
- 04General (partial) F-test for a group of coefficients
- 05Omitted-variable bias vs irrelevant variables; the RESET test
- 06Multicollinearity: VIFⱼ = 1/(1 − R²ⱼ), concern above 5 or 10
- 07Functional forms: log-log (elasticity), log-lin (% change), lin-log
Overall F-test and adjusted R² from a regression printout
- 1 markState the hypotheses: H₀: β₁ = β₂ = β₃ = β₄ = 0 (no predictor matters) versus H₁: at least one slope is non-zero.
- 1 markIdentify degrees of freedom: numerator df = k = 4, denominator df = n − k − 1 = 50 − 4 − 1 = 45.
- 2 marksCompute the overall F: F = (R²/k)/((1−R²)/(n−k−1)) = (0.40/4)/((0.60)/45) = 0.10/0.013333 = 7.5.
- 1 markDecision rule: reject if F > F₀.₀₅,₄,₄₅ ≈ 2.58. Since 7.5 > 2.58, reject H₀.
- 2 marksAdjusted R²: R̄² = 1 − (1−R²)(n−1)/(n−k−1) = 1 − (0.60)(49)/(45) = 1 − 0.6533 = 0.3467 ≈ 0.35.
- 1 markConclude in context: the model is jointly significant — at least one predictor explains house price — and after penalising for the four predictors about 35% of price variation is explained.
Key terms
- Partial (ceteris-paribus) effect
- A multiple-regression slope βⱼ is the expected change in Y per one-unit increase in Xⱼ, holding all other predictors constant — not the raw bivariate relationship.
- Overall F-test
- Tests H₀ that all slopes are simultaneously zero using F = MSR/MSE = (R²/k)/((1−R²)/(n−k−1)). A significant F means the model as a whole has explanatory power.
- Adjusted R²
- R̄² = 1 − (1−R²)(n−1)/(n−k−1), a version of R² that penalises adding predictors. It can fall when a useless regressor is added, making it the fairer measure for comparing models of different sizes.
- Variance inflation factor (VIF)
- VIFⱼ = 1/(1 − R²ⱼ), where R²ⱼ comes from regressing predictor j on the others. Values above 5 (or 10) flag serious multicollinearity, which inflates standard errors and destabilises coefficient signs.
Multiple Regression, the General F-test & Specification FAQ
What's the difference between R² and adjusted R²?
R² never falls when you add a predictor, so it rewards stuffing the model with variables. Adjusted R² subtracts a penalty for each extra predictor and can decrease, so it is the right tool for comparing models with different numbers of regressors.
How do I spot multicollinearity, and does it bias my coefficients?
The symptoms are a high R² and significant overall F but individually insignificant t's and wrong-signed coefficients; confirm with VIF (concern above 5 or 10). Multicollinearity does not bias the estimates — they stay unbiased — but it inflates their standard errors, making individual effects hard to detect. Remedies include dropping or combining collinear predictors or gathering more data.
Exam move
Practise reading a full lm summary top to bottom: the F-statistic and its p (overall significance), each coefficient's t and p (individual effects), R² and adjusted R² (fit), and the VIF block (collinearity). Drill the three F-tests — overall, individual-equivalent, and general/partial — so you know which one a question is asking for.