University of Melbourne · S1 2026 · FACULTY OF BUSINESS & ECONOMICS

ECON20003 · Quantitative Methods 2

- one subject, every graph, every model, every mark
50% final exam · hurdle14 Chapters7-page Bible
Our own words - no uploaded lecturer files
Built to mirror S1 2026 · updated this semester
Chapter 10 of 12 · ECON20003

Cross-Validation, Ridge & LASSO Regression

Cross-Validation, Ridge & LASSO Regression is the conceptual machine-learning corner of the subject, focused on the bias-variance trade-off and overfitting. A model that is too flexible fits the training data beautifully but predicts new data badly; k-fold cross-validation estimates out-of-sample error by training on most folds and testing on the held-out fold, then averaging, which is how you tune the penalty strength λ. Ridge regression (an L2 penalty) shrinks coefficients toward zero but never exactly to zero, while LASSO (an L1 penalty) shrinks and can set some coefficients exactly to zero, performing variable selection.

In this chapter

What this chapter covers

  • 01Bias-variance trade-off and overfitting
  • 02k-fold cross-validation to estimate out-of-sample error and tune λ
  • 03Ridge (L2): minimise SSE + λΣβⱼ² — shrinks toward 0, never exactly 0
  • 04LASSO (L1): minimise SSE + λΣ|βⱼ| — shrinks AND selects (some β̂ⱼ = 0)
  • 05Choosing λ by cross-validation
Worked example · free

Tuning λ with 5-fold cross-validation

Q [6 marks]. A LASSO model is tuned over candidate penalties. For a particular λ, 5-fold cross-validation produces mean squared prediction errors of 12, 10, 11, 13 and 9 on the five held-out folds. Compute the cross-validated error for this λ, and explain how you would use it to choose the final λ.
  • 1 markRecall the procedure: for each fold, train on the other four folds and record the prediction error on the held-out fold; the cross-validated error is the average across folds.
  • 2 marksSum the five fold errors: 12 + 10 + 11 + 13 + 9 = 55.
  • 1 markAverage over k = 5 folds: CV error = 55/5 = 11.
  • 1 markRepeat for every candidate λ to get one CV error per λ.
  • 1 markChoose the λ with the lowest cross-validated error (some practitioners use the slightly larger one-standard-error λ for a simpler model). Larger λ shrinks coefficients more and, for LASSO, sets more of them to zero.
The cross-validated error for this λ is (12 + 10 + 11 + 13 + 9)/5 = 55/5 = 11; you pick the λ that minimises this CV error across the candidate grid.
Sia tip — Cross-validation exists to estimate out-of-sample performance, so never tune λ on the training error — it always favours the most complex model. Remember the one practical difference that earns marks: ridge shrinks every coefficient toward zero but keeps them all, whereas LASSO can zero coefficients out and thereby selects variables.
Glossary

Key terms

Bias-variance trade-off
Simple models have high bias (they systematically miss) but low variance; flexible models have low bias but high variance (they overfit noise). Good prediction balances the two to minimise total out-of-sample error.
k-fold cross-validation
Split the data into k folds, train on k − 1 and test on the held-out fold, rotate through all folds, and average the test errors. It estimates out-of-sample error and is used to tune the penalty λ.
Ridge regression (L2)
Minimises SSE + λΣβⱼ², adding a penalty on the squared coefficients. It shrinks all coefficients toward zero to reduce variance but never sets any exactly to zero, so it does not select variables.
LASSO regression (L1)
Minimises SSE + λΣ|βⱼ|, penalising the absolute coefficients. It both shrinks coefficients and can set some exactly to zero, performing automatic variable selection alongside regularisation.
FAQ

Cross-Validation, Ridge & LASSO Regression FAQ

What's the key difference between ridge and LASSO?

Both add a penalty on the size of the coefficients to control overfitting, but ridge (L2) shrinks every coefficient toward zero without eliminating any, while LASSO (L1) can drive some coefficients exactly to zero, so LASSO doubles as a variable-selection method. The penalty strength λ is tuned by cross-validation in both.

Why do we need cross-validation at all?

Because training error always improves as a model gets more complex, it cannot tell you whether the model will predict new data well. Cross-validation simulates out-of-sample testing by holding folds out, giving an honest error estimate to pick the λ (or model) that generalises best.

Study strategy

Exam move

This chapter is conceptual with no R-printout reading, so focus on crisp definitions: state the bias-variance trade-off, explain the k-fold procedure, and contrast ridge versus LASSO in one sentence each. Be ready to compute a simple averaged CV error and to say how λ is chosen.

A+Everything unlocked
Unlocks this Bible + all 13 of your University of Melbourne subjects - and 1,000+ Bibles across every Australian university.
Sia - your ECON20003 tutor, unlimited, worked the way the exam marks it
The full 7-page Bible + practice bank with worked solutions
Chrome extension - sync your LMS so Sia knows your deadlines
Bilingual EN / Chinese on every Bible and every Sia answer
$25/ month
30-day money-back · cancel in one tap · how it works
Unlock the full ECON20003 Bible + 13 University of Melbourne subjects解锁完整 ECON20003 Bible + University of Melbourne 13 门科目
$25/mo