STAT7038
May 29, 2026
All files
我拿到了一份我在上的课的考前宝典!你帮我拆解一下这节课,告诉我复习重点,我快final了
越详细越好,公式,定义,核心考点我都需要,救救孩子
- 下面我只基于你这两份材料(STAT7038 Regression Modelling 的 bible + cheatsheet)把这门课“拆开”成一张期末冲刺复习地图:你照着刷,基本不会跑偏。
- 这门课的终极目标:看懂给定的 R 输出(你读、不跑),然后把数字代进正确流程,做检验/区间/诊断/选择,并写出情境化结论。[1]Source: asksia-bible-stat7038-bilingual.pdfIt's the exam-prep window. Use the chapters to build your one A4 typed sheet: the boxed formulas, the 5-step test ritual, the CI-vs-PI rule, the R- output map. The blueprint overleaf shows the weights, the format and exactly what the sheet should carry. 正处考前冲刺期。用各章来打造你的 那一张 A4 打印笔记:框住的公式、5 步检验流程、CI与 PI 的判别规则、R 输出地图。背面的蓝图给出权重、形 式以及这张笔记究竟该承载什么。 ★ The KILLER HOOK - what you may bring, what they give you 致命要点 -- 你能带什么,他们给你什么 The final permits ONE A4-size sheet of notes, DOUBLE-SIDED, TYPED OR PRINTED. The exam itself supplies the calculator (hp300s+), the R outputs, the statistical tables (t, F, normal) and scribble paper. So do not waste sheet space on table values or on R syntax - you read R, you don't run it. Spend the sheet on the boxed formulas, the decision rules, the CI/PI distinction, and a worked-template for each test. This guide is written to populate exactly that sheet. 期末允许带一张 A4 笔记,双面,打字或打印。考试本身提供计算器(hp300s+)、R输出、统计表(t、F、正态)以及草稿纸。所以不 要把笔记空间浪费在表值或 R 语法上 -- 你只判读 R,不运行它。把笔记用在框住的公式、判定规则、CI/PI 的区分,以及每种检验的 算例模板上。本指南正是为填满那张笔记而写。 STAT7038 . Regression Modelling i How this book was built - and the two-layer rule 本书是如何编成的 -- 以及两层原则 Standard statistical results - the SLR model, least squares, the ANOVA identity, the t/F sampling distributions - are universal facts, stated plainly. The unit's own framing and the lecturer's specific datasets are paraphrased and re-numbered; every worked example here uses our own numbers, never copied from slides or past papers. The course follows Kutner, Applied Linear Regression Models (4th ed. ). Verify dates and weights against your own Canvas (wattle. anu. edu. au), as details can shift between cohorts. 标准统计结果 -- SLR 模型、最小二乘、ANOVA 恒等式、t/F 抽样分布 -- 都是普适的事实,直接陈述。本课程自身的表述方式与讲师 的特定数据集均经过改写并重新编号;这里的每个算例都使用我们自己的数字,绝不照抄幻灯片或往年试卷。课程依据 Kutner, Applied Linear Regression Models (第4版)。各项日期与权重请以你自己的 Canvas (wattle. anu. edu. au)为准,细节可能随届次 而变。 STAT7038 . Regression Modelling THE BLUEPRINT - THE EXAM BLUEPRINT FINAL 70% . 5 JUNE 70% final, open to one typed sheet 期末占 70%,可带一张打印的笔记 Online Quiz 5% . In-tutorial Quiz 10% . Assignment 15% . Final 70% 在线测验 5% · 课堂测验 10% · 作业 15% · 期末 70% Your mark is built from four pieces, but the final exam dominates at 70%. Both quizzes are redeemable (the exam mark replaces them if higher); the assignment is non-redeemable. So the whole game is the final - and it is an open-sheet, R-output-supplied paper. 你的成绩由四部分构成,但期末考试以 70% 占主导。两次测验都可补救(若期末分数更高则用它替换);作业则不可补救。所以全部关键就 是期末 -- 而它是一场可带笔记、提供R输出的考试。 70% FINAL EXAM 期末考试 WRITING TIME 答题时间 1A TYPED SHEET, 2-SIDED 打印笔记,双面 W1-12 EXAMINABLE SCOPE 考查范围 The four assessment pieces 四项评估构成 Component Weight When / detail Final examination - MCQ + 70%[2]Source: asksia-bible-stat7038-bilingual.pdfRegression Modelling 回归建模 ONE LINE THROUGH A CLOUD OF POINTS- AND EVERYTHING YOU CAN INFER, TEST AND PREDICT FROM IT. 澳国立 ANU STAT7038 · 双语视觉精读 · LaTeX 公式排版 · 可带一张打印 A4 · 线性回归全流程 (诊断/选择) STAT7038 . AUSTRALIAN NATIONAL UNIVERSITY 中英双语版 · BILINGUAL EDITION 英文主讲,中文随行 一 考试要点与术语保留英文原词 The final exam is 70% of your mark. The good news: you may bring ONE A4 double-sided, typed or printed notes sheet, and the exam supplies the calculator, the R outputs and the statistical tables. So success is not about memory - it is about reading R output and driving the method on fresh numbers. This book teaches exactly that, and helps you build that one compliant sheet. Independent study companion. Not affiliated with or endorsed by the Australian National University. Corrections: takedowns@asksia. ai PREFACE HOW TO USE THIS BOOK Read R output, drive the method 读懂 R 输出,驾驭方法 Open-sheet exam - it tests whether you can execute & interpret, not recall 开卷(带笔记)考试 -- 它考的是你能否执行与解读,而非死记 This is not a transcript of the lecture slides. It is a self-contained course in every technique STAT7038 examines - each model stated plainly, each estimator derived to a formula, each test shown on a worked example with real arithmetic, and the matching R summary () / anova () output read line by line. The exam is open to one typed A4 sheet and supplies the calculator, the R printouts and the statistical tables - so the examiner cannot test what you remember, only whether you can do regression under time. That is what these pages drill. 这不是讲义幻灯片的逐字稿。它是一门自成体系的课程,覆盖 STAT7038 考查的每一项技术 -- 每个模型都直白陈述,每个估计量都推导到 公式,每个检验都在配真实算术的例题中演示,并把配套的 R summaryC)/ anova(〕 输出逐行读懂。考试可带一张打印的 A4 笔记,并提 供计算器、R打印结果和统计表 -- 所以考官无法考你记住了什么,只能考你能否在限时下做回归。这正是这些篇幅所训练的。 A 1 . LEARN 1 ·学习 You haven't seen the lecture yet. Read a chapter top to bottom. Each concept is an AHA-unit: the equation or picture - a plain explainer - the method in numbered steps - a fully worked example - the trap. The diagrams are original schematics of the standard statistics - learn the idea cold. 你还没上过这一讲。从头到尾通读一 章。每个概念都是一个 AHA 单元:公 式或图示→通俗讲解→ 编号步骤的 方法→完整算例→陷阱。图示都是 对标准统计量的原创示意图 -- 把概 念彻底学透。 B 2 . DRILL 2· 训练 You've done the lecture and the lab. Cover the worked steps and re-derive each answer with the supplied calculator. Then read the paired R output and tick off every number - Estimate, Std. Error, t value, Pr(>|t|), the F-statistic. That reading speed is the exam. 你已上过课、做过实验。遮住已做好 的步骤,用所提供的计算器重新推导 每个答案。然后判读配套的R 输出, 逐一核对每个数字 -- Estimate、 Std. Error, t value, Pr(>|t|〕、F-statistic。这种判 读速度就是考试本身。 C 3 · EXAM 3· 考试[3]Source: asksia-bible-stat7038-bilingual.pdf5 June, 2pm · 15 min reading + 180 min In-tutorial Quiz (redeemable) 10% Wk 7 · topic: simple linear regression Assignment (non-redeemable, R) 15% Wk 11 . due 21 May, 5pm Online Quiz (redeemable, Canvas) 5% Wk 5 . no extensions ✓ The strategy this dictates - the recurring chains 由此决定的策略 -- 反复出现的链条 Every exam item is a procedure on supplied numbers. Drill the chains: Sxy/Sxx - b1, bo; SSE - MSE - se(b;) - t - decision; SST = SSR + SSE - F, R2; xh - CI (mean) or PI (new obs). Show every line for the short-answer written parts - method marks are real. Put each chain, once, on your sheet. 每道考题都是对所给数字执行某一流程。反复演练这些链条: Sxy/Sxx → b1、 bo; SSE → MSE → se(bi)→t→判定; SST = SSR + SSE → F、R2; xh → CI(均值)或 PI(新观 测)。简答文字部分要写出每一行 -- 方法分是实打实的。把 每条链条都在笔记上写一次。 What "R supplied" means for your sheet “提供R 输出”对你的笔记意味着什么 Supplied - don't cram You must be able to do / read t, F, normal tables Pick the right critical value & df summary(lm) printout Read off b, se, t, p; recover MSE, n anova(lm) printout Read SSR, SSE, df; form F = MSR/MSE hp300s+ calculator Sxx. Sxy, b1, Cls by hand ★ The exam format - open one sheet, calculator & tables supplied 考试格式 -- 可带一张笔记,提供计算器与统计表 Three question styles: multiple-choice, short-answer calculation, and short-answer written. Covers all lectures & tutorials, Weeks 1-12. Permitted: one A4 double-sided typed/printed notes sheet. Supplied in the paper: hp300s+ calculator, R outputs, statistical tables, scribble paper. Significance level 5% unless stated; log means natural log. 三种题型:选择题、简答计算、简答文字。覆盖第1-12周的 全部讲课与辅导。允许携带:一张 A4 双面打字/打印笔记。 试卷中提供:hp300s+ 计算器、R 输出、统计表、草稿纸。 除非另有说明,显著性水平为 5%;log 指自然对数。 STAT7038 . Regression Modelling short calc + written CONTENTS - CONTENTS
- 考试条件最关键:只允许带 1 张 A4、双面、打印/打字笔记;考场提供计算器(hp300s+)、R 输出、统计表(t/F/normal)与草稿纸,所以你 A4 上不要浪费在“表值、R 语法”。把空间留给:框住的公式、决策规则、CI vs PI 区分、每类题的模板链条。[1]Source: asksia-bible-stat7038-bilingual.pdfIt's the exam-prep window. Use the chapters to build your one A4 typed sheet: the boxed formulas, the 5-step test ritual, the CI-vs-PI rule, the R- output map. The blueprint overleaf shows the weights, the format and exactly what the sheet should carry. 正处考前冲刺期。用各章来打造你的 那一张 A4 打印笔记:框住的公式、5 步检验流程、CI与 PI 的判别规则、R 输出地图。背面的蓝图给出权重、形 式以及这张笔记究竟该承载什么。 ★ The KILLER HOOK - what you may bring, what they give you 致命要点 -- 你能带什么,他们给你什么 The final permits ONE A4-size sheet of notes, DOUBLE-SIDED, TYPED OR PRINTED. The exam itself supplies the calculator (hp300s+), the R outputs, the statistical tables (t, F, normal) and scribble paper. So do not waste sheet space on table values or on R syntax - you read R, you don't run it. Spend the sheet on the boxed formulas, the decision rules, the CI/PI distinction, and a worked-template for each test. This guide is written to populate exactly that sheet. 期末允许带一张 A4 笔记,双面,打字或打印。考试本身提供计算器(hp300s+)、R输出、统计表(t、F、正态)以及草稿纸。所以不 要把笔记空间浪费在表值或 R 语法上 -- 你只判读 R,不运行它。把笔记用在框住的公式、判定规则、CI/PI 的区分,以及每种检验的 算例模板上。本指南正是为填满那张笔记而写。 STAT7038 . Regression Modelling i How this book was built - and the two-layer rule 本书是如何编成的 -- 以及两层原则 Standard statistical results - the SLR model, least squares, the ANOVA identity, the t/F sampling distributions - are universal facts, stated plainly. The unit's own framing and the lecturer's specific datasets are paraphrased and re-numbered; every worked example here uses our own numbers, never copied from slides or past papers. The course follows Kutner, Applied Linear Regression Models (4th ed. ). Verify dates and weights against your own Canvas (wattle. anu. edu. au), as details can shift between cohorts. 标准统计结果 -- SLR 模型、最小二乘、ANOVA 恒等式、t/F 抽样分布 -- 都是普适的事实,直接陈述。本课程自身的表述方式与讲师 的特定数据集均经过改写并重新编号;这里的每个算例都使用我们自己的数字,绝不照抄幻灯片或往年试卷。课程依据 Kutner, Applied Linear Regression Models (第4版)。各项日期与权重请以你自己的 Canvas (wattle. anu. edu. au)为准,细节可能随届次 而变。 STAT7038 . Regression Modelling THE BLUEPRINT - THE EXAM BLUEPRINT FINAL 70% . 5 JUNE 70% final, open to one typed sheet 期末占 70%,可带一张打印的笔记 Online Quiz 5% . In-tutorial Quiz 10% . Assignment 15% . Final 70% 在线测验 5% · 课堂测验 10% · 作业 15% · 期末 70% Your mark is built from four pieces, but the final exam dominates at 70%. Both quizzes are redeemable (the exam mark replaces them if higher); the assignment is non-redeemable. So the whole game is the final - and it is an open-sheet, R-output-supplied paper. 你的成绩由四部分构成,但期末考试以 70% 占主导。两次测验都可补救(若期末分数更高则用它替换);作业则不可补救。所以全部关键就 是期末 -- 而它是一场可带笔记、提供R输出的考试。 70% FINAL EXAM 期末考试 WRITING TIME 答题时间 1A TYPED SHEET, 2-SIDED 打印笔记,双面 W1-12 EXAMINABLE SCOPE 考查范围 The four assessment pieces 四项评估构成 Component Weight When / detail Final examination - MCQ + 70%[3]Source: asksia-bible-stat7038-bilingual.pdf5 June, 2pm · 15 min reading + 180 min In-tutorial Quiz (redeemable) 10% Wk 7 · topic: simple linear regression Assignment (non-redeemable, R) 15% Wk 11 . due 21 May, 5pm Online Quiz (redeemable, Canvas) 5% Wk 5 . no extensions ✓ The strategy this dictates - the recurring chains 由此决定的策略 -- 反复出现的链条 Every exam item is a procedure on supplied numbers. Drill the chains: Sxy/Sxx - b1, bo; SSE - MSE - se(b;) - t - decision; SST = SSR + SSE - F, R2; xh - CI (mean) or PI (new obs). Show every line for the short-answer written parts - method marks are real. Put each chain, once, on your sheet. 每道考题都是对所给数字执行某一流程。反复演练这些链条: Sxy/Sxx → b1、 bo; SSE → MSE → se(bi)→t→判定; SST = SSR + SSE → F、R2; xh → CI(均值)或 PI(新观 测)。简答文字部分要写出每一行 -- 方法分是实打实的。把 每条链条都在笔记上写一次。 What "R supplied" means for your sheet “提供R 输出”对你的笔记意味着什么 Supplied - don't cram You must be able to do / read t, F, normal tables Pick the right critical value & df summary(lm) printout Read off b, se, t, p; recover MSE, n anova(lm) printout Read SSR, SSE, df; form F = MSR/MSE hp300s+ calculator Sxx. Sxy, b1, Cls by hand ★ The exam format - open one sheet, calculator & tables supplied 考试格式 -- 可带一张笔记,提供计算器与统计表 Three question styles: multiple-choice, short-answer calculation, and short-answer written. Covers all lectures & tutorials, Weeks 1-12. Permitted: one A4 double-sided typed/printed notes sheet. Supplied in the paper: hp300s+ calculator, R outputs, statistical tables, scribble paper. Significance level 5% unless stated; log means natural log. 三种题型:选择题、简答计算、简答文字。覆盖第1-12周的 全部讲课与辅导。允许携带:一张 A4 双面打字/打印笔记。 试卷中提供:hp300s+ 计算器、R 输出、统计表、草稿纸。 除非另有说明,显著性水平为 5%;log 指自然对数。 STAT7038 . Regression Modelling short calc + written CONTENTS - CONTENTS[4]Source: asksia-bible-stat7038-bilingual.pdfSTAT7038 期末考试闭卷,但允许带一张你自己的 A4 笔记外加一台计算器,并提供 t/F/正态分布表 〔闭卷,但可带一张A4 笔记,附统计 表〕。所以把A4写满公式与临界值(左栏),把复习时间花在笔记替你做不了的步骤流程与 R 输出判读上(右栏)。 MARKS . 4x15 分数 · 4×15 5 STEPS PER TEST 每个检验的步数 A4 NOTE ALLOWED 允许带笔记 read R, NEVER RUN IT R 输出,绝不运行 Term (EN) 中文 One-line meaning Term (EN) 中文 One-line meaning On your A4 note - bring these, don't memorise blind # A4 4 L You must execute / read by hand 须手算或读懂 SS & estimator formulae 平方和 与估计 量公式 Sxx, Say; bi = Say/ Sax, bo = y - bix; MSE = SSE/(n-2). The 5-step hypothesis test 假设检验五步法 Hypotheses - statistic - critical value/df (or p) -+ decision - conclusion IN CONTEXT - every test. se / variance 标准误 formulae 与方差 公式 Var(b1) = 02/Szz, se(b1) = \MSE/Szz; the CI/PI root forms. Fill an ANOVA 由 summary 补 全 ANOVA 表 Recover SSR/SSE/df from R output; F = MSR/MSE; don't round intermediates. Diagnostic cut- offs 诊断临[27]Source: asksia-cheatsheet-stat7038.pdfSTAT7038 Regression Modelling AUSTRALIAN NATIONAL UNIVERSITY . RSFAS EXAM REVISION Sem 1 2026 . SIDE 1 OF 2 SLR · estimation . inference SIDE 1/2 R output READ FIRST 0 . Exam Blueprint * Final = 70% . 180 min (+15 read) . MCQ + short-answer calculation + written, all of weeks 1-12. Also: online quiz 5%, in- tutorial quiz (SLR) 10%, assignment 15%. * The exam permits ONE A4 double-sided typed or printed notes sheet - and a calculator, R outputs & statistical tables are SUPPLIED. So this sheet IS a compliant memory aid: spend zero space on tables/R syntax, max out formulae, decision rules, cut-offs & method recipes. Every hypothesis test shows all 5: (1) hypotheses, (2) test statistic, (3) critical value w/ df (or p), (4) decision, (5) conclusion in context. a = 5% unless stated; log = natural log. - - SIA > The killer combo: read the supplied R output, pull the numbers, plug into the formula on this sheet, run the 5-step test. Memorise the recipes & cut-offs, not the tables. 1 . Building Blocks NOTATION Parameter (ß1, 02) = fixed unknown; estimator (b1, 02) = random, from the sample; estimate = its realised value. The sampling distribution is the estimator's distribution over repeated samples. CORE SUMS (MEMORISE) x == (1/n) ΣΧΙ · Sxx = Σ(Χi -x) 2 S_yy = Σ( yi-y)2 · Sxγ = Σ (Xi -X) (γi -y) Sx2 = Sxx/(n-1) . Cov = Sxy/(n-1) r = Sxy/V(SxxS_yy) = Sxy/((n-1) SxSy) EXPECTATION / VARIANCE RULES E(aX+b) = aE(X)+b Var(aX+b) = a2Var(X) Var (Σai Υ1)=Σa12Var (Υ1)+2_{i<j}aia; Cov [Yi,Υ;) (independent - Cov terms vanish) re [-1,1] measures linear association only. These E/Var rules drive every variance derivation below (e. g. Var(b,) treats the y, as the only random part, since the x, are fixed and the & carry all the randomness). Trap: correlation # causation; a strong r can be driven by one outlier or a lurking variable; r = 0 => no linear link, not "no relationship". 1b . The 5 SS Identities KEEP HANDY Everything in inference is built from five sums; learn how each is recovered from the others: SST = S_yy = E(yi-y)2 (df n-1) SSR = b1 . Sxy = b12Sxx = R2 . SST (df 1) SSE = SST - SSR = Ee:2 (df n-2) MSE = SSE/(n-2) . MSR = SSR/1 R2 = SSR/SST . r = ±VR2 (sign of bi) If an R table hides one cell, back it out: e. g. SSE = SST(1-R2) , or SSR = F·MSE. Degrees of freedom always add: df_total (n-1) = df_reg + df_err. In SLR that's (n-1) = 1 + (n-2); in MLR (n-1) = (p-1) + (n-p). A quick df check catches most table-fill slips. The "p" you divide by is the number of estimated parameters including the intercept - count the rows in the coefficient table. Get p wrong and every df, MSE and cut-off downstream is wrong too. In SLR p = 2. 2 . SLR Model + LINE SIMPLE LINEAR REGRESSION V1 = Be + BiXi + 81 , 81 ~iid N(0, 02) " yı ~ind N(Be+ß1X1 , 02) E(y|x) = Be + B1x (mean response) Four assumptions - LINE: Linearity of E(y|x); Independence of &; Normality of &; Equal variance Var(s)=o2. The x, are fixed, measured without error. Interpret: [] = expected change in y per 1-unit + in x; Bo = expected y at x = 0 (often a meaningless extrapolation). 3 . Least-Squares Estimation
- 你材料里写明:期末权重 70%,日期标为 5 June(你再用自己 Canvas 核对,因为材料也提醒届次可能变动)。[1]Source: asksia-bible-stat7038-bilingual.pdfIt's the exam-prep window. Use the chapters to build your one A4 typed sheet: the boxed formulas, the 5-step test ritual, the CI-vs-PI rule, the R- output map. The blueprint overleaf shows the weights, the format and exactly what the sheet should carry. 正处考前冲刺期。用各章来打造你的 那一张 A4 打印笔记:框住的公式、5 步检验流程、CI与 PI 的判别规则、R 输出地图。背面的蓝图给出权重、形 式以及这张笔记究竟该承载什么。 ★ The KILLER HOOK - what you may bring, what they give you 致命要点 -- 你能带什么,他们给你什么 The final permits ONE A4-size sheet of notes, DOUBLE-SIDED, TYPED OR PRINTED. The exam itself supplies the calculator (hp300s+), the R outputs, the statistical tables (t, F, normal) and scribble paper. So do not waste sheet space on table values or on R syntax - you read R, you don't run it. Spend the sheet on the boxed formulas, the decision rules, the CI/PI distinction, and a worked-template for each test. This guide is written to populate exactly that sheet. 期末允许带一张 A4 笔记,双面,打字或打印。考试本身提供计算器(hp300s+)、R输出、统计表(t、F、正态)以及草稿纸。所以不 要把笔记空间浪费在表值或 R 语法上 -- 你只判读 R,不运行它。把笔记用在框住的公式、判定规则、CI/PI 的区分,以及每种检验的 算例模板上。本指南正是为填满那张笔记而写。 STAT7038 . Regression Modelling i How this book was built - and the two-layer rule 本书是如何编成的 -- 以及两层原则 Standard statistical results - the SLR model, least squares, the ANOVA identity, the t/F sampling distributions - are universal facts, stated plainly. The unit's own framing and the lecturer's specific datasets are paraphrased and re-numbered; every worked example here uses our own numbers, never copied from slides or past papers. The course follows Kutner, Applied Linear Regression Models (4th ed. ). Verify dates and weights against your own Canvas (wattle. anu. edu. au), as details can shift between cohorts. 标准统计结果 -- SLR 模型、最小二乘、ANOVA 恒等式、t/F 抽样分布 -- 都是普适的事实,直接陈述。本课程自身的表述方式与讲师 的特定数据集均经过改写并重新编号;这里的每个算例都使用我们自己的数字,绝不照抄幻灯片或往年试卷。课程依据 Kutner, Applied Linear Regression Models (第4版)。各项日期与权重请以你自己的 Canvas (wattle. anu. edu. au)为准,细节可能随届次 而变。 STAT7038 . Regression Modelling THE BLUEPRINT - THE EXAM BLUEPRINT FINAL 70% . 5 JUNE 70% final, open to one typed sheet 期末占 70%,可带一张打印的笔记 Online Quiz 5% . In-tutorial Quiz 10% . Assignment 15% . Final 70% 在线测验 5% · 课堂测验 10% · 作业 15% · 期末 70% Your mark is built from four pieces, but the final exam dominates at 70%. Both quizzes are redeemable (the exam mark replaces them if higher); the assignment is non-redeemable. So the whole game is the final - and it is an open-sheet, R-output-supplied paper. 你的成绩由四部分构成,但期末考试以 70% 占主导。两次测验都可补救(若期末分数更高则用它替换);作业则不可补救。所以全部关键就 是期末 -- 而它是一场可带笔记、提供R输出的考试。 70% FINAL EXAM 期末考试 WRITING TIME 答题时间 1A TYPED SHEET, 2-SIDED 打印笔记,双面 W1-12 EXAMINABLE SCOPE 考查范围 The four assessment pieces 四项评估构成 Component Weight When / detail Final examination - MCQ + 70%
-
0)你现在最需要知道的:期末“出题形状”= 反复出现的“链条题”
- 你要练到“闭眼能写”的四条链(材料明确说每道题就是 procedure):[3]Source: asksia-bible-stat7038-bilingual.pdf5 June, 2pm · 15 min reading + 180 min
In-tutorial Quiz (redeemable) 10%
Wk 7 · topic: simple linear regression
Assignment (non-redeemable, R)
15%
Wk 11 . due 21 May, 5pm
Online Quiz (redeemable, Canvas)
5% Wk 5 . no extensions
✓ The strategy this dictates - the recurring chains 由此决定的策略 -- 反复出现的链条
Every exam item is a procedure on supplied numbers. Drill the chains: Sxy/Sxx - b1, bo; SSE - MSE - se(b;) - t - decision; SST = SSR + SSE - F, R2; xh - CI (mean) or PI (new obs). Show every line for the short-answer written parts - method marks are real. Put each chain, once, on your sheet.
每道考题都是对所给数字执行某一流程。反复演练这些链条: Sxy/Sxx → b1、 bo; SSE → MSE → se(bi)→t→判定; SST = SSR + SSE → F、R2; xh → CI(均值)或 PI(新观 测)。简答文字部分要写出每一行 -- 方法分是实打实的。把 每条链条都在笔记上写一次。
What "R supplied" means for your sheet
“提供R 输出”对你的笔记意味着什么
Supplied - don't cram
You must be able to do / read
t, F, normal tables
Pick the right critical value & df
summary(lm) printout
Read off b, se, t, p; recover MSE, n
anova(lm) printout Read SSR, SSE, df; form F = MSR/MSE
hp300s+ calculator
Sxx. Sxy, b1, Cls by hand
★
The exam format - open one sheet, calculator & tables supplied
考试格式 -- 可带一张笔记,提供计算器与统计表
Three question styles: multiple-choice, short-answer calculation, and short-answer written. Covers all lectures & tutorials, Weeks 1-12. Permitted: one A4 double-sided typed/printed notes sheet. Supplied in the paper: hp300s+ calculator, R outputs, statistical tables, scribble paper. Significance level 5% unless stated; log means natural log. 三种题型:选择题、简答计算、简答文字。覆盖第1-12周的 全部讲课与辅导。允许携带:一张 A4 双面打字/打印笔记。 试卷中提供:hp300s+ 计算器、R 输出、统计表、草稿纸。 除非另有说明,显著性水平为 5%;log 指自然对数。
STAT7038 . Regression Modelling
short calc + written
CONTENTS
- CONTENTS
- 链条 A(手算拟合线):$S_{xy}/S_{xx}\rightarrow b_1,\ b_0$
- 链条 B(推断 t 检验):$SSE\rightarrow MSE\rightarrow se(b_j)\rightarrow t=b/se\rightarrow 判定\rightarrow 情境结论$
- 链条 C(ANOVA / $F$ / $R^2$):$SST=SSR+SSE\rightarrow F=MSR/MSE,\ R^2$
- 链条 D(预测区间):给 $x_h$,判断 CI(mean) 还是 PI(new),再代入根号里的结构(关键是 PI 的 “$+1$”)
-
1)整门课的“骨架”:先吃透 SLR(简单线性回归),后面都是“换皮”
- 你的 bible 把课程按逻辑分成三块:先 SLR(基础与核心),再 MLR(多元回归、矩阵形式),再诊断与模型选择。[6]Source: asksia-bible-stat7038-bilingual.pdfFrom one line, to many predictors
从一条直线,到多个预测变量
Simple linear regression first (the foundation every later topic builds on), then multiple regression
先讲简单线性回归(后续每个主题都建立在这个基础上),再讲多元回归
Ch Topic Core methods
Part 1 . Simple linear regression - the core of the exam
1 Simple linear regression the model . LINE assumptions . least squares . LSE properties →
2 Inference in SLR ANOVA . F-test & R2 . t-tests & Cls . CI vs PI
→
Part 2 · Multiple regression & model building
3 Diagnostics
residual plots . transformations . outliers, leverage, influence →
4 Matrix form & multiple regression XB . hat matrix . partial coefficients →
5 Inference in MLR overall F vs partial t · adjusted R2 · sequential SS →
6 Qualitative covariates & multicollinearity dummies · interactions . VIF
→
7 Model selection
Cp . AIC/BIC . PRESS . stepwise
→
Part 3 . Walk in ready
8 Glossary & formula map
every term . what's on the tables, what's on your sheet →
9 Practice bank & worked solutions the recurring exam templates, re-numbered
→
i Why this order
为何采用这个顺序
STAT7038 spends Weeks 1-5 on simple linear regression - the model, least squares, inference and diagnostics - then Weeks 6-11 generalise to multiple regression. We keep that order because every multiple-regression result is the SLR result in matrix
clothing: master the t-test, the ANOVA identity and the CI/PI distinction on one predictor, and the rest is bookkeeping. This volume covers Chapters 1 and 2 - the simple-linear core - in full depth.
STAT7038 第 1-5 周讲简单线性回归 -- 模型、最小二乘、推断与诊断 -- 随后第 6-11 周推广到多元回归。我们保持这一顺序,因为 每个多元回归结果都是穿上矩阵外衣的 SLR 结果:在单一预测变量上吃透 t 检验、ANOVA 恒等式与 CI/PI 区分,其余便只是记账。本 卷完整、深入地覆盖第1、2章 -- 简单线性回归的核心。
STAT7038 . Regression Modelling
- 复习优先级建议:
- 第一优先:SLR 的“估计 + 推断 + CI/PI + ANOVA 恒等式 + 读 R 输出”——这是后面一切的地基。[6]Source: asksia-bible-stat7038-bilingual.pdfFrom one line, to many predictors 从一条直线,到多个预测变量 Simple linear regression first (the foundation every later topic builds on), then multiple regression 先讲简单线性回归(后续每个主题都建立在这个基础上),再讲多元回归 Ch Topic Core methods Part 1 . Simple linear regression - the core of the exam 1 Simple linear regression the model . LINE assumptions . least squares . LSE properties → 2 Inference in SLR ANOVA . F-test & R2 . t-tests & Cls . CI vs PI → Part 2 · Multiple regression & model building 3 Diagnostics residual plots . transformations . outliers, leverage, influence → 4 Matrix form & multiple regression XB . hat matrix . partial coefficients → 5 Inference in MLR overall F vs partial t · adjusted R2 · sequential SS → 6 Qualitative covariates & multicollinearity dummies · interactions . VIF → 7 Model selection Cp . AIC/BIC . PRESS . stepwise → Part 3 . Walk in ready 8 Glossary & formula map every term . what's on the tables, what's on your sheet → 9 Practice bank & worked solutions the recurring exam templates, re-numbered → i Why this order 为何采用这个顺序 STAT7038 spends Weeks 1-5 on simple linear regression - the model, least squares, inference and diagnostics - then Weeks 6-11 generalise to multiple regression. We keep that order because every multiple-regression result is the SLR result in matrix clothing: master the t-test, the ANOVA identity and the CI/PI distinction on one predictor, and the rest is bookkeeping. This volume covers Chapters 1 and 2 - the simple-linear core - in full depth. STAT7038 第 1-5 周讲简单线性回归 -- 模型、最小二乘、推断与诊断 -- 随后第 6-11 周推广到多元回归。我们保持这一顺序,因为 每个多元回归结果都是穿上矩阵外衣的 SLR 结果:在单一预测变量上吃透 t 检验、ANOVA 恒等式与 CI/PI 区分,其余便只是记账。本 卷完整、深入地覆盖第1、2章 -- 简单线性回归的核心。 STAT7038 . Regression Modelling[8]Source: asksia-bible-stat7038-bilingual.pdf✓ Recover n from the output i summary stars 从输出中还原 n summary 的星号 "on 6 degrees of freedom" means dfE = n - 2 = 6, so n = 8. The F line "on 1 and 6 DF" confirms it. From n you can rebuild any SE the printout hides. *** / ** / * flag significance at . 001 / . 01 / . 05. Handy, but in a written answer quote the p-value or compare to the critical value - don't just cite stars. “在6个自由度上”意味着 dfE = n-2=6,故 n=8。F那 一行“在1和6 DF上”印证了这一点。有了n,便可重建打印 输出中隐去的任何 SE。 *** / ** / * 分别标记 . 001/ 01/. 05 水平上的显著性。方 便,但在文字作答中要引用 p 值或与临界值比较 -- 不要只引 星号。 "The whole of SLR inference is recoverable from six numbers on one printout. Learn to read it cold and the calculation questions become transcription with arithmetic. " “整个 SLR 推断都能从一张打印输出上的六个数字里复原出来。把它练到能一眼读懂,计算题就变成了带算术的誊抄。” WHY R-OUTPUT READING IS THE HIGHEST-YIELD EXAM SKILL STAT7038 . Regression Modelling DIAGNOSTICS . RESIDUAL PLOT - DIAGNOSTICS - DO THE ASSUMPTIONS HOLD? WEEKS 5-6 . HEAVILY EXAMINED The fit is only as good as LINE 拟合的优劣完全取决于 LINE Residual plots are how you check, not just assume, the four conditions 残差图是用来检验这四条假设的,而不是直接假定它们成立 Least squares always returns a line - even through data that has no business being modelled linearly. Inference (the t- and F-tests, the CIs and PIs) is only valid when the error assumptions hold. Diagnostics are the plots and statistics that interrogate them. Recall the four LINE assumptions: 最小二乘法总会返回一条直线 -- 哪怕数据根本不该用线性建模。推断(t 与 F 检验、CI 与 PI)只有在误差假设成立时才有效。诊断就是用 来盘问这些假设的图与统计量。回顾四条 LINE 假设: L LINEARITY OF E[Y |X] E[y|x]的线性 I INDEPENDENT ERRORS 误差相互独立 N NORMAL ERRORS 误差正态 E EQUAL VARIANCE 方差相等 AHA 1 The residuals-vs-fitted plot AHA 1 残差对拟合值图 The single most useful diagnostic. Plot each residual ei = yi - yi against its fitted value gj. Under the assumptions the residuals are a structureless cloud about the e = 0 line. Read it for two things at once: curvature (a failure of Linearity) and changing spread (a failure of Equal variance). 最有用的单一诊断。把每个残差对其拟合值作图。在假设成立下,残差是围绕 e = 0 直线、毫无结构的散点云。一眼同时看两件事:弯曲 (Linearity 即线性的失效)与散布变化(Equal variance 即等方差的失效)。 residual e GOOD
- 第二优先:Diagnostics(残差图、Cook、leverage、outlier/influence 的区分)——材料强调“重考”。[8]Source: asksia-bible-stat7038-bilingual.pdf✓ Recover n from the output i summary stars 从输出中还原 n summary 的星号 "on 6 degrees of freedom" means dfE = n - 2 = 6, so n = 8. The F line "on 1 and 6 DF" confirms it. From n you can rebuild any SE the printout hides. *** / ** / * flag significance at . 001 / . 01 / . 05. Handy, but in a written answer quote the p-value or compare to the critical value - don't just cite stars. “在6个自由度上”意味着 dfE = n-2=6,故 n=8。F那 一行“在1和6 DF上”印证了这一点。有了n,便可重建打印 输出中隐去的任何 SE。 *** / ** / * 分别标记 . 001/ 01/. 05 水平上的显著性。方 便,但在文字作答中要引用 p 值或与临界值比较 -- 不要只引 星号。 "The whole of SLR inference is recoverable from six numbers on one printout. Learn to read it cold and the calculation questions become transcription with arithmetic. " “整个 SLR 推断都能从一张打印输出上的六个数字里复原出来。把它练到能一眼读懂,计算题就变成了带算术的誊抄。” WHY R-OUTPUT READING IS THE HIGHEST-YIELD EXAM SKILL STAT7038 . Regression Modelling DIAGNOSTICS . RESIDUAL PLOT - DIAGNOSTICS - DO THE ASSUMPTIONS HOLD? WEEKS 5-6 . HEAVILY EXAMINED The fit is only as good as LINE 拟合的优劣完全取决于 LINE Residual plots are how you check, not just assume, the four conditions 残差图是用来检验这四条假设的,而不是直接假定它们成立 Least squares always returns a line - even through data that has no business being modelled linearly. Inference (the t- and F-tests, the CIs and PIs) is only valid when the error assumptions hold. Diagnostics are the plots and statistics that interrogate them. Recall the four LINE assumptions: 最小二乘法总会返回一条直线 -- 哪怕数据根本不该用线性建模。推断(t 与 F 检验、CI 与 PI)只有在误差假设成立时才有效。诊断就是用 来盘问这些假设的图与统计量。回顾四条 LINE 假设: L LINEARITY OF E[Y |X] E[y|x]的线性 I INDEPENDENT ERRORS 误差相互独立 N NORMAL ERRORS 误差正态 E EQUAL VARIANCE 方差相等 AHA 1 The residuals-vs-fitted plot AHA 1 残差对拟合值图 The single most useful diagnostic. Plot each residual ei = yi - yi against its fitted value gj. Under the assumptions the residuals are a structureless cloud about the e = 0 line. Read it for two things at once: curvature (a failure of Linearity) and changing spread (a failure of Equal variance). 最有用的单一诊断。把每个残差对其拟合值作图。在假设成立下,残差是围绕 e = 0 直线、毫无结构的散点云。一眼同时看两件事:弯曲 (Linearity 即线性的失效)与散布变化(Equal variance 即等方差的失效)。 residual e GOOD
- 第三优先:MLR + multicollinearity + selection(VIF / Cp / AIC / BIC / PRESS / stepwise / sequential vs partial SS)。[5]Source: asksia-bible-stat7038-bilingual.pdfNear-linear dependence among predictors (XT X near-singular); inflates SEs, flips signs. Variance Inflation Factor 方差膨胀因子 VIFj=1/(1- R2); flag >5 (concerning) / >10 (serious); \VIF; inflates se(bj). Mallows' Cp Mallows Cp Cp = SSEp/MSEfull - (n-2p); want Cp ~ p and small. AIC / BIC AIC / BIC AIC = n log(SSE/n) + 2p; BIC uses penalty plog n (heavier => smaller models). Minimise. PRESS 预测残差平方和 PRESS = >(ei/(1 - hii))2; leave-one-out predictive error; minimise. Stepwise selection 逐步回归 Forward/backward/both by AIC (R step()); data-driven, so final p-values are over- optimistic. Hierarchy / marginality 层级原则 Keep lower-order terms (x, main effects) whenever a higher-order term (x2, interaction) is retained. Parsimony / bias-variance 简约/偏差-方差 Too few terms => bias (underfit); too many => variance (overfit). Prefer the simplest adequate model. [SSE(R)-SSE(F)\/q model STAT7038 . Regression Modelling GLOSSARY - FORMULA - SHEET & R-READING MAP · 公式纸与 R 阅读地图 What goes on your A4 note vs what you execute A4 笔记上写什么,考场里执行什么 STAT7038 is closed-book with ONE A4 page of notes, a calculator, and supplied stat tables STAT7038 为闭卷,仅允许带一张 A4 笔记、一个计算器和考场提供的统计表 The STAT7038 final is closed-book but allows one A4 page of your own notes plus a calculator, with t/F/normal tables supplied 〔闭卷,但可带一张 A4笔记,附统计表〕. So fill the A4 with formulae and cut-offs (left) and spend revision on the procedures and R-reading the note cannot do for you (right).[18]Source: asksia-cheatsheet-stat7038.pdfRemedies: drop a redundant predictor; centre x (kills x vs x2 collinearity); combine variables; collect better- spread data; ridge. Condition number K = /()_max/2_min) of the scaled XTX; K > 30 signals collinearity problems - a global check to pair with the per-predictor VIFs. VIF pinpoints which predictor; k flags the system as a whole. Trap: high VIFs from deliberately included higher- order/interaction terms are expected, not a fault (centring reduces them). Multicollinearity inflates se's & destabilises coefficients but does NOT bias y or R2 within the data range. 25 . Model Selection WK 11 Smaller better for Cp/AIC/BIC/PRESS; larger for adj-R2. MALLOWS' CP Cp = SSEp/MSE_full - (n-2p) good: Cp = p (low bias) & small AIC / BIC AIC = n. log(SSE/n) + 2p BIC = n . Log(SSE/n) + p. log n (heavier » smaller models) PRESS (LEAVE-ONE-OUT) PRESS = E(e1/(1-hit))2 . minimise PRESS measures out-of-sample prediction (each point predicted from a fit that excludes it), so it rewards genuine predictive power rather than in-sample fit. Procedures: best-subset (regsubsets, feasible only for modest #predictors); forward (start null, add most sig); backward (start full, drop least sig); stepwise "both" - R step(), AIC-driven; read the trace, take the lowest-AIC move, stop when <none> is best. Bias-variance: too few => biased (underfit); too many = inflated variance (overfit). Prefer the simplest adequate model (Occam). All the criteria are just different penalties balancing fit against complexity, so they need not agree - report which criterion you used. Reading a step() trace: each block lists candidate add/drop moves with the resulting AIC; R takes the lowest-AIC move and stops when <none> tops the list. BIC's heavier penalty (p. log n, for n≥8) almost always lands on a smaller model than AIC, so quoting which criterion you used matters. Trap: a stepwise-selected model's p-values/CIs are over- optimistic (selection inflates significance); AIC vs BIC can pick different models. Validate; don't treat the selected model as confirmed truth. Trap List SIDE 2 | CI(mean) vs PI(new): "+1" > PI wider seq SS order-dependent . t = partial F sig, no t - multicollinearity (VIF) keep Lower-order if interaction sig cut-offs large-sample . gap not threshold no extrapolation . don't over-trust step () asksia. ai/cheatsheet/ anu-stat7038 . side 2/2 Revision aid . check the current class summary for exam conditions . @ 2026 good luck. revise smart. AskSia CHEATSHEET SERIES Compiled by AskSia . mapped to the STAT7038 syllabus . asksia. ai/cheatsheet/anu-stat7038 15 . Matrix Form MLR & DIAGNOSTICS . matrix form (XTX)-1XTy . hat matrix . seq vs partial SS . nested F . dummies & interactions . Leverage/Cook/DFFITS ONE A4 . TYPED MEMORY AID STAT7038 Regression Modelling AUSTRALIAN NATIONAL UNIVERSITY . RSFAS EXAM REVISION Sem 1 2026 . SIDE 1 OF 2 SLR · estimation . inference[22]Source: asksia-cheatsheet-stat7038.pdfRevision aid . check the current class summary for exam conditions . @ 2026 flip + for side 2 . MLR, diagnostics & selection transform / add term THE SETUP SLR & INFERENCE . The model & LINE . least squares @1=Sxy/Sxx . Gauss-Markov . ANOVA . F & R2 . t-tests & CIs . CI vs PI . reading ONE A4 . TYPED MEMORY AID STAT7038 Regression Modelling AUSTRALIAN NATIONAL UNIVERSITY . RSFAS EXAM REVISION Sem 1 2026 . SIDE 2 OF 2 MLR . diagnostics · selection SIDE 2/2 VIF . Cp/AIC/BIC WK 5 STACKED MODEL y = XB + € , ε ~ N(0, 02I) X is nxp (1s + predictors), ß is px1 LS SOLUTION (FROM XTXB=X™Y) ₿ = (XTX) -1XTy FITTED & HAT MATRIX y = X8 = Hy . H = X(XTX)-1XT H symmetric (HT=H) & idempotent (H2=H) tr(H)=p . diagonals = leverages hit RESIDUALS e = (I-H)y . Var(e)=02 (I-H) - Var(ei)=02(1-hi1) COVARIANCE OF ESTIMATES Var(B) = 02 (XTX) -1 = MSE (XTX)-1 jth diagonal of MSE(XTX)-1 = se(bj)2; off-diagonals = Cov(bj,b_k). R: vcov ( Im) prints it; /diag = se's. Read matrix output, don't invert by hand. 16 . MLR Model WK 6-7 P-1 PREDICTORS, P PARAMS yi = Be + Bix_{i1} + - + @_{p-1}x_{i, p-1} + £1 ô2 = MSE = SSE/(n-p) · resid SE on n-p df "Partial" coefficients: { j = expected change in y per 1-unit + in xj holding all other predictors constant. This conditional meaning is why a coefficient's sign can differ from the simple pairwise correlation of x j with y (see §24). 16b . Read vcov / matrix output EXAM CAN GIVE IT You may be handed MSE. (XTx)-1 (the vcov matrix). To extract: se(bj) = V(jth diagonal entry) Cov(bj, b_k) = the (j, k) off-diagonal corr(bj, b_k) = Cov / (se(b; )se(b_k)) Worked: if diag(vcov) = (4, 0. 25, 0. 09) then se(bo)=2, se(b1)=0. 5, se(b2)=0. 3. A t-test of ß1: t = b1/0. 5. No matrix inversion by hand - just read entries.
- 复习优先级建议:
-
2)你必须背到“条件反射”的核心定义(考场写一句话就拿分那种)
-
2.1 参数 / 估计量 / 估计值
-
2.2 抽样分布 sampling distribution
-
2.3 SLR 模型(简单线性回归)
- 模型:$y_i=\beta_0+\beta_1 x_i+\varepsilon_i$,并且误差常用正态设定。[7]Source: asksia-bible-stat7038-bilingual.pdfEN + 中文 Regression vocabulary the STAT7038 final examines STAT7038 期末考查的回归术语 English term . X . one-line meaning - grouped to follow the syllabus 英文术语 · 中文 · 一句话释义 -- 按教学大纲分组 A fast bilingual reference for the language STAT7038 actually tests 〔STAT7038 真正考查的术语双语速查〕. About 55 terms from the SLR model through diagnostics, the matrix view, MLR, sequential-vs-partial SS, dummy variables, multicollinearity and model selection. Cover the right two columns and recite the meaning from the term - then flip and recall the English from the X. Small formulae are typeset inline. 一份 STAT7038 真正考查术语的双语速查表 〔STAT7038 真正考查的术语双语速查〕。约 55个术语,覆盖从 SLR 模型到诊断、矩阵视角、 MLR、序贯与偏平方和、虚拟变量、多重共线性以及模型选择。遮住右侧两列,从术语复述其含义 -- 再翻过来,从中文回忆英文。小公式以 行内方式排版。 Term (EN) 中文 One-line meaning Foundations & the simple linear regression model 基础与简单线性回归模型 Response / predictor 因变量/ 自变量 Outcome y we model vs the explanatory x; in SLR x is treated as fixed, measured without error. Parameter vs estimator vs estimate 参数/估计量/估计值 Bj, o2 are fixed unknowns; bj, o2 are random statistics; an estimate is one realised value. Sampling distribution 抽样分布 Distribution of an estimator over repeated samples; source of its standard error. Correlation rxy 相关系数 Tay = Szy/ VSax Syy € [-1, 1]; measures LINEAR association only. r#causation. Spy Syy. Sxy 离差平方和与交叉积 Sxx = = (xi - x) 2, Syy = [(yi - y)2, Say = = (xi - x)(yi - y). SLR model 简单线性回归模型 yi = Bo + Bixi + Ei, E ~ N(0, 02); mean response E(y | x) = Bo + 1℃. LINE assumptions LINE 四假设
- 条件均值:$E(y\mid x)=\beta_0+\beta_1 x$(“平均响应”是一条线)。[7]Source: asksia-bible-stat7038-bilingual.pdfEN + 中文 Regression vocabulary the STAT7038 final examines STAT7038 期末考查的回归术语 English term . X . one-line meaning - grouped to follow the syllabus 英文术语 · 中文 · 一句话释义 -- 按教学大纲分组 A fast bilingual reference for the language STAT7038 actually tests 〔STAT7038 真正考查的术语双语速查〕. About 55 terms from the SLR model through diagnostics, the matrix view, MLR, sequential-vs-partial SS, dummy variables, multicollinearity and model selection. Cover the right two columns and recite the meaning from the term - then flip and recall the English from the X. Small formulae are typeset inline. 一份 STAT7038 真正考查术语的双语速查表 〔STAT7038 真正考查的术语双语速查〕。约 55个术语,覆盖从 SLR 模型到诊断、矩阵视角、 MLR、序贯与偏平方和、虚拟变量、多重共线性以及模型选择。遮住右侧两列,从术语复述其含义 -- 再翻过来,从中文回忆英文。小公式以 行内方式排版。 Term (EN) 中文 One-line meaning Foundations & the simple linear regression model 基础与简单线性回归模型 Response / predictor 因变量/ 自变量 Outcome y we model vs the explanatory x; in SLR x is treated as fixed, measured without error. Parameter vs estimator vs estimate 参数/估计量/估计值 Bj, o2 are fixed unknowns; bj, o2 are random statistics; an estimate is one realised value. Sampling distribution 抽样分布 Distribution of an estimator over repeated samples; source of its standard error. Correlation rxy 相关系数 Tay = Szy/ VSax Syy € [-1, 1]; measures LINEAR association only. r#causation. Spy Syy. Sxy 离差平方和与交叉积 Sxx = = (xi - x) 2, Syy = [(yi - y)2, Say = = (xi - x)(yi - y). SLR model 简单线性回归模型 yi = Bo + Bixi + Ei, E ~ N(0, 02); mean response E(y | x) = Bo + 1℃. LINE assumptions LINE 四假设
-
2.4 LINE 四假设(推断有效性的地基)
- L:$E(y\mid x)$ 线性
- I:误差独立
- N:误差正态
- E:等方差(方差不随 $x$ 变)[8]Source: asksia-bible-stat7038-bilingual.pdf✓ Recover n from the output i summary stars 从输出中还原 n summary 的星号 "on 6 degrees of freedom" means dfE = n - 2 = 6, so n = 8. The F line "on 1 and 6 DF" confirms it. From n you can rebuild any SE the printout hides. *** / ** / * flag significance at . 001 / . 01 / . 05. Handy, but in a written answer quote the p-value or compare to the critical value - don't just cite stars. “在6个自由度上”意味着 dfE = n-2=6,故 n=8。F那 一行“在1和6 DF上”印证了这一点。有了n,便可重建打印 输出中隐去的任何 SE。 *** / ** / * 分别标记 . 001/ 01/. 05 水平上的显著性。方 便,但在文字作答中要引用 p 值或与临界值比较 -- 不要只引 星号。 "The whole of SLR inference is recoverable from six numbers on one printout. Learn to read it cold and the calculation questions become transcription with arithmetic. " “整个 SLR 推断都能从一张打印输出上的六个数字里复原出来。把它练到能一眼读懂,计算题就变成了带算术的誊抄。” WHY R-OUTPUT READING IS THE HIGHEST-YIELD EXAM SKILL STAT7038 . Regression Modelling DIAGNOSTICS . RESIDUAL PLOT - DIAGNOSTICS - DO THE ASSUMPTIONS HOLD? WEEKS 5-6 . HEAVILY EXAMINED The fit is only as good as LINE 拟合的优劣完全取决于 LINE Residual plots are how you check, not just assume, the four conditions 残差图是用来检验这四条假设的,而不是直接假定它们成立 Least squares always returns a line - even through data that has no business being modelled linearly. Inference (the t- and F-tests, the CIs and PIs) is only valid when the error assumptions hold. Diagnostics are the plots and statistics that interrogate them. Recall the four LINE assumptions: 最小二乘法总会返回一条直线 -- 哪怕数据根本不该用线性建模。推断(t 与 F 检验、CI 与 PI)只有在误差假设成立时才有效。诊断就是用 来盘问这些假设的图与统计量。回顾四条 LINE 假设: L LINEARITY OF E[Y |X] E[y|x]的线性 I INDEPENDENT ERRORS 误差相互独立 N NORMAL ERRORS 误差正态 E EQUAL VARIANCE 方差相等 AHA 1 The residuals-vs-fitted plot AHA 1 残差对拟合值图 The single most useful diagnostic. Plot each residual ei = yi - yi against its fitted value gj. Under the assumptions the residuals are a structureless cloud about the e = 0 line. Read it for two things at once: curvature (a failure of Linearity) and changing spread (a failure of Equal variance). 最有用的单一诊断。把每个残差对其拟合值作图。在假设成立下,残差是围绕 e = 0 直线、毫无结构的散点云。一眼同时看两件事:弯曲 (Linearity 即线性的失效)与散布变化(Equal variance 即等方差的失效)。 residual e GOOD[27]Source: asksia-cheatsheet-stat7038.pdfSTAT7038 Regression Modelling AUSTRALIAN NATIONAL UNIVERSITY . RSFAS EXAM REVISION Sem 1 2026 . SIDE 1 OF 2 SLR · estimation . inference SIDE 1/2 R output READ FIRST 0 . Exam Blueprint * Final = 70% . 180 min (+15 read) . MCQ + short-answer calculation + written, all of weeks 1-12. Also: online quiz 5%, in- tutorial quiz (SLR) 10%, assignment 15%. * The exam permits ONE A4 double-sided typed or printed notes sheet - and a calculator, R outputs & statistical tables are SUPPLIED. So this sheet IS a compliant memory aid: spend zero space on tables/R syntax, max out formulae, decision rules, cut-offs & method recipes. Every hypothesis test shows all 5: (1) hypotheses, (2) test statistic, (3) critical value w/ df (or p), (4) decision, (5) conclusion in context. a = 5% unless stated; log = natural log. - - SIA > The killer combo: read the supplied R output, pull the numbers, plug into the formula on this sheet, run the 5-step test. Memorise the recipes & cut-offs, not the tables. 1 . Building Blocks NOTATION Parameter (ß1, 02) = fixed unknown; estimator (b1, 02) = random, from the sample; estimate = its realised value. The sampling distribution is the estimator's distribution over repeated samples. CORE SUMS (MEMORISE) x == (1/n) ΣΧΙ · Sxx = Σ(Χi -x) 2 S_yy = Σ( yi-y)2 · Sxγ = Σ (Xi -X) (γi -y) Sx2 = Sxx/(n-1) . Cov = Sxy/(n-1) r = Sxy/V(SxxS_yy) = Sxy/((n-1) SxSy) EXPECTATION / VARIANCE RULES E(aX+b) = aE(X)+b Var(aX+b) = a2Var(X) Var (Σai Υ1)=Σa12Var (Υ1)+2_{i<j}aia; Cov [Yi,Υ;) (independent - Cov terms vanish) re [-1,1] measures linear association only. These E/Var rules drive every variance derivation below (e. g. Var(b,) treats the y, as the only random part, since the x, are fixed and the & carry all the randomness). Trap: correlation # causation; a strong r can be driven by one outlier or a lurking variable; r = 0 => no linear link, not "no relationship". 1b . The 5 SS Identities KEEP HANDY Everything in inference is built from five sums; learn how each is recovered from the others: SST = S_yy = E(yi-y)2 (df n-1) SSR = b1 . Sxy = b12Sxx = R2 . SST (df 1) SSE = SST - SSR = Ee:2 (df n-2) MSE = SSE/(n-2) . MSR = SSR/1 R2 = SSR/SST . r = ±VR2 (sign of bi) If an R table hides one cell, back it out: e. g. SSE = SST(1-R2) , or SSR = F·MSE. Degrees of freedom always add: df_total (n-1) = df_reg + df_err. In SLR that's (n-1) = 1 + (n-2); in MLR (n-1) = (p-1) + (n-p). A quick df check catches most table-fill slips. The "p" you divide by is the number of estimated parameters including the intercept - count the rows in the coefficient table. Get p wrong and every df, MSE and cut-off downstream is wrong too. In SLR p = 2. 2 . SLR Model + LINE SIMPLE LINEAR REGRESSION V1 = Be + BiXi + 81 , 81 ~iid N(0, 02) " yı ~ind N(Be+ß1X1 , 02) E(y|x) = Be + B1x (mean response) Four assumptions - LINE: Linearity of E(y|x); Independence of &; Normality of &; Equal variance Var(s)=o2. The x, are fixed, measured without error. Interpret: [] = expected change in y per 1-unit + in x; Bo = expected y at x = 0 (often a meaningless extrapolation). 3 . Least-Squares Estimation
- 重要考试话术:最小二乘总能拟合一条线,但 t/F/CI/PI 只有在误差假设成立时才靠谱。[8]Source: asksia-bible-stat7038-bilingual.pdf✓ Recover n from the output i summary stars 从输出中还原 n summary 的星号 "on 6 degrees of freedom" means dfE = n - 2 = 6, so n = 8. The F line "on 1 and 6 DF" confirms it. From n you can rebuild any SE the printout hides. *** / ** / * flag significance at . 001 / . 01 / . 05. Handy, but in a written answer quote the p-value or compare to the critical value - don't just cite stars. “在6个自由度上”意味着 dfE = n-2=6,故 n=8。F那 一行“在1和6 DF上”印证了这一点。有了n,便可重建打印 输出中隐去的任何 SE。 *** / ** / * 分别标记 . 001/ 01/. 05 水平上的显著性。方 便,但在文字作答中要引用 p 值或与临界值比较 -- 不要只引 星号。 "The whole of SLR inference is recoverable from six numbers on one printout. Learn to read it cold and the calculation questions become transcription with arithmetic. " “整个 SLR 推断都能从一张打印输出上的六个数字里复原出来。把它练到能一眼读懂,计算题就变成了带算术的誊抄。” WHY R-OUTPUT READING IS THE HIGHEST-YIELD EXAM SKILL STAT7038 . Regression Modelling DIAGNOSTICS . RESIDUAL PLOT - DIAGNOSTICS - DO THE ASSUMPTIONS HOLD? WEEKS 5-6 . HEAVILY EXAMINED The fit is only as good as LINE 拟合的优劣完全取决于 LINE Residual plots are how you check, not just assume, the four conditions 残差图是用来检验这四条假设的,而不是直接假定它们成立 Least squares always returns a line - even through data that has no business being modelled linearly. Inference (the t- and F-tests, the CIs and PIs) is only valid when the error assumptions hold. Diagnostics are the plots and statistics that interrogate them. Recall the four LINE assumptions: 最小二乘法总会返回一条直线 -- 哪怕数据根本不该用线性建模。推断(t 与 F 检验、CI 与 PI)只有在误差假设成立时才有效。诊断就是用 来盘问这些假设的图与统计量。回顾四条 LINE 假设: L LINEARITY OF E[Y |X] E[y|x]的线性 I INDEPENDENT ERRORS 误差相互独立 N NORMAL ERRORS 误差正态 E EQUAL VARIANCE 方差相等 AHA 1 The residuals-vs-fitted plot AHA 1 残差对拟合值图 The single most useful diagnostic. Plot each residual ei = yi - yi against its fitted value gj. Under the assumptions the residuals are a structureless cloud about the e = 0 line. Read it for two things at once: curvature (a failure of Linearity) and changing spread (a failure of Equal variance). 最有用的单一诊断。把每个残差对其拟合值作图。在假设成立下,残差是围绕 e = 0 直线、毫无结构的散点云。一眼同时看两件事:弯曲 (Linearity 即线性的失效)与散布变化(Equal variance 即等方差的失效)。 residual e GOOD
-
3)必会公式带(把这些“压缩”到你的 A4 上)
-
3.1 三个核心离差和
- $S_{xx}=\sum (x_i-\bar x)^2$
- $S_{yy}=\sum (y_i-\bar y)^2$
- $S_{xy}=\sum (x_i-\bar x)(y_i-\bar y)$ [7]Source: asksia-bible-stat7038-bilingual.pdfEN + 中文 Regression vocabulary the STAT7038 final examines STAT7038 期末考查的回归术语 English term . X . one-line meaning - grouped to follow the syllabus 英文术语 · 中文 · 一句话释义 -- 按教学大纲分组 A fast bilingual reference for the language STAT7038 actually tests 〔STAT7038 真正考查的术语双语速查〕. About 55 terms from the SLR model through diagnostics, the matrix view, MLR, sequential-vs-partial SS, dummy variables, multicollinearity and model selection. Cover the right two columns and recite the meaning from the term - then flip and recall the English from the X. Small formulae are typeset inline. 一份 STAT7038 真正考查术语的双语速查表 〔STAT7038 真正考查的术语双语速查〕。约 55个术语,覆盖从 SLR 模型到诊断、矩阵视角、 MLR、序贯与偏平方和、虚拟变量、多重共线性以及模型选择。遮住右侧两列,从术语复述其含义 -- 再翻过来,从中文回忆英文。小公式以 行内方式排版。 Term (EN) 中文 One-line meaning Foundations & the simple linear regression model 基础与简单线性回归模型 Response / predictor 因变量/ 自变量 Outcome y we model vs the explanatory x; in SLR x is treated as fixed, measured without error. Parameter vs estimator vs estimate 参数/估计量/估计值 Bj, o2 are fixed unknowns; bj, o2 are random statistics; an estimate is one realised value. Sampling distribution 抽样分布 Distribution of an estimator over repeated samples; source of its standard error. Correlation rxy 相关系数 Tay = Szy/ VSax Syy € [-1, 1]; measures LINEAR association only. r#causation. Spy Syy. Sxy 离差平方和与交叉积 Sxx = = (xi - x) 2, Syy = [(yi - y)2, Say = = (xi - x)(yi - y). SLR model 简单线性回归模型 yi = Bo + Bixi + Ei, E ~ N(0, 02); mean response E(y | x) = Bo + 1℃. LINE assumptions LINE 四假设[27]Source: asksia-cheatsheet-stat7038.pdfSTAT7038 Regression Modelling AUSTRALIAN NATIONAL UNIVERSITY . RSFAS EXAM REVISION Sem 1 2026 . SIDE 1 OF 2 SLR · estimation . inference SIDE 1/2 R output READ FIRST 0 . Exam Blueprint * Final = 70% . 180 min (+15 read) . MCQ + short-answer calculation + written, all of weeks 1-12. Also: online quiz 5%, in- tutorial quiz (SLR) 10%, assignment 15%. * The exam permits ONE A4 double-sided typed or printed notes sheet - and a calculator, R outputs & statistical tables are SUPPLIED. So this sheet IS a compliant memory aid: spend zero space on tables/R syntax, max out formulae, decision rules, cut-offs & method recipes. Every hypothesis test shows all 5: (1) hypotheses, (2) test statistic, (3) critical value w/ df (or p), (4) decision, (5) conclusion in context. a = 5% unless stated; log = natural log. - - SIA > The killer combo: read the supplied R output, pull the numbers, plug into the formula on this sheet, run the 5-step test. Memorise the recipes & cut-offs, not the tables. 1 . Building Blocks NOTATION Parameter (ß1, 02) = fixed unknown; estimator (b1, 02) = random, from the sample; estimate = its realised value. The sampling distribution is the estimator's distribution over repeated samples. CORE SUMS (MEMORISE) x == (1/n) ΣΧΙ · Sxx = Σ(Χi -x) 2 S_yy = Σ( yi-y)2 · Sxγ = Σ (Xi -X) (γi -y) Sx2 = Sxx/(n-1) . Cov = Sxy/(n-1) r = Sxy/V(SxxS_yy) = Sxy/((n-1) SxSy) EXPECTATION / VARIANCE RULES E(aX+b) = aE(X)+b Var(aX+b) = a2Var(X) Var (Σai Υ1)=Σa12Var (Υ1)+2_{i<j}aia; Cov [Yi,Υ;) (independent - Cov terms vanish) re [-1,1] measures linear association only. These E/Var rules drive every variance derivation below (e. g. Var(b,) treats the y, as the only random part, since the x, are fixed and the & carry all the randomness). Trap: correlation # causation; a strong r can be driven by one outlier or a lurking variable; r = 0 => no linear link, not "no relationship". 1b . The 5 SS Identities KEEP HANDY Everything in inference is built from five sums; learn how each is recovered from the others: SST = S_yy = E(yi-y)2 (df n-1) SSR = b1 . Sxy = b12Sxx = R2 . SST (df 1) SSE = SST - SSR = Ee:2 (df n-2) MSE = SSE/(n-2) . MSR = SSR/1 R2 = SSR/SST . r = ±VR2 (sign of bi) If an R table hides one cell, back it out: e. g. SSE = SST(1-R2) , or SSR = F·MSE. Degrees of freedom always add: df_total (n-1) = df_reg + df_err. In SLR that's (n-1) = 1 + (n-2); in MLR (n-1) = (p-1) + (n-p). A quick df check catches most table-fill slips. The "p" you divide by is the number of estimated parameters including the intercept - count the rows in the coefficient table. Get p wrong and every df, MSE and cut-off downstream is wrong too. In SLR p = 2. 2 . SLR Model + LINE SIMPLE LINEAR REGRESSION V1 = Be + BiXi + 81 , 81 ~iid N(0, 02) " yı ~ind N(Be+ß1X1 , 02) E(y|x) = Be + B1x (mean response) Four assumptions - LINE: Linearity of E(y|x); Independence of &; Normality of &; Equal variance Var(s)=o2. The x, are fixed, measured without error. Interpret: [] = expected change in y per 1-unit + in x; Bo = expected y at x = 0 (often a meaningless extrapolation). 3 . Least-Squares Estimation
-
3.2 最小二乘估计(SLR)
- 斜率:$b_1=\dfrac{S_{xy}}{S_{xx}}$
- 截距:$b_0=\bar y-b_1\bar x$ [4]Source: asksia-bible-stat7038-bilingual.pdfSTAT7038 期末考试闭卷,但允许带一张你自己的 A4 笔记外加一台计算器,并提供 t/F/正态分布表 〔闭卷,但可带一张A4 笔记,附统计 表〕。所以把A4写满公式与临界值(左栏),把复习时间花在笔记替你做不了的步骤流程与 R 输出判读上(右栏)。 MARKS . 4x15 分数 · 4×15 5 STEPS PER TEST 每个检验的步数 A4 NOTE ALLOWED 允许带笔记 read R, NEVER RUN IT R 输出,绝不运行 Term (EN) 中文 One-line meaning Term (EN) 中文 One-line meaning On your A4 note - bring these, don't memorise blind # A4 4 L You must execute / read by hand 须手算或读懂 SS & estimator formulae 平方和 与估计 量公式 Sxx, Say; bi = Say/ Sax, bo = y - bix; MSE = SSE/(n-2). The 5-step hypothesis test 假设检验五步法 Hypotheses - statistic - critical value/df (or p) -+ decision - conclusion IN CONTEXT - every test. se / variance 标准误 formulae 与方差 公式 Var(b1) = 02/Szz, se(b1) = \MSE/Szz; the CI/PI root forms. Fill an ANOVA 由 summary 补 全 ANOVA 表 Recover SSR/SSE/df from R output; F = MSR/MSE; don't round intermediates. Diagnostic cut- offs 诊断临[12]Source: asksia-bible-stat7038-bilingual.pdf截距。 Bo = bo = ] - b1ª = 65. 5 - 3. 833(5. 5) = 44. 417 Fitted line. y = 44. 417 + 3. 833 x. Each extra study hour buys about 3. 83 more marks; the line passes through the centroid (5. 5, 65. 5). 拟合直线 。。 每多学习一小时大约可换来约3. 83分;直线穿过质心。 ! Deriving vs using 推导与运用 The calculus derivation of the normal equations is not examinable; the formulas and their use are. On the exam you are handed Ex, Ey, Ex2, Exy (or the S-quantities) - go straight to Say/ Szz. Keep that one chain on your A4 sheet. 正规方程的微积分推导不在考查范围内;公式及其运用则在。考试中会直接给你(或给S量) -- 直接进入计算即可。把这一条链条放 在你的 A4 笔记上。 - STAT7038 . Regression Modelling SLR . RESIDUALS - FITTED VALUES, RESIDUALS & E ' What the line predicts, and what it misses 直线预测了什么,又遗漏了什么 9 ;= bo+bx; e ;= y ;- 9; · Ô2 = MSE Îi = bo + b1xi · ei = Vi - Îi · 02 = MSE FITTED & RESIDUALS Fitted value and residual at observation i: yi = bo + b1xi, ei = yi - yi Two identities every LS fit satisfies: Σεi= 0, Σ i i xiei = 0 (so the line passes through (x, y)). ERROR VARIANCE Estimating the error variance - average the squared rest : SSE SSE - Σά, ô2 = MSE = - n-2 Divide by n-2, not n: two df are spent estimating Bo, B1.[24]Source: asksia-cheatsheet-stat7038.pdfCalculator, R output & tables are supplied - don't waste this sheet on them. Formula Belt SIDE 1 b1=Sxy/Sxx . bo=y-bix" MSE=SSE/(n-2) . Var(b1)=02/Sxx R2=SSR/SST=1-SSE/SST . F=t2 se(b1)=V(MSE/Sxx) . CI: bit . se CI mean: ôv(1/n+(x_h-x]2/Sxx) PI new: ôv(1+1/n+(x_h-x]2/Sxx) asksia. ai/cheatsheet/ anu-stat7038 · side 1/2 AskSia CHEATSHEET SERIES 9 · CI (mean) vs PI * CLASSIC TRAP (new) At x = x_h both intervals share the same point estimate ŷ_h = bo + b1x_h; they differ only in the SE. CI FOR THE MEAN E(Y|X_H) @_h + t_{n-2} (1-a/2) · ôv(1/n + (x_h-x] 2/Sxx) PI FOR A NEW OBS Y_NEW ŷ_h + t_{n-2} (1-a/2) · ôv(1 + 1/n + (x_h-x]2/Sxx) Why PI is wider: it carries the extra o2 of the new point's own error &_new - the "+1" under the root - on top of the uncertainty in the estimated mean. Both are narrowest at x_h = x and flare as you move away (the (x_h-x)2 term). R: predict ( . . . , interval="confidence" ) vs "prediction". As n->oo the CI shrinks to a point (you know the mean) but the PI stays finite - it can never beat the irreducible o of a single new draw. SIA > Match the wording: "average value for . . . " ++ CI; "predict one new . . . " ++ PI. Forgetting the "+1", or swapping them, loses marks. Never extrapolate beyond the observed x range. 8 . Reading summary(Im) SUPPLIED IN EXAM Estimate Std. Error t value Pr(>|t|) (Int) b0 se0 t0 p0 x b1 se1 t1 p1 Resid SE: VMSE on n-2 df Mult R2: R2 . Adj R2: R2adj F: F on 1 and n-2 DF, p: p Read off: Estimate/Std. Error = bj, se(b }); t value = bj/se(bj) (tests ßj=0); Pr(>|t|) = two-sided p. Stars *** ** *=. 001/. 01/. 05.
-
3.3 拟合值、残差、误差方差估计
- 拟合值:$\hat y_i=b_0+b_1x_i$
- 残差:$e_i=y_i-\hat y_i$
- $SSE=\sum e_i^2$
- $MSE=\hat\sigma^2=\dfrac{SSE}{n-2}$(除以 $n-2$,不是 $n$)[12]Source: asksia-bible-stat7038-bilingual.pdf截距。 Bo = bo = ] - b1ª = 65. 5 - 3. 833(5. 5) = 44. 417 Fitted line. y = 44. 417 + 3. 833 x. Each extra study hour buys about 3. 83 more marks; the line passes through the centroid (5. 5, 65. 5). 拟合直线 。。 每多学习一小时大约可换来约3. 83分;直线穿过质心。 ! Deriving vs using 推导与运用 The calculus derivation of the normal equations is not examinable; the formulas and their use are. On the exam you are handed Ex, Ey, Ex2, Exy (or the S-quantities) - go straight to Say/ Szz. Keep that one chain on your A4 sheet. 正规方程的微积分推导不在考查范围内;公式及其运用则在。考试中会直接给你(或给S量) -- 直接进入计算即可。把这一条链条放 在你的 A4 笔记上。 - STAT7038 . Regression Modelling SLR . RESIDUALS - FITTED VALUES, RESIDUALS & E ' What the line predicts, and what it misses 直线预测了什么,又遗漏了什么 9 ;= bo+bx; e ;= y ;- 9; · Ô2 = MSE Îi = bo + b1xi · ei = Vi - Îi · 02 = MSE FITTED & RESIDUALS Fitted value and residual at observation i: yi = bo + b1xi, ei = yi - yi Two identities every LS fit satisfies: Σεi= 0, Σ i i xiei = 0 (so the line passes through (x, y)). ERROR VARIANCE Estimating the error variance - average the squared rest : SSE SSE - Σά, ô2 = MSE = - n-2 Divide by n-2, not n: two df are spent estimating Bo, B1.[27]Source: asksia-cheatsheet-stat7038.pdfSTAT7038 Regression Modelling AUSTRALIAN NATIONAL UNIVERSITY . RSFAS EXAM REVISION Sem 1 2026 . SIDE 1 OF 2 SLR · estimation . inference SIDE 1/2 R output READ FIRST 0 . Exam Blueprint * Final = 70% . 180 min (+15 read) . MCQ + short-answer calculation + written, all of weeks 1-12. Also: online quiz 5%, in- tutorial quiz (SLR) 10%, assignment 15%. * The exam permits ONE A4 double-sided typed or printed notes sheet - and a calculator, R outputs & statistical tables are SUPPLIED. So this sheet IS a compliant memory aid: spend zero space on tables/R syntax, max out formulae, decision rules, cut-offs & method recipes. Every hypothesis test shows all 5: (1) hypotheses, (2) test statistic, (3) critical value w/ df (or p), (4) decision, (5) conclusion in context. a = 5% unless stated; log = natural log. - - SIA > The killer combo: read the supplied R output, pull the numbers, plug into the formula on this sheet, run the 5-step test. Memorise the recipes & cut-offs, not the tables. 1 . Building Blocks NOTATION Parameter (ß1, 02) = fixed unknown; estimator (b1, 02) = random, from the sample; estimate = its realised value. The sampling distribution is the estimator's distribution over repeated samples. CORE SUMS (MEMORISE) x == (1/n) ΣΧΙ · Sxx = Σ(Χi -x) 2 S_yy = Σ( yi-y)2 · Sxγ = Σ (Xi -X) (γi -y) Sx2 = Sxx/(n-1) . Cov = Sxy/(n-1) r = Sxy/V(SxxS_yy) = Sxy/((n-1) SxSy) EXPECTATION / VARIANCE RULES E(aX+b) = aE(X)+b Var(aX+b) = a2Var(X) Var (Σai Υ1)=Σa12Var (Υ1)+2_{i<j}aia; Cov [Yi,Υ;) (independent - Cov terms vanish) re [-1,1] measures linear association only. These E/Var rules drive every variance derivation below (e. g. Var(b,) treats the y, as the only random part, since the x, are fixed and the & carry all the randomness). Trap: correlation # causation; a strong r can be driven by one outlier or a lurking variable; r = 0 => no linear link, not "no relationship". 1b . The 5 SS Identities KEEP HANDY Everything in inference is built from five sums; learn how each is recovered from the others: SST = S_yy = E(yi-y)2 (df n-1) SSR = b1 . Sxy = b12Sxx = R2 . SST (df 1) SSE = SST - SSR = Ee:2 (df n-2) MSE = SSE/(n-2) . MSR = SSR/1 R2 = SSR/SST . r = ±VR2 (sign of bi) If an R table hides one cell, back it out: e. g. SSE = SST(1-R2) , or SSR = F·MSE. Degrees of freedom always add: df_total (n-1) = df_reg + df_err. In SLR that's (n-1) = 1 + (n-2); in MLR (n-1) = (p-1) + (n-p). A quick df check catches most table-fill slips. The "p" you divide by is the number of estimated parameters including the intercept - count the rows in the coefficient table. Get p wrong and every df, MSE and cut-off downstream is wrong too. In SLR p = 2. 2 . SLR Model + LINE SIMPLE LINEAR REGRESSION V1 = Be + BiXi + 81 , 81 ~iid N(0, 02) " yı ~ind N(Be+ß1X1 , 02) E(y|x) = Be + B1x (mean response) Four assumptions - LINE: Linearity of E(y|x); Independence of &; Normality of &; Equal variance Var(s)=o2. The x, are fixed, measured without error. Interpret: [] = expected change in y per 1-unit + in x; Bo = expected y at x = 0 (often a meaningless extrapolation). 3 . Least-Squares Estimation
-
3.4 $R^2$ / ANOVA 恒等式(考表格补全与 F 检验)
- $SST=SSR+SSE$(df 也要加:$n-1=1+(n-2)$)[27]Source: asksia-cheatsheet-stat7038.pdfSTAT7038 Regression Modelling AUSTRALIAN NATIONAL UNIVERSITY . RSFAS EXAM REVISION Sem 1 2026 . SIDE 1 OF 2 SLR · estimation . inference SIDE 1/2 R output READ FIRST 0 . Exam Blueprint * Final = 70% . 180 min (+15 read) . MCQ + short-answer calculation + written, all of weeks 1-12. Also: online quiz 5%, in- tutorial quiz (SLR) 10%, assignment 15%. * The exam permits ONE A4 double-sided typed or printed notes sheet - and a calculator, R outputs & statistical tables are SUPPLIED. So this sheet IS a compliant memory aid: spend zero space on tables/R syntax, max out formulae, decision rules, cut-offs & method recipes. Every hypothesis test shows all 5: (1) hypotheses, (2) test statistic, (3) critical value w/ df (or p), (4) decision, (5) conclusion in context. a = 5% unless stated; log = natural log. - - SIA > The killer combo: read the supplied R output, pull the numbers, plug into the formula on this sheet, run the 5-step test. Memorise the recipes & cut-offs, not the tables. 1 . Building Blocks NOTATION Parameter (ß1, 02) = fixed unknown; estimator (b1, 02) = random, from the sample; estimate = its realised value. The sampling distribution is the estimator's distribution over repeated samples. CORE SUMS (MEMORISE) x == (1/n) ΣΧΙ · Sxx = Σ(Χi -x) 2 S_yy = Σ( yi-y)2 · Sxγ = Σ (Xi -X) (γi -y) Sx2 = Sxx/(n-1) . Cov = Sxy/(n-1) r = Sxy/V(SxxS_yy) = Sxy/((n-1) SxSy) EXPECTATION / VARIANCE RULES E(aX+b) = aE(X)+b Var(aX+b) = a2Var(X) Var (Σai Υ1)=Σa12Var (Υ1)+2_{i<j}aia; Cov [Yi,Υ;) (independent - Cov terms vanish) re [-1,1] measures linear association only. These E/Var rules drive every variance derivation below (e. g. Var(b,) treats the y, as the only random part, since the x, are fixed and the & carry all the randomness). Trap: correlation # causation; a strong r can be driven by one outlier or a lurking variable; r = 0 => no linear link, not "no relationship". 1b . The 5 SS Identities KEEP HANDY Everything in inference is built from five sums; learn how each is recovered from the others: SST = S_yy = E(yi-y)2 (df n-1) SSR = b1 . Sxy = b12Sxx = R2 . SST (df 1) SSE = SST - SSR = Ee:2 (df n-2) MSE = SSE/(n-2) . MSR = SSR/1 R2 = SSR/SST . r = ±VR2 (sign of bi) If an R table hides one cell, back it out: e. g. SSE = SST(1-R2) , or SSR = F·MSE. Degrees of freedom always add: df_total (n-1) = df_reg + df_err. In SLR that's (n-1) = 1 + (n-2); in MLR (n-1) = (p-1) + (n-p). A quick df check catches most table-fill slips. The "p" you divide by is the number of estimated parameters including the intercept - count the rows in the coefficient table. Get p wrong and every df, MSE and cut-off downstream is wrong too. In SLR p = 2. 2 . SLR Model + LINE SIMPLE LINEAR REGRESSION V1 = Be + BiXi + 81 , 81 ~iid N(0, 02) " yı ~ind N(Be+ß1X1 , 02) E(y|x) = Be + B1x (mean response) Four assumptions - LINE: Linearity of E(y|x); Independence of &; Normality of &; Equal variance Var(s)=o2. The x, are fixed, measured without error. Interpret: [] = expected change in y per 1-unit + in x; Bo = expected y at x = 0 (often a meaningless extrapolation). 3 . Least-Squares Estimation
- $R^2=\dfrac{SSR}{SST}=1-\dfrac{SSE}{SST}$ [21]Source: asksia-cheatsheet-stat7038.pdfWorked: n=20, SST=400, R2=0. 64 => SSR=256, SSE=144, MSE=144/18=8, MSR=256, F=256/8=32 on 1,18 df. Compare F to F_{1,18}(. 95)=4. 41 => reject Ho: ß1=0. Cross-check via t: t = /F = /32 = 5. 66 (two-sided). The t-test of 3, and the overall F always agree in SLR (F =t2), and the p-values match exactly. In MLR this equivalence holds only between the overall F and the joint test - not the per-coefficient t's. If instead you are given SSE and SSR directly, R2 = SSR/(SSR+SSE) and there is nothing to look up - the Fand R2 are pure ratios. Watch the units: SS are in y2, MS are SS+df, and F is dimensionless. A common slip is dividing SSR by its df twice - MSR = SSR/1 in SLR, so MSR = SSR exactly, and MSE is the only mean square that needs a division. Compiled by AskSia . mapped to the STAT7038 syllabus . asksia. ai/cheatsheet/anu- stat7038 11 . Worked . CI vs PI SAME FIT From COL 4: d = 2, n = 12, Sxx = 80, X = 10. Predict at x_h =14. ŷ_h = 5 + 2. 5. 14 = 40 (x_h-x]2 = 16 . 1/n = 0. 0833 95% CI (MEAN) 40 ± 2. 228. 2. V(0. 0833 + 16/80) = 40 ± 2. 228. 2. 0. 532 = (37. 6, 42. 4) 95% PI (NEW) 40 ± 2. 228. 2. V(1 + 0. 0833 + 0. 2) = 40 ± 2. 228 . 2. 1. 133 = (34. 95, 45. 05) PI is far wider - the "+1" dominates the root. Both centre on 40. 12 . Diagnostics . the WK 4 . plots EVERY Q Residuals vs Fitted (& vs each x): checks linearity (no curve in the smoother) + constant variance (even band). Curve = wrong functional form; funnel/megaphone = heteroscedasticity . Normal Q-Q (internally studentised resid): checks normality. On the line => normal; S-shape = skew; heavy/light tails = kurtosis. Tails matter less in large n (CLT). Scale-Location (/|std resid| vs fitted): another homoscedasticity check; rising trend => increasing variance. Residuals vs Leverage / Cook's D plot: flags influence (Side 2). Independence => residuals vs order/time. Trap: R auto-labels the 3 most extreme points - labelled # outlier . Judge vs cut-offs + the gap; a Q- Q-flagged point with |studentised| < 2 is not an outlier. 12b . Reading 3 Plots CHECKBOX TASK The exam shows Residuals-vs-Fitted, Q-Q, Cook's-D and asks for the single best verdict. Map symptom -> conclusion: WHAT YOU SEE CONCLUSION[27]Source: asksia-cheatsheet-stat7038.pdfSTAT7038 Regression Modelling AUSTRALIAN NATIONAL UNIVERSITY . RSFAS EXAM REVISION Sem 1 2026 . SIDE 1 OF 2 SLR · estimation . inference SIDE 1/2 R output READ FIRST 0 . Exam Blueprint * Final = 70% . 180 min (+15 read) . MCQ + short-answer calculation + written, all of weeks 1-12. Also: online quiz 5%, in- tutorial quiz (SLR) 10%, assignment 15%. * The exam permits ONE A4 double-sided typed or printed notes sheet - and a calculator, R outputs & statistical tables are SUPPLIED. So this sheet IS a compliant memory aid: spend zero space on tables/R syntax, max out formulae, decision rules, cut-offs & method recipes. Every hypothesis test shows all 5: (1) hypotheses, (2) test statistic, (3) critical value w/ df (or p), (4) decision, (5) conclusion in context. a = 5% unless stated; log = natural log. - - SIA > The killer combo: read the supplied R output, pull the numbers, plug into the formula on this sheet, run the 5-step test. Memorise the recipes & cut-offs, not the tables. 1 . Building Blocks NOTATION Parameter (ß1, 02) = fixed unknown; estimator (b1, 02) = random, from the sample; estimate = its realised value. The sampling distribution is the estimator's distribution over repeated samples. CORE SUMS (MEMORISE) x == (1/n) ΣΧΙ · Sxx = Σ(Χi -x) 2 S_yy = Σ( yi-y)2 · Sxγ = Σ (Xi -X) (γi -y) Sx2 = Sxx/(n-1) . Cov = Sxy/(n-1) r = Sxy/V(SxxS_yy) = Sxy/((n-1) SxSy) EXPECTATION / VARIANCE RULES E(aX+b) = aE(X)+b Var(aX+b) = a2Var(X) Var (Σai Υ1)=Σa12Var (Υ1)+2_{i<j}aia; Cov [Yi,Υ;) (independent - Cov terms vanish) re [-1,1] measures linear association only. These E/Var rules drive every variance derivation below (e. g. Var(b,) treats the y, as the only random part, since the x, are fixed and the & carry all the randomness). Trap: correlation # causation; a strong r can be driven by one outlier or a lurking variable; r = 0 => no linear link, not "no relationship". 1b . The 5 SS Identities KEEP HANDY Everything in inference is built from five sums; learn how each is recovered from the others: SST = S_yy = E(yi-y)2 (df n-1) SSR = b1 . Sxy = b12Sxx = R2 . SST (df 1) SSE = SST - SSR = Ee:2 (df n-2) MSE = SSE/(n-2) . MSR = SSR/1 R2 = SSR/SST . r = ±VR2 (sign of bi) If an R table hides one cell, back it out: e. g. SSE = SST(1-R2) , or SSR = F·MSE. Degrees of freedom always add: df_total (n-1) = df_reg + df_err. In SLR that's (n-1) = 1 + (n-2); in MLR (n-1) = (p-1) + (n-p). A quick df check catches most table-fill slips. The "p" you divide by is the number of estimated parameters including the intercept - count the rows in the coefficient table. Get p wrong and every df, MSE and cut-off downstream is wrong too. In SLR p = 2. 2 . SLR Model + LINE SIMPLE LINEAR REGRESSION V1 = Be + BiXi + 81 , 81 ~iid N(0, 02) " yı ~ind N(Be+ß1X1 , 02) E(y|x) = Be + B1x (mean response) Four assumptions - LINE: Linearity of E(y|x); Independence of &; Normality of &; Equal variance Var(s)=o2. The x, are fixed, measured without error. Interpret: [] = expected change in y per 1-unit + in x; Bo = expected y at x = 0 (often a meaningless extrapolation). 3 . Least-Squares Estimation
- 在 SLR:$MSR=SSR/1=SSR$,$F=\dfrac{MSR}{MSE}$;并且 $F=t^2$(SLR 下整体 F 与斜率 t 完全一致)。[21]Source: asksia-cheatsheet-stat7038.pdfWorked: n=20, SST=400, R2=0. 64 => SSR=256, SSE=144, MSE=144/18=8, MSR=256, F=256/8=32 on 1,18 df. Compare F to F_{1,18}(. 95)=4. 41 => reject Ho: ß1=0. Cross-check via t: t = /F = /32 = 5. 66 (two-sided). The t-test of 3, and the overall F always agree in SLR (F =t2), and the p-values match exactly. In MLR this equivalence holds only between the overall F and the joint test - not the per-coefficient t's. If instead you are given SSE and SSR directly, R2 = SSR/(SSR+SSE) and there is nothing to look up - the Fand R2 are pure ratios. Watch the units: SS are in y2, MS are SS+df, and F is dimensionless. A common slip is dividing SSR by its df twice - MSR = SSR/1 in SLR, so MSR = SSR exactly, and MSE is the only mean square that needs a division. Compiled by AskSia . mapped to the STAT7038 syllabus . asksia. ai/cheatsheet/anu- stat7038 11 . Worked . CI vs PI SAME FIT From COL 4: d = 2, n = 12, Sxx = 80, X = 10. Predict at x_h =14. ŷ_h = 5 + 2. 5. 14 = 40 (x_h-x]2 = 16 . 1/n = 0. 0833 95% CI (MEAN) 40 ± 2. 228. 2. V(0. 0833 + 16/80) = 40 ± 2. 228. 2. 0. 532 = (37. 6, 42. 4) 95% PI (NEW) 40 ± 2. 228. 2. V(1 + 0. 0833 + 0. 2) = 40 ± 2. 228 . 2. 1. 133 = (34. 95, 45. 05) PI is far wider - the "+1" dominates the root. Both centre on 40. 12 . Diagnostics . the WK 4 . plots EVERY Q Residuals vs Fitted (& vs each x): checks linearity (no curve in the smoother) + constant variance (even band). Curve = wrong functional form; funnel/megaphone = heteroscedasticity . Normal Q-Q (internally studentised resid): checks normality. On the line => normal; S-shape = skew; heavy/light tails = kurtosis. Tails matter less in large n (CLT). Scale-Location (/|std resid| vs fitted): another homoscedasticity check; rising trend => increasing variance. Residuals vs Leverage / Cook's D plot: flags influence (Side 2). Independence => residuals vs order/time. Trap: R auto-labels the 3 most extreme points - labelled # outlier . Judge vs cut-offs + the gap; a Q- Q-flagged point with |studentised| < 2 is not an outlier. 12b . Reading 3 Plots CHECKBOX TASK The exam shows Residuals-vs-Fitted, Q-Q, Cook's-D and asks for the single best verdict. Map symptom -> conclusion: WHAT YOU SEE CONCLUSION[24]Source: asksia-cheatsheet-stat7038.pdfCalculator, R output & tables are supplied - don't waste this sheet on them. Formula Belt SIDE 1 b1=Sxy/Sxx . bo=y-bix" MSE=SSE/(n-2) . Var(b1)=02/Sxx R2=SSR/SST=1-SSE/SST . F=t2 se(b1)=V(MSE/Sxx) . CI: bit . se CI mean: ôv(1/n+(x_h-x]2/Sxx) PI new: ôv(1+1/n+(x_h-x]2/Sxx) asksia. ai/cheatsheet/ anu-stat7038 · side 1/2 AskSia CHEATSHEET SERIES 9 · CI (mean) vs PI * CLASSIC TRAP (new) At x = x_h both intervals share the same point estimate ŷ_h = bo + b1x_h; they differ only in the SE. CI FOR THE MEAN E(Y|X_H) @_h + t_{n-2} (1-a/2) · ôv(1/n + (x_h-x] 2/Sxx) PI FOR A NEW OBS Y_NEW ŷ_h + t_{n-2} (1-a/2) · ôv(1 + 1/n + (x_h-x]2/Sxx) Why PI is wider: it carries the extra o2 of the new point's own error &_new - the "+1" under the root - on top of the uncertainty in the estimated mean. Both are narrowest at x_h = x and flare as you move away (the (x_h-x)2 term). R: predict ( . . . , interval="confidence" ) vs "prediction". As n->oo the CI shrinks to a point (you know the mean) but the PI stays finite - it can never beat the irreducible o of a single new draw. SIA > Match the wording: "average value for . . . " ++ CI; "predict one new . . . " ++ PI. Forgetting the "+1", or swapping them, loses marks. Never extrapolate beyond the observed x range. 8 . Reading summary(Im) SUPPLIED IN EXAM Estimate Std. Error t value Pr(>|t|) (Int) b0 se0 t0 p0 x b1 se1 t1 p1 Resid SE: VMSE on n-2 df Mult R2: R2 . Adj R2: R2adj F: F on 1 and n-2 DF, p: p Read off: Estimate/Std. Error = bj, se(b }); t value = bj/se(bj) (tests ßj=0); Pr(>|t|) = two-sided p. Stars *** ** *=. 001/. 01/. 05.
-
3.5 标准误(se)与 t 检验骨架
- 常用读法:$t=\dfrac{b}{se(b)}$(材料明确:t 就是 Estimate/Std.Error)。[16]Source: asksia-cheatsheet-stat7038.pdf13b · Log Interpretation BACK- TRANSFORM Fit log y = bo + byx. On the original scale a 1-unit rise in x multiplies y by e^{b1} (not "+b,"). A coefficient difference of b2 between groups = y differs by a factor e^{b2}. Back-transform a CI by exponentiating the endpoints: if log-scale CI = (L, U) then the multiplicative-factor CI = (e^L, e^U). Trap: never just exponentiate the point estimate and call it the mean - that's the median on the original scale. Quick read: a log-y coefficient of 0. 05 = a 5% increase per unit x (since e^{0. 05}=1. 051); for small coefficients the percentage = 100-b1. This shortcut is handy for interpreting the supplied R output fast - but state it as "approximately" and use e^{b} for exact factors. 14 . 5-Step Test Recipe USE ON EVERY Q 1. Hypotheses Ho / Ha (state in symbols) 2. Statistic - t = b/se, F = MSR/MSE, etc. 3. Critical value w/ df (or p-value) from the supplied tables 4. Decision - reject / fail to reject Ho 5. Conclusion in context - plain words + the variable a = 5% unless stated. Show working - no rounding of intermediates. 14b . Exam-Day Tactics SCORING · MCQ first (fast points), then short-calc, then written. · Quote the metric / cut-off, not the adjective - markers reward the number + the rule. · State the sampling distribution & df explicitly (~t_{n-2},~F_{1,n-2}). · For "which interval?" - read for "average" (CI) vs "a new/individual" (PI). Every written answer ends in context with the variable name. · Redeemable quizzes: a strong exam can lift them, so the exam carries it all.[20]Source: asksia-cheatsheet-stat7038.pdfasksia. ai/cheatsheet/ anu-stat7038 · side 1/2 AskSia CHEATSHEET SERIES 9 · CI (mean) vs PI * CLASSIC TRAP (new) At x = x_h both intervals share the same point estimate ŷ_h = bo + b1x_h; they differ only in the SE. CI FOR THE MEAN E(Y|X_H) @_h + t_{n-2} (1-a/2) · ôv(1/n + (x_h-x] 2/Sxx) PI FOR A NEW OBS Y_NEW ŷ_h + t_{n-2} (1-a/2) · ôv(1 + 1/n + (x_h-x]2/Sxx) Why PI is wider: it carries the extra o2 of the new point's own error &_new - the "+1" under the root - on top of the uncertainty in the estimated mean. Both are narrowest at x_h = x and flare as you move away (the (x_h-x)2 term). R: predict ( . . . , interval="confidence" ) vs "prediction". As n->oo the CI shrinks to a point (you know the mean) but the PI stays finite - it can never beat the irreducible o of a single new draw. SIA > Match the wording: "average value for . . . " ++ CI; "predict one new . . . " ++ PI. Forgetting the "+1", or swapping them, loses marks. Never extrapolate beyond the observed x range. 8 . Reading summary(Im) SUPPLIED IN EXAM Estimate Std. Error t value Pr(>|t|) (Int) b0 se0 t0 p0 x b1 se1 t1 p1 Resid SE: VMSE on n-2 df Mult R2: R2 . Adj R2: R2adj F: F on 1 and n-2 DF, p: p Read off: Estimate/Std. Error = bj, se(b }); t value = bj/se(bj) (tests ßj=0); Pr(>|t|) = two-sided p. Stars *** ** *=. 001/. 01/. 05. The bottom three lines bundle the global picture: residual SE + its df give MSE and n-p; the two R2 values compare raw vs size-penalised fit; the F-line is the overall test. A small p on the F-line but big p's on every coefficient = suspect multicollinearity (Side 2). RECOVER HIDDEN QUANTITIES se = b/t . MSE = (resid SE)2 . df_E = n-p 8b . Fill the ANOVA Table RECURRING EXAM TASK
- 你最常需要的“隐藏量复原”:
- $se = b/t$
- $MSE=(\text{residual SE})^2$
- $df_E=n-p$(SLR 时 $p=2$;MLR 时 $p$=参数个数含截距)[20]Source: asksia-cheatsheet-stat7038.pdfasksia. ai/cheatsheet/ anu-stat7038 · side 1/2 AskSia CHEATSHEET SERIES 9 · CI (mean) vs PI * CLASSIC TRAP (new) At x = x_h both intervals share the same point estimate ŷ_h = bo + b1x_h; they differ only in the SE. CI FOR THE MEAN E(Y|X_H) @_h + t_{n-2} (1-a/2) · ôv(1/n + (x_h-x] 2/Sxx) PI FOR A NEW OBS Y_NEW ŷ_h + t_{n-2} (1-a/2) · ôv(1 + 1/n + (x_h-x]2/Sxx) Why PI is wider: it carries the extra o2 of the new point's own error &_new - the "+1" under the root - on top of the uncertainty in the estimated mean. Both are narrowest at x_h = x and flare as you move away (the (x_h-x)2 term). R: predict ( . . . , interval="confidence" ) vs "prediction". As n->oo the CI shrinks to a point (you know the mean) but the PI stays finite - it can never beat the irreducible o of a single new draw. SIA > Match the wording: "average value for . . . " ++ CI; "predict one new . . . " ++ PI. Forgetting the "+1", or swapping them, loses marks. Never extrapolate beyond the observed x range. 8 . Reading summary(Im) SUPPLIED IN EXAM Estimate Std. Error t value Pr(>|t|) (Int) b0 se0 t0 p0 x b1 se1 t1 p1 Resid SE: VMSE on n-2 df Mult R2: R2 . Adj R2: R2adj F: F on 1 and n-2 DF, p: p Read off: Estimate/Std. Error = bj, se(b }); t value = bj/se(bj) (tests ßj=0); Pr(>|t|) = two-sided p. Stars *** ** *=. 001/. 01/. 05. The bottom three lines bundle the global picture: residual SE + its df give MSE and n-p; the two R2 values compare raw vs size-penalised fit; the F-line is the overall test. A small p on the F-line but big p's on every coefficient = suspect multicollinearity (Side 2). RECOVER HIDDEN QUANTITIES se = b/t . MSE = (resid SE)2 . df_E = n-p 8b . Fill the ANOVA Table RECURRING EXAM TASK[25]Source: asksia-cheatsheet-stat7038.pdfThe bottom three lines bundle the global picture: residual SE + its df give MSE and n-p; the two R2 values compare raw vs size-penalised fit; the F-line is the overall test. A small p on the F-line but big p's on every coefficient = suspect multicollinearity (Side 2). RECOVER HIDDEN QUANTITIES se = b/t . MSE = (resid SE)2 . df_E = n-p 8b . Fill the ANOVA Table RECURRING EXAM TASK Given any two of {SST, SSR, SSE} and n, complete the rest - never round intermediates: SRC DF SS MS F Reg 1 SSR SSF MSR/MSE Linearity biased b's, curved resid Indep. wrong se's, F invalid time/cluster model Normality t/F approx only (small n) transform; CLT in large n[27]Source: asksia-cheatsheet-stat7038.pdfSTAT7038 Regression Modelling AUSTRALIAN NATIONAL UNIVERSITY . RSFAS EXAM REVISION Sem 1 2026 . SIDE 1 OF 2 SLR · estimation . inference SIDE 1/2 R output READ FIRST 0 . Exam Blueprint * Final = 70% . 180 min (+15 read) . MCQ + short-answer calculation + written, all of weeks 1-12. Also: online quiz 5%, in- tutorial quiz (SLR) 10%, assignment 15%. * The exam permits ONE A4 double-sided typed or printed notes sheet - and a calculator, R outputs & statistical tables are SUPPLIED. So this sheet IS a compliant memory aid: spend zero space on tables/R syntax, max out formulae, decision rules, cut-offs & method recipes. Every hypothesis test shows all 5: (1) hypotheses, (2) test statistic, (3) critical value w/ df (or p), (4) decision, (5) conclusion in context. a = 5% unless stated; log = natural log. - - SIA > The killer combo: read the supplied R output, pull the numbers, plug into the formula on this sheet, run the 5-step test. Memorise the recipes & cut-offs, not the tables. 1 . Building Blocks NOTATION Parameter (ß1, 02) = fixed unknown; estimator (b1, 02) = random, from the sample; estimate = its realised value. The sampling distribution is the estimator's distribution over repeated samples. CORE SUMS (MEMORISE) x == (1/n) ΣΧΙ · Sxx = Σ(Χi -x) 2 S_yy = Σ( yi-y)2 · Sxγ = Σ (Xi -X) (γi -y) Sx2 = Sxx/(n-1) . Cov = Sxy/(n-1) r = Sxy/V(SxxS_yy) = Sxy/((n-1) SxSy) EXPECTATION / VARIANCE RULES E(aX+b) = aE(X)+b Var(aX+b) = a2Var(X) Var (Σai Υ1)=Σa12Var (Υ1)+2_{i<j}aia; Cov [Yi,Υ;) (independent - Cov terms vanish) re [-1,1] measures linear association only. These E/Var rules drive every variance derivation below (e. g. Var(b,) treats the y, as the only random part, since the x, are fixed and the & carry all the randomness). Trap: correlation # causation; a strong r can be driven by one outlier or a lurking variable; r = 0 => no linear link, not "no relationship". 1b . The 5 SS Identities KEEP HANDY Everything in inference is built from five sums; learn how each is recovered from the others: SST = S_yy = E(yi-y)2 (df n-1) SSR = b1 . Sxy = b12Sxx = R2 . SST (df 1) SSE = SST - SSR = Ee:2 (df n-2) MSE = SSE/(n-2) . MSR = SSR/1 R2 = SSR/SST . r = ±VR2 (sign of bi) If an R table hides one cell, back it out: e. g. SSE = SST(1-R2) , or SSR = F·MSE. Degrees of freedom always add: df_total (n-1) = df_reg + df_err. In SLR that's (n-1) = 1 + (n-2); in MLR (n-1) = (p-1) + (n-p). A quick df check catches most table-fill slips. The "p" you divide by is the number of estimated parameters including the intercept - count the rows in the coefficient table. Get p wrong and every df, MSE and cut-off downstream is wrong too. In SLR p = 2. 2 . SLR Model + LINE SIMPLE LINEAR REGRESSION V1 = Be + BiXi + 81 , 81 ~iid N(0, 02) " yı ~ind N(Be+ß1X1 , 02) E(y|x) = Be + B1x (mean response) Four assumptions - LINE: Linearity of E(y|x); Independence of &; Normality of &; Equal variance Var(s)=o2. The x, are fixed, measured without error. Interpret: [] = expected change in y per 1-unit + in x; Bo = expected y at x = 0 (often a meaningless extrapolation). 3 . Least-Squares Estimation
-
4)“每个检验都按 5 步给分”:你必须用同一套模板写
- 你的 cheatsheet 强调:每个假设检验都按 5 步给分——你少一步就直接掉方法分:[16]Source: asksia-cheatsheet-stat7038.pdf13b · Log Interpretation
BACK-
TRANSFORM
Fit log y = bo + byx. On the original scale a 1-unit rise in x multiplies y by e^{b1} (not "+b,"). A coefficient difference of b2 between groups = y differs by a factor e^{b2}.
Back-transform a CI by exponentiating the endpoints: if log-scale CI = (L, U) then the multiplicative-factor CI = (e^L, e^U). Trap: never just exponentiate the point estimate and call it the mean - that's the median on the original scale.
Quick read: a log-y coefficient of 0. 05 = a 5% increase per unit x (since e^{0. 05}=1. 051); for small coefficients the percentage = 100-b1. This shortcut is handy for interpreting the supplied R output fast - but state it as "approximately" and use e^{b} for exact factors.
14 . 5-Step Test Recipe
USE ON EVERY Q
1. Hypotheses Ho / Ha (state in symbols)
2. Statistic - t = b/se, F = MSR/MSE, etc.
3. Critical value w/ df (or p-value) from the supplied tables
4. Decision - reject / fail to reject Ho
5. Conclusion in context - plain words + the variable a = 5% unless stated. Show working - no rounding of intermediates.
14b . Exam-Day Tactics SCORING
· MCQ first (fast points), then short-calc, then written.
· Quote the metric / cut-off, not the adjective - markers reward the number + the rule.
· State the sampling distribution & df explicitly (~t_{n-2},~F_{1,n-2}).
· For "which interval?" - read for "average" (CI) vs "a new/individual" (PI).
Every written answer ends in context with the variable name.
· Redeemable quizzes: a strong exam can lift them, so the exam carries it all.[27]Source: asksia-cheatsheet-stat7038.pdfSTAT7038
Regression Modelling AUSTRALIAN NATIONAL UNIVERSITY . RSFAS
EXAM REVISION Sem 1 2026 . SIDE 1 OF 2 SLR · estimation . inference
SIDE 1/2 R output
READ FIRST
0 . Exam Blueprint * Final = 70% . 180 min (+15 read) . MCQ + short-answer calculation + written, all of weeks 1-12. Also: online quiz 5%, in- tutorial quiz (SLR) 10%, assignment 15%.
* The exam permits ONE A4 double-sided typed or printed notes sheet - and a calculator, R outputs & statistical tables are SUPPLIED. So this sheet IS a compliant memory aid: spend zero space on tables/R syntax, max out formulae, decision rules, cut-offs & method recipes.
Every hypothesis test shows all 5: (1) hypotheses, (2) test statistic, (3) critical value w/ df (or p), (4) decision, (5) conclusion in context. a = 5% unless stated; log = natural log. - - SIA > The killer combo: read the supplied R output, pull the numbers, plug into the formula on this sheet, run the 5-step test. Memorise the recipes & cut-offs, not the tables.
1 . Building Blocks NOTATION
Parameter (ß1, 02) = fixed unknown; estimator (b1, 02) = random, from the sample; estimate = its realised value. The sampling distribution is the estimator's distribution over repeated samples.
CORE SUMS (MEMORISE)
x == (1/n) ΣΧΙ · Sxx = Σ(Χi -x) 2 S_yy = Σ( yi-y)2 · Sxγ = Σ (Xi -X) (γi -y) Sx2 = Sxx/(n-1) . Cov = Sxy/(n-1) r = Sxy/V(SxxS_yy) = Sxy/((n-1) SxSy)
EXPECTATION / VARIANCE RULES E(aX+b) = aE(X)+b Var(aX+b) = a2Var(X) Var (Σai Υ1)=Σa12Var (Υ1)+2_{i<j}aia; Cov [Yi,Υ;) (independent - Cov terms vanish) re [-1,1] measures linear association only. These E/Var rules drive every variance derivation below (e. g. Var(b,) treats the y, as the only random part, since the x, are fixed and the & carry all the randomness).
Trap: correlation # causation; a strong r can be driven by one outlier or a lurking variable; r = 0 => no linear link, not "no relationship".
1b . The 5 SS Identities KEEP HANDY
Everything in inference is built from five sums; learn how each is recovered from the others: SST = S_yy = E(yi-y)2 (df n-1) SSR = b1 . Sxy = b12Sxx = R2 . SST (df 1) SSE = SST - SSR = Ee:2 (df n-2) MSE = SSE/(n-2) . MSR = SSR/1 R2 = SSR/SST . r = ±VR2 (sign of bi) If an R table hides one cell, back it out: e. g. SSE = SST(1-R2) , or SSR = F·MSE.
Degrees of freedom always add: df_total (n-1) = df_reg + df_err. In SLR that's (n-1) = 1 + (n-2); in MLR (n-1) = (p-1) + (n-p). A quick df check catches most table-fill slips.
The "p" you divide by is the number of estimated parameters including the intercept - count the rows in the coefficient table. Get p wrong and every df, MSE and cut-off downstream is wrong too. In SLR p = 2.
2 . SLR Model + LINE SIMPLE LINEAR REGRESSION V1 = Be + BiXi + 81 , 81 ~iid N(0, 02) " yı ~ind N(Be+ß1X1 , 02) E(y|x) = Be + B1x (mean response) Four assumptions - LINE: Linearity of E(y|x); Independence of &; Normality of &; Equal variance Var(s)=o2. The x, are fixed, measured without error. Interpret: [] = expected change in y per 1-unit + in x; Bo = expected y at x = 0 (often a meaningless extrapolation).
3 . Least-Squares Estimation
- 1)写清 $H_0/H_a$(用符号)
- 2)写检验统计量(如 $t=b/se$ 或 $F=MSR/MSE$)
- 3)写临界值 + 自由度(或 p 值;表由考场提供)
- 4)做决定:reject / fail to reject
- 5)情境化结论:用题目变量名说人话(不要只写“拒绝 $H_0$”)[16]Source: asksia-cheatsheet-stat7038.pdf13b · Log Interpretation BACK- TRANSFORM Fit log y = bo + byx. On the original scale a 1-unit rise in x multiplies y by e^{b1} (not "+b,"). A coefficient difference of b2 between groups = y differs by a factor e^{b2}. Back-transform a CI by exponentiating the endpoints: if log-scale CI = (L, U) then the multiplicative-factor CI = (e^L, e^U). Trap: never just exponentiate the point estimate and call it the mean - that's the median on the original scale. Quick read: a log-y coefficient of 0. 05 = a 5% increase per unit x (since e^{0. 05}=1. 051); for small coefficients the percentage = 100-b1. This shortcut is handy for interpreting the supplied R output fast - but state it as "approximately" and use e^{b} for exact factors. 14 . 5-Step Test Recipe USE ON EVERY Q 1. Hypotheses Ho / Ha (state in symbols) 2. Statistic - t = b/se, F = MSR/MSE, etc. 3. Critical value w/ df (or p-value) from the supplied tables 4. Decision - reject / fail to reject Ho 5. Conclusion in context - plain words + the variable a = 5% unless stated. Show working - no rounding of intermediates. 14b . Exam-Day Tactics SCORING · MCQ first (fast points), then short-calc, then written. · Quote the metric / cut-off, not the adjective - markers reward the number + the rule. · State the sampling distribution & df explicitly (~t_{n-2},~F_{1,n-2}). · For "which interval?" - read for "average" (CI) vs "a new/individual" (PI). Every written answer ends in context with the variable name. · Redeemable quizzes: a strong exam can lift them, so the exam carries it all.[10]Source: asksia-bible-stat7038-bilingual.pdf统计表 t, F and normal tables are SUPPLIED in the exam handout - you read critical values off them. interaction equations table from summary Term (EN) 中文 One-line meaning AIC move is taken; stop when <none> is best. ★ How to use this split Aiz 如何使用这种划分 用法 Pack the left column onto your A4 sheet so you never stall hunting a formula or a cut-off. Drill the right column until each procedure is automatic 〔把左栏公式抄上A4;把右栏步骤练到不假思索〕. Every hypothesis test is marked on all five steps, and most calculation marks come from reading the supplied R output correctly and concluding in context - not from reciting a formula. 把左栏装进你的 A4 笔记,这样你就永远不会卡在找公式或找临界值上。把右栏练到每个流程都自动化〔把左栏公式抄上A4;把右栏 步骤练到不假思索〕。每个假设检验都按全部五步评分,而大多数计算分来自正确判读所给的 R 输出与结合情境作结 -- 而非背诵公 式。 STAT7038 . Regression Modelling PRACTICE . Q1 - PRACTICE BANK & WORKED SOLUTIONS DRILL TO EXAM STANDARD Drill the whole paper, the way STAT7038 asks it 按 STAT7038 的出题方式刷遍整张卷子 Four 15-mark questions in the real format - fresh scenarios, supplied R output, every part worked 真实格式下的四道 15 分大题 -- 全新情境、提供 R 输出、每一小问都有完整解答 The one-line takeaway. The STAT7038 final is four questions, 15 marks each (60 total), blending MCQ, short calculation, and written interpretation, with a separate handout of R output you read but never run. This bank gives you AskSia- authored problems in that exact shape, each fully worked. Cover the solution, do it on paper, then check. 一句话要点。STAT7038 期末为四道题、每题15分(共60分),混合选择题、简短计算与文字解释,并附一份你只判读、绝不运行的R输 出讲义。本题库提供完全按这一形式编写、由 AskSia 原创的题目,每题都有完整解答。先遮住解答、在纸上做,再对照检查。 ★ Fresh scenarios in the STAT7038 style - NOT the real stems STAT7038 风格的全新情境 -- 并非真实题干 These are AskSia-authored questions written in the STAT7038 exam style (an SLR fit, an ANOVA-from-summary fill, CI vs PI, a diagnostics read, outlier/leverage/influence with cut-offs, sequential vs partial SS, a dummy-variable interpretation, multicollinearity/VIF, and model selection). They are not the real exam questions. Show all five steps of every hypothesis test; the supplied R output is to be read; t/F/normal critical values are taken from the supplied tables (we quote them inline below). 这些是AskSia 原创的题目,按 STAT7038 考试风格编写(一次 SLR 拟合、一次由 summary 填出的 ANOVA、CI 与 PI、一次诊断判 读、带临界值的离群/杠杆/影响、序贯与偏平方和、虚拟变量解释、多重共线性/VIF,以及模型选择)。它们不是真题。每个假设检验都 要展示全部五步;所给的R输出供判读;t/F/正态临界值取自所提供的表(下文以行内方式引出)。 Q1 SLR - estimation & interpretation [15 marks] Q1 SLR -- 估计与解读 [15 marks] Q1 scenario SLR Setup An agronomist fits a simple linear regression of wheat yield (tonnes/ha, y) on rainfall (cm, x) for n = 12 plots. The summary statistics are Sxx = 200, Sxy = 150, Syy = 160, with means x = 40, y = 5.
-
5)CI vs PI:期末“经典陷阱”,一眼读题就要选对
- 共同点:同一个点预测 $\hat y_h=b_0+b_1x_h$,两者中心都在 $\hat y_h$。[20]Source: asksia-cheatsheet-stat7038.pdfasksia. ai/cheatsheet/ anu-stat7038 · side 1/2 AskSia CHEATSHEET SERIES 9 · CI (mean) vs PI * CLASSIC TRAP (new) At x = x_h both intervals share the same point estimate ŷ_h = bo + b1x_h; they differ only in the SE. CI FOR THE MEAN E(Y|X_H) @_h + t_{n-2} (1-a/2) · ôv(1/n + (x_h-x] 2/Sxx) PI FOR A NEW OBS Y_NEW ŷ_h + t_{n-2} (1-a/2) · ôv(1 + 1/n + (x_h-x]2/Sxx) Why PI is wider: it carries the extra o2 of the new point's own error &_new - the "+1" under the root - on top of the uncertainty in the estimated mean. Both are narrowest at x_h = x and flare as you move away (the (x_h-x)2 term). R: predict ( . . . , interval="confidence" ) vs "prediction". As n->oo the CI shrinks to a point (you know the mean) but the PI stays finite - it can never beat the irreducible o of a single new draw. SIA > Match the wording: "average value for . . . " ++ CI; "predict one new . . . " ++ PI. Forgetting the "+1", or swapping them, loses marks. Never extrapolate beyond the observed x range. 8 . Reading summary(Im) SUPPLIED IN EXAM Estimate Std. Error t value Pr(>|t|) (Int) b0 se0 t0 p0 x b1 se1 t1 p1 Resid SE: VMSE on n-2 df Mult R2: R2 . Adj R2: R2adj F: F on 1 and n-2 DF, p: p Read off: Estimate/Std. Error = bj, se(b }); t value = bj/se(bj) (tests ßj=0); Pr(>|t|) = two-sided p. Stars *** ** *=. 001/. 01/. 05. The bottom three lines bundle the global picture: residual SE + its df give MSE and n-p; the two R2 values compare raw vs size-penalised fit; the F-line is the overall test. A small p on the F-line but big p's on every coefficient = suspect multicollinearity (Side 2). RECOVER HIDDEN QUANTITIES se = b/t . MSE = (resid SE)2 . df_E = n-p 8b . Fill the ANOVA Table RECURRING EXAM TASK
- 区别只在标准误(根号里):[20]Source: asksia-cheatsheet-stat7038.pdfasksia. ai/cheatsheet/ anu-stat7038 · side 1/2
AskSia CHEATSHEET SERIES
9 · CI (mean) vs PI
* CLASSIC
TRAP
(new)
At x = x_h both intervals share the same point estimate ŷ_h = bo + b1x_h; they differ only in the SE. CI FOR THE MEAN E(Y|X_H) @_h + t_{n-2} (1-a/2) · ôv(1/n + (x_h-x] 2/Sxx) PI FOR A NEW OBS Y_NEW
ŷ_h + t_{n-2} (1-a/2) · ôv(1 + 1/n + (x_h-x]2/Sxx)
Why PI is wider: it carries the extra o2 of the new point's own error &_new - the "+1" under the root - on top of the uncertainty in the estimated mean. Both are narrowest at x_h = x and flare as you move away (the (x_h-x)2 term). R:
predict ( . . . , interval="confidence" ) vs "prediction". As n->oo the CI shrinks to a point (you know the mean) but the PI stays finite - it can never beat the irreducible o of a single new draw.
SIA > Match the wording: "average value for . . . " ++ CI; "predict one new . . . " ++ PI.
Forgetting the "+1", or swapping them, loses marks. Never extrapolate beyond the observed x range.
8 . Reading summary(Im)
SUPPLIED IN EXAM
Estimate Std. Error t value Pr(>|t|) (Int) b0 se0 t0 p0 x b1 se1 t1 p1 Resid SE: VMSE on n-2 df Mult R2: R2 . Adj R2: R2adj
F: F on 1 and n-2 DF, p: p
Read off: Estimate/Std. Error = bj, se(b }); t value = bj/se(bj) (tests ßj=0); Pr(>|t|) = two-sided p. Stars *** ** *=. 001/. 01/. 05.
The bottom three lines bundle the global picture: residual SE + its df give MSE and n-p; the two R2 values compare raw vs size-penalised fit; the F-line is the overall test. A small p on the F-line but big p's on every coefficient = suspect multicollinearity (Side 2). RECOVER HIDDEN QUANTITIES se = b/t . MSE = (resid SE)2 . df_E = n-p
8b . Fill the ANOVA Table
RECURRING EXAM TASK[24]Source: asksia-cheatsheet-stat7038.pdfCalculator, R output & tables are supplied - don't waste this sheet on them.
Formula Belt SIDE 1
b1=Sxy/Sxx . bo=y-bix" MSE=SSE/(n-2) . Var(b1)=02/Sxx R2=SSR/SST=1-SSE/SST . F=t2 se(b1)=V(MSE/Sxx) . CI: bit . se CI mean: ôv(1/n+(x_h-x]2/Sxx) PI new: ôv(1+1/n+(x_h-x]2/Sxx)
asksia. ai/cheatsheet/ anu-stat7038 · side 1/2
AskSia CHEATSHEET SERIES
9 · CI (mean) vs PI
* CLASSIC
TRAP
(new)
At x = x_h both intervals share the same point estimate ŷ_h = bo + b1x_h; they differ only in the SE. CI FOR THE MEAN E(Y|X_H) @_h + t_{n-2} (1-a/2) · ôv(1/n + (x_h-x] 2/Sxx) PI FOR A NEW OBS Y_NEW
ŷ_h + t_{n-2} (1-a/2) · ôv(1 + 1/n + (x_h-x]2/Sxx)
Why PI is wider: it carries the extra o2 of the new point's own error &_new - the "+1" under the root - on top of the uncertainty in the estimated mean. Both are narrowest at x_h = x and flare as you move away (the (x_h-x)2 term). R:
predict ( . . . , interval="confidence" ) vs "prediction". As n->oo the CI shrinks to a point (you know the mean) but the PI stays finite - it can never beat the irreducible o of a single new draw.
SIA > Match the wording: "average value for . . . " ++ CI; "predict one new . . . " ++ PI.
Forgetting the "+1", or swapping them, loses marks. Never extrapolate beyond the observed x range.
8 . Reading summary(Im)
SUPPLIED IN EXAM
Estimate Std. Error t value Pr(>|t|) (Int) b0 se0 t0 p0 x b1 se1 t1 p1 Resid SE: VMSE on n-2 df Mult R2: R2 . Adj R2: R2adj
F: F on 1 and n-2 DF, p: p
Read off: Estimate/Std. Error = bj, se(b }); t value = bj/se(bj) (tests ßj=0); Pr(>|t|) = two-sided p. Stars *** ** *=. 001/. 01/. 05.
- 均值的置信区间 CI(mean):
$$\hat y_h \pm t_{n-2,,1-\alpha/2}\ \hat\sigma\sqrt{\frac{1}{n}+\frac{(x_h-\bar x)^2}{S_{xx}}}$$ - 单个新观测的预测区间 PI(new):
$$\hat y_h \pm t_{n-2,,1-\alpha/2}\ \hat\sigma\sqrt{1+\frac{1}{n}+\frac{(x_h-\bar x)^2}{S_{xx}}}$$
- 均值的置信区间 CI(mean):
- 选用规则(材料用“匹配措辞”反复强调):
- “average / mean response / 平均值” ⇒ CI
- “predict one new / individual / 一个新个体” ⇒ PI(因为有那个不可消除的 $+1$)[16]Source: asksia-cheatsheet-stat7038.pdf13b · Log Interpretation BACK- TRANSFORM Fit log y = bo + byx. On the original scale a 1-unit rise in x multiplies y by e^{b1} (not "+b,"). A coefficient difference of b2 between groups = y differs by a factor e^{b2}. Back-transform a CI by exponentiating the endpoints: if log-scale CI = (L, U) then the multiplicative-factor CI = (e^L, e^U). Trap: never just exponentiate the point estimate and call it the mean - that's the median on the original scale. Quick read: a log-y coefficient of 0. 05 = a 5% increase per unit x (since e^{0. 05}=1. 051); for small coefficients the percentage = 100-b1. This shortcut is handy for interpreting the supplied R output fast - but state it as "approximately" and use e^{b} for exact factors. 14 . 5-Step Test Recipe USE ON EVERY Q 1. Hypotheses Ho / Ha (state in symbols) 2. Statistic - t = b/se, F = MSR/MSE, etc. 3. Critical value w/ df (or p-value) from the supplied tables 4. Decision - reject / fail to reject Ho 5. Conclusion in context - plain words + the variable a = 5% unless stated. Show working - no rounding of intermediates. 14b . Exam-Day Tactics SCORING · MCQ first (fast points), then short-calc, then written. · Quote the metric / cut-off, not the adjective - markers reward the number + the rule. · State the sampling distribution & df explicitly (~t_{n-2},~F_{1,n-2}). · For "which interval?" - read for "average" (CI) vs "a new/individual" (PI). Every written answer ends in context with the variable name. · Redeemable quizzes: a strong exam can lift them, so the exam carries it all.[20]Source: asksia-cheatsheet-stat7038.pdfasksia. ai/cheatsheet/ anu-stat7038 · side 1/2 AskSia CHEATSHEET SERIES 9 · CI (mean) vs PI * CLASSIC TRAP (new) At x = x_h both intervals share the same point estimate ŷ_h = bo + b1x_h; they differ only in the SE. CI FOR THE MEAN E(Y|X_H) @_h + t_{n-2} (1-a/2) · ôv(1/n + (x_h-x] 2/Sxx) PI FOR A NEW OBS Y_NEW ŷ_h + t_{n-2} (1-a/2) · ôv(1 + 1/n + (x_h-x]2/Sxx) Why PI is wider: it carries the extra o2 of the new point's own error &_new - the "+1" under the root - on top of the uncertainty in the estimated mean. Both are narrowest at x_h = x and flare as you move away (the (x_h-x)2 term). R: predict ( . . . , interval="confidence" ) vs "prediction". As n->oo the CI shrinks to a point (you know the mean) but the PI stays finite - it can never beat the irreducible o of a single new draw. SIA > Match the wording: "average value for . . . " ++ CI; "predict one new . . . " ++ PI. Forgetting the "+1", or swapping them, loses marks. Never extrapolate beyond the observed x range. 8 . Reading summary(Im) SUPPLIED IN EXAM Estimate Std. Error t value Pr(>|t|) (Int) b0 se0 t0 p0 x b1 se1 t1 p1 Resid SE: VMSE on n-2 df Mult R2: R2 . Adj R2: R2adj F: F on 1 and n-2 DF, p: p Read off: Estimate/Std. Error = bj, se(b }); t value = bj/se(bj) (tests ßj=0); Pr(>|t|) = two-sided p. Stars *** ** *=. 001/. 01/. 05. The bottom three lines bundle the global picture: residual SE + its df give MSE and n-p; the two R2 values compare raw vs size-penalised fit; the F-line is the overall test. A small p on the F-line but big p's on every coefficient = suspect multicollinearity (Side 2). RECOVER HIDDEN QUANTITIES se = b/t . MSE = (resid SE)2 . df_E = n-p 8b . Fill the ANOVA Table RECURRING EXAM TASK
- 额外提醒:材料明确写了“别外推(extrapolate)到观测 $x$ 范围之外”。[20]Source: asksia-cheatsheet-stat7038.pdfasksia. ai/cheatsheet/ anu-stat7038 · side 1/2 AskSia CHEATSHEET SERIES 9 · CI (mean) vs PI * CLASSIC TRAP (new) At x = x_h both intervals share the same point estimate ŷ_h = bo + b1x_h; they differ only in the SE. CI FOR THE MEAN E(Y|X_H) @_h + t_{n-2} (1-a/2) · ôv(1/n + (x_h-x] 2/Sxx) PI FOR A NEW OBS Y_NEW ŷ_h + t_{n-2} (1-a/2) · ôv(1 + 1/n + (x_h-x]2/Sxx) Why PI is wider: it carries the extra o2 of the new point's own error &_new - the "+1" under the root - on top of the uncertainty in the estimated mean. Both are narrowest at x_h = x and flare as you move away (the (x_h-x)2 term). R: predict ( . . . , interval="confidence" ) vs "prediction". As n->oo the CI shrinks to a point (you know the mean) but the PI stays finite - it can never beat the irreducible o of a single new draw. SIA > Match the wording: "average value for . . . " ++ CI; "predict one new . . . " ++ PI. Forgetting the "+1", or swapping them, loses marks. Never extrapolate beyond the observed x range. 8 . Reading summary(Im) SUPPLIED IN EXAM Estimate Std. Error t value Pr(>|t|) (Int) b0 se0 t0 p0 x b1 se1 t1 p1 Resid SE: VMSE on n-2 df Mult R2: R2 . Adj R2: R2adj F: F on 1 and n-2 DF, p: p Read off: Estimate/Std. Error = bj, se(b }); t value = bj/se(bj) (tests ßj=0); Pr(>|t|) = two-sided p. Stars *** ** *=. 001/. 01/. 05. The bottom three lines bundle the global picture: residual SE + its df give MSE and n-p; the two R2 values compare raw vs size-penalised fit; the F-line is the overall test. A small p on the F-line but big p's on every coefficient = suspect multicollinearity (Side 2). RECOVER HIDDEN QUANTITIES se = b/t . MSE = (resid SE)2 . df_E = n-p 8b . Fill the ANOVA Table RECURRING EXAM TASK
-
6)读 R 输出:这是你最该投入时间的“最高收益技能”
- 材料一句话很狠:SLR 的推断几乎都能从一张 summary 输出的几个数字里复原——练到“冷读”。[8]Source: asksia-bible-stat7038-bilingual.pdf✓ Recover n from the output i summary stars 从输出中还原 n summary 的星号 "on 6 degrees of freedom" means dfE = n - 2 = 6, so n = 8. The F line "on 1 and 6 DF" confirms it. From n you can rebuild any SE the printout hides. *** / ** / * flag significance at . 001 / . 01 / . 05. Handy, but in a written answer quote the p-value or compare to the critical value - don't just cite stars. “在6个自由度上”意味着 dfE = n-2=6,故 n=8。F那 一行“在1和6 DF上”印证了这一点。有了n,便可重建打印 输出中隐去的任何 SE。 *** / ** / * 分别标记 . 001/ 01/. 05 水平上的显著性。方 便,但在文字作答中要引用 p 值或与临界值比较 -- 不要只引 星号。 "The whole of SLR inference is recoverable from six numbers on one printout. Learn to read it cold and the calculation questions become transcription with arithmetic. " “整个 SLR 推断都能从一张打印输出上的六个数字里复原出来。把它练到能一眼读懂,计算题就变成了带算术的誊抄。” WHY R-OUTPUT READING IS THE HIGHEST-YIELD EXAM SKILL STAT7038 . Regression Modelling DIAGNOSTICS . RESIDUAL PLOT - DIAGNOSTICS - DO THE ASSUMPTIONS HOLD? WEEKS 5-6 . HEAVILY EXAMINED The fit is only as good as LINE 拟合的优劣完全取决于 LINE Residual plots are how you check, not just assume, the four conditions 残差图是用来检验这四条假设的,而不是直接假定它们成立 Least squares always returns a line - even through data that has no business being modelled linearly. Inference (the t- and F-tests, the CIs and PIs) is only valid when the error assumptions hold. Diagnostics are the plots and statistics that interrogate them. Recall the four LINE assumptions: 最小二乘法总会返回一条直线 -- 哪怕数据根本不该用线性建模。推断(t 与 F 检验、CI 与 PI)只有在误差假设成立时才有效。诊断就是用 来盘问这些假设的图与统计量。回顾四条 LINE 假设: L LINEARITY OF E[Y |X] E[y|x]的线性 I INDEPENDENT ERRORS 误差相互独立 N NORMAL ERRORS 误差正态 E EQUAL VARIANCE 方差相等 AHA 1 The residuals-vs-fitted plot AHA 1 残差对拟合值图 The single most useful diagnostic. Plot each residual ei = yi - yi against its fitted value gj. Under the assumptions the residuals are a structureless cloud about the e = 0 line. Read it for two things at once: curvature (a failure of Linearity) and changing spread (a failure of Equal variance). 最有用的单一诊断。把每个残差对其拟合值作图。在假设成立下,残差是围绕 e = 0 直线、毫无结构的散点云。一眼同时看两件事:弯曲 (Linearity 即线性的失效)与散布变化(Equal variance 即等方差的失效)。 residual e GOOD
-
6.1 summary(lm) 你要会读出的字段
- Coefficients 表:读 $b$、$se$、$t=b/se$、p 值(Pr(>|t|) 是双侧)。[20]Source: asksia-cheatsheet-stat7038.pdfasksia. ai/cheatsheet/ anu-stat7038 · side 1/2 AskSia CHEATSHEET SERIES 9 · CI (mean) vs PI * CLASSIC TRAP (new) At x = x_h both intervals share the same point estimate ŷ_h = bo + b1x_h; they differ only in the SE. CI FOR THE MEAN E(Y|X_H) @_h + t_{n-2} (1-a/2) · ôv(1/n + (x_h-x] 2/Sxx) PI FOR A NEW OBS Y_NEW ŷ_h + t_{n-2} (1-a/2) · ôv(1 + 1/n + (x_h-x]2/Sxx) Why PI is wider: it carries the extra o2 of the new point's own error &_new - the "+1" under the root - on top of the uncertainty in the estimated mean. Both are narrowest at x_h = x and flare as you move away (the (x_h-x)2 term). R: predict ( . . . , interval="confidence" ) vs "prediction". As n->oo the CI shrinks to a point (you know the mean) but the PI stays finite - it can never beat the irreducible o of a single new draw. SIA > Match the wording: "average value for . . . " ++ CI; "predict one new . . . " ++ PI. Forgetting the "+1", or swapping them, loses marks. Never extrapolate beyond the observed x range. 8 . Reading summary(Im) SUPPLIED IN EXAM Estimate Std. Error t value Pr(>|t|) (Int) b0 se0 t0 p0 x b1 se1 t1 p1 Resid SE: VMSE on n-2 df Mult R2: R2 . Adj R2: R2adj F: F on 1 and n-2 DF, p: p Read off: Estimate/Std. Error = bj, se(b }); t value = bj/se(bj) (tests ßj=0); Pr(>|t|) = two-sided p. Stars *** ** *=. 001/. 01/. 05. The bottom three lines bundle the global picture: residual SE + its df give MSE and n-p; the two R2 values compare raw vs size-penalised fit; the F-line is the overall test. A small p on the F-line but big p's on every coefficient = suspect multicollinearity (Side 2). RECOVER HIDDEN QUANTITIES se = b/t . MSE = (resid SE)2 . df_E = n-p 8b . Fill the ANOVA Table RECURRING EXAM TASK
- “Residual standard error: … on … degrees of freedom”:
- $MSE=(\text{resid SE})^2$
- $df_E=n-p$,从 df 反推 $n$(常考小陷阱)。[8]Source: asksia-bible-stat7038-bilingual.pdf✓ Recover n from the output i summary stars 从输出中还原 n summary 的星号 "on 6 degrees of freedom" means dfE = n - 2 = 6, so n = 8. The F line "on 1 and 6 DF" confirms it. From n you can rebuild any SE the printout hides. *** / ** / * flag significance at . 001 / . 01 / . 05. Handy, but in a written answer quote the p-value or compare to the critical value - don't just cite stars. “在6个自由度上”意味着 dfE = n-2=6,故 n=8。F那 一行“在1和6 DF上”印证了这一点。有了n,便可重建打印 输出中隐去的任何 SE。 *** / ** / * 分别标记 . 001/ 01/. 05 水平上的显著性。方 便,但在文字作答中要引用 p 值或与临界值比较 -- 不要只引 星号。 "The whole of SLR inference is recoverable from six numbers on one printout. Learn to read it cold and the calculation questions become transcription with arithmetic. " “整个 SLR 推断都能从一张打印输出上的六个数字里复原出来。把它练到能一眼读懂,计算题就变成了带算术的誊抄。” WHY R-OUTPUT READING IS THE HIGHEST-YIELD EXAM SKILL STAT7038 . Regression Modelling DIAGNOSTICS . RESIDUAL PLOT - DIAGNOSTICS - DO THE ASSUMPTIONS HOLD? WEEKS 5-6 . HEAVILY EXAMINED The fit is only as good as LINE 拟合的优劣完全取决于 LINE Residual plots are how you check, not just assume, the four conditions 残差图是用来检验这四条假设的,而不是直接假定它们成立 Least squares always returns a line - even through data that has no business being modelled linearly. Inference (the t- and F-tests, the CIs and PIs) is only valid when the error assumptions hold. Diagnostics are the plots and statistics that interrogate them. Recall the four LINE assumptions: 最小二乘法总会返回一条直线 -- 哪怕数据根本不该用线性建模。推断(t 与 F 检验、CI 与 PI)只有在误差假设成立时才有效。诊断就是用 来盘问这些假设的图与统计量。回顾四条 LINE 假设: L LINEARITY OF E[Y |X] E[y|x]的线性 I INDEPENDENT ERRORS 误差相互独立 N NORMAL ERRORS 误差正态 E EQUAL VARIANCE 方差相等 AHA 1 The residuals-vs-fitted plot AHA 1 残差对拟合值图 The single most useful diagnostic. Plot each residual ei = yi - yi against its fitted value gj. Under the assumptions the residuals are a structureless cloud about the e = 0 line. Read it for two things at once: curvature (a failure of Linearity) and changing spread (a failure of Equal variance). 最有用的单一诊断。把每个残差对其拟合值作图。在假设成立下,残差是围绕 e = 0 直线、毫无结构的散点云。一眼同时看两件事:弯曲 (Linearity 即线性的失效)与散布变化(Equal variance 即等方差的失效)。 residual e GOOD[25]Source: asksia-cheatsheet-stat7038.pdfThe bottom three lines bundle the global picture: residual SE + its df give MSE and n-p; the two R2 values compare raw vs size-penalised fit; the F-line is the overall test. A small p on the F-line but big p's on every coefficient = suspect multicollinearity (Side 2). RECOVER HIDDEN QUANTITIES se = b/t . MSE = (resid SE)2 . df_E = n-p 8b . Fill the ANOVA Table RECURRING EXAM TASK Given any two of {SST, SSR, SSE} and n, complete the rest - never round intermediates: SRC DF SS MS F Reg 1 SSR SSF MSR/MSE Linearity biased b's, curved resid Indep. wrong se's, F invalid time/cluster model Normality t/F approx only (small n) transform; CLT in large n
- “F-statistic: … on … and … DF”:整体检验(SLR 下与 $t^2$ 对齐)。[8]Source: asksia-bible-stat7038-bilingual.pdf✓ Recover n from the output i summary stars 从输出中还原 n summary 的星号 "on 6 degrees of freedom" means dfE = n - 2 = 6, so n = 8. The F line "on 1 and 6 DF" confirms it. From n you can rebuild any SE the printout hides. *** / ** / * flag significance at . 001 / . 01 / . 05. Handy, but in a written answer quote the p-value or compare to the critical value - don't just cite stars. “在6个自由度上”意味着 dfE = n-2=6,故 n=8。F那 一行“在1和6 DF上”印证了这一点。有了n,便可重建打印 输出中隐去的任何 SE。 *** / ** / * 分别标记 . 001/ 01/. 05 水平上的显著性。方 便,但在文字作答中要引用 p 值或与临界值比较 -- 不要只引 星号。 "The whole of SLR inference is recoverable from six numbers on one printout. Learn to read it cold and the calculation questions become transcription with arithmetic. " “整个 SLR 推断都能从一张打印输出上的六个数字里复原出来。把它练到能一眼读懂,计算题就变成了带算术的誊抄。” WHY R-OUTPUT READING IS THE HIGHEST-YIELD EXAM SKILL STAT7038 . Regression Modelling DIAGNOSTICS . RESIDUAL PLOT - DIAGNOSTICS - DO THE ASSUMPTIONS HOLD? WEEKS 5-6 . HEAVILY EXAMINED The fit is only as good as LINE 拟合的优劣完全取决于 LINE Residual plots are how you check, not just assume, the four conditions 残差图是用来检验这四条假设的,而不是直接假定它们成立 Least squares always returns a line - even through data that has no business being modelled linearly. Inference (the t- and F-tests, the CIs and PIs) is only valid when the error assumptions hold. Diagnostics are the plots and statistics that interrogate them. Recall the four LINE assumptions: 最小二乘法总会返回一条直线 -- 哪怕数据根本不该用线性建模。推断(t 与 F 检验、CI 与 PI)只有在误差假设成立时才有效。诊断就是用 来盘问这些假设的图与统计量。回顾四条 LINE 假设: L LINEARITY OF E[Y |X] E[y|x]的线性 I INDEPENDENT ERRORS 误差相互独立 N NORMAL ERRORS 误差正态 E EQUAL VARIANCE 方差相等 AHA 1 The residuals-vs-fitted plot AHA 1 残差对拟合值图 The single most useful diagnostic. Plot each residual ei = yi - yi against its fitted value gj. Under the assumptions the residuals are a structureless cloud about the e = 0 line. Read it for two things at once: curvature (a failure of Linearity) and changing spread (a failure of Equal variance). 最有用的单一诊断。把每个残差对其拟合值作图。在假设成立下,残差是围绕 e = 0 直线、毫无结构的散点云。一眼同时看两件事:弯曲 (Linearity 即线性的失效)与散布变化(Equal variance 即等方差的失效)。 residual e GOOD[21]Source: asksia-cheatsheet-stat7038.pdfWorked: n=20, SST=400, R2=0. 64 => SSR=256, SSE=144, MSE=144/18=8, MSR=256, F=256/8=32 on 1,18 df. Compare F to F_{1,18}(. 95)=4. 41 => reject Ho: ß1=0. Cross-check via t: t = /F = /32 = 5. 66 (two-sided). The t-test of 3, and the overall F always agree in SLR (F =t2), and the p-values match exactly. In MLR this equivalence holds only between the overall F and the joint test - not the per-coefficient t's. If instead you are given SSE and SSR directly, R2 = SSR/(SSR+SSE) and there is nothing to look up - the Fand R2 are pure ratios. Watch the units: SS are in y2, MS are SS+df, and F is dimensionless. A common slip is dividing SSR by its df twice - MSR = SSR/1 in SLR, so MSR = SSR exactly, and MSE is the only mean square that needs a division. Compiled by AskSia . mapped to the STAT7038 syllabus . asksia. ai/cheatsheet/anu- stat7038 11 . Worked . CI vs PI SAME FIT From COL 4: d = 2, n = 12, Sxx = 80, X = 10. Predict at x_h =14. ŷ_h = 5 + 2. 5. 14 = 40 (x_h-x]2 = 16 . 1/n = 0. 0833 95% CI (MEAN) 40 ± 2. 228. 2. V(0. 0833 + 16/80) = 40 ± 2. 228. 2. 0. 532 = (37. 6, 42. 4) 95% PI (NEW) 40 ± 2. 228. 2. V(1 + 0. 0833 + 0. 2) = 40 ± 2. 228 . 2. 1. 133 = (34. 95, 45. 05) PI is far wider - the "+1" dominates the root. Both centre on 40. 12 . Diagnostics . the WK 4 . plots EVERY Q Residuals vs Fitted (& vs each x): checks linearity (no curve in the smoother) + constant variance (even band). Curve = wrong functional form; funnel/megaphone = heteroscedasticity . Normal Q-Q (internally studentised resid): checks normality. On the line => normal; S-shape = skew; heavy/light tails = kurtosis. Tails matter less in large n (CLT). Scale-Location (/|std resid| vs fitted): another homoscedasticity check; rising trend => increasing variance. Residuals vs Leverage / Cook's D plot: flags influence (Side 2). Independence => residuals vs order/time. Trap: R auto-labels the 3 most extreme points - labelled # outlier . Judge vs cut-offs + the gap; a Q- Q-flagged point with |studentised| < 2 is not an outlier. 12b . Reading 3 Plots CHECKBOX TASK The exam shows Residuals-vs-Fitted, Q-Q, Cook's-D and asks for the single best verdict. Map symptom -> conclusion: WHAT YOU SEE CONCLUSION
-
6.2 星号(*** ** *)怎么用
-
7)Diagnostics(诊断):残差图在考场是“勾选题 + 解释题”
- 核心观念:最小二乘一定给你一条线,但你必须用诊断去检查 LINE 是否成立。[8]Source: asksia-bible-stat7038-bilingual.pdf✓ Recover n from the output i summary stars 从输出中还原 n summary 的星号 "on 6 degrees of freedom" means dfE = n - 2 = 6, so n = 8. The F line "on 1 and 6 DF" confirms it. From n you can rebuild any SE the printout hides. *** / ** / * flag significance at . 001 / . 01 / . 05. Handy, but in a written answer quote the p-value or compare to the critical value - don't just cite stars. “在6个自由度上”意味着 dfE = n-2=6,故 n=8。F那 一行“在1和6 DF上”印证了这一点。有了n,便可重建打印 输出中隐去的任何 SE。 *** / ** / * 分别标记 . 001/ 01/. 05 水平上的显著性。方 便,但在文字作答中要引用 p 值或与临界值比较 -- 不要只引 星号。 "The whole of SLR inference is recoverable from six numbers on one printout. Learn to read it cold and the calculation questions become transcription with arithmetic. " “整个 SLR 推断都能从一张打印输出上的六个数字里复原出来。把它练到能一眼读懂,计算题就变成了带算术的誊抄。” WHY R-OUTPUT READING IS THE HIGHEST-YIELD EXAM SKILL STAT7038 . Regression Modelling DIAGNOSTICS . RESIDUAL PLOT - DIAGNOSTICS - DO THE ASSUMPTIONS HOLD? WEEKS 5-6 . HEAVILY EXAMINED The fit is only as good as LINE 拟合的优劣完全取决于 LINE Residual plots are how you check, not just assume, the four conditions 残差图是用来检验这四条假设的,而不是直接假定它们成立 Least squares always returns a line - even through data that has no business being modelled linearly. Inference (the t- and F-tests, the CIs and PIs) is only valid when the error assumptions hold. Diagnostics are the plots and statistics that interrogate them. Recall the four LINE assumptions: 最小二乘法总会返回一条直线 -- 哪怕数据根本不该用线性建模。推断(t 与 F 检验、CI 与 PI)只有在误差假设成立时才有效。诊断就是用 来盘问这些假设的图与统计量。回顾四条 LINE 假设: L LINEARITY OF E[Y |X] E[y|x]的线性 I INDEPENDENT ERRORS 误差相互独立 N NORMAL ERRORS 误差正态 E EQUAL VARIANCE 方差相等 AHA 1 The residuals-vs-fitted plot AHA 1 残差对拟合值图 The single most useful diagnostic. Plot each residual ei = yi - yi against its fitted value gj. Under the assumptions the residuals are a structureless cloud about the e = 0 line. Read it for two things at once: curvature (a failure of Linearity) and changing spread (a failure of Equal variance). 最有用的单一诊断。把每个残差对其拟合值作图。在假设成立下,残差是围绕 e = 0 直线、毫无结构的散点云。一眼同时看两件事:弯曲 (Linearity 即线性的失效)与散布变化(Equal variance 即等方差的失效)。 residual e GOOD
- 最常被考的图与结论映射(材料给得很明确):[8]Source: asksia-bible-stat7038-bilingual.pdf✓ Recover n from the output i summary stars 从输出中还原 n summary 的星号 "on 6 degrees of freedom" means dfE = n - 2 = 6, so n = 8. The F line "on 1 and 6 DF" confirms it. From n you can rebuild any SE the printout hides. *** / ** / * flag significance at . 001 / . 01 / . 05. Handy, but in a written answer quote the p-value or compare to the critical value - don't just cite stars. “在6个自由度上”意味着 dfE = n-2=6,故 n=8。F那 一行“在1和6 DF上”印证了这一点。有了n,便可重建打印 输出中隐去的任何 SE。 *** / ** / * 分别标记 . 001/ 01/. 05 水平上的显著性。方 便,但在文字作答中要引用 p 值或与临界值比较 -- 不要只引 星号。 "The whole of SLR inference is recoverable from six numbers on one printout. Learn to read it cold and the calculation questions become transcription with arithmetic. " “整个 SLR 推断都能从一张打印输出上的六个数字里复原出来。把它练到能一眼读懂,计算题就变成了带算术的誊抄。” WHY R-OUTPUT READING IS THE HIGHEST-YIELD EXAM SKILL STAT7038 . Regression Modelling DIAGNOSTICS . RESIDUAL PLOT - DIAGNOSTICS - DO THE ASSUMPTIONS HOLD? WEEKS 5-6 . HEAVILY EXAMINED The fit is only as good as LINE 拟合的优劣完全取决于 LINE Residual plots are how you check, not just assume, the four conditions 残差图是用来检验这四条假设的,而不是直接假定它们成立 Least squares always returns a line - even through data that has no business being modelled linearly. Inference (the t- and F-tests, the CIs and PIs) is only valid when the error assumptions hold. Diagnostics are the plots and statistics that interrogate them. Recall the four LINE assumptions: 最小二乘法总会返回一条直线 -- 哪怕数据根本不该用线性建模。推断(t 与 F 检验、CI 与 PI)只有在误差假设成立时才有效。诊断就是用 来盘问这些假设的图与统计量。回顾四条 LINE 假设: L LINEARITY OF E[Y |X] E[y|x]的线性 I INDEPENDENT ERRORS 误差相互独立 N NORMAL ERRORS 误差正态 E EQUAL VARIANCE 方差相等 AHA 1 The residuals-vs-fitted plot AHA 1 残差对拟合值图 The single most useful diagnostic. Plot each residual ei = yi - yi against its fitted value gj. Under the assumptions the residuals are a structureless cloud about the e = 0 line. Read it for two things at once: curvature (a failure of Linearity) and changing spread (a failure of Equal variance). 最有用的单一诊断。把每个残差对其拟合值作图。在假设成立下,残差是围绕 e = 0 直线、毫无结构的散点云。一眼同时看两件事:弯曲 (Linearity 即线性的失效)与散布变化(Equal variance 即等方差的失效)。 residual e GOOD[17]Source: asksia-cheatsheet-stat7038.pdfCompiled by AskSia . mapped to the STAT7038 syllabus . asksia. ai/cheatsheet/anu- stat7038 11 . Worked . CI vs PI SAME FIT From COL 4: d = 2, n = 12, Sxx = 80, X = 10. Predict at x_h =14. ŷ_h = 5 + 2. 5. 14 = 40 (x_h-x]2 = 16 . 1/n = 0. 0833 95% CI (MEAN) 40 ± 2. 228. 2. V(0. 0833 + 16/80) = 40 ± 2. 228. 2. 0. 532 = (37. 6, 42. 4) 95% PI (NEW) 40 ± 2. 228. 2. V(1 + 0. 0833 + 0. 2) = 40 ± 2. 228 . 2. 1. 133 = (34. 95, 45. 05) PI is far wider - the "+1" dominates the root. Both centre on 40. 12 . Diagnostics . the WK 4 . plots EVERY Q Residuals vs Fitted (& vs each x): checks linearity (no curve in the smoother) + constant variance (even band). Curve = wrong functional form; funnel/megaphone = heteroscedasticity . Normal Q-Q (internally studentised resid): checks normality. On the line => normal; S-shape = skew; heavy/light tails = kurtosis. Tails matter less in large n (CLT). Scale-Location (/|std resid| vs fitted): another homoscedasticity check; rising trend => increasing variance. Residuals vs Leverage / Cook's D plot: flags influence (Side 2). Independence => residuals vs order/time. Trap: R auto-labels the 3 most extreme points - labelled # outlier . Judge vs cut-offs + the gap; a Q- Q-flagged point with |studentised| < 2 is not an outlier. 12b . Reading 3 Plots CHECKBOX TASK The exam shows Residuals-vs-Fitted, Q-Q, Cook's-D and asks for the single best verdict. Map symptom -> conclusion: WHAT YOU SEE CONCLUSION Curved smoother (R-v-F) non-linearity Funnel widening
- 常见陷阱(材料点名):
- R 会自动标 3 个最极端点,不等于“就是异常”;要结合 cut-off + 和第二名的差距(gap)判断。[17]Source: asksia-cheatsheet-stat7038.pdfCompiled by AskSia . mapped to the STAT7038 syllabus . asksia. ai/cheatsheet/anu- stat7038 11 . Worked . CI vs PI SAME FIT From COL 4: d = 2, n = 12, Sxx = 80, X = 10. Predict at x_h =14. ŷ_h = 5 + 2. 5. 14 = 40 (x_h-x]2 = 16 . 1/n = 0. 0833 95% CI (MEAN) 40 ± 2. 228. 2. V(0. 0833 + 16/80) = 40 ± 2. 228. 2. 0. 532 = (37. 6, 42. 4) 95% PI (NEW) 40 ± 2. 228. 2. V(1 + 0. 0833 + 0. 2) = 40 ± 2. 228 . 2. 1. 133 = (34. 95, 45. 05) PI is far wider - the "+1" dominates the root. Both centre on 40. 12 . Diagnostics . the WK 4 . plots EVERY Q Residuals vs Fitted (& vs each x): checks linearity (no curve in the smoother) + constant variance (even band). Curve = wrong functional form; funnel/megaphone = heteroscedasticity . Normal Q-Q (internally studentised resid): checks normality. On the line => normal; S-shape = skew; heavy/light tails = kurtosis. Tails matter less in large n (CLT). Scale-Location (/|std resid| vs fitted): another homoscedasticity check; rising trend => increasing variance. Residuals vs Leverage / Cook's D plot: flags influence (Side 2). Independence => residuals vs order/time. Trap: R auto-labels the 3 most extreme points - labelled # outlier . Judge vs cut-offs + the gap; a Q- Q-flagged point with |studentised| < 2 is not an outlier. 12b . Reading 3 Plots CHECKBOX TASK The exam shows Residuals-vs-Fitted, Q-Q, Cook's-D and asks for the single best verdict. Map symptom -> conclusion: WHAT YOU SEE CONCLUSION Curved smoother (R-v-F) non-linearity Funnel widening
- 如果出现“先弯曲再漏斗”,先修正曲线(功能形式错)再谈方差。[28]Source: asksia-cheatsheet-stat7038.pdfnon-constant var Q-Q tails off line non-normal One big Cook's D influential point Random band, on line assumptions OK Read all three before answering - pick the one clear defect, not every labelled point. If two defects show, name the most fundamental (linearity outranks an outlier). 12c . Independence & OFTEN n MISSED Independence can't be seen in R-v-F; check residuals vs time/order for runs or cycles when data are sequential. Large n: by the CLT the b's are approx normal even if & aren't, so mild Q-Q departures matter little; small n makes normality (and the cut- offs) genuinely matter. Funnel direction: a right-opening funnel (variance grows with the mean) is the textbook case for log y or Vy; a curved-then-funnel pattern means fix the curvature first. Don't read randomness as a defect - a structureless band on R-v-F is exactly what you want, and is the single best "assumptions OK" verdict on the checkbox list. 13 . Transformations WK 5 Apply when residual plots show non-linearity and/or non-constant variance. Power ladder . . . 1/y, logy, /y, y, y2 . . . log y = mid- strength: monotone, linearises multiplicative links, stabilises variance (compresses large values). On the log scale an additive constant = a multiplicative factor on the original scale. Variance-stabilising: Vy for Poisson-like (Var = mean); log y when SD . mean; arcsin y/y for proportions. BOX-COX FAMILY y^(A) = (y^^ - 1)/A (log y at À=0) choose A by ML - boxcox() profile plot If transforming y fixes variance but breaks linearity, also transform x (Tukey "bulging rule"). 13b · Log Interpretation BACK- TRANSFORM
-
8)多元回归 MLR:你要抓住“比 SLR 多了什么”
- 关键换皮:$y=X\beta+\varepsilon$;最小二乘解 $\hat\beta=(X^TX)^{-1}X^Ty$;拟合 $ \hat y=Hy$,$H$ 的对角是 leverage $h_{ii}$。[22]Source: asksia-cheatsheet-stat7038.pdfRevision aid . check the current class summary for exam conditions . @ 2026 flip + for side 2 . MLR, diagnostics & selection transform / add term THE SETUP SLR & INFERENCE . The model & LINE . least squares @1=Sxy/Sxx . Gauss-Markov . ANOVA . F & R2 . t-tests & CIs . CI vs PI . reading ONE A4 . TYPED MEMORY AID STAT7038 Regression Modelling AUSTRALIAN NATIONAL UNIVERSITY . RSFAS EXAM REVISION Sem 1 2026 . SIDE 2 OF 2 MLR . diagnostics · selection SIDE 2/2 VIF . Cp/AIC/BIC WK 5 STACKED MODEL y = XB + € , ε ~ N(0, 02I) X is nxp (1s + predictors), ß is px1 LS SOLUTION (FROM XTXB=X™Y) ₿ = (XTX) -1XTy FITTED & HAT MATRIX y = X8 = Hy . H = X(XTX)-1XT H symmetric (HT=H) & idempotent (H2=H) tr(H)=p . diagonals = leverages hit RESIDUALS e = (I-H)y . Var(e)=02 (I-H) - Var(ei)=02(1-hi1) COVARIANCE OF ESTIMATES Var(B) = 02 (XTX) -1 = MSE (XTX)-1 jth diagonal of MSE(XTX)-1 = se(bj)2; off-diagonals = Cov(bj,b_k). R: vcov ( Im) prints it; /diag = se's. Read matrix output, don't invert by hand. 16 . MLR Model WK 6-7 P-1 PREDICTORS, P PARAMS yi = Be + Bix_{i1} + - + @_{p-1}x_{i, p-1} + £1 ô2 = MSE = SSE/(n-p) · resid SE on n-p df "Partial" coefficients: { j = expected change in y per 1-unit + in xj holding all other predictors constant. This conditional meaning is why a coefficient's sign can differ from the simple pairwise correlation of x j with y (see §24). 16b . Read vcov / matrix output EXAM CAN GIVE IT You may be handed MSE. (XTx)-1 (the vcov matrix). To extract: se(bj) = V(jth diagonal entry) Cov(bj, b_k) = the (j, k) off-diagonal corr(bj, b_k) = Cov / (se(b; )se(b_k)) Worked: if diag(vcov) = (4, 0. 25, 0. 09) then se(bo)=2, se(b1)=0. 5, se(b2)=0. 3. A t-test of ß1: t = b1/0. 5. No matrix inversion by hand - just read entries.
- 误差方差估计:$MSE=SSE/(n-p)$(注意这里 df 不再是 $n-2$)。[22]Source: asksia-cheatsheet-stat7038.pdfRevision aid . check the current class summary for exam conditions . @ 2026 flip + for side 2 . MLR, diagnostics & selection transform / add term THE SETUP SLR & INFERENCE . The model & LINE . least squares @1=Sxy/Sxx . Gauss-Markov . ANOVA . F & R2 . t-tests & CIs . CI vs PI . reading ONE A4 . TYPED MEMORY AID STAT7038 Regression Modelling AUSTRALIAN NATIONAL UNIVERSITY . RSFAS EXAM REVISION Sem 1 2026 . SIDE 2 OF 2 MLR . diagnostics · selection SIDE 2/2 VIF . Cp/AIC/BIC WK 5 STACKED MODEL y = XB + € , ε ~ N(0, 02I) X is nxp (1s + predictors), ß is px1 LS SOLUTION (FROM XTXB=X™Y) ₿ = (XTX) -1XTy FITTED & HAT MATRIX y = X8 = Hy . H = X(XTX)-1XT H symmetric (HT=H) & idempotent (H2=H) tr(H)=p . diagonals = leverages hit RESIDUALS e = (I-H)y . Var(e)=02 (I-H) - Var(ei)=02(1-hi1) COVARIANCE OF ESTIMATES Var(B) = 02 (XTX) -1 = MSE (XTX)-1 jth diagonal of MSE(XTX)-1 = se(bj)2; off-diagonals = Cov(bj,b_k). R: vcov ( Im) prints it; /diag = se's. Read matrix output, don't invert by hand. 16 . MLR Model WK 6-7 P-1 PREDICTORS, P PARAMS yi = Be + Bix_{i1} + - + @_{p-1}x_{i, p-1} + £1 ô2 = MSE = SSE/(n-p) · resid SE on n-p df "Partial" coefficients: { j = expected change in y per 1-unit + in xj holding all other predictors constant. This conditional meaning is why a coefficient's sign can differ from the simple pairwise correlation of x j with y (see §24). 16b . Read vcov / matrix output EXAM CAN GIVE IT You may be handed MSE. (XTx)-1 (the vcov matrix). To extract: se(bj) = V(jth diagonal entry) Cov(bj, b_k) = the (j, k) off-diagonal corr(bj, b_k) = Cov / (se(b; )se(b_k)) Worked: if diag(vcov) = (4, 0. 25, 0. 09) then se(bo)=2, se(b1)=0. 5, se(b2)=0. 3. A t-test of ß1: t = b1/0. 5. No matrix inversion by hand - just read entries.
- “Partial coefficient”的解释(考试写一句话拿分):
- $\beta_j$:在控制其他预测变量不变的前提下,$x_j$ 增加 1 单位,$y$ 的期望变化。[22]Source: asksia-cheatsheet-stat7038.pdfRevision aid . check the current class summary for exam conditions . @ 2026 flip + for side 2 . MLR, diagnostics & selection transform / add term THE SETUP SLR & INFERENCE . The model & LINE . least squares @1=Sxy/Sxx . Gauss-Markov . ANOVA . F & R2 . t-tests & CIs . CI vs PI . reading ONE A4 . TYPED MEMORY AID STAT7038 Regression Modelling AUSTRALIAN NATIONAL UNIVERSITY . RSFAS EXAM REVISION Sem 1 2026 . SIDE 2 OF 2 MLR . diagnostics · selection SIDE 2/2 VIF . Cp/AIC/BIC WK 5 STACKED MODEL y = XB + € , ε ~ N(0, 02I) X is nxp (1s + predictors), ß is px1 LS SOLUTION (FROM XTXB=X™Y) ₿ = (XTX) -1XTy FITTED & HAT MATRIX y = X8 = Hy . H = X(XTX)-1XT H symmetric (HT=H) & idempotent (H2=H) tr(H)=p . diagonals = leverages hit RESIDUALS e = (I-H)y . Var(e)=02 (I-H) - Var(ei)=02(1-hi1) COVARIANCE OF ESTIMATES Var(B) = 02 (XTX) -1 = MSE (XTX)-1 jth diagonal of MSE(XTX)-1 = se(bj)2; off-diagonals = Cov(bj,b_k). R: vcov ( Im) prints it; /diag = se's. Read matrix output, don't invert by hand. 16 . MLR Model WK 6-7 P-1 PREDICTORS, P PARAMS yi = Be + Bix_{i1} + - + @_{p-1}x_{i, p-1} + £1 ô2 = MSE = SSE/(n-p) · resid SE on n-p df "Partial" coefficients: { j = expected change in y per 1-unit + in xj holding all other predictors constant. This conditional meaning is why a coefficient's sign can differ from the simple pairwise correlation of x j with y (see §24). 16b . Read vcov / matrix output EXAM CAN GIVE IT You may be handed MSE. (XTx)-1 (the vcov matrix). To extract: se(bj) = V(jth diagonal entry) Cov(bj, b_k) = the (j, k) off-diagonal corr(bj, b_k) = Cov / (se(b; )se(b_k)) Worked: if diag(vcov) = (4, 0. 25, 0. 09) then se(bo)=2, se(b1)=0. 5, se(b2)=0. 3. A t-test of ß1: t = b1/0. 5. No matrix inversion by hand - just read entries.
- 所以系数符号可能和简单相关方向不一致(这为共线性/混杂埋伏笔)。[22]Source: asksia-cheatsheet-stat7038.pdfRevision aid . check the current class summary for exam conditions . @ 2026 flip + for side 2 . MLR, diagnostics & selection transform / add term THE SETUP SLR & INFERENCE . The model & LINE . least squares @1=Sxy/Sxx . Gauss-Markov . ANOVA . F & R2 . t-tests & CIs . CI vs PI . reading ONE A4 . TYPED MEMORY AID STAT7038 Regression Modelling AUSTRALIAN NATIONAL UNIVERSITY . RSFAS EXAM REVISION Sem 1 2026 . SIDE 2 OF 2 MLR . diagnostics · selection SIDE 2/2 VIF . Cp/AIC/BIC WK 5 STACKED MODEL y = XB + € , ε ~ N(0, 02I) X is nxp (1s + predictors), ß is px1 LS SOLUTION (FROM XTXB=X™Y) ₿ = (XTX) -1XTy FITTED & HAT MATRIX y = X8 = Hy . H = X(XTX)-1XT H symmetric (HT=H) & idempotent (H2=H) tr(H)=p . diagonals = leverages hit RESIDUALS e = (I-H)y . Var(e)=02 (I-H) - Var(ei)=02(1-hi1) COVARIANCE OF ESTIMATES Var(B) = 02 (XTX) -1 = MSE (XTX)-1 jth diagonal of MSE(XTX)-1 = se(bj)2; off-diagonals = Cov(bj,b_k). R: vcov ( Im) prints it; /diag = se's. Read matrix output, don't invert by hand. 16 . MLR Model WK 6-7 P-1 PREDICTORS, P PARAMS yi = Be + Bix_{i1} + - + @_{p-1}x_{i, p-1} + £1 ô2 = MSE = SSE/(n-p) · resid SE on n-p df "Partial" coefficients: { j = expected change in y per 1-unit + in xj holding all other predictors constant. This conditional meaning is why a coefficient's sign can differ from the simple pairwise correlation of x j with y (see §24). 16b . Read vcov / matrix output EXAM CAN GIVE IT You may be handed MSE. (XTx)-1 (the vcov matrix). To extract: se(bj) = V(jth diagonal entry) Cov(bj, b_k) = the (j, k) off-diagonal corr(bj, b_k) = Cov / (se(b; )se(b_k)) Worked: if diag(vcov) = (4, 0. 25, 0. 09) then se(bo)=2, se(b1)=0. 5, se(b2)=0. 3. A t-test of ß1: t = b1/0. 5. No matrix inversion by hand - just read entries.
-
9)Sequential vs Partial(序贯 vs 偏):最容易把人绕晕,但考法很固定
- 你材料给了最重要的一句话:
- 考试“套路题”:要检验某一组系数的联合贡献(extra-SS / nested F):
- 公式骨架(full vs reduced):
$$F=\frac{(SSE_R-SSE_F)/q}{SSE_F/(n-p_F)}\sim F_{q,\ n-p_F}$$
其中 $q$ 是你删掉的系数个数。[26]Source: asksia-cheatsheet-stat7038.pdf19 . Polynomial Regr. WK 11 Centre x (use x-x) before forming powers to kill the artificial collinearity between x and x2. Test the highest-order term first (it's last sequential); only after dropping it re-test the next. 19b . Back-fill summary. lm RECURRING TASK Given a partial MLR summary, recover the blanks (n=25, p=4): missing t = Estimate / Std. Error missing se = Estimate / t resid SE = VMSE on n-p = 21 df overall F on p-1=3 and n-p=21 DF R2adj = 1 - (1-R2) (n-1)/(n-p) Worked: b2=1. 8, se=0. 6 => t=3. 0, p <. 01 (vst_{21} (. 975)=2. 08, reject ß2=0). If R2=0. 70 = R2adj = 1-0. 30-24/21 = 0. 657 Then recover SSE from resid SE: SSE = (resid SE)2. (n-p) = MSE. 21. And the overall F = [R2/(p-1)] / [(1-R2)/(n-p)] = (0. 70/3)/(0. 30/21) = 0. 233/0. 0143 = 16. 3 on 3,21 df - compare to F_{3,21}(. 95)=3. 07 => model is useful overall. 20 · Sequential vs * TRAP . Partial SS WK 8 Sequential (Type I) - R's anova ( Im) splits SSR one term at a time, in entry order: SSR = SSR(X1) + SSR(X2 |X1) + SSR(X3 |X1 , X2 ) + " Each line: 1 df (factor: levels-1), F = MS/MSE ~ F_{df,n-p}. The last sequential line = t2 for that coefficient, same p. EXTRA-SS / NESTED (PARTIAL) F . FULL VS REDUCED F = {[SSE(R)-SSE(F)]/q} / {SSE(F)/(n-p_F)} = [SSR(extra)/q] / MSE(F) ~ F_{q, n-p_F} Ho: the q extra coefficients are all 0. Read the numerator straight off the sequential table if the q terms are last & consecutive (sum their seq SS + df). Trap: a sequential SS for X2 is SSR(X2|X1) - order- dependent . To test a non-final variable, re-order it last or use a partial F. The summary t-test is the partial test (all-others-in); = the sequential F only for the last term. Different orders => different per-line p's - itself a multicollinearity tell. 20b . Worked . Nested F OUR NUMBERS Full (X1,X2,X3): SSE(F) = 90, n = 30, p_F = 4. Drop X2,X3: SSE(R) = 150, q = 2. F = [(150-90)/2] / [90/26] = 30 / 3. 462 = 8. 67 ~ F_{2,26} vs F_{2,26}(. 95) = 3. 37 = reject He => X2,X3 jointly add significantly. Keep them. 20c . Extra-SS from the Table SHORTCUT σ̂σ̂_{(i)} To test x2,X3 given x4,X1 already in, fit in order x4, X1, X2, X3 and read the sequential lines: - 快捷法只在“被检验项在末尾且连续”时成立:把最后几行 seq SS 相加组成分子。[26]Source: asksia-cheatsheet-stat7038.pdf19 . Polynomial Regr. WK 11 Centre x (use x-x) before forming powers to kill the artificial collinearity between x and x2. Test the highest-order term first (it's last sequential); only after dropping it re-test the next. 19b . Back-fill summary. lm RECURRING TASK Given a partial MLR summary, recover the blanks (n=25, p=4): missing t = Estimate / Std. Error missing se = Estimate / t resid SE = VMSE on n-p = 21 df overall F on p-1=3 and n-p=21 DF R2adj = 1 - (1-R2) (n-1)/(n-p) Worked: b2=1. 8, se=0. 6 => t=3. 0, p <. 01 (vst_{21} (. 975)=2. 08, reject ß2=0). If R2=0. 70 = R2adj = 1-0. 30-24/21 = 0. 657 Then recover SSE from resid SE: SSE = (resid SE)2. (n-p) = MSE. 21. And the overall F = [R2/(p-1)] / [(1-R2)/(n-p)] = (0. 70/3)/(0. 30/21) = 0. 233/0. 0143 = 16. 3 on 3,21 df - compare to F_{3,21}(. 95)=3. 07 => model is useful overall. 20 · Sequential vs * TRAP . Partial SS WK 8 Sequential (Type I) - R's anova ( Im) splits SSR one term at a time, in entry order: SSR = SSR(X1) + SSR(X2 |X1) + SSR(X3 |X1 , X2 ) + " Each line: 1 df (factor: levels-1), F = MS/MSE ~ F_{df,n-p}. The last sequential line = t2 for that coefficient, same p. EXTRA-SS / NESTED (PARTIAL) F . FULL VS REDUCED F = {[SSE(R)-SSE(F)]/q} / {SSE(F)/(n-p_F)} = [SSR(extra)/q] / MSE(F) ~ F_{q, n-p_F} Ho: the q extra coefficients are all 0. Read the numerator straight off the sequential table if the q terms are last & consecutive (sum their seq SS + df). Trap: a sequential SS for X2 is SSR(X2|X1) - order- dependent . To test a non-final variable, re-order it last or use a partial F. The summary t-test is the partial test (all-others-in); = the sequential F only for the last term. Different orders => different per-line p's - itself a multicollinearity tell. 20b . Worked . Nested F OUR NUMBERS Full (X1,X2,X3): SSE(F) = 90, n = 30, p_F = 4. Drop X2,X3: SSE(R) = 150, q = 2. F = [(150-90)/2] / [90/26] = 30 / 3. 462 = 8. 67 ~ F_{2,26} vs F_{2,26}(. 95) = 3. 37 = reject He => X2,X3 jointly add significantly. Keep them. 20c . Extra-SS from the Table SHORTCUT σ̂σ̂_{(i)} To test x2,X3 given x4,X1 already in, fit in order x4, X1, X2, X3 and read the sequential lines:[30]Source: asksia-cheatsheet-stat7038.pdf20c . Extra-SS from the Table SHORTCUT σ̂σ̂_{(i)} To test x2,X3 given x4,X1 already in, fit in order x4, X1, X2, X3 and read the sequential lines: SSR(extra) = SSR(X2 |X4,X1) + SSR(X3|X4,X1,X2) F = [SSR(extra)/2] / MSE ~ F_{2, n-p} Just sum the last two sequential SS (and their df). This only works if the tested terms are last & consecutive - otherwise refit or use a full vs reduced comparison. Equivalently, the extra SS = SSE(reduced) - SSE(full): drop the q terms, note how much SSE rises, and that rise (+q, MSE_full) is your F. The two routes always give the same number. R does this directly with anova( reduced, full). 20d . What Reordering Changes QUICK TABLE OUTPUT REORDER X'S? summary() coeffs/t unchanged (partial) R2, adj-R2, overall F unchanged vif, plot(lm) unchanged anova() seq SS changes Different per-line anova verdicts under reordering is itself a multicollinearity diagnostic . Why: SSR(X2|X1) is the extra variation X2 explains beyond X1. If X1 and X2 are correlated they share explanatory power, so whichever enters first claims it - hence the order dependence. When predictors are orthogonal, sequential = partial and order is irrelevant. The sequential SS for all terms still sum to the same total SSR regardless of order. 21 . Qualitative Covariates WK 9 Dummy coding: a factor with k levels => k-1 indicator (0/1) variables; one level is the reference (baseline) absorbed into the intercept. R: factor, treatment contrasts, ref = first level alphabetically. ADDITIVE (PARALLEL LINES), DE{0,1} y = Be + B1x + @2D + £ D=0: int Be . D=1: int Be+B2 . same slope ß1
- 公式骨架(full vs reduced):
-
10)Multicollinearity(多重共线性):考场识别“症状”比背定义更重要
- 典型症状(材料直接给):整体 F 显著,但单个 t 都不显著、系数符号乱跳。[11]Source: asksia-bible-stat7038-bilingual.pdfM4 MCQ Concept Centring x before forming x2 in a polynomial model mainly: (A) changes R2; (B) reduces the multicollinearity between x and x2; (C) removes outliers; (D) changes the fitted ŷ. M5 READ R Output From the print-out above, recover the missing t value and significance verdict for x2 at 5%. M6 READ R Output From the same output, is x1 significant at 5%? State n and the residual df, and give the two-sided p-value verdict. MZ MCQ Concept On a log-y scale a fitted coefficient for a 0/1 group is b2 = 0. 41. The multiplicative effect of the group on the original y scale is about: (A) +0. 41; (B) e0. 41 ~ 1. 51x; (C) 0. 41x; (D) none. M8 MCQ Concept A step() trace shows <none> has the lowest AIC of all add/drop moves. This means: (A) keep adding; (B) stop - the current model is selected; (C) drop a term; (D) AIC failed. STAT7038 . Regression Modelling M1-M8 Answer key - rapid-fire 1 M1. (B) ±0. 90. In SLR R2 = r2, so r = ±10. 81 = +0. 90; the sign matches the slope, which isn't given here. M1. (B)±0. 90。在 SLR 中 R2 = r2,故r=±√0. 81= ±0. 90;其符号与斜率一致,而此处未给出斜率。 2 M2. (B) prediction interval. One new individual => PI (carries the +1 of the new error). "Average for this x" would be the CI for the mean. M2. (B)预测区间。一个新个体⇒ PI(承载新误差的+1)。“此 x处的平均值”则是均值的 CI。 3 M3. (B) multicollinearity. The signature: joint F significant, marginal t's not - check VIFs and the predictor correlation matrix. M3. (B)多重共线性。其标志:联合 F 显著、边际 t 不显著 -- 检查 VIF 与预测变量相关矩阵。 M4. (B). Centring kills the artificial correlation between x and x2; it does not change R2 or the fitted values, just the conditioning of XTx. M4. (B)。中心化消除了x与 x2 之间的人为相关;它不改变 R2 或拟合值,只改善 X x的条件数。 5 M5. t = b/se = - 1. 2/0. 4 =- 3. 0. With 47 df, 1-3. 0| > t47(0. 975)~2. 01, so x2 is significant (reject 2=0). M5. t = b/se =- 1. 2/0. 4 =- 3. 0。在 47个自由度下,|-3. 0| > t47(0. 975)~2. 01,所以 x2 显著(拒绝 β2=0)。 6 M6. Residual df = 47 = n - p = n - 3 => n = 50. For x1, t = 2. 00, p = 0. 054 > 0. 05 = not significant at 5% (only borderline). M6. 残差 df = 47= n -p = n-3⇒ n= 50。对 x1, t = 2. 00, p = 0. 054 >0. 05⇒在5% 水平上不显著(仅勉强接近)。 7 M7. (B) eº. 41 =1. 51x. On a log scale an additive coefficient becomes a multiplicative factor on the original scale. M7. (B) e0. 41~1. 51x。在对数尺度上,一个加法系数在原尺度上变成一个乘法因子。 M8. (B) stop. When <none> is best, no add/drop move lowers AIC, so step() halts and returns the current model. M8. (B)停止。当 <none>最优时,没有任何添加/剔除动作能降低 AIC,于是 step()停下并返回当前模型。 4×15 MARKS PER PAPER 每份试卷的分数 5 STEPS / TEST 每个检验的步数 CIPI read MATCH THE WORDING 匹配题目措辞 THE R OUTPUT R 输出 Same habit across every part: state the formula, show the substitution, conclude in context. The five-step test, the labelled ANOVA entries, the cut-off vs the gap, the fitted line written per group - each earns marks on method even if one arithmetic step slips. A bare number, or 'reject H ' with no context sentence, throws that away. 每一部分都保持同一习惯:写出公式、展示代入、结合情境作结。五步检验、标好的 ANOVA 表项、临界值与间距的比较、按组写出的 拟合直线 -- 即便某一步算术失手,每一项都还能拿到方法分。一个光秃秃的数字,或没有情境句子的“拒绝 Ho”,会把这些分白白丢 掉。 MARKER'S NOTE . STAT7038 FINAL STAT7038 . Regression Modelling[20]Source: asksia-cheatsheet-stat7038.pdfasksia. ai/cheatsheet/ anu-stat7038 · side 1/2 AskSia CHEATSHEET SERIES 9 · CI (mean) vs PI * CLASSIC TRAP (new) At x = x_h both intervals share the same point estimate ŷ_h = bo + b1x_h; they differ only in the SE. CI FOR THE MEAN E(Y|X_H) @_h + t_{n-2} (1-a/2) · ôv(1/n + (x_h-x] 2/Sxx) PI FOR A NEW OBS Y_NEW ŷ_h + t_{n-2} (1-a/2) · ôv(1 + 1/n + (x_h-x]2/Sxx) Why PI is wider: it carries the extra o2 of the new point's own error &_new - the "+1" under the root - on top of the uncertainty in the estimated mean. Both are narrowest at x_h = x and flare as you move away (the (x_h-x)2 term). R: predict ( . . . , interval="confidence" ) vs "prediction". As n->oo the CI shrinks to a point (you know the mean) but the PI stays finite - it can never beat the irreducible o of a single new draw. SIA > Match the wording: "average value for . . . " ++ CI; "predict one new . . . " ++ PI. Forgetting the "+1", or swapping them, loses marks. Never extrapolate beyond the observed x range. 8 . Reading summary(Im) SUPPLIED IN EXAM Estimate Std. Error t value Pr(>|t|) (Int) b0 se0 t0 p0 x b1 se1 t1 p1 Resid SE: VMSE on n-2 df Mult R2: R2 . Adj R2: R2adj F: F on 1 and n-2 DF, p: p Read off: Estimate/Std. Error = bj, se(b }); t value = bj/se(bj) (tests ßj=0); Pr(>|t|) = two-sided p. Stars *** ** *=. 001/. 01/. 05. The bottom three lines bundle the global picture: residual SE + its df give MSE and n-p; the two R2 values compare raw vs size-penalised fit; the F-line is the overall test. A small p on the F-line but big p's on every coefficient = suspect multicollinearity (Side 2). RECOVER HIDDEN QUANTITIES se = b/t . MSE = (resid SE)2 . df_E = n-p 8b . Fill the ANOVA Table RECURRING EXAM TASK
- VIF(方差膨胀因子):
- $VIF_j=\dfrac{1}{1-R_j^2}$;经验阈值:$>5$ 值得关注,$>10$ 严重。[5]Source: asksia-bible-stat7038-bilingual.pdfNear-linear dependence among predictors (XT X near-singular); inflates SEs, flips signs. Variance Inflation Factor 方差膨胀因子 VIFj=1/(1- R2); flag >5 (concerning) / >10 (serious); \VIF; inflates se(bj). Mallows' Cp Mallows Cp Cp = SSEp/MSEfull - (n-2p); want Cp ~ p and small. AIC / BIC AIC / BIC AIC = n log(SSE/n) + 2p; BIC uses penalty plog n (heavier => smaller models). Minimise. PRESS 预测残差平方和 PRESS = >(ei/(1 - hii))2; leave-one-out predictive error; minimise. Stepwise selection 逐步回归 Forward/backward/both by AIC (R step()); data-driven, so final p-values are over- optimistic. Hierarchy / marginality 层级原则 Keep lower-order terms (x, main effects) whenever a higher-order term (x2, interaction) is retained. Parsimony / bias-variance 简约/偏差-方差 Too few terms => bias (underfit); too many => variance (overfit). Prefer the simplest adequate model. [SSE(R)-SSE(F)\/q model STAT7038 . Regression Modelling GLOSSARY - FORMULA - SHEET & R-READING MAP · 公式纸与 R 阅读地图 What goes on your A4 note vs what you execute A4 笔记上写什么,考场里执行什么 STAT7038 is closed-book with ONE A4 page of notes, a calculator, and supplied stat tables STAT7038 为闭卷,仅允许带一张 A4 笔记、一个计算器和考场提供的统计表 The STAT7038 final is closed-book but allows one A4 page of your own notes plus a calculator, with t/F/normal tables supplied 〔闭卷,但可带一张 A4笔记,附统计表〕. So fill the A4 with formulae and cut-offs (left) and spend revision on the procedures and R-reading the note cannot do for you (right).
- 解释句:VIF 会把 $se(b_j)$ “吹大”,导致 t 变小、显著性下降。[5]Source: asksia-bible-stat7038-bilingual.pdfNear-linear dependence among predictors (XT X near-singular); inflates SEs, flips signs. Variance Inflation Factor 方差膨胀因子 VIFj=1/(1- R2); flag >5 (concerning) / >10 (serious); \VIF; inflates se(bj). Mallows' Cp Mallows Cp Cp = SSEp/MSEfull - (n-2p); want Cp ~ p and small. AIC / BIC AIC / BIC AIC = n log(SSE/n) + 2p; BIC uses penalty plog n (heavier => smaller models). Minimise. PRESS 预测残差平方和 PRESS = >(ei/(1 - hii))2; leave-one-out predictive error; minimise. Stepwise selection 逐步回归 Forward/backward/both by AIC (R step()); data-driven, so final p-values are over- optimistic. Hierarchy / marginality 层级原则 Keep lower-order terms (x, main effects) whenever a higher-order term (x2, interaction) is retained. Parsimony / bias-variance 简约/偏差-方差 Too few terms => bias (underfit); too many => variance (overfit). Prefer the simplest adequate model. [SSE(R)-SSE(F)\/q model STAT7038 . Regression Modelling GLOSSARY - FORMULA - SHEET & R-READING MAP · 公式纸与 R 阅读地图 What goes on your A4 note vs what you execute A4 笔记上写什么,考场里执行什么 STAT7038 is closed-book with ONE A4 page of notes, a calculator, and supplied stat tables STAT7038 为闭卷,仅允许带一张 A4 笔记、一个计算器和考场提供的统计表 The STAT7038 final is closed-book but allows one A4 page of your own notes plus a calculator, with t/F/normal tables supplied 〔闭卷,但可带一张 A4笔记,附统计表〕. So fill the A4 with formulae and cut-offs (left) and spend revision on the procedures and R-reading the note cannot do for you (right).
- 重要辨析(防止写错):共线性会增加方差/不稳定,但材料强调它 不导致在样本范围内的 $\hat y$ 或 $R^2$ 有偏(不等于“模型没用”)。[18]Source: asksia-cheatsheet-stat7038.pdfRemedies: drop a redundant predictor; centre x (kills x vs x2 collinearity); combine variables; collect better- spread data; ridge. Condition number K = /()_max/2_min) of the scaled XTX; K > 30 signals collinearity problems - a global check to pair with the per-predictor VIFs. VIF pinpoints which predictor; k flags the system as a whole. Trap: high VIFs from deliberately included higher- order/interaction terms are expected, not a fault (centring reduces them). Multicollinearity inflates se's & destabilises coefficients but does NOT bias y or R2 within the data range. 25 . Model Selection WK 11 Smaller better for Cp/AIC/BIC/PRESS; larger for adj-R2. MALLOWS' CP Cp = SSEp/MSE_full - (n-2p) good: Cp = p (low bias) & small AIC / BIC AIC = n. log(SSE/n) + 2p BIC = n . Log(SSE/n) + p. log n (heavier » smaller models) PRESS (LEAVE-ONE-OUT) PRESS = E(e1/(1-hit))2 . minimise PRESS measures out-of-sample prediction (each point predicted from a fit that excludes it), so it rewards genuine predictive power rather than in-sample fit. Procedures: best-subset (regsubsets, feasible only for modest #predictors); forward (start null, add most sig); backward (start full, drop least sig); stepwise "both" - R step(), AIC-driven; read the trace, take the lowest-AIC move, stop when <none> is best. Bias-variance: too few => biased (underfit); too many = inflated variance (overfit). Prefer the simplest adequate model (Occam). All the criteria are just different penalties balancing fit against complexity, so they need not agree - report which criterion you used. Reading a step() trace: each block lists candidate add/drop moves with the resulting AIC; R takes the lowest-AIC move and stops when <none> tops the list. BIC's heavier penalty (p. log n, for n≥8) almost always lands on a smaller model than AIC, so quoting which criterion you used matters. Trap: a stepwise-selected model's p-values/CIs are over- optimistic (selection inflates significance); AIC vs BIC can pick different models. Validate; don't treat the selected model as confirmed truth. Trap List SIDE 2 | CI(mean) vs PI(new): "+1" > PI wider seq SS order-dependent . t = partial F sig, no t - multicollinearity (VIF) keep Lower-order if interaction sig cut-offs large-sample . gap not threshold no extrapolation . don't over-trust step () asksia. ai/cheatsheet/ anu-stat7038 . side 2/2 Revision aid . check the current class summary for exam conditions . @ 2026 good luck. revise smart. AskSia CHEATSHEET SERIES Compiled by AskSia . mapped to the STAT7038 syllabus . asksia. ai/cheatsheet/anu-stat7038 15 . Matrix Form MLR & DIAGNOSTICS . matrix form (XTX)-1XTy . hat matrix . seq vs partial SS . nested F . dummies & interactions . Leverage/Cook/DFFITS ONE A4 . TYPED MEMORY AID STAT7038 Regression Modelling AUSTRALIAN NATIONAL UNIVERSITY . RSFAS EXAM REVISION Sem 1 2026 . SIDE 1 OF 2 SLR · estimation . inference
- 常见处理(材料列了):删冗余变量、中心化、组合变量、收集更分散的数据、ridge(提到即可)。[18]Source: asksia-cheatsheet-stat7038.pdfRemedies: drop a redundant predictor; centre x (kills x vs x2 collinearity); combine variables; collect better- spread data; ridge. Condition number K = /()_max/2_min) of the scaled XTX; K > 30 signals collinearity problems - a global check to pair with the per-predictor VIFs. VIF pinpoints which predictor; k flags the system as a whole. Trap: high VIFs from deliberately included higher- order/interaction terms are expected, not a fault (centring reduces them). Multicollinearity inflates se's & destabilises coefficients but does NOT bias y or R2 within the data range. 25 . Model Selection WK 11 Smaller better for Cp/AIC/BIC/PRESS; larger for adj-R2. MALLOWS' CP Cp = SSEp/MSE_full - (n-2p) good: Cp = p (low bias) & small AIC / BIC AIC = n. log(SSE/n) + 2p BIC = n . Log(SSE/n) + p. log n (heavier » smaller models) PRESS (LEAVE-ONE-OUT) PRESS = E(e1/(1-hit))2 . minimise PRESS measures out-of-sample prediction (each point predicted from a fit that excludes it), so it rewards genuine predictive power rather than in-sample fit. Procedures: best-subset (regsubsets, feasible only for modest #predictors); forward (start null, add most sig); backward (start full, drop least sig); stepwise "both" - R step(), AIC-driven; read the trace, take the lowest-AIC move, stop when <none> is best. Bias-variance: too few => biased (underfit); too many = inflated variance (overfit). Prefer the simplest adequate model (Occam). All the criteria are just different penalties balancing fit against complexity, so they need not agree - report which criterion you used. Reading a step() trace: each block lists candidate add/drop moves with the resulting AIC; R takes the lowest-AIC move and stops when <none> tops the list. BIC's heavier penalty (p. log n, for n≥8) almost always lands on a smaller model than AIC, so quoting which criterion you used matters. Trap: a stepwise-selected model's p-values/CIs are over- optimistic (selection inflates significance); AIC vs BIC can pick different models. Validate; don't treat the selected model as confirmed truth. Trap List SIDE 2 | CI(mean) vs PI(new): "+1" > PI wider seq SS order-dependent . t = partial F sig, no t - multicollinearity (VIF) keep Lower-order if interaction sig cut-offs large-sample . gap not threshold no extrapolation . don't over-trust step () asksia. ai/cheatsheet/ anu-stat7038 . side 2/2 Revision aid . check the current class summary for exam conditions . @ 2026 good luck. revise smart. AskSia CHEATSHEET SERIES Compiled by AskSia . mapped to the STAT7038 syllabus . asksia. ai/cheatsheet/anu-stat7038 15 . Matrix Form MLR & DIAGNOSTICS . matrix form (XTX)-1XTy . hat matrix . seq vs partial SS . nested F . dummies & interactions . Leverage/Cook/DFFITS ONE A4 . TYPED MEMORY AID STAT7038 Regression Modelling AUSTRALIAN NATIONAL UNIVERSITY . RSFAS EXAM REVISION Sem 1 2026 . SIDE 1 OF 2 SLR · estimation . inference
-
11)Model selection(模型选择):你只要记“方向 + 公式 + step() 判读”
- 你需要掌握“哪个指标越小越好 / 越大越好”:
- 公式(至少要会在 A4 上抄到能代数):[5]Source: asksia-bible-stat7038-bilingual.pdfNear-linear dependence among predictors (XT X near-singular); inflates SEs, flips signs.
Variance Inflation Factor
方差膨胀因子
VIFj=1/(1- R2); flag >5 (concerning) / >10 (serious); \VIF; inflates se(bj).
Mallows' Cp
Mallows Cp
Cp = SSEp/MSEfull - (n-2p); want Cp ~ p and small.
AIC / BIC
AIC / BIC
AIC = n log(SSE/n) + 2p; BIC uses penalty plog n (heavier => smaller models). Minimise.
PRESS
预测残差平方和
PRESS = >(ei/(1 - hii))2; leave-one-out predictive error; minimise.
Stepwise selection
逐步回归
Forward/backward/both by AIC (R step()); data-driven, so final p-values are over- optimistic.
Hierarchy / marginality
层级原则
Keep lower-order terms (x, main effects) whenever a higher-order term (x2, interaction) is retained.
Parsimony / bias-variance
简约/偏差-方差
Too few terms => bias (underfit); too many => variance (overfit). Prefer the simplest adequate model.
[SSE(R)-SSE(F)\/q
model
STAT7038 . Regression Modelling
GLOSSARY
- FORMULA - SHEET & R-READING MAP · 公式纸与 R 阅读地图
What goes on your A4 note vs what you execute A4 笔记上写什么,考场里执行什么
STAT7038 is closed-book with ONE A4 page of notes, a calculator, and supplied stat tables STAT7038 为闭卷,仅允许带一张 A4 笔记、一个计算器和考场提供的统计表
The STAT7038 final is closed-book but allows one A4 page of your own notes plus a calculator, with t/F/normal tables supplied 〔闭卷,但可带一张 A4笔记,附统计表〕. So fill the A4 with formulae and cut-offs (left) and spend revision on the procedures and R-reading the note cannot do for you (right).[18]Source: asksia-cheatsheet-stat7038.pdfRemedies: drop a redundant predictor; centre x (kills x vs x2 collinearity); combine variables; collect better- spread data; ridge.
Condition number K = /()_max/2_min) of the scaled XTX; K > 30 signals collinearity problems - a global check to pair with the per-predictor VIFs. VIF pinpoints which predictor; k flags the system as a whole.
Trap: high VIFs from deliberately included higher- order/interaction terms are expected, not a fault (centring reduces them). Multicollinearity inflates se's & destabilises coefficients but does NOT bias y or R2 within the data range.
25 . Model Selection WK 11
Smaller better for Cp/AIC/BIC/PRESS; larger for adj-R2. MALLOWS' CP Cp = SSEp/MSE_full - (n-2p) good: Cp = p (low bias) & small AIC / BIC AIC = n. log(SSE/n) + 2p BIC = n . Log(SSE/n) + p. log n (heavier » smaller models)
PRESS (LEAVE-ONE-OUT) PRESS = E(e1/(1-hit))2 . minimise PRESS measures out-of-sample prediction (each point predicted from a fit that excludes it), so it rewards genuine predictive power rather than in-sample fit. Procedures: best-subset (regsubsets, feasible only for modest #predictors); forward (start null, add most sig); backward (start full, drop least sig); stepwise "both" - R step(), AIC-driven; read the trace, take the lowest-AIC move, stop when <none> is best.
Bias-variance: too few => biased (underfit); too many = inflated variance (overfit). Prefer the simplest adequate model (Occam). All the criteria are just different penalties balancing fit against complexity, so they need not agree - report which criterion you used.
Reading a step() trace: each block lists candidate add/drop moves with the resulting AIC; R takes the lowest-AIC move and stops when <none> tops the list. BIC's heavier penalty (p. log n, for n≥8) almost always lands on a smaller model than AIC, so quoting which criterion you used matters.
Trap: a stepwise-selected model's p-values/CIs are over- optimistic (selection inflates significance); AIC vs BIC can pick different models. Validate; don't treat the selected model as confirmed truth.
Trap List SIDE 2 | CI(mean) vs PI(new): "+1" > PI wider seq SS order-dependent . t = partial F sig, no t - multicollinearity (VIF) keep Lower-order if interaction sig cut-offs large-sample . gap not threshold no extrapolation . don't over-trust step ()
asksia. ai/cheatsheet/ anu-stat7038 . side 2/2
Revision aid . check the current class summary for exam conditions . @ 2026 good luck. revise smart.
AskSia CHEATSHEET SERIES
Compiled by AskSia . mapped to the STAT7038 syllabus . asksia. ai/cheatsheet/anu-stat7038
15 . Matrix Form
MLR & DIAGNOSTICS . matrix form (XTX)-1XTy . hat matrix . seq vs partial SS . nested F . dummies & interactions . Leverage/Cook/DFFITS
ONE A4 . TYPED MEMORY AID
STAT7038
Regression Modelling AUSTRALIAN NATIONAL UNIVERSITY . RSFAS
EXAM REVISION Sem 1 2026 . SIDE 1 OF 2 SLR · estimation . inference
- Mallows’ $C_p = \dfrac{SSE_p}{MSE_{full}}-(n-2p)$(倾向 $C_p\approx p$ 且小)
- $AIC = n\log(SSE/n)+2p$
- $BIC = n\log(SSE/n)+p\log n$(惩罚更重 ⇒ 更偏小模型)
- $PRESS=\sum\left(\dfrac{e_i}{1-h_{ii}}\right)^2$(留一法预测误差,越小越好)
- step() trace 读法(材料明确):每一步选 AIC 最小的 add/drop;当 <none> 最优就停止。[18]Source: asksia-cheatsheet-stat7038.pdfRemedies: drop a redundant predictor; centre x (kills x vs x2 collinearity); combine variables; collect better- spread data; ridge. Condition number K = /()_max/2_min) of the scaled XTX; K > 30 signals collinearity problems - a global check to pair with the per-predictor VIFs. VIF pinpoints which predictor; k flags the system as a whole. Trap: high VIFs from deliberately included higher- order/interaction terms are expected, not a fault (centring reduces them). Multicollinearity inflates se's & destabilises coefficients but does NOT bias y or R2 within the data range. 25 . Model Selection WK 11 Smaller better for Cp/AIC/BIC/PRESS; larger for adj-R2. MALLOWS' CP Cp = SSEp/MSE_full - (n-2p) good: Cp = p (low bias) & small AIC / BIC AIC = n. log(SSE/n) + 2p BIC = n . Log(SSE/n) + p. log n (heavier » smaller models) PRESS (LEAVE-ONE-OUT) PRESS = E(e1/(1-hit))2 . minimise PRESS measures out-of-sample prediction (each point predicted from a fit that excludes it), so it rewards genuine predictive power rather than in-sample fit. Procedures: best-subset (regsubsets, feasible only for modest #predictors); forward (start null, add most sig); backward (start full, drop least sig); stepwise "both" - R step(), AIC-driven; read the trace, take the lowest-AIC move, stop when <none> is best. Bias-variance: too few => biased (underfit); too many = inflated variance (overfit). Prefer the simplest adequate model (Occam). All the criteria are just different penalties balancing fit against complexity, so they need not agree - report which criterion you used. Reading a step() trace: each block lists candidate add/drop moves with the resulting AIC; R takes the lowest-AIC move and stops when <none> tops the list. BIC's heavier penalty (p. log n, for n≥8) almost always lands on a smaller model than AIC, so quoting which criterion you used matters. Trap: a stepwise-selected model's p-values/CIs are over- optimistic (selection inflates significance); AIC vs BIC can pick different models. Validate; don't treat the selected model as confirmed truth. Trap List SIDE 2 | CI(mean) vs PI(new): "+1" > PI wider seq SS order-dependent . t = partial F sig, no t - multicollinearity (VIF) keep Lower-order if interaction sig cut-offs large-sample . gap not threshold no extrapolation . don't over-trust step () asksia. ai/cheatsheet/ anu-stat7038 . side 2/2 Revision aid . check the current class summary for exam conditions . @ 2026 good luck. revise smart. AskSia CHEATSHEET SERIES Compiled by AskSia . mapped to the STAT7038 syllabus . asksia. ai/cheatsheet/anu-stat7038 15 . Matrix Form MLR & DIAGNOSTICS . matrix form (XTX)-1XTy . hat matrix . seq vs partial SS . nested F . dummies & interactions . Leverage/Cook/DFFITS ONE A4 . TYPED MEMORY AID STAT7038 Regression Modelling AUSTRALIAN NATIONAL UNIVERSITY . RSFAS EXAM REVISION Sem 1 2026 . SIDE 1 OF 2 SLR · estimation . inference[11]Source: asksia-bible-stat7038-bilingual.pdfM4 MCQ Concept Centring x before forming x2 in a polynomial model mainly: (A) changes R2; (B) reduces the multicollinearity between x and x2; (C) removes outliers; (D) changes the fitted ŷ. M5 READ R Output From the print-out above, recover the missing t value and significance verdict for x2 at 5%. M6 READ R Output From the same output, is x1 significant at 5%? State n and the residual df, and give the two-sided p-value verdict. MZ MCQ Concept On a log-y scale a fitted coefficient for a 0/1 group is b2 = 0. 41. The multiplicative effect of the group on the original y scale is about: (A) +0. 41; (B) e0. 41 ~ 1. 51x; (C) 0. 41x; (D) none. M8 MCQ Concept A step() trace shows <none> has the lowest AIC of all add/drop moves. This means: (A) keep adding; (B) stop - the current model is selected; (C) drop a term; (D) AIC failed. STAT7038 . Regression Modelling M1-M8 Answer key - rapid-fire 1 M1. (B) ±0. 90. In SLR R2 = r2, so r = ±10. 81 = +0. 90; the sign matches the slope, which isn't given here. M1. (B)±0. 90。在 SLR 中 R2 = r2,故r=±√0. 81= ±0. 90;其符号与斜率一致,而此处未给出斜率。 2 M2. (B) prediction interval. One new individual => PI (carries the +1 of the new error). "Average for this x" would be the CI for the mean. M2. (B)预测区间。一个新个体⇒ PI(承载新误差的+1)。“此 x处的平均值”则是均值的 CI。 3 M3. (B) multicollinearity. The signature: joint F significant, marginal t's not - check VIFs and the predictor correlation matrix. M3. (B)多重共线性。其标志:联合 F 显著、边际 t 不显著 -- 检查 VIF 与预测变量相关矩阵。 M4. (B). Centring kills the artificial correlation between x and x2; it does not change R2 or the fitted values, just the conditioning of XTx. M4. (B)。中心化消除了x与 x2 之间的人为相关;它不改变 R2 或拟合值,只改善 X x的条件数。 5 M5. t = b/se = - 1. 2/0. 4 =- 3. 0. With 47 df, 1-3. 0| > t47(0. 975)~2. 01, so x2 is significant (reject 2=0). M5. t = b/se =- 1. 2/0. 4 =- 3. 0。在 47个自由度下,|-3. 0| > t47(0. 975)~2. 01,所以 x2 显著(拒绝 β2=0)。 6 M6. Residual df = 47 = n - p = n - 3 => n = 50. For x1, t = 2. 00, p = 0. 054 > 0. 05 = not significant at 5% (only borderline). M6. 残差 df = 47= n -p = n-3⇒ n= 50。对 x1, t = 2. 00, p = 0. 054 >0. 05⇒在5% 水平上不显著(仅勉强接近)。 7 M7. (B) eº. 41 =1. 51x. On a log scale an additive coefficient becomes a multiplicative factor on the original scale. M7. (B) e0. 41~1. 51x。在对数尺度上,一个加法系数在原尺度上变成一个乘法因子。 M8. (B) stop. When <none> is best, no add/drop move lowers AIC, so step() halts and returns the current model. M8. (B)停止。当 <none>最优时,没有任何添加/剔除动作能降低 AIC,于是 step()停下并返回当前模型。 4×15 MARKS PER PAPER 每份试卷的分数 5 STEPS / TEST 每个检验的步数 CIPI read MATCH THE WORDING 匹配题目措辞 THE R OUTPUT R 输出 Same habit across every part: state the formula, show the substitution, conclude in context. The five-step test, the labelled ANOVA entries, the cut-off vs the gap, the fitted line written per group - each earns marks on method even if one arithmetic step slips. A bare number, or 'reject H ' with no context sentence, throws that away. 每一部分都保持同一习惯:写出公式、展示代入、结合情境作结。五步检验、标好的 ANOVA 表项、临界值与间距的比较、按组写出的 拟合直线 -- 即便某一步算术失手,每一项都还能拿到方法分。一个光秃秃的数字,或没有情境句子的“拒绝 Ho”,会把这些分白白丢 掉。 MARKER'S NOTE . STAT7038 FINAL STAT7038 . Regression Modelling
- 重要陷阱(材料警告):stepwise 选出来的模型,其 p-values / CI 会过度乐观(别当“已证实真理”)。[5]Source: asksia-bible-stat7038-bilingual.pdfNear-linear dependence among predictors (XT X near-singular); inflates SEs, flips signs. Variance Inflation Factor 方差膨胀因子 VIFj=1/(1- R2); flag >5 (concerning) / >10 (serious); \VIF; inflates se(bj). Mallows' Cp Mallows Cp Cp = SSEp/MSEfull - (n-2p); want Cp ~ p and small. AIC / BIC AIC / BIC AIC = n log(SSE/n) + 2p; BIC uses penalty plog n (heavier => smaller models). Minimise. PRESS 预测残差平方和 PRESS = >(ei/(1 - hii))2; leave-one-out predictive error; minimise. Stepwise selection 逐步回归 Forward/backward/both by AIC (R step()); data-driven, so final p-values are over- optimistic. Hierarchy / marginality 层级原则 Keep lower-order terms (x, main effects) whenever a higher-order term (x2, interaction) is retained. Parsimony / bias-variance 简约/偏差-方差 Too few terms => bias (underfit); too many => variance (overfit). Prefer the simplest adequate model. [SSE(R)-SSE(F)\/q model STAT7038 . Regression Modelling GLOSSARY - FORMULA - SHEET & R-READING MAP · 公式纸与 R 阅读地图 What goes on your A4 note vs what you execute A4 笔记上写什么,考场里执行什么 STAT7038 is closed-book with ONE A4 page of notes, a calculator, and supplied stat tables STAT7038 为闭卷,仅允许带一张 A4 笔记、一个计算器和考场提供的统计表 The STAT7038 final is closed-book but allows one A4 page of your own notes plus a calculator, with t/F/normal tables supplied 〔闭卷,但可带一张 A4笔记,附统计表〕. So fill the A4 with formulae and cut-offs (left) and spend revision on the procedures and R-reading the note cannot do for you (right).[18]Source: asksia-cheatsheet-stat7038.pdfRemedies: drop a redundant predictor; centre x (kills x vs x2 collinearity); combine variables; collect better- spread data; ridge. Condition number K = /()_max/2_min) of the scaled XTX; K > 30 signals collinearity problems - a global check to pair with the per-predictor VIFs. VIF pinpoints which predictor; k flags the system as a whole. Trap: high VIFs from deliberately included higher- order/interaction terms are expected, not a fault (centring reduces them). Multicollinearity inflates se's & destabilises coefficients but does NOT bias y or R2 within the data range. 25 . Model Selection WK 11 Smaller better for Cp/AIC/BIC/PRESS; larger for adj-R2. MALLOWS' CP Cp = SSEp/MSE_full - (n-2p) good: Cp = p (low bias) & small AIC / BIC AIC = n. log(SSE/n) + 2p BIC = n . Log(SSE/n) + p. log n (heavier » smaller models) PRESS (LEAVE-ONE-OUT) PRESS = E(e1/(1-hit))2 . minimise PRESS measures out-of-sample prediction (each point predicted from a fit that excludes it), so it rewards genuine predictive power rather than in-sample fit. Procedures: best-subset (regsubsets, feasible only for modest #predictors); forward (start null, add most sig); backward (start full, drop least sig); stepwise "both" - R step(), AIC-driven; read the trace, take the lowest-AIC move, stop when <none> is best. Bias-variance: too few => biased (underfit); too many = inflated variance (overfit). Prefer the simplest adequate model (Occam). All the criteria are just different penalties balancing fit against complexity, so they need not agree - report which criterion you used. Reading a step() trace: each block lists candidate add/drop moves with the resulting AIC; R takes the lowest-AIC move and stops when <none> tops the list. BIC's heavier penalty (p. log n, for n≥8) almost always lands on a smaller model than AIC, so quoting which criterion you used matters. Trap: a stepwise-selected model's p-values/CIs are over- optimistic (selection inflates significance); AIC vs BIC can pick different models. Validate; don't treat the selected model as confirmed truth. Trap List SIDE 2 | CI(mean) vs PI(new): "+1" > PI wider seq SS order-dependent . t = partial F sig, no t - multicollinearity (VIF) keep Lower-order if interaction sig cut-offs large-sample . gap not threshold no extrapolation . don't over-trust step () asksia. ai/cheatsheet/ anu-stat7038 . side 2/2 Revision aid . check the current class summary for exam conditions . @ 2026 good luck. revise smart. AskSia CHEATSHEET SERIES Compiled by AskSia . mapped to the STAT7038 syllabus . asksia. ai/cheatsheet/anu-stat7038 15 . Matrix Form MLR & DIAGNOSTICS . matrix form (XTX)-1XTy . hat matrix . seq vs partial SS . nested F . dummies & interactions . Leverage/Cook/DFFITS ONE A4 . TYPED MEMORY AID STAT7038 Regression Modelling AUSTRALIAN NATIONAL UNIVERSITY . RSFAS EXAM REVISION Sem 1 2026 . SIDE 1 OF 2 SLR · estimation . inference
-
12)Log-y 的解释(cheatsheet 单独强调,说明很可能考)
- 若拟合 $\log y=b_0+b_1x$:
- 在原尺度:$x$ 每增加 1,$y$ 会被乘以 $e^{b_1}$(不是“加 $b_1$”)。[16]Source: asksia-cheatsheet-stat7038.pdf13b · Log Interpretation BACK- TRANSFORM Fit log y = bo + byx. On the original scale a 1-unit rise in x multiplies y by e^{b1} (not "+b,"). A coefficient difference of b2 between groups = y differs by a factor e^{b2}. Back-transform a CI by exponentiating the endpoints: if log-scale CI = (L, U) then the multiplicative-factor CI = (e^L, e^U). Trap: never just exponentiate the point estimate and call it the mean - that's the median on the original scale. Quick read: a log-y coefficient of 0. 05 = a 5% increase per unit x (since e^{0. 05}=1. 051); for small coefficients the percentage = 100-b1. This shortcut is handy for interpreting the supplied R output fast - but state it as "approximately" and use e^{b} for exact factors. 14 . 5-Step Test Recipe USE ON EVERY Q 1. Hypotheses Ho / Ha (state in symbols) 2. Statistic - t = b/se, F = MSR/MSE, etc. 3. Critical value w/ df (or p-value) from the supplied tables 4. Decision - reject / fail to reject Ho 5. Conclusion in context - plain words + the variable a = 5% unless stated. Show working - no rounding of intermediates. 14b . Exam-Day Tactics SCORING · MCQ first (fast points), then short-calc, then written. · Quote the metric / cut-off, not the adjective - markers reward the number + the rule. · State the sampling distribution & df explicitly (~t_{n-2},~F_{1,n-2}). · For "which interval?" - read for "average" (CI) vs "a new/individual" (PI). Every written answer ends in context with the variable name. · Redeemable quizzes: a strong exam can lift them, so the exam carries it all.
- 两组系数差 $b_2$ ⇒ 原尺度差一个倍数 $e^{b_2}$。[16]Source: asksia-cheatsheet-stat7038.pdf13b · Log Interpretation BACK- TRANSFORM Fit log y = bo + byx. On the original scale a 1-unit rise in x multiplies y by e^{b1} (not "+b,"). A coefficient difference of b2 between groups = y differs by a factor e^{b2}. Back-transform a CI by exponentiating the endpoints: if log-scale CI = (L, U) then the multiplicative-factor CI = (e^L, e^U). Trap: never just exponentiate the point estimate and call it the mean - that's the median on the original scale. Quick read: a log-y coefficient of 0. 05 = a 5% increase per unit x (since e^{0. 05}=1. 051); for small coefficients the percentage = 100-b1. This shortcut is handy for interpreting the supplied R output fast - but state it as "approximately" and use e^{b} for exact factors. 14 . 5-Step Test Recipe USE ON EVERY Q 1. Hypotheses Ho / Ha (state in symbols) 2. Statistic - t = b/se, F = MSR/MSE, etc. 3. Critical value w/ df (or p-value) from the supplied tables 4. Decision - reject / fail to reject Ho 5. Conclusion in context - plain words + the variable a = 5% unless stated. Show working - no rounding of intermediates. 14b . Exam-Day Tactics SCORING · MCQ first (fast points), then short-calc, then written. · Quote the metric / cut-off, not the adjective - markers reward the number + the rule. · State the sampling distribution & df explicitly (~t_{n-2},~F_{1,n-2}). · For "which interval?" - read for "average" (CI) vs "a new/individual" (PI). Every written answer ends in context with the variable name. · Redeemable quizzes: a strong exam can lift them, so the exam carries it all.
- 快速近似:$b_1=0.05$ 约等于每单位 $x$ 增加 约 5%(要写“approximately”;精确用 $e^{b}$)。[16]Source: asksia-cheatsheet-stat7038.pdf13b · Log Interpretation BACK- TRANSFORM Fit log y = bo + byx. On the original scale a 1-unit rise in x multiplies y by e^{b1} (not "+b,"). A coefficient difference of b2 between groups = y differs by a factor e^{b2}. Back-transform a CI by exponentiating the endpoints: if log-scale CI = (L, U) then the multiplicative-factor CI = (e^L, e^U). Trap: never just exponentiate the point estimate and call it the mean - that's the median on the original scale. Quick read: a log-y coefficient of 0. 05 = a 5% increase per unit x (since e^{0. 05}=1. 051); for small coefficients the percentage = 100-b1. This shortcut is handy for interpreting the supplied R output fast - but state it as "approximately" and use e^{b} for exact factors. 14 . 5-Step Test Recipe USE ON EVERY Q 1. Hypotheses Ho / Ha (state in symbols) 2. Statistic - t = b/se, F = MSR/MSE, etc. 3. Critical value w/ df (or p-value) from the supplied tables 4. Decision - reject / fail to reject Ho 5. Conclusion in context - plain words + the variable a = 5% unless stated. Show working - no rounding of intermediates. 14b . Exam-Day Tactics SCORING · MCQ first (fast points), then short-calc, then written. · Quote the metric / cut-off, not the adjective - markers reward the number + the rule. · State the sampling distribution & df explicitly (~t_{n-2},~F_{1,n-2}). · For "which interval?" - read for "average" (CI) vs "a new/individual" (PI). Every written answer ends in context with the variable name. · Redeemable quizzes: a strong exam can lift them, so the exam carries it all.
- 把 log 尺度上的 CI $(L,U)$ 反变换:$(e^L,e^U)$。[16]Source: asksia-cheatsheet-stat7038.pdf13b · Log Interpretation BACK- TRANSFORM Fit log y = bo + byx. On the original scale a 1-unit rise in x multiplies y by e^{b1} (not "+b,"). A coefficient difference of b2 between groups = y differs by a factor e^{b2}. Back-transform a CI by exponentiating the endpoints: if log-scale CI = (L, U) then the multiplicative-factor CI = (e^L, e^U). Trap: never just exponentiate the point estimate and call it the mean - that's the median on the original scale. Quick read: a log-y coefficient of 0. 05 = a 5% increase per unit x (since e^{0. 05}=1. 051); for small coefficients the percentage = 100-b1. This shortcut is handy for interpreting the supplied R output fast - but state it as "approximately" and use e^{b} for exact factors. 14 . 5-Step Test Recipe USE ON EVERY Q 1. Hypotheses Ho / Ha (state in symbols) 2. Statistic - t = b/se, F = MSR/MSE, etc. 3. Critical value w/ df (or p-value) from the supplied tables 4. Decision - reject / fail to reject Ho 5. Conclusion in context - plain words + the variable a = 5% unless stated. Show working - no rounding of intermediates. 14b . Exam-Day Tactics SCORING · MCQ first (fast points), then short-calc, then written. · Quote the metric / cut-off, not the adjective - markers reward the number + the rule. · State the sampling distribution & df explicitly (~t_{n-2},~F_{1,n-2}). · For "which interval?" - read for "average" (CI) vs "a new/individual" (PI). Every written answer ends in context with the variable name. · Redeemable quizzes: a strong exam can lift them, so the exam carries it all.
- 陷阱:只指数化点估计并称为“均值”是错的——材料提醒那更像是原尺度上的中位数概念。[16]Source: asksia-cheatsheet-stat7038.pdf13b · Log Interpretation BACK- TRANSFORM Fit log y = bo + byx. On the original scale a 1-unit rise in x multiplies y by e^{b1} (not "+b,"). A coefficient difference of b2 between groups = y differs by a factor e^{b2}. Back-transform a CI by exponentiating the endpoints: if log-scale CI = (L, U) then the multiplicative-factor CI = (e^L, e^U). Trap: never just exponentiate the point estimate and call it the mean - that's the median on the original scale. Quick read: a log-y coefficient of 0. 05 = a 5% increase per unit x (since e^{0. 05}=1. 051); for small coefficients the percentage = 100-b1. This shortcut is handy for interpreting the supplied R output fast - but state it as "approximately" and use e^{b} for exact factors. 14 . 5-Step Test Recipe USE ON EVERY Q 1. Hypotheses Ho / Ha (state in symbols) 2. Statistic - t = b/se, F = MSR/MSE, etc. 3. Critical value w/ df (or p-value) from the supplied tables 4. Decision - reject / fail to reject Ho 5. Conclusion in context - plain words + the variable a = 5% unless stated. Show working - no rounding of intermediates. 14b . Exam-Day Tactics SCORING · MCQ first (fast points), then short-calc, then written. · Quote the metric / cut-off, not the adjective - markers reward the number + the rule. · State the sampling distribution & df explicitly (~t_{n-2},~F_{1,n-2}). · For "which interval?" - read for "average" (CI) vs "a new/individual" (PI). Every written answer ends in context with the variable name. · Redeemable quizzes: a strong exam can lift them, so the exam carries it all.
-
13)你那张 A4(双面)应该怎么排版(给你一个“能直接抄”的目录)
-
A4 正面(SLR 核心 + 读输出)
- 公式带:$S_{xx},S_{xy}$;$b_1,b_0$;$SST/SSR/SSE/MSE$;$R^2$;$F$ 与 $t^2$;$t=b/se$。[24]Source: asksia-cheatsheet-stat7038.pdfCalculator, R output & tables are supplied - don't waste this sheet on them. Formula Belt SIDE 1 b1=Sxy/Sxx . bo=y-bix" MSE=SSE/(n-2) . Var(b1)=02/Sxx R2=SSR/SST=1-SSE/SST . F=t2 se(b1)=V(MSE/Sxx) . CI: bit . se CI mean: ôv(1/n+(x_h-x]2/Sxx) PI new: ôv(1+1/n+(x_h-x]2/Sxx) asksia. ai/cheatsheet/ anu-stat7038 · side 1/2 AskSia CHEATSHEET SERIES 9 · CI (mean) vs PI * CLASSIC TRAP (new) At x = x_h both intervals share the same point estimate ŷ_h = bo + b1x_h; they differ only in the SE. CI FOR THE MEAN E(Y|X_H) @_h + t_{n-2} (1-a/2) · ôv(1/n + (x_h-x] 2/Sxx) PI FOR A NEW OBS Y_NEW ŷ_h + t_{n-2} (1-a/2) · ôv(1 + 1/n + (x_h-x]2/Sxx) Why PI is wider: it carries the extra o2 of the new point's own error &_new - the "+1" under the root - on top of the uncertainty in the estimated mean. Both are narrowest at x_h = x and flare as you move away (the (x_h-x)2 term). R: predict ( . . . , interval="confidence" ) vs "prediction". As n->oo the CI shrinks to a point (you know the mean) but the PI stays finite - it can never beat the irreducible o of a single new draw. SIA > Match the wording: "average value for . . . " ++ CI; "predict one new . . . " ++ PI. Forgetting the "+1", or swapping them, loses marks. Never extrapolate beyond the observed x range. 8 . Reading summary(Im) SUPPLIED IN EXAM Estimate Std. Error t value Pr(>|t|) (Int) b0 se0 t0 p0 x b1 se1 t1 p1 Resid SE: VMSE on n-2 df Mult R2: R2 . Adj R2: R2adj F: F on 1 and n-2 DF, p: p Read off: Estimate/Std. Error = bj, se(b }); t value = bj/se(bj) (tests ßj=0); Pr(>|t|) = two-sided p. Stars *** ** *=. 001/. 01/. 05.[27]Source: asksia-cheatsheet-stat7038.pdfSTAT7038 Regression Modelling AUSTRALIAN NATIONAL UNIVERSITY . RSFAS EXAM REVISION Sem 1 2026 . SIDE 1 OF 2 SLR · estimation . inference SIDE 1/2 R output READ FIRST 0 . Exam Blueprint * Final = 70% . 180 min (+15 read) . MCQ + short-answer calculation + written, all of weeks 1-12. Also: online quiz 5%, in- tutorial quiz (SLR) 10%, assignment 15%. * The exam permits ONE A4 double-sided typed or printed notes sheet - and a calculator, R outputs & statistical tables are SUPPLIED. So this sheet IS a compliant memory aid: spend zero space on tables/R syntax, max out formulae, decision rules, cut-offs & method recipes. Every hypothesis test shows all 5: (1) hypotheses, (2) test statistic, (3) critical value w/ df (or p), (4) decision, (5) conclusion in context. a = 5% unless stated; log = natural log. - - SIA > The killer combo: read the supplied R output, pull the numbers, plug into the formula on this sheet, run the 5-step test. Memorise the recipes & cut-offs, not the tables. 1 . Building Blocks NOTATION Parameter (ß1, 02) = fixed unknown; estimator (b1, 02) = random, from the sample; estimate = its realised value. The sampling distribution is the estimator's distribution over repeated samples. CORE SUMS (MEMORISE) x == (1/n) ΣΧΙ · Sxx = Σ(Χi -x) 2 S_yy = Σ( yi-y)2 · Sxγ = Σ (Xi -X) (γi -y) Sx2 = Sxx/(n-1) . Cov = Sxy/(n-1) r = Sxy/V(SxxS_yy) = Sxy/((n-1) SxSy) EXPECTATION / VARIANCE RULES E(aX+b) = aE(X)+b Var(aX+b) = a2Var(X) Var (Σai Υ1)=Σa12Var (Υ1)+2_{i<j}aia; Cov [Yi,Υ;) (independent - Cov terms vanish) re [-1,1] measures linear association only. These E/Var rules drive every variance derivation below (e. g. Var(b,) treats the y, as the only random part, since the x, are fixed and the & carry all the randomness). Trap: correlation # causation; a strong r can be driven by one outlier or a lurking variable; r = 0 => no linear link, not "no relationship". 1b . The 5 SS Identities KEEP HANDY Everything in inference is built from five sums; learn how each is recovered from the others: SST = S_yy = E(yi-y)2 (df n-1) SSR = b1 . Sxy = b12Sxx = R2 . SST (df 1) SSE = SST - SSR = Ee:2 (df n-2) MSE = SSE/(n-2) . MSR = SSR/1 R2 = SSR/SST . r = ±VR2 (sign of bi) If an R table hides one cell, back it out: e. g. SSE = SST(1-R2) , or SSR = F·MSE. Degrees of freedom always add: df_total (n-1) = df_reg + df_err. In SLR that's (n-1) = 1 + (n-2); in MLR (n-1) = (p-1) + (n-p). A quick df check catches most table-fill slips. The "p" you divide by is the number of estimated parameters including the intercept - count the rows in the coefficient table. Get p wrong and every df, MSE and cut-off downstream is wrong too. In SLR p = 2. 2 . SLR Model + LINE SIMPLE LINEAR REGRESSION V1 = Be + BiXi + 81 , 81 ~iid N(0, 02) " yı ~ind N(Be+ß1X1 , 02) E(y|x) = Be + B1x (mean response) Four assumptions - LINE: Linearity of E(y|x); Independence of &; Normality of &; Equal variance Var(s)=o2. The x, are fixed, measured without error. Interpret: [] = expected change in y per 1-unit + in x; Bo = expected y at x = 0 (often a meaningless extrapolation). 3 . Least-Squares Estimation
- 5 步检验模板(固定格式,保证方法分)。[16]Source: asksia-cheatsheet-stat7038.pdf13b · Log Interpretation BACK- TRANSFORM Fit log y = bo + byx. On the original scale a 1-unit rise in x multiplies y by e^{b1} (not "+b,"). A coefficient difference of b2 between groups = y differs by a factor e^{b2}. Back-transform a CI by exponentiating the endpoints: if log-scale CI = (L, U) then the multiplicative-factor CI = (e^L, e^U). Trap: never just exponentiate the point estimate and call it the mean - that's the median on the original scale. Quick read: a log-y coefficient of 0. 05 = a 5% increase per unit x (since e^{0. 05}=1. 051); for small coefficients the percentage = 100-b1. This shortcut is handy for interpreting the supplied R output fast - but state it as "approximately" and use e^{b} for exact factors. 14 . 5-Step Test Recipe USE ON EVERY Q 1. Hypotheses Ho / Ha (state in symbols) 2. Statistic - t = b/se, F = MSR/MSE, etc. 3. Critical value w/ df (or p-value) from the supplied tables 4. Decision - reject / fail to reject Ho 5. Conclusion in context - plain words + the variable a = 5% unless stated. Show working - no rounding of intermediates. 14b . Exam-Day Tactics SCORING · MCQ first (fast points), then short-calc, then written. · Quote the metric / cut-off, not the adjective - markers reward the number + the rule. · State the sampling distribution & df explicitly (~t_{n-2},~F_{1,n-2}). · For "which interval?" - read for "average" (CI) vs "a new/individual" (PI). Every written answer ends in context with the variable name. · Redeemable quizzes: a strong exam can lift them, so the exam carries it all.
- summary(lm) 输出复原:$se=b/t$;$MSE=(residSE)^2$;$df_E=n-p$;从 df 反推 $n$。[8]Source: asksia-bible-stat7038-bilingual.pdf✓ Recover n from the output i summary stars 从输出中还原 n summary 的星号 "on 6 degrees of freedom" means dfE = n - 2 = 6, so n = 8. The F line "on 1 and 6 DF" confirms it. From n you can rebuild any SE the printout hides. *** / ** / * flag significance at . 001 / . 01 / . 05. Handy, but in a written answer quote the p-value or compare to the critical value - don't just cite stars. “在6个自由度上”意味着 dfE = n-2=6,故 n=8。F那 一行“在1和6 DF上”印证了这一点。有了n,便可重建打印 输出中隐去的任何 SE。 *** / ** / * 分别标记 . 001/ 01/. 05 水平上的显著性。方 便,但在文字作答中要引用 p 值或与临界值比较 -- 不要只引 星号。 "The whole of SLR inference is recoverable from six numbers on one printout. Learn to read it cold and the calculation questions become transcription with arithmetic. " “整个 SLR 推断都能从一张打印输出上的六个数字里复原出来。把它练到能一眼读懂,计算题就变成了带算术的誊抄。” WHY R-OUTPUT READING IS THE HIGHEST-YIELD EXAM SKILL STAT7038 . Regression Modelling DIAGNOSTICS . RESIDUAL PLOT - DIAGNOSTICS - DO THE ASSUMPTIONS HOLD? WEEKS 5-6 . HEAVILY EXAMINED The fit is only as good as LINE 拟合的优劣完全取决于 LINE Residual plots are how you check, not just assume, the four conditions 残差图是用来检验这四条假设的,而不是直接假定它们成立 Least squares always returns a line - even through data that has no business being modelled linearly. Inference (the t- and F-tests, the CIs and PIs) is only valid when the error assumptions hold. Diagnostics are the plots and statistics that interrogate them. Recall the four LINE assumptions: 最小二乘法总会返回一条直线 -- 哪怕数据根本不该用线性建模。推断(t 与 F 检验、CI 与 PI)只有在误差假设成立时才有效。诊断就是用 来盘问这些假设的图与统计量。回顾四条 LINE 假设: L LINEARITY OF E[Y |X] E[y|x]的线性 I INDEPENDENT ERRORS 误差相互独立 N NORMAL ERRORS 误差正态 E EQUAL VARIANCE 方差相等 AHA 1 The residuals-vs-fitted plot AHA 1 残差对拟合值图 The single most useful diagnostic. Plot each residual ei = yi - yi against its fitted value gj. Under the assumptions the residuals are a structureless cloud about the e = 0 line. Read it for two things at once: curvature (a failure of Linearity) and changing spread (a failure of Equal variance). 最有用的单一诊断。把每个残差对其拟合值作图。在假设成立下,残差是围绕 e = 0 直线、毫无结构的散点云。一眼同时看两件事:弯曲 (Linearity 即线性的失效)与散布变化(Equal variance 即等方差的失效)。 residual e GOOD[25]Source: asksia-cheatsheet-stat7038.pdfThe bottom three lines bundle the global picture: residual SE + its df give MSE and n-p; the two R2 values compare raw vs size-penalised fit; the F-line is the overall test. A small p on the F-line but big p's on every coefficient = suspect multicollinearity (Side 2). RECOVER HIDDEN QUANTITIES se = b/t . MSE = (resid SE)2 . df_E = n-p 8b . Fill the ANOVA Table RECURRING EXAM TASK Given any two of {SST, SSR, SSE} and n, complete the rest - never round intermediates: SRC DF SS MS F Reg 1 SSR SSF MSR/MSE Linearity biased b's, curved resid Indep. wrong se's, F invalid time/cluster model Normality t/F approx only (small n) transform; CLT in large n
- CI(mean) vs PI(new) 两个根号式 + “措辞匹配”规则(写大写 TRAP:PI 有 $+1$)。[20]Source: asksia-cheatsheet-stat7038.pdfasksia. ai/cheatsheet/ anu-stat7038 · side 1/2 AskSia CHEATSHEET SERIES 9 · CI (mean) vs PI * CLASSIC TRAP (new) At x = x_h both intervals share the same point estimate ŷ_h = bo + b1x_h; they differ only in the SE. CI FOR THE MEAN E(Y|X_H) @_h + t_{n-2} (1-a/2) · ôv(1/n + (x_h-x] 2/Sxx) PI FOR A NEW OBS Y_NEW ŷ_h + t_{n-2} (1-a/2) · ôv(1 + 1/n + (x_h-x]2/Sxx) Why PI is wider: it carries the extra o2 of the new point's own error &_new - the "+1" under the root - on top of the uncertainty in the estimated mean. Both are narrowest at x_h = x and flare as you move away (the (x_h-x)2 term). R: predict ( . . . , interval="confidence" ) vs "prediction". As n->oo the CI shrinks to a point (you know the mean) but the PI stays finite - it can never beat the irreducible o of a single new draw. SIA > Match the wording: "average value for . . . " ++ CI; "predict one new . . . " ++ PI. Forgetting the "+1", or swapping them, loses marks. Never extrapolate beyond the observed x range. 8 . Reading summary(Im) SUPPLIED IN EXAM Estimate Std. Error t value Pr(>|t|) (Int) b0 se0 t0 p0 x b1 se1 t1 p1 Resid SE: VMSE on n-2 df Mult R2: R2 . Adj R2: R2adj F: F on 1 and n-2 DF, p: p Read off: Estimate/Std. Error = bj, se(b }); t value = bj/se(bj) (tests ßj=0); Pr(>|t|) = two-sided p. Stars *** ** *=. 001/. 01/. 05. The bottom three lines bundle the global picture: residual SE + its df give MSE and n-p; the two R2 values compare raw vs size-penalised fit; the F-line is the overall test. A small p on the F-line but big p's on every coefficient = suspect multicollinearity (Side 2). RECOVER HIDDEN QUANTITIES se = b/t . MSE = (resid SE)2 . df_E = n-p 8b . Fill the ANOVA Table RECURRING EXAM TASK
-
A4 反面(MLR / 诊断 / 共线性 / 选择)
- Diagnostics 图→结论对照表 + cut-off 思想(阈值 + gap)。[17]Source: asksia-cheatsheet-stat7038.pdfCompiled by AskSia . mapped to the STAT7038 syllabus . asksia. ai/cheatsheet/anu- stat7038 11 . Worked . CI vs PI SAME FIT From COL 4: d = 2, n = 12, Sxx = 80, X = 10. Predict at x_h =14. ŷ_h = 5 + 2. 5. 14 = 40 (x_h-x]2 = 16 . 1/n = 0. 0833 95% CI (MEAN) 40 ± 2. 228. 2. V(0. 0833 + 16/80) = 40 ± 2. 228. 2. 0. 532 = (37. 6, 42. 4) 95% PI (NEW) 40 ± 2. 228. 2. V(1 + 0. 0833 + 0. 2) = 40 ± 2. 228 . 2. 1. 133 = (34. 95, 45. 05) PI is far wider - the "+1" dominates the root. Both centre on 40. 12 . Diagnostics . the WK 4 . plots EVERY Q Residuals vs Fitted (& vs each x): checks linearity (no curve in the smoother) + constant variance (even band). Curve = wrong functional form; funnel/megaphone = heteroscedasticity . Normal Q-Q (internally studentised resid): checks normality. On the line => normal; S-shape = skew; heavy/light tails = kurtosis. Tails matter less in large n (CLT). Scale-Location (/|std resid| vs fitted): another homoscedasticity check; rising trend => increasing variance. Residuals vs Leverage / Cook's D plot: flags influence (Side 2). Independence => residuals vs order/time. Trap: R auto-labels the 3 most extreme points - labelled # outlier . Judge vs cut-offs + the gap; a Q- Q-flagged point with |studentised| < 2 is not an outlier. 12b . Reading 3 Plots CHECKBOX TASK The exam shows Residuals-vs-Fitted, Q-Q, Cook's-D and asks for the single best verdict. Map symptom -> conclusion: WHAT YOU SEE CONCLUSION Curved smoother (R-v-F) non-linearity Funnel widening
- MLR:$y=X\beta+\varepsilon$;$\hat\beta=(X^TX)^{-1}X^Ty$;$MSE=SSE/(n-p)$;partial meaning。[22]Source: asksia-cheatsheet-stat7038.pdfRevision aid . check the current class summary for exam conditions . @ 2026 flip + for side 2 . MLR, diagnostics & selection transform / add term THE SETUP SLR & INFERENCE . The model & LINE . least squares @1=Sxy/Sxx . Gauss-Markov . ANOVA . F & R2 . t-tests & CIs . CI vs PI . reading ONE A4 . TYPED MEMORY AID STAT7038 Regression Modelling AUSTRALIAN NATIONAL UNIVERSITY . RSFAS EXAM REVISION Sem 1 2026 . SIDE 2 OF 2 MLR . diagnostics · selection SIDE 2/2 VIF . Cp/AIC/BIC WK 5 STACKED MODEL y = XB + € , ε ~ N(0, 02I) X is nxp (1s + predictors), ß is px1 LS SOLUTION (FROM XTXB=X™Y) ₿ = (XTX) -1XTy FITTED & HAT MATRIX y = X8 = Hy . H = X(XTX)-1XT H symmetric (HT=H) & idempotent (H2=H) tr(H)=p . diagonals = leverages hit RESIDUALS e = (I-H)y . Var(e)=02 (I-H) - Var(ei)=02(1-hi1) COVARIANCE OF ESTIMATES Var(B) = 02 (XTX) -1 = MSE (XTX)-1 jth diagonal of MSE(XTX)-1 = se(bj)2; off-diagonals = Cov(bj,b_k). R: vcov ( Im) prints it; /diag = se's. Read matrix output, don't invert by hand. 16 . MLR Model WK 6-7 P-1 PREDICTORS, P PARAMS yi = Be + Bix_{i1} + - + @_{p-1}x_{i, p-1} + £1 ô2 = MSE = SSE/(n-p) · resid SE on n-p df "Partial" coefficients: { j = expected change in y per 1-unit + in xj holding all other predictors constant. This conditional meaning is why a coefficient's sign can differ from the simple pairwise correlation of x j with y (see §24). 16b . Read vcov / matrix output EXAM CAN GIVE IT You may be handed MSE. (XTx)-1 (the vcov matrix). To extract: se(bj) = V(jth diagonal entry) Cov(bj, b_k) = the (j, k) off-diagonal corr(bj, b_k) = Cov / (se(b; )se(b_k)) Worked: if diag(vcov) = (4, 0. 25, 0. 09) then se(bo)=2, se(b1)=0. 5, se(b2)=0. 3. A t-test of ß1: t = b1/0. 5. No matrix inversion by hand - just read entries.
- 共线性:VIF 定义 + 阈值(5/10)+ 典型症状(F 显著 t 不显著)。[5]Source: asksia-bible-stat7038-bilingual.pdfNear-linear dependence among predictors (XT X near-singular); inflates SEs, flips signs. Variance Inflation Factor 方差膨胀因子 VIFj=1/(1- R2); flag >5 (concerning) / >10 (serious); \VIF; inflates se(bj). Mallows' Cp Mallows Cp Cp = SSEp/MSEfull - (n-2p); want Cp ~ p and small. AIC / BIC AIC / BIC AIC = n log(SSE/n) + 2p; BIC uses penalty plog n (heavier => smaller models). Minimise. PRESS 预测残差平方和 PRESS = >(ei/(1 - hii))2; leave-one-out predictive error; minimise. Stepwise selection 逐步回归 Forward/backward/both by AIC (R step()); data-driven, so final p-values are over- optimistic. Hierarchy / marginality 层级原则 Keep lower-order terms (x, main effects) whenever a higher-order term (x2, interaction) is retained. Parsimony / bias-variance 简约/偏差-方差 Too few terms => bias (underfit); too many => variance (overfit). Prefer the simplest adequate model. [SSE(R)-SSE(F)\/q model STAT7038 . Regression Modelling GLOSSARY - FORMULA - SHEET & R-READING MAP · 公式纸与 R 阅读地图 What goes on your A4 note vs what you execute A4 笔记上写什么,考场里执行什么 STAT7038 is closed-book with ONE A4 page of notes, a calculator, and supplied stat tables STAT7038 为闭卷,仅允许带一张 A4 笔记、一个计算器和考场提供的统计表 The STAT7038 final is closed-book but allows one A4 page of your own notes plus a calculator, with t/F/normal tables supplied 〔闭卷,但可带一张 A4笔记,附统计表〕. So fill the A4 with formulae and cut-offs (left) and spend revision on the procedures and R-reading the note cannot do for you (right).[11]Source: asksia-bible-stat7038-bilingual.pdfM4 MCQ Concept Centring x before forming x2 in a polynomial model mainly: (A) changes R2; (B) reduces the multicollinearity between x and x2; (C) removes outliers; (D) changes the fitted ŷ. M5 READ R Output From the print-out above, recover the missing t value and significance verdict for x2 at 5%. M6 READ R Output From the same output, is x1 significant at 5%? State n and the residual df, and give the two-sided p-value verdict. MZ MCQ Concept On a log-y scale a fitted coefficient for a 0/1 group is b2 = 0. 41. The multiplicative effect of the group on the original y scale is about: (A) +0. 41; (B) e0. 41 ~ 1. 51x; (C) 0. 41x; (D) none. M8 MCQ Concept A step() trace shows <none> has the lowest AIC of all add/drop moves. This means: (A) keep adding; (B) stop - the current model is selected; (C) drop a term; (D) AIC failed. STAT7038 . Regression Modelling M1-M8 Answer key - rapid-fire 1 M1. (B) ±0. 90. In SLR R2 = r2, so r = ±10. 81 = +0. 90; the sign matches the slope, which isn't given here. M1. (B)±0. 90。在 SLR 中 R2 = r2,故r=±√0. 81= ±0. 90;其符号与斜率一致,而此处未给出斜率。 2 M2. (B) prediction interval. One new individual => PI (carries the +1 of the new error). "Average for this x" would be the CI for the mean. M2. (B)预测区间。一个新个体⇒ PI(承载新误差的+1)。“此 x处的平均值”则是均值的 CI。 3 M3. (B) multicollinearity. The signature: joint F significant, marginal t's not - check VIFs and the predictor correlation matrix. M3. (B)多重共线性。其标志:联合 F 显著、边际 t 不显著 -- 检查 VIF 与预测变量相关矩阵。 M4. (B). Centring kills the artificial correlation between x and x2; it does not change R2 or the fitted values, just the conditioning of XTx. M4. (B)。中心化消除了x与 x2 之间的人为相关;它不改变 R2 或拟合值,只改善 X x的条件数。 5 M5. t = b/se = - 1. 2/0. 4 =- 3. 0. With 47 df, 1-3. 0| > t47(0. 975)~2. 01, so x2 is significant (reject 2=0). M5. t = b/se =- 1. 2/0. 4 =- 3. 0。在 47个自由度下,|-3. 0| > t47(0. 975)~2. 01,所以 x2 显著(拒绝 β2=0)。 6 M6. Residual df = 47 = n - p = n - 3 => n = 50. For x1, t = 2. 00, p = 0. 054 > 0. 05 = not significant at 5% (only borderline). M6. 残差 df = 47= n -p = n-3⇒ n= 50。对 x1, t = 2. 00, p = 0. 054 >0. 05⇒在5% 水平上不显著(仅勉强接近)。 7 M7. (B) eº. 41 =1. 51x. On a log scale an additive coefficient becomes a multiplicative factor on the original scale. M7. (B) e0. 41~1. 51x。在对数尺度上,一个加法系数在原尺度上变成一个乘法因子。 M8. (B) stop. When <none> is best, no add/drop move lowers AIC, so step() halts and returns the current model. M8. (B)停止。当 <none>最优时,没有任何添加/剔除动作能降低 AIC,于是 step()停下并返回当前模型。 4×15 MARKS PER PAPER 每份试卷的分数 5 STEPS / TEST 每个检验的步数 CIPI read MATCH THE WORDING 匹配题目措辞 THE R OUTPUT R 输出 Same habit across every part: state the formula, show the substitution, conclude in context. The five-step test, the labelled ANOVA entries, the cut-off vs the gap, the fitted line written per group - each earns marks on method even if one arithmetic step slips. A bare number, or 'reject H ' with no context sentence, throws that away. 每一部分都保持同一习惯:写出公式、展示代入、结合情境作结。五步检验、标好的 ANOVA 表项、临界值与间距的比较、按组写出的 拟合直线 -- 即便某一步算术失手,每一项都还能拿到方法分。一个光秃秃的数字,或没有情境句子的“拒绝 Ho”,会把这些分白白丢 掉。 MARKER'S NOTE . STAT7038 FINAL STAT7038 . Regression Modelling
- 选择:$C_p,AIC,BIC,PRESS$ 方向 + 公式;step() 读法(<none> stop)。[18]Source: asksia-cheatsheet-stat7038.pdfRemedies: drop a redundant predictor; centre x (kills x vs x2 collinearity); combine variables; collect better- spread data; ridge. Condition number K = /()_max/2_min) of the scaled XTX; K > 30 signals collinearity problems - a global check to pair with the per-predictor VIFs. VIF pinpoints which predictor; k flags the system as a whole. Trap: high VIFs from deliberately included higher- order/interaction terms are expected, not a fault (centring reduces them). Multicollinearity inflates se's & destabilises coefficients but does NOT bias y or R2 within the data range. 25 . Model Selection WK 11 Smaller better for Cp/AIC/BIC/PRESS; larger for adj-R2. MALLOWS' CP Cp = SSEp/MSE_full - (n-2p) good: Cp = p (low bias) & small AIC / BIC AIC = n. log(SSE/n) + 2p BIC = n . Log(SSE/n) + p. log n (heavier » smaller models) PRESS (LEAVE-ONE-OUT) PRESS = E(e1/(1-hit))2 . minimise PRESS measures out-of-sample prediction (each point predicted from a fit that excludes it), so it rewards genuine predictive power rather than in-sample fit. Procedures: best-subset (regsubsets, feasible only for modest #predictors); forward (start null, add most sig); backward (start full, drop least sig); stepwise "both" - R step(), AIC-driven; read the trace, take the lowest-AIC move, stop when <none> is best. Bias-variance: too few => biased (underfit); too many = inflated variance (overfit). Prefer the simplest adequate model (Occam). All the criteria are just different penalties balancing fit against complexity, so they need not agree - report which criterion you used. Reading a step() trace: each block lists candidate add/drop moves with the resulting AIC; R takes the lowest-AIC move and stops when <none> tops the list. BIC's heavier penalty (p. log n, for n≥8) almost always lands on a smaller model than AIC, so quoting which criterion you used matters. Trap: a stepwise-selected model's p-values/CIs are over- optimistic (selection inflates significance); AIC vs BIC can pick different models. Validate; don't treat the selected model as confirmed truth. Trap List SIDE 2 | CI(mean) vs PI(new): "+1" > PI wider seq SS order-dependent . t = partial F sig, no t - multicollinearity (VIF) keep Lower-order if interaction sig cut-offs large-sample . gap not threshold no extrapolation . don't over-trust step () asksia. ai/cheatsheet/ anu-stat7038 . side 2/2 Revision aid . check the current class summary for exam conditions . @ 2026 good luck. revise smart. AskSia CHEATSHEET SERIES Compiled by AskSia . mapped to the STAT7038 syllabus . asksia. ai/cheatsheet/anu-stat7038 15 . Matrix Form MLR & DIAGNOSTICS . matrix form (XTX)-1XTy . hat matrix . seq vs partial SS . nested F . dummies & interactions . Leverage/Cook/DFFITS ONE A4 . TYPED MEMORY AID STAT7038 Regression Modelling AUSTRALIAN NATIONAL UNIVERSITY . RSFAS EXAM REVISION Sem 1 2026 . SIDE 1 OF 2 SLR · estimation . inference[11]Source: asksia-bible-stat7038-bilingual.pdfM4 MCQ Concept Centring x before forming x2 in a polynomial model mainly: (A) changes R2; (B) reduces the multicollinearity between x and x2; (C) removes outliers; (D) changes the fitted ŷ. M5 READ R Output From the print-out above, recover the missing t value and significance verdict for x2 at 5%. M6 READ R Output From the same output, is x1 significant at 5%? State n and the residual df, and give the two-sided p-value verdict. MZ MCQ Concept On a log-y scale a fitted coefficient for a 0/1 group is b2 = 0. 41. The multiplicative effect of the group on the original y scale is about: (A) +0. 41; (B) e0. 41 ~ 1. 51x; (C) 0. 41x; (D) none. M8 MCQ Concept A step() trace shows <none> has the lowest AIC of all add/drop moves. This means: (A) keep adding; (B) stop - the current model is selected; (C) drop a term; (D) AIC failed. STAT7038 . Regression Modelling M1-M8 Answer key - rapid-fire 1 M1. (B) ±0. 90. In SLR R2 = r2, so r = ±10. 81 = +0. 90; the sign matches the slope, which isn't given here. M1. (B)±0. 90。在 SLR 中 R2 = r2,故r=±√0. 81= ±0. 90;其符号与斜率一致,而此处未给出斜率。 2 M2. (B) prediction interval. One new individual => PI (carries the +1 of the new error). "Average for this x" would be the CI for the mean. M2. (B)预测区间。一个新个体⇒ PI(承载新误差的+1)。“此 x处的平均值”则是均值的 CI。 3 M3. (B) multicollinearity. The signature: joint F significant, marginal t's not - check VIFs and the predictor correlation matrix. M3. (B)多重共线性。其标志:联合 F 显著、边际 t 不显著 -- 检查 VIF 与预测变量相关矩阵。 M4. (B). Centring kills the artificial correlation between x and x2; it does not change R2 or the fitted values, just the conditioning of XTx. M4. (B)。中心化消除了x与 x2 之间的人为相关;它不改变 R2 或拟合值,只改善 X x的条件数。 5 M5. t = b/se = - 1. 2/0. 4 =- 3. 0. With 47 df, 1-3. 0| > t47(0. 975)~2. 01, so x2 is significant (reject 2=0). M5. t = b/se =- 1. 2/0. 4 =- 3. 0。在 47个自由度下,|-3. 0| > t47(0. 975)~2. 01,所以 x2 显著(拒绝 β2=0)。 6 M6. Residual df = 47 = n - p = n - 3 => n = 50. For x1, t = 2. 00, p = 0. 054 > 0. 05 = not significant at 5% (only borderline). M6. 残差 df = 47= n -p = n-3⇒ n= 50。对 x1, t = 2. 00, p = 0. 054 >0. 05⇒在5% 水平上不显著(仅勉强接近)。 7 M7. (B) eº. 41 =1. 51x. On a log scale an additive coefficient becomes a multiplicative factor on the original scale. M7. (B) e0. 41~1. 51x。在对数尺度上,一个加法系数在原尺度上变成一个乘法因子。 M8. (B) stop. When <none> is best, no add/drop move lowers AIC, so step() halts and returns the current model. M8. (B)停止。当 <none>最优时,没有任何添加/剔除动作能降低 AIC,于是 step()停下并返回当前模型。 4×15 MARKS PER PAPER 每份试卷的分数 5 STEPS / TEST 每个检验的步数 CIPI read MATCH THE WORDING 匹配题目措辞 THE R OUTPUT R 输出 Same habit across every part: state the formula, show the substitution, conclude in context. The five-step test, the labelled ANOVA entries, the cut-off vs the gap, the fitted line written per group - each earns marks on method even if one arithmetic step slips. A bare number, or 'reject H ' with no context sentence, throws that away. 每一部分都保持同一习惯:写出公式、展示代入、结合情境作结。五步检验、标好的 ANOVA 表项、临界值与间距的比较、按组写出的 拟合直线 -- 即便某一步算术失手,每一项都还能拿到方法分。一个光秃秃的数字,或没有情境句子的“拒绝 Ho”,会把这些分白白丢 掉。 MARKER'S NOTE . STAT7038 FINAL STAT7038 . Regression Modelling
- sequential vs partial 一句话:anov a 顺序变;summary t 不变。[30]Source: asksia-cheatsheet-stat7038.pdf20c . Extra-SS from the Table SHORTCUT σ̂σ̂_{(i)} To test x2,X3 given x4,X1 already in, fit in order x4, X1, X2, X3 and read the sequential lines: SSR(extra) = SSR(X2 |X4,X1) + SSR(X3|X4,X1,X2) F = [SSR(extra)/2] / MSE ~ F_{2, n-p} Just sum the last two sequential SS (and their df). This only works if the tested terms are last & consecutive - otherwise refit or use a full vs reduced comparison. Equivalently, the extra SS = SSE(reduced) - SSE(full): drop the q terms, note how much SSE rises, and that rise (+q, MSE_full) is your F. The two routes always give the same number. R does this directly with anova( reduced, full). 20d . What Reordering Changes QUICK TABLE OUTPUT REORDER X'S? summary() coeffs/t unchanged (partial) R2, adj-R2, overall F unchanged vif, plot(lm) unchanged anova() seq SS changes Different per-line anova verdicts under reordering is itself a multicollinearity diagnostic . Why: SSR(X2|X1) is the extra variation X2 explains beyond X1. If X1 and X2 are correlated they share explanatory power, so whichever enters first claims it - hence the order dependence. When predictors are orthogonal, sequential = partial and order is irrelevant. The sequential SS for all terms still sum to the same total SSR regardless of order. 21 . Qualitative Covariates WK 9 Dummy coding: a factor with k levels => k-1 indicator (0/1) variables; one level is the reference (baseline) absorbed into the intercept. R: factor, treatment contrasts, ref = first level alphabetically. ADDITIVE (PARALLEL LINES), DE{0,1} y = Be + B1x + @2D + £ D=0: int Be . D=1: int Be+B2 . same slope ß1
-
14)最后给你一份“最省命”的冲刺顺序(按你材料的考试风格)
- (1)先把 R 输出冷读练熟:看到 summary 就能立刻读出 $b,se,t,p,MSE,df,n$。[8]Source: asksia-bible-stat7038-bilingual.pdf✓ Recover n from the output i summary stars 从输出中还原 n summary 的星号 "on 6 degrees of freedom" means dfE = n - 2 = 6, so n = 8. The F line "on 1 and 6 DF" confirms it. From n you can rebuild any SE the printout hides. *** / ** / * flag significance at . 001 / . 01 / . 05. Handy, but in a written answer quote the p-value or compare to the critical value - don't just cite stars. “在6个自由度上”意味着 dfE = n-2=6,故 n=8。F那 一行“在1和6 DF上”印证了这一点。有了n,便可重建打印 输出中隐去的任何 SE。 *** / ** / * 分别标记 . 001/ 01/. 05 水平上的显著性。方 便,但在文字作答中要引用 p 值或与临界值比较 -- 不要只引 星号。 "The whole of SLR inference is recoverable from six numbers on one printout. Learn to read it cold and the calculation questions become transcription with arithmetic. " “整个 SLR 推断都能从一张打印输出上的六个数字里复原出来。把它练到能一眼读懂,计算题就变成了带算术的誊抄。” WHY R-OUTPUT READING IS THE HIGHEST-YIELD EXAM SKILL STAT7038 . Regression Modelling DIAGNOSTICS . RESIDUAL PLOT - DIAGNOSTICS - DO THE ASSUMPTIONS HOLD? WEEKS 5-6 . HEAVILY EXAMINED The fit is only as good as LINE 拟合的优劣完全取决于 LINE Residual plots are how you check, not just assume, the four conditions 残差图是用来检验这四条假设的,而不是直接假定它们成立 Least squares always returns a line - even through data that has no business being modelled linearly. Inference (the t- and F-tests, the CIs and PIs) is only valid when the error assumptions hold. Diagnostics are the plots and statistics that interrogate them. Recall the four LINE assumptions: 最小二乘法总会返回一条直线 -- 哪怕数据根本不该用线性建模。推断(t 与 F 检验、CI 与 PI)只有在误差假设成立时才有效。诊断就是用 来盘问这些假设的图与统计量。回顾四条 LINE 假设: L LINEARITY OF E[Y |X] E[y|x]的线性 I INDEPENDENT ERRORS 误差相互独立 N NORMAL ERRORS 误差正态 E EQUAL VARIANCE 方差相等 AHA 1 The residuals-vs-fitted plot AHA 1 残差对拟合值图 The single most useful diagnostic. Plot each residual ei = yi - yi against its fitted value gj. Under the assumptions the residuals are a structureless cloud about the e = 0 line. Read it for two things at once: curvature (a failure of Linearity) and changing spread (a failure of Equal variance). 最有用的单一诊断。把每个残差对其拟合值作图。在假设成立下,残差是围绕 e = 0 直线、毫无结构的散点云。一眼同时看两件事:弯曲 (Linearity 即线性的失效)与散布变化(Equal variance 即等方差的失效)。 residual e GOOD[20]Source: asksia-cheatsheet-stat7038.pdfasksia. ai/cheatsheet/ anu-stat7038 · side 1/2 AskSia CHEATSHEET SERIES 9 · CI (mean) vs PI * CLASSIC TRAP (new) At x = x_h both intervals share the same point estimate ŷ_h = bo + b1x_h; they differ only in the SE. CI FOR THE MEAN E(Y|X_H) @_h + t_{n-2} (1-a/2) · ôv(1/n + (x_h-x] 2/Sxx) PI FOR A NEW OBS Y_NEW ŷ_h + t_{n-2} (1-a/2) · ôv(1 + 1/n + (x_h-x]2/Sxx) Why PI is wider: it carries the extra o2 of the new point's own error &_new - the "+1" under the root - on top of the uncertainty in the estimated mean. Both are narrowest at x_h = x and flare as you move away (the (x_h-x)2 term). R: predict ( . . . , interval="confidence" ) vs "prediction". As n->oo the CI shrinks to a point (you know the mean) but the PI stays finite - it can never beat the irreducible o of a single new draw. SIA > Match the wording: "average value for . . . " ++ CI; "predict one new . . . " ++ PI. Forgetting the "+1", or swapping them, loses marks. Never extrapolate beyond the observed x range. 8 . Reading summary(Im) SUPPLIED IN EXAM Estimate Std. Error t value Pr(>|t|) (Int) b0 se0 t0 p0 x b1 se1 t1 p1 Resid SE: VMSE on n-2 df Mult R2: R2 . Adj R2: R2adj F: F on 1 and n-2 DF, p: p Read off: Estimate/Std. Error = bj, se(b }); t value = bj/se(bj) (tests ßj=0); Pr(>|t|) = two-sided p. Stars *** ** *=. 001/. 01/. 05. The bottom three lines bundle the global picture: residual SE + its df give MSE and n-p; the two R2 values compare raw vs size-penalised fit; the F-line is the overall test. A small p on the F-line but big p's on every coefficient = suspect multicollinearity (Side 2). RECOVER HIDDEN QUANTITIES se = b/t . MSE = (resid SE)2 . df_E = n-p 8b . Fill the ANOVA Table RECURRING EXAM TASK
- (2)每天刷一遍“四大链条”(拟合线、t 检验、ANOVA/F/$R^2$、CI/PI)。[3]Source: asksia-bible-stat7038-bilingual.pdf5 June, 2pm · 15 min reading + 180 min In-tutorial Quiz (redeemable) 10% Wk 7 · topic: simple linear regression Assignment (non-redeemable, R) 15% Wk 11 . due 21 May, 5pm Online Quiz (redeemable, Canvas) 5% Wk 5 . no extensions ✓ The strategy this dictates - the recurring chains 由此决定的策略 -- 反复出现的链条 Every exam item is a procedure on supplied numbers. Drill the chains: Sxy/Sxx - b1, bo; SSE - MSE - se(b;) - t - decision; SST = SSR + SSE - F, R2; xh - CI (mean) or PI (new obs). Show every line for the short-answer written parts - method marks are real. Put each chain, once, on your sheet. 每道考题都是对所给数字执行某一流程。反复演练这些链条: Sxy/Sxx → b1、 bo; SSE → MSE → se(bi)→t→判定; SST = SSR + SSE → F、R2; xh → CI(均值)或 PI(新观 测)。简答文字部分要写出每一行 -- 方法分是实打实的。把 每条链条都在笔记上写一次。 What "R supplied" means for your sheet “提供R 输出”对你的笔记意味着什么 Supplied - don't cram You must be able to do / read t, F, normal tables Pick the right critical value & df summary(lm) printout Read off b, se, t, p; recover MSE, n anova(lm) printout Read SSR, SSE, df; form F = MSR/MSE hp300s+ calculator Sxx. Sxy, b1, Cls by hand ★ The exam format - open one sheet, calculator & tables supplied 考试格式 -- 可带一张笔记,提供计算器与统计表 Three question styles: multiple-choice, short-answer calculation, and short-answer written. Covers all lectures & tutorials, Weeks 1-12. Permitted: one A4 double-sided typed/printed notes sheet. Supplied in the paper: hp300s+ calculator, R outputs, statistical tables, scribble paper. Significance level 5% unless stated; log means natural log. 三种题型:选择题、简答计算、简答文字。覆盖第1-12周的 全部讲课与辅导。允许携带:一张 A4 双面打字/打印笔记。 试卷中提供:hp300s+ 计算器、R 输出、统计表、草稿纸。 除非另有说明,显著性水平为 5%;log 指自然对数。 STAT7038 . Regression Modelling short calc + written CONTENTS - CONTENTS
- (3)再刷诊断“勾选题”:三张图(R-v-F、Q-Q、Cook)各自对应的唯一核心缺陷。[17]Source: asksia-cheatsheet-stat7038.pdfCompiled by AskSia . mapped to the STAT7038 syllabus . asksia. ai/cheatsheet/anu- stat7038 11 . Worked . CI vs PI SAME FIT From COL 4: d = 2, n = 12, Sxx = 80, X = 10. Predict at x_h =14. ŷ_h = 5 + 2. 5. 14 = 40 (x_h-x]2 = 16 . 1/n = 0. 0833 95% CI (MEAN) 40 ± 2. 228. 2. V(0. 0833 + 16/80) = 40 ± 2. 228. 2. 0. 532 = (37. 6, 42. 4) 95% PI (NEW) 40 ± 2. 228. 2. V(1 + 0. 0833 + 0. 2) = 40 ± 2. 228 . 2. 1. 133 = (34. 95, 45. 05) PI is far wider - the "+1" dominates the root. Both centre on 40. 12 . Diagnostics . the WK 4 . plots EVERY Q Residuals vs Fitted (& vs each x): checks linearity (no curve in the smoother) + constant variance (even band). Curve = wrong functional form; funnel/megaphone = heteroscedasticity . Normal Q-Q (internally studentised resid): checks normality. On the line => normal; S-shape = skew; heavy/light tails = kurtosis. Tails matter less in large n (CLT). Scale-Location (/|std resid| vs fitted): another homoscedasticity check; rising trend => increasing variance. Residuals vs Leverage / Cook's D plot: flags influence (Side 2). Independence => residuals vs order/time. Trap: R auto-labels the 3 most extreme points - labelled # outlier . Judge vs cut-offs + the gap; a Q- Q-flagged point with |studentised| < 2 is not an outlier. 12b . Reading 3 Plots CHECKBOX TASK The exam shows Residuals-vs-Fitted, Q-Q, Cook's-D and asks for the single best verdict. Map symptom -> conclusion: WHAT YOU SEE CONCLUSION Curved smoother (R-v-F) non-linearity Funnel widening
- (4)最后用 selection / VIF / extra-SS F 做提速与防坑。[18]Source: asksia-cheatsheet-stat7038.pdfRemedies: drop a redundant predictor; centre x (kills x vs x2 collinearity); combine variables; collect better- spread data; ridge. Condition number K = /()_max/2_min) of the scaled XTX; K > 30 signals collinearity problems - a global check to pair with the per-predictor VIFs. VIF pinpoints which predictor; k flags the system as a whole. Trap: high VIFs from deliberately included higher- order/interaction terms are expected, not a fault (centring reduces them). Multicollinearity inflates se's & destabilises coefficients but does NOT bias y or R2 within the data range. 25 . Model Selection WK 11 Smaller better for Cp/AIC/BIC/PRESS; larger for adj-R2. MALLOWS' CP Cp = SSEp/MSE_full - (n-2p) good: Cp = p (low bias) & small AIC / BIC AIC = n. log(SSE/n) + 2p BIC = n . Log(SSE/n) + p. log n (heavier » smaller models) PRESS (LEAVE-ONE-OUT) PRESS = E(e1/(1-hit))2 . minimise PRESS measures out-of-sample prediction (each point predicted from a fit that excludes it), so it rewards genuine predictive power rather than in-sample fit. Procedures: best-subset (regsubsets, feasible only for modest #predictors); forward (start null, add most sig); backward (start full, drop least sig); stepwise "both" - R step(), AIC-driven; read the trace, take the lowest-AIC move, stop when <none> is best. Bias-variance: too few => biased (underfit); too many = inflated variance (overfit). Prefer the simplest adequate model (Occam). All the criteria are just different penalties balancing fit against complexity, so they need not agree - report which criterion you used. Reading a step() trace: each block lists candidate add/drop moves with the resulting AIC; R takes the lowest-AIC move and stops when <none> tops the list. BIC's heavier penalty (p. log n, for n≥8) almost always lands on a smaller model than AIC, so quoting which criterion you used matters. Trap: a stepwise-selected model's p-values/CIs are over- optimistic (selection inflates significance); AIC vs BIC can pick different models. Validate; don't treat the selected model as confirmed truth. Trap List SIDE 2 | CI(mean) vs PI(new): "+1" > PI wider seq SS order-dependent . t = partial F sig, no t - multicollinearity (VIF) keep Lower-order if interaction sig cut-offs large-sample . gap not threshold no extrapolation . don't over-trust step () asksia. ai/cheatsheet/ anu-stat7038 . side 2/2 Revision aid . check the current class summary for exam conditions . @ 2026 good luck. revise smart. AskSia CHEATSHEET SERIES Compiled by AskSia . mapped to the STAT7038 syllabus . asksia. ai/cheatsheet/anu-stat7038 15 . Matrix Form MLR & DIAGNOSTICS . matrix form (XTX)-1XTy . hat matrix . seq vs partial SS . nested F . dummies & interactions . Leverage/Cook/DFFITS ONE A4 . TYPED MEMORY AID STAT7038 Regression Modelling AUSTRALIAN NATIONAL UNIVERSITY . RSFAS EXAM REVISION Sem 1 2026 . SIDE 1 OF 2 SLR · estimation . inference[26]Source: asksia-cheatsheet-stat7038.pdf19 . Polynomial Regr. WK 11 Centre x (use x-x) before forming powers to kill the artificial collinearity between x and x2. Test the highest-order term first (it's last sequential); only after dropping it re-test the next. 19b . Back-fill summary. lm RECURRING TASK Given a partial MLR summary, recover the blanks (n=25, p=4): missing t = Estimate / Std. Error missing se = Estimate / t resid SE = VMSE on n-p = 21 df overall F on p-1=3 and n-p=21 DF R2adj = 1 - (1-R2) (n-1)/(n-p) Worked: b2=1. 8, se=0. 6 => t=3. 0, p <. 01 (vst_{21} (. 975)=2. 08, reject ß2=0). If R2=0. 70 = R2adj = 1-0. 30-24/21 = 0. 657 Then recover SSE from resid SE: SSE = (resid SE)2. (n-p) = MSE. 21. And the overall F = [R2/(p-1)] / [(1-R2)/(n-p)] = (0. 70/3)/(0. 30/21) = 0. 233/0. 0143 = 16. 3 on 3,21 df - compare to F_{3,21}(. 95)=3. 07 => model is useful overall. 20 · Sequential vs * TRAP . Partial SS WK 8 Sequential (Type I) - R's anova ( Im) splits SSR one term at a time, in entry order: SSR = SSR(X1) + SSR(X2 |X1) + SSR(X3 |X1 , X2 ) + " Each line: 1 df (factor: levels-1), F = MS/MSE ~ F_{df,n-p}. The last sequential line = t2 for that coefficient, same p. EXTRA-SS / NESTED (PARTIAL) F . FULL VS REDUCED F = {[SSE(R)-SSE(F)]/q} / {SSE(F)/(n-p_F)} = [SSR(extra)/q] / MSE(F) ~ F_{q, n-p_F} Ho: the q extra coefficients are all 0. Read the numerator straight off the sequential table if the q terms are last & consecutive (sum their seq SS + df). Trap: a sequential SS for X2 is SSR(X2|X1) - order- dependent . To test a non-final variable, re-order it last or use a partial F. The summary t-test is the partial test (all-others-in); = the sequential F only for the last term. Different orders => different per-line p's - itself a multicollinearity tell. 20b . Worked . Nested F OUR NUMBERS Full (X1,X2,X3): SSE(F) = 90, n = 30, p_F = 4. Drop X2,X3: SSE(R) = 150, q = 2. F = [(150-90)/2] / [90/26] = 30 / 3. 462 = 8. 67 ~ F_{2,26} vs F_{2,26}(. 95) = 3. 37 = reject He => X2,X3 jointly add significantly. Keep them. 20c . Extra-SS from the Table SHORTCUT σ̂σ̂_{(i)} To test x2,X3 given x4,X1 already in, fit in order x4, X1, X2, X3 and read the sequential lines:
-
你接下来把下面 3 个信息发我,我就能把上面内容进一步“落到你每天要刷什么题”的清单里(并帮你把 A4 版面压缩到可抄的密度):
-
(可选)如果你愿意,我也可以把你“最弱的三块 + 最常踩的坑”(比如 CI/PI 的 $+1$、se/b/t 的复原、seq vs partial)记住,之后你每次来我就按这个顺序带你快速复习。
Ask Sia for quick explanations, examples, and study support.