MAST20034
Jun 8, 2026
All files
-
当然可以,我来直接把这门 MAST20034 的 final 复习重点,按“考试怎么考 → 你要背什么 → 你怎么答题 → 每章核心考点 → 考前怎么冲”给你拆开。
-
先给你一句最重要的总结:
-
这门课不是算数课,是“批判性推理课”。
- final 是 60%,3小时,全是简答推理题,不能用计算器,而且没有需要计算的题;你可以带 2张A4双面 = 4面笔记 进去。[1]Source: asksia-bible-mast20034-bilingual.pdfB 2 . REVISE 2 · REVISE 2 . REVISE You've done the week. Use the tables and the chapter-end recall checklists to self-test: can you list the four observational designs, name three sampling biases, give the five graphics principles, recite the Hill criteria? The checklists are written to be lifted almost verbatim onto your four-side notes sheet. 你已经上完本周。用各表格和章 末的recall checklists (回忆清 单)来自测:你能列出四种观察 性设计、点名三种 sampling bias、给出五条图表原则、背出 Hill 准则吗?这些清单写出来就 是为了几乎逐字誉到你的四面笔 记纸上。 C 3 . APPLY 3 . APPLY 3 . APPLY You're building your notes sheet or sitting the paper. Run the name-the-concept decoder (Ch 14) on every prompt: read the cue - name the design / bias / method -> write the because. With four sides of notes carried in and no calculator, your edge is reasoning discipline, not recall under pressure. 你正在做笔记纸,或正在考场 上。对每道题跑一遍name-the- concept decoder (点名概念解 码器)(第14章):读线索→点 design / bias / method -> 写下because。带着四面笔记、 不用计算器,你的优势是推理纪 律,而非压力下的回忆。 AskSia Library · MAST20034 · 双语 Bilingual ! Read this first: the assessment shape, and the bring-in rule 先读这个:评估的形态,以及可带入规则 MAST20034 is assessed by four pieces: 5 revision quizzes (5%), 4 short assignments (20%, each a tight 200- word critique with hard word penalties), a group project (15%, study design/critique + a Week 11 presentation), and the 60% final exam. The final is in-person, short-answer reasoning, three hours. You may bring in up to two A4 pages double-sided - four sides - of your own notes, and calculators are not permitted (there are no questions that need one). So your notes sheet should carry definitions, taxonomies, decision rules and checklists, never arithmetic. Always confirm the current weights, dates and exam conditions on your own LMS, as details shift between cohorts. MAST20034 由四个部分评估:5 次复习 quiz(5%)、4次短作业(20%,每次是一篇严格200词的批判,超字数有硬 扣分)、一个小组项目(15%,研究设计/批判+第11周展示),以及60% 的期末考。期末是线下、简答推理、三小时。 你可以带入最多两张 A4 双面纸 -- 四面 -- 自己的笔记,不允许用计算器(也没有需要计算器的题目)。所以你的笔记 纸应承载定义、分类法、决策规则与清单,绝不放算术。请始终在你自己的 LMS 上确认当前的权重、日期与考试条件, 因为细节会随届次变动。 i How this book was built - the two-layer rule 这本书是怎么搭出来的 -- 两层规则 The framework canon here is standard, widely-published statistical-literacy theory - the PPDAC investigation cycle (Wild & Pfannkuch), EDA (Tukey), the standard study-design and sampling taxonomies, validity & precision principles, NHST + confidence-interval logic, the WEIRD-bias critique, the Bradford Hill criteria, and data-feminism / data-ethics (D'Ignazio & Klein). These are non-copyrightable canon, stated plainly. The course's own case-study stems and tutorial examples are paraphrased and re-authored with our own scenarios - we never reproduce a case study's specific data. Book status quoted and honoured (four-side bring-in, no calculator, short-answer). Verify on your LMS. 这里的框架经典是标准的、被广泛发表的统计素养理论 -- PPDAC 探究循环(Wild & Pfannkuch)、EDA (Tukey)、 标准的研究设计与 sampling 分类法、validity & precision (效度与精密度)原则、NHST+置信区间逻辑、WEIRD- bias 批判、Bradford Hill 准则,以及data-feminism / data-ethics (数据女性主义/数据伦理)(D'Ignazio & Klein)。这些是不可受版权保护的经典,平实陈述。本课自身的案例研究题干与教程例子都被转述并以我们自己的情景重 写 -- 我们绝不复制任何案例研究的具体数据。书面状态如实引用并遵守(四面带入、不用计算器、简答)。请在你的 LMS 上核实。 AskSia Library · MAST20034 · 双语 Bilingual THE BLUEPRINT - THE EXAM BLUEPRINT 60% FINAL . EVERY MARK IS A 'BECAUSE' Where every mark lives 每一分都落在哪里 One 60% short-answer final - reasoning only, no calculator, four sides of your own notes 一场占 60% 的 short-answer 期末 -- 只考推理、不许用计算器、可带四页自备笔记 TL;DR. Sixty percent is a short-answer reasoning final - no calculator, no calculations, no software, with four sides of your own notes carried in. Its make-or-break skill is "name the concept, then justify the call": you are handed a graph, a study or a piece of statistical output and asked to critique it and say how to fix it. Master the taxonomies and decision rules in this book and you hold the keys to the whole paper. TL;DR. 这 60% 是一场 short-answer 推理期末 -- 不许用计算器、不做计算、不用软件,可带入四页自备笔记。它成败攸 关的技能是“为概念命名,再论证你的判断”:你会拿到一张图、一项研究或一段统计输出,被要求批判它并说出如何修补。 掌握本书里的分类法与决策规则,你就握住了整张试卷的钥匙。 60% FINAL EXAM (3 HR) 期末考试(3 小时)[2]Source: asksia-bible-mast20034-bilingual.pdf-P C P - A D - - THE COMPLETE EXAM BIBLE Critical Thinking with Data 用数据进行批判性思考 DON'T COMPUTE - CRITIQUE. NAME THE DESIGN, SPOT THE BIAS, READ THE OUTPUT, JUSTIFY EVERY CALL. 四面笔记· 不带计算器 -- 每一分都靠讲清你的推理。 MAST20034 . THE UNIVERSITY OF MELBOURNE 中英双语版 · BILINGUAL EDITION 英文主讲,中文随行 一 考试要点与术语保留英文原词 The final is 60% of your mark, short-answer reasoning only - no calculator, no calculations, no software, with four sides of your own notes carried in. You are handed a graph, a study or a piece of output and asked to name what is good, name what is wrong, and say how to fix it. As the marking criteria put it, "explaining your reasoning and choices is typically more important than any answer. " Every mark is a because. This book is a decision-tree, taxonomy and checklist machine built to win exactly that. Independent study companion. Not affiliated with or endorsed by the University of Melbourne. Corrections: takedowns@asksia. ai PREFACE - - HOW TO USE THIS BOOK Reasoning, not arithmetic 讲推理,不讲算术 The exam pays for the 'because' - so does this book 考试为‘because’ 付分 -- 本书亦然 TL;DR. This is not a copy of the lecture slides or a formula dump - MAST20034 has no calculator and no calculations, so there is nothing to crunch. It is a self-contained bank of the concepts, taxonomies, decision rules and critique checklists the course examines: each idea defined plainly (markers reward defined terms), drawn as an original schematic where a picture helps, and tied to the exam's one move - name the concept, then justify the call. The same pages serve you three ways across the twelve teaching weeks. TL;DR. 这不是讲义幻灯片的副本,也不是公式堆砌 -- MAST20034 不许用计算器、也不做计算,所以根本没有什么要算。 它是一份自成体系的题库,收录课程会考的概念、分类法、决策规则与批判清单:每个想法都平实地定义(阅卷人为定义好的 术语给分),在图能帮上忙处画成原创示意图,并扣住考试唯一的动作 -- 为概念命名,再论证你的判断。同一批页面在这十 二个教学周里以三种方式为你服务。 A 1 . LEARN 1 . LEARN 1 . LEARN You haven't done the week's lecture yet. Read a chapter top to bottom. Each concept opens with a plain-English definition, lands a schematic or a decision table, then a worked short- answer that shows how to reason - the 'because' - not how to compute. Meet PPDAC, confounding, the bias dartboard or Bradford Hill here cold. 你还没上本周的课。把一章从头 读到尾。每个概念以一个大白话 定义开场,落到一张示意图或决 策表,再接一个简答范例,展示 如何推理 -- 那个“because” 而非如何计算。在这里第一 次冷碰 PPDAC、 confounding、bias 飞镖盘或 Bradford Hill.[5]Source: asksia-bible-mast20034-bilingual.pdfFINAL . 60% . SHORT-ANSWER REASONING The exam-morning decoder 考试当天解码器 If the question says X, reach for Y, and say these three things 若题目说 X,就伸手去取 Y,并说出这三件事 TL;DR. The final hands you a scenario, a graph, or a piece of statistical output and asks you to name the concept and justify it. There is no calculator and no arithmetic - every mark is a because. This page is the lookup table: read the cue words in the stem, reach for the matching concept, then deliver the three reasons that bank the marks. Memorise the column on the right; that is the answer. TL;DR. 期末递给你一个scenario (情景)、一张图,或一段统计输出,要你点名概念并加以论证。没有计算器,也没有算术 -- 每一分都是一个because。本页就是查找表:读题干里的cue words (线索词),伸手抓对应的概念,再交出能存下分数 的三条理由。把右侧那一列背下来;那就是答案。 ★ What the exam asks here 考试在这里问什么 The 60% final is 3 hours, short-answer only (no MCQ, no essay), no calculator, and you bring in 4 sides of your own notes. The marking criteria are explicit: "explaining your reasoning and choices is typically more important than any answer. " Dot-points are fine; no marks for grammar/spelling. So this whole chapter trains the one move the exam pays for - name the concept - give the because. 60% 的期末为时 3 小时,仅简答(无MCQ、无论文),不可用计算器,且你带入自备4面笔记。评分标准写得很明 确:“解释你的推理与选择,通常比任何答案本身更重要。”用要点列举即可;语法/拼写不计分。所以整章都在训练考试 买单的那一招 -- 点名概念 →给出 because。 12. 1 The cue - concept - because table 12. 1cue -> concept -> because xJAR Each row is a question species you have already met in this book. The left column is what the stem sounds like; the middle is the framework to invoke (with the chapter); the right is the 3-part skeleton - say all three and you have earned the reasoning marks. 每一行都是你在本书中已经见过的一类问题species(题种)。左列是题干听起来像什么;中列是要调用的框架(附章节);右 列是三段式骨架 -- 三段都说出来,你就挣到了推理分。 If the question says . . . Reach for this concept Say these 3 things (the because) DESIGN & CAUSATION "Choose / justify a study design"; "how would you investigate . . . " Study-design tree + validity · Ch3-4 (1) Can you intervene? - experiment (RCT) vs observational. (2) Pick the type with a because - rare outcome-case-control, many outcomes-cohort, snapshot-cross-sectional, populations-ecological. (3) Name the design tools that protect validity (randomise/compare/control). "Is it causal?"; "does X cause Y?"; "can we conclude . . . " Confounding + Bradford Hill . Ch4, Ch10 (1) Correlation # causation - observational data give association only. (2) Name a plausible confounder (linked to both exposure and outcome). (3) Argue Hill - esp. temporality (cause first) + dose-response gradient; an RCT would strengthen it by removing confounders. AskSia Library . MAST20034 . XXia Bilingual If the question says . . . Reach for this concept Say these 3 things (the because) GRAPHS & OUTPUT "Critique this graph"; "two good features & one improvement" 5 graphics principles . Ch2 (1) Two good features, each tied to a principle (standard form / common scale / clear encoding / shows data / simple). (2) One real issue (no title, abbreviated labels, panels on different scales). (3) A specific fix that addresses that issue - vague fixes score zero.[10]Source: asksia-bible-mast20034-bilingual.pdf4 SIDES OF OWN NOTES (BRING-IN) 面自备笔记(可带入) CALCULATORS PERMITTED 允许使用计算器 S/A SHORT-ANSWER REASONING 简答推理 The four assessment pieces 四项评估构成 Component Weight When Format Final exam 60% Exam period In-person, 3 hr; short-answer reasoning; no calculator / no calculations; 4- side own-notes bring-in 4 Short assignments 20% Across sem Individual; 200- word critique each (APA 7); hard word penalties; pick a case study Group project 15% + Wk 11 Team report + presentation + peer/contribution review; design or critique a study 5 Revision quizzes 5% Wks 2/4/6/9/12 Online (LMS); low- stakes checks on lecture content FIG 0. 1 1 Problem define the question 5 Conclusion answer in context 2 Plan design how to get date PPDAC cycle[16]Source: asksia-cheatsheet-mast20034.pdfqualitative . convergence 17 . The "Because" Rule HOW TO BANK MARKS EVERY ANSWER = A CHAIN NAME the concept/framework - APPLY it to the context + BECAUSE . . . (the reason wins the mark) · Marks are per correct, sufficiently-detailed reason - one detailed reason often = full marks; restating the definition earns nothing · Dot points are fine . no grammar/spelling marks . 3 hours SIA > No question needs a calculator and none recall case-study details. If you're computing, you've misread it - they want the reasoning. Spend the words on the because. 18 . Top Traps to THE MARKS LOST MOST Avoid · "P = Pr(Ho true)" . "large P proves Ho" . choosing one- sided after the data · Calling a natural experiment an experiment . "association proves cause" · Cohort>case-control mix-up · stratified > quota . cluster<> stratified · Hill as a tick-box · ignoring temporality · "Significant" = "important" . trusting a result because n is huge the core of your 4-side bring-in notes . confirm on the MAST20034 exam-info page name it. apply it. because . . . asksia. ai/cheatsheet/ unimelb-mast20034 . side 2/2 AskSia CHEAT SHEET SERIES RESTRICTED BRING-IN . NO CALCULATOR
- 阅卷标准最看重的不是你“答了什么词”,而是你有没有把理由讲清楚:
- 先点名概念
- 再下定义
- 再套到题目情境里
- 最后写 because(为什么/会导致什么后果)[4]Source: asksia-bible-mast20034-bilingual.pdf4 Analysis explore + model + test 3 Data collect / clean / store iterative: conclusions raise new problems -> cycle repeats The course's engine in one picture: Problem - Plan + Data - Analysis - Conclusion - the investigation cycle that frames almost every critique prompt. Most exam answers are really a question about one node: was the Plan a sound design? Were the Data well sampled? Is the Conclusion licensed by the design? Learn to walk it from memory. 一幅图道尽本课程的引擎:Problem → Plan → Data → Analysis → Conclusion -- 这个研究循环 框定了几乎每一个批判题。多数考试答案其实是关于 某一个节点的问题:Plan 是不是一个可靠的设计? Data 抽样得当吗?Conclusion 是设计所许可的吗? 要学会凭记忆把它走一遍。 Examinable scope = the 12-week reasoning spine: objectivity & data · good graphics · study design . observational studies & confounding · reporting & critiquing claims . qualitative methods . frameworks for inference . analysis & modelling . sampling & AskSia Library . MAST20034 . XXia Bilingual WEIRD bias . accumulating research (meta-analysis & Hill) . big data · context. Research prompts touch only the whole-class case studies and never demand recalled details. 可考范围 = 12 周的推理主线:objectivity 与数据 · 好图表 · 研究设计 · observational study 与 confounding · 报告 与批判主张 · qualitative methods · 推断框架 · 分析与建 模 · 抽样与 WEIRD bias · 积累研究 (meta-analysis 与 Hill) · big data · 情境。研究类题目只触及全班共学的案 例研究,从不要求背诵细节。 What the exam is really testing 这场考试真正在考什么 The cue you get The move it rewards A graph / figure Critique it: name two good features + one specific fix (the graphics principles) A described study Name the design - say what conclusion is legal (causation vs association) An association Find the confounder / lurking variable and explain how it could fake the link Statistical output / a CI / P Interpret it in context - without the classic misreads A sampling scenario Name the method & the bias (incl. WEIRD) and why size won't cure it ✓ The one habit that wins this exam 赢下这场考试的那一个习惯 For every prompt, name the concept first, then write the because. "Two variables move together" - confounding / correlation#causation; "who got picked" - a sampling / selection bias; "is this graph any good" - the five graphics principles; "does X cause Y across studies" - Bradford Hill; "what does this P-value mean" - the interpretation rules. The decoder in Ch 14 lists every cue. 对每道题,先点名概念,再写because。“两个变量一 起变动”→ confounding / correlation≠causation;“谁被选中”→ 某种 sampling/ selection bias;“这张图好不好”→五条 图表原则;“跨多项研究X是否导致 Y”→ Bradford Hill;“这个 P-value 是什么意思”→解读规则。第 14 章的解码器列出了每一个线索。 ★ The single highest-value habit 价值最高的单一习惯 You may write in dot-points or sentences, and there are no marks for grammar or spelling - so spend every word on the reasoning. Practise answering in the shape the markers reward: (1) name the concept, (2) define it in a line, (3) apply it to the scenario, (4) state the consequence or fix. Four sentences, full marks. "Explaining your reasoning and choices is typically more important than any answer. " 你可以用要点或句子书写,且语法或拼写不计分 所以把每一个词都花在推理上。按评分者奖励的形态 练习作答:(1)点名概念,(2)一行内定义它,(3)把它 应用到情景,(4)陈述后果或修正。四句话,满分。 “阐释你的推理与选择,通常比任何答案本身更重要。”[6]Source: asksia-bible-mast20034-bilingual.pdfEvery unit (and every set of n) equally likely - the baseline. Cluster 整群抽 样 Randomly pick groups, survey all within. stratum; guarantees subgroup coverage. prone. ✓ How to spend a glossary term in the exam 如何在考试中「花掉」一个词汇表术语 Never just name it. Define - apply - because. e. g. "This is convenience sampling (define); here it over-represents metro students (apply); so a bigger sample won't fix the bias (because). " That three-move sentence is what the rubric pays for. 永远别只是点名。定义→应用→ because。例如:“这是 convenience sampling(便利抽样)(定义);这里它过度 代表了都市学生(应用);所以更大的样本也修复不了这个偏倚(because)。”那个三步句式,正是评分标准买单的东 西。 AskSia Library · MAST20034 · 双语 Bilingual REVISION . SHORT -ANSWER BANK - REVISION . SHORT - ANSWER BANK ALL CHAPTERS . EXAM REHEARSAL Practice bank: every mark is a because 练习题库:每一分都是一个 because Twelve short-answer species, each reasoned out the way the marker wants 十二类 short-answer 题型,每一类都按阅卷人想要的方式推理出来 TL;DR. The final is short-answer reasoning only - you name a concept, then reason it out, then say because . . . The rubric is explicit: "explaining your reasoning and choices is typically more important than any answer. " So marks are awarded per correct, sufficiently-detailed reason - not per fact recalled. Each card below gives you the skeleton (define - reason - because), the marking note (what actually scores), and the trap that zeroes a vague answer. TL;DR. 期末只考简答推理 -- 你先点名一个概念,再推理出来,然后说because . . . (因为 …. . . . . )。评分标准写得很明白:“阐 释你的推理与选择,通常比任何答案本身更重要。”所以分数是按每条正确、足够详细的理由给的 -- 而不是按回忆出的事实给 的。下面每张卡片给你骨架(定义→推理→ because)、评分提示(什么真正得分)以及让模糊答案归零的陷阱。 ★ What the exam asks here - the format you are rehearsing 考试在这里问什么 -- 你正在排练的那种格式 The 60% final is short-answer, no calculator, no calculations, no software operation. You may bring 4 sides of your own notes. Two question species recur: (1) "explain a concept / apply critical thinking to a context" and (2) "use critical thinking on a whole-class example" (anchored to a shared case, but you are never asked to recall its data). They may hand you statistical output or a graph to interpret - you read and explain it, you never compute it. Every card here is one rep of that move. 60% 的期末是简答题,不可用计算器,无需计算,无需操作软件。你可以带入自备4 面笔记。两类题目反复出现:(1) “解释一个概念/把批判性思维应用到某情境”和(2)“对一个全班案例运用批判性思维”(锚定在一个共享案例上,但从 不要求你回忆它的数据)。他们可能递给你一段统计输出或一张图让你解读 -- 你读它、解释它,但从不计算它。这里的 每一张卡片都是这一招的一次操练。 P. 1 How to read each card - the marking model P. 1如何读每张卡片 -- 评分模型 Markers do not reward the verb "explain"; they reward the linkage. A "4-mark, 2+2" item almost always means 2 marks for a precise definition and 2 marks for two distinct, consequence-level reasons. A reason that merely restates the definition earns nothing. The skeleton below is the spine of every answer. 评分者奖励的不是“explain(解释)”这个动词,而是关联(linkage)。一道“4分、2+2”的题几乎总意味着2分给精确定义,2 分给两条相异的、后果层面的理由。一条仅仅复述定义的理由得不到分。下面的骨架是每个答案的脊柱。 1 Name the concept / framework. Markers reward defined terms - say convenience sampling, confounder, Type I error by name before you reason. 点名概念 / 框架。评分者奖励定义清晰的术语 -- 在推理前先按名说出 convenience sampling、confounder、Type l error. 2 Define it precisely. One sentence that would let a stranger identify it; vagueness ("just picking people") loses the definition marks. 精确地定义它。用一句话让陌生人也能据此辨认它;含糊(“就是随便挑人”)会丢掉定义分。[8]Source: asksia-bible-mast20034-bilingual.pdfAskSia Library · MAST20034 · 双语 Bilingual 3 Reason in the context given. Tie the concept to this scenario, not a generic textbook one. 在给定的情境中推理。把概念系到这个情景,而非一个泛泛的教科书情景。 4 Close each point with a because. State the consequence (the bias it induces, the assumption it breaks, the conclusion it licenses) - this is the mark-bearing clause. 每个要点都以一个 because 收尾。说出后果(它引发的偏倚、它破坏的假设、它许可的结论) -- 这是承载分数的从句。 5 Count your reasons against the marks. If it says [4: 2+2], deliver a definition and two separate consequence-level reasons. 按分值数你的理由。若标着[4:2+2],就要给出一个定义以及两个各自独立、到后果层面的理由。 ✓ The universal sentence frame 通用句式框架 "This is [named concept], which means [definition]. Here it matters because [consequence #1], and also because [consequence #2]. "Drop any scenario into that frame and you have structured an answer that the rubric can find marks in. “这是[点名的概念],意思是[定义]。在这里它之所以重要,是因为[后果#1],也因为[后果 #2]。”把任意情景套进这 个框架,你就把一个评分标准能找到分的答案给搭好了。 - ! The four ways students throw away marks 学生白白丢分的四种方式 复述,而非推理 -- “它有偏倚,因为它不随机”只是给定义换了个名字;要说出这个偏倚做了什么。 |一个理由套两件外套 -- 一道“2+2”的题目需要两个不同的理由,而非把一个换说法说两遍。 · Trying to compute - there is no calculator and no calc question; if you start arithmetic you have misread the task. 试图计算 -- 这里既无计算器也无计算题;若你开始做算术,你就读错题了。 过度宣称因果 -- 对观察性数据唯一合法的动词是 is associated with,绝不是 causes。 ● 复述,而非推理 -- “它有偏倚,因为它不随机”只是给定义换了个名字;要说出这个偏倚做了什么。 ● 一个理由套两件外套 -- 一道“2+2”的题目需要两个不同的理由,而非把一个换说法说两遍。 ● 试图计算 -- 这里既无计算器也无计算题;若你开始做算术,你就读错题了。 ● 过度宣称因果 -- 对观察性数据唯一合法的动词是 is associated with (与 . . . . . 相关),绝不是 causes(导致)。 AskSia Library · MAST20034 · 双语 Bilingual REVISION . SHORT - ANSWER BANK REVISION . SHORT - ANSWER BANK STUDY - PRODUCTION SPECIES Drills 1-6: design, sampling, confounding & graphs 演练 1-6: 设计、抽样、confounding 与图表 TL;DR. These six rehearse the "how the data were produced" family: choose & justify a design, pick a sampling method, name the confounder, identify exposure/outcome, and critique a graph (two good features + one specific fix). Reason the choice against its alternative - that contrast is where the marks live. TL;DR. 这六张演练“数据是如何产生的”这一族题:选择并论证一种设计、挑一种 sampling 方法、点名 confounder、识别 exposure/outcome,以及批判一张图(两个优点+一个具体的修正)。要把你的选择对照其备选方案来论证 -- 那个对比正 是分数所在。 Q1[11]Source: asksia-bible-mast20034-bilingual.pdf(b) Big-data 批判(because):巨大的 n修不了 bias -- 它只是那些已经在用这个 app 的人(selection bias); 在海量 n 下一切看起来都“显著”,所以 effect size 与来源比 P 更要紧。再加上数据伦理标记:被记录用户的隐私/ 同意。 What earns the marks. a justified qual choice (the "why" logic + convergence) + a big-data critique naming that size # representativeness, with an ethics flag. 什么能得分。一个有论证的 qual 选择(“why”逻辑+ convergence)+一个 big-data 批判,点明规模 ≠代表性,并附 上一个 ethics 标记。 Trap. dismissing qualitative as "unscientific"; equating large n with representative; forgetting consent/provenance for found data. 陷阱。把 qualitative 斥为“不科学”;把大n 等同于有代表性;对找来的数据忘了 consent/provenance(同意/来源)。 AskSia Library · MAST20034 · 双语 Bilingual ★ Recall checklist - the decision rules for the bank 回忆清单 -- 题库的决策规则 1 每个答案:点名→定义→在情境中给理由→ because(后果)。按分值数理由。 · Design/sampling: justify the choice against its alternative; non-probability methods are biased - size won't cure it. 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 · Confounder: must link to both exposure and outcome; observational - associated with, never causes. Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 · Graph critique: two good features to a principle; one issue + a fix that matches it. 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 · Type I/II: false +/false -; power = 1-ß; small n -+ low power; rare condition - base-rate false positives. Type l/ll: 假阳/假阴;power = 1-β;小 n→低 power;罕见情况→基础率导致的假阳性。 · Qual vs quant: why vs what; convergence is the qual stopping rule. Big data: size # unbiased; effect size & ethics over P. 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data: 大≠无偏;effect size 与伦理胜过 P。 ● 每个答案:点名 →定义→在情境中给理由→ because(后果)。按分值数理由。 ● 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 ● Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 ● 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 ● CI:随机的是区间,固定的是参数。P-value: Pr(datalHo),而非 Pr(Holdata);显著 ≠重要。 ● Type I/ll: 假阳/假阴;power = 1-β;小n→低 power;罕见情况→ 基础率导致的假阳性。 ● Forest plot: 菱形对零线+异质性+发表偏倚的注意。Hill:证据的权重,时序性优先。 ● 诊断图:漏斗形→方差非恒定;QQ 弯曲→非正态→变换/用假设更少的方法。 ● 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data:大≠无偏;effect size 与伦理胜过 P。 AskSia Library · MAST20034 · 双语 Bilingual EXAM MORNING . THE DECODER EXAM MORNING . THE DECODER[16]Source: asksia-cheatsheet-mast20034.pdfqualitative . convergence 17 . The "Because" Rule HOW TO BANK MARKS EVERY ANSWER = A CHAIN NAME the concept/framework - APPLY it to the context + BECAUSE . . . (the reason wins the mark) · Marks are per correct, sufficiently-detailed reason - one detailed reason often = full marks; restating the definition earns nothing · Dot points are fine . no grammar/spelling marks . 3 hours SIA > No question needs a calculator and none recall case-study details. If you're computing, you've misread it - they want the reasoning. Spend the words on the because. 18 . Top Traps to THE MARKS LOST MOST Avoid · "P = Pr(Ho true)" . "large P proves Ho" . choosing one- sided after the data · Calling a natural experiment an experiment . "association proves cause" · Cohort>case-control mix-up · stratified > quota . cluster<> stratified · Hill as a tick-box · ignoring temporality · "Significant" = "important" . trusting a result because n is huge the core of your 4-side bring-in notes . confirm on the MAST20034 exam-info page name it. apply it. because . . . asksia. ai/cheatsheet/ unimelb-mast20034 . side 2/2 AskSia CHEAT SHEET SERIES RESTRICTED BRING-IN . NO CALCULATOR
-
一、先看懂:这门课 final 到底在考什么
-
这门课的引擎是 PPDAC 循环:
- Problem → Plan → Data → Analysis → Conclusion
- 很多题本质上都在问:
- 这个研究问题定义得对吗?
- 这个设计合理吗?
- 样本有偏吗?
- 输出能不能这样解释?
- 这个结论有没有超出数据许可范围?[4]Source: asksia-bible-mast20034-bilingual.pdf4 Analysis explore + model + test 3 Data collect / clean / store iterative: conclusions raise new problems -> cycle repeats The course's engine in one picture: Problem - Plan + Data - Analysis - Conclusion - the investigation cycle that frames almost every critique prompt. Most exam answers are really a question about one node: was the Plan a sound design? Were the Data well sampled? Is the Conclusion licensed by the design? Learn to walk it from memory. 一幅图道尽本课程的引擎:Problem → Plan → Data → Analysis → Conclusion -- 这个研究循环 框定了几乎每一个批判题。多数考试答案其实是关于 某一个节点的问题:Plan 是不是一个可靠的设计? Data 抽样得当吗?Conclusion 是设计所许可的吗? 要学会凭记忆把它走一遍。 Examinable scope = the 12-week reasoning spine: objectivity & data · good graphics · study design . observational studies & confounding · reporting & critiquing claims . qualitative methods . frameworks for inference . analysis & modelling . sampling & AskSia Library . MAST20034 . XXia Bilingual WEIRD bias . accumulating research (meta-analysis & Hill) . big data · context. Research prompts touch only the whole-class case studies and never demand recalled details. 可考范围 = 12 周的推理主线:objectivity 与数据 · 好图表 · 研究设计 · observational study 与 confounding · 报告 与批判主张 · qualitative methods · 推断框架 · 分析与建 模 · 抽样与 WEIRD bias · 积累研究 (meta-analysis 与 Hill) · big data · 情境。研究类题目只触及全班共学的案 例研究,从不要求背诵细节。 What the exam is really testing 这场考试真正在考什么 The cue you get The move it rewards A graph / figure Critique it: name two good features + one specific fix (the graphics principles) A described study Name the design - say what conclusion is legal (causation vs association) An association Find the confounder / lurking variable and explain how it could fake the link Statistical output / a CI / P Interpret it in context - without the classic misreads A sampling scenario Name the method & the bias (incl. WEIRD) and why size won't cure it ✓ The one habit that wins this exam 赢下这场考试的那一个习惯 For every prompt, name the concept first, then write the because. "Two variables move together" - confounding / correlation#causation; "who got picked" - a sampling / selection bias; "is this graph any good" - the five graphics principles; "does X cause Y across studies" - Bradford Hill; "what does this P-value mean" - the interpretation rules. The decoder in Ch 14 lists every cue. 对每道题,先点名概念,再写because。“两个变量一 起变动”→ confounding / correlation≠causation;“谁被选中”→ 某种 sampling/ selection bias;“这张图好不好”→五条 图表原则;“跨多项研究X是否导致 Y”→ Bradford Hill;“这个 P-value 是什么意思”→解读规则。第 14 章的解码器列出了每一个线索。 ★ The single highest-value habit 价值最高的单一习惯 You may write in dot-points or sentences, and there are no marks for grammar or spelling - so spend every word on the reasoning. Practise answering in the shape the markers reward: (1) name the concept, (2) define it in a line, (3) apply it to the scenario, (4) state the consequence or fix. Four sentences, full marks. "Explaining your reasoning and choices is typically more important than any answer. " 你可以用要点或句子书写,且语法或拼写不计分 所以把每一个词都花在推理上。按评分者奖励的形态 练习作答:(1)点名概念,(2)一行内定义它,(3)把它 应用到情景,(4)陈述后果或修正。四句话,满分。 “阐释你的推理与选择,通常比任何答案本身更重要。”[9]Source: asksia-bible-mast20034-bilingual.pdfThe PPDAC cycle - the spine of the whole unit, and a one-glance map of how an exam scenario hangs together: every question lives somewhere on Problem - Plan - Data - Analysis - Conclusion. Locating the stage tells you which concept the marker wants. PPDAC 循环 -- 整个单元的脊柱,也是一张让你一眼看清考试情景如何拼接的地图:每道题都栖身于 Problem → Plan → Data → Analysis → Conclusion 中的某处。定位到阶段,就知道评分者想要哪个概念。 AskSia Library . MAST20034 . XXia Bilingual ★ Concepts to recall - the whole-book checklist 要回忆的概念 -- 全书清单 · Context first (Ch1): data are value-laden; ask who/why/what/how/when; critique # criticism (always offer a constructive fix). 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · Graphics (Ch2): the 5 principles; match graph to variable types; two good features + one specific improvement. I 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 · Design (Ch3): validity = randomise/compare/control (kills bias); precision = replicate/stratify/balance (kills variability); they are independent axes. 设计(第3章): validity = 随机化/比较/控制(杀 bias); precision = 重复/分层/平衡(杀 variability);二者是独 立的轴。 · Observational (Ch4): cohort=group-by-exposure, case-control=group-by-outcome; confounder links to both; correlation # causation. 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠ 因果。 I 报告(第5章):中心/离散/趋势/离群点;报告 Cl+水平、以及统计量 +P,而非只报P。 · Qualitative (Ch6): "why" not "what"; bottom-up vs top-down coding; convergence as the stopping rule. 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence 作为停止规则。 推断(第7章):随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大P 不证明 Ho; Type l/ll 与 power. · Modelling (Ch8): signal+noise; "all models wrong, some useful"; parsimony; read residual/QQ plots - interpret, never fit. 建模(第8章):信号+噪声;“所有模型都是错的,有些有用”;简约性;读残差/QQ图 -- 解读,绝不拟合。 · Sampling (Ch9): frame vs sample; a big sample won't fix bias; 4 random + 4 non-random methods; WEIRD; reproducibility crisis. I 抽样(第9章):抽样框 vs样本;大样本修不了偏倚;4种随机+4种非随机方法;WEIRD;可重复性危机。 I 累积(第10章):森林图(零线+菱形);Hill 准则(时序性+梯度);发表偏倚。 · Big data (Ch11): significance # importance at scale; provenance, ethics, scepticism toward Al findings. Big data (第11章):在大规模下显著 ≠重要;来源、伦理、对 AI发现的怀疑。 而且永远 -- 铁律:点名概念,然后给 because。祝你好运。 AskSia Library · MAST20034 · 双语 Bilingual ● 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 ● 设计(第3章): validity =随机化/比较/控制(杀 bias); precision= 重复/分层/平衡(杀 variability);二者是独 立的轴。 ● 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠因果。 ● 报告(第5章):中心/离散/趋势/离群点;报告 CI+水平、以及统计量+P,而非只报P。 ● 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence(收敛)作为停止规则。 推断(第7章): 随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大 P 不证明 Ho; Type l/ll 与 power.[13]Source: asksia-bible-mast20034-bilingual.pdfEX 12. 1 Turning a fact into a because (worked short-answer) name > consequence + because Stem (AskSia-invented): "A wellbeing app is evaluated by surveying users who clicked an in-app pop-up. Comment on the sample. " 题干(AskSia 自拟):“某福祉 app通过调查那些点击了应用内弹窗的用户来评估。评论这个样本。” Weak (no marks): "It is a convenience sample. " - a label, no reasoning. 弱(无分):“这是一个 convenience sample。” -- 只是标签,没有推理。 Strong (banks the marks): "This is a convenience / volunteer sample, because only users already engaged enough to click respond - so it suffers self-selection bias and likely over-states satisfaction (because dissatisfied users have churned and are missing). A larger pop-up sample would not fix this, because it repeats the same biased method at scale. " 强(存下分数):“这是一个 convenience/ volunteer(便利/自愿)样本,因为只有已经足够投入到会去点击的用户才 会作答 -- 所以它存在 self-selection bias(自我选择偏倚),很可能高估满意度(因为不满意的用户已经流失,处于 缺失(missing)状态)。更大的弹窗样本并不能解决这个问题,因为它只是把同一种有偏的方法放大重复。” - Read-out: three clauses, three becauses: name the method - name the consequence in context - pre- empt the 'bigger sample' trap. (Scenario AskSia-invented; no figures to compute. ) 读出结构:三个分句,三个because:点名方法→在情境中点名后果→预先化解“样本更大”的陷阱。(情景由 AskSia 自拟;没有要计算的数字。) - 12. 4 The 3-hour timing plan 12. 43 小时计时计划 Three hours for short-answer reasoning is generous - the risk is over-writing early questions, not running out of ideas. Budget by marks, leave a critique-polish pass at the end. 三小时做简答推理是宽裕的 -- 风险在于早段题目写过头,而非想不出点子。按分值分配时间,末尾留一遍批判-润色的检 查。 AskSia Library · MAST20034 · 双语 Bilingual 1 First 10 min - survey & map. Read every question; pencil the decoder row next to each (design? critique? interpret? sample?). Spot the high-mark items. 头10分钟 -- 通览与定位。把每道题读一遍;在每题旁用铅笔标出解码器的行(设计?批判?解读?抽样?)。挑出高分 题。 2 Bulk - answer by mark weight. Roughly a minute per mark; a 4-mark design question wants two detailed becauses, a 2-mark "two good features" wants exactly two. Do not pad. 主体 -- 按分值作答。大约每分一分钟;一道 4分的设计题想要两个详尽的 because,一道 2分的“两个好特征”就恰好两 个。别注水。 3 Discipline - never compute. If you feel an arithmetic urge, you have misread - the answer is an interpretation, not a number. 纪律 -- 绝不计算。若你感到一股算术冲动,那你读错题了 -- 答案是一个解读,而非一个数字。 4 Last 20 min - the because audit. Re-read each answer and check every claim ends in a reason tied to the scenario; add the missing because, the missing caveat (significance # importance), the missing specific fix. 最后 20 分钟 -- because 审计。重读每个答案,检查每一条主张是否都以一个系到情景的理由收尾;补上缺失的 because、缺失的注意点(显著≠重要)、缺失的具体修复。 FIG 12. 1 1 Problem define the question 5 Conclusion answer in context 2 Plan design how to get data PPDAC cycle 4 Analysis explore + model + test 3 Data collect / clean / store iterative: conclusions raise new problems -> cycle repeats
-
final 常见题型,基本就这几类:[4]Source: asksia-bible-mast20034-bilingual.pdf4 Analysis explore + model + test 3 Data collect / clean / store iterative: conclusions raise new problems -> cycle repeats The course's engine in one picture: Problem - Plan + Data - Analysis - Conclusion - the investigation cycle that frames almost every critique prompt. Most exam answers are really a question about one node: was the Plan a sound design? Were the Data well sampled? Is the Conclusion licensed by the design? Learn to walk it from memory. 一幅图道尽本课程的引擎:Problem → Plan → Data → Analysis → Conclusion -- 这个研究循环 框定了几乎每一个批判题。多数考试答案其实是关于 某一个节点的问题:Plan 是不是一个可靠的设计? Data 抽样得当吗?Conclusion 是设计所许可的吗? 要学会凭记忆把它走一遍。 Examinable scope = the 12-week reasoning spine: objectivity & data · good graphics · study design . observational studies & confounding · reporting & critiquing claims . qualitative methods . frameworks for inference . analysis & modelling . sampling & AskSia Library . MAST20034 . XXia Bilingual WEIRD bias . accumulating research (meta-analysis & Hill) . big data · context. Research prompts touch only the whole-class case studies and never demand recalled details. 可考范围 = 12 周的推理主线:objectivity 与数据 · 好图表 · 研究设计 · observational study 与 confounding · 报告 与批判主张 · qualitative methods · 推断框架 · 分析与建 模 · 抽样与 WEIRD bias · 积累研究 (meta-analysis 与 Hill) · big data · 情境。研究类题目只触及全班共学的案 例研究,从不要求背诵细节。 What the exam is really testing 这场考试真正在考什么 The cue you get The move it rewards A graph / figure Critique it: name two good features + one specific fix (the graphics principles) A described study Name the design - say what conclusion is legal (causation vs association) An association Find the confounder / lurking variable and explain how it could fake the link Statistical output / a CI / P Interpret it in context - without the classic misreads A sampling scenario Name the method & the bias (incl. WEIRD) and why size won't cure it ✓ The one habit that wins this exam 赢下这场考试的那一个习惯 For every prompt, name the concept first, then write the because. "Two variables move together" - confounding / correlation#causation; "who got picked" - a sampling / selection bias; "is this graph any good" - the five graphics principles; "does X cause Y across studies" - Bradford Hill; "what does this P-value mean" - the interpretation rules. The decoder in Ch 14 lists every cue. 对每道题,先点名概念,再写because。“两个变量一 起变动”→ confounding / correlation≠causation;“谁被选中”→ 某种 sampling/ selection bias;“这张图好不好”→五条 图表原则;“跨多项研究X是否导致 Y”→ Bradford Hill;“这个 P-value 是什么意思”→解读规则。第 14 章的解码器列出了每一个线索。 ★ The single highest-value habit 价值最高的单一习惯 You may write in dot-points or sentences, and there are no marks for grammar or spelling - so spend every word on the reasoning. Practise answering in the shape the markers reward: (1) name the concept, (2) define it in a line, (3) apply it to the scenario, (4) state the consequence or fix. Four sentences, full marks. "Explaining your reasoning and choices is typically more important than any answer. " 你可以用要点或句子书写,且语法或拼写不计分 所以把每一个词都花在推理上。按评分者奖励的形态 练习作答:(1)点名概念,(2)一行内定义它,(3)把它 应用到情景,(4)陈述后果或修正。四句话,满分。 “阐释你的推理与选择,通常比任何答案本身更重要。”[5]Source: asksia-bible-mast20034-bilingual.pdfFINAL . 60% . SHORT-ANSWER REASONING The exam-morning decoder 考试当天解码器 If the question says X, reach for Y, and say these three things 若题目说 X,就伸手去取 Y,并说出这三件事 TL;DR. The final hands you a scenario, a graph, or a piece of statistical output and asks you to name the concept and justify it. There is no calculator and no arithmetic - every mark is a because. This page is the lookup table: read the cue words in the stem, reach for the matching concept, then deliver the three reasons that bank the marks. Memorise the column on the right; that is the answer. TL;DR. 期末递给你一个scenario (情景)、一张图,或一段统计输出,要你点名概念并加以论证。没有计算器,也没有算术 -- 每一分都是一个because。本页就是查找表:读题干里的cue words (线索词),伸手抓对应的概念,再交出能存下分数 的三条理由。把右侧那一列背下来;那就是答案。 ★ What the exam asks here 考试在这里问什么 The 60% final is 3 hours, short-answer only (no MCQ, no essay), no calculator, and you bring in 4 sides of your own notes. The marking criteria are explicit: "explaining your reasoning and choices is typically more important than any answer. " Dot-points are fine; no marks for grammar/spelling. So this whole chapter trains the one move the exam pays for - name the concept - give the because. 60% 的期末为时 3 小时,仅简答(无MCQ、无论文),不可用计算器,且你带入自备4面笔记。评分标准写得很明 确:“解释你的推理与选择,通常比任何答案本身更重要。”用要点列举即可;语法/拼写不计分。所以整章都在训练考试 买单的那一招 -- 点名概念 →给出 because。 12. 1 The cue - concept - because table 12. 1cue -> concept -> because xJAR Each row is a question species you have already met in this book. The left column is what the stem sounds like; the middle is the framework to invoke (with the chapter); the right is the 3-part skeleton - say all three and you have earned the reasoning marks. 每一行都是你在本书中已经见过的一类问题species(题种)。左列是题干听起来像什么;中列是要调用的框架(附章节);右 列是三段式骨架 -- 三段都说出来,你就挣到了推理分。 If the question says . . . Reach for this concept Say these 3 things (the because) DESIGN & CAUSATION "Choose / justify a study design"; "how would you investigate . . . " Study-design tree + validity · Ch3-4 (1) Can you intervene? - experiment (RCT) vs observational. (2) Pick the type with a because - rare outcome-case-control, many outcomes-cohort, snapshot-cross-sectional, populations-ecological. (3) Name the design tools that protect validity (randomise/compare/control). "Is it causal?"; "does X cause Y?"; "can we conclude . . . " Confounding + Bradford Hill . Ch4, Ch10 (1) Correlation # causation - observational data give association only. (2) Name a plausible confounder (linked to both exposure and outcome). (3) Argue Hill - esp. temporality (cause first) + dose-response gradient; an RCT would strengthen it by removing confounders. AskSia Library . MAST20034 . XXia Bilingual If the question says . . . Reach for this concept Say these 3 things (the because) GRAPHS & OUTPUT "Critique this graph"; "two good features & one improvement" 5 graphics principles . Ch2 (1) Two good features, each tied to a principle (standard form / common scale / clear encoding / shows data / simple). (2) One real issue (no title, abbreviated labels, panels on different scales). (3) A specific fix that addresses that issue - vague fixes score zero.[6]Source: asksia-bible-mast20034-bilingual.pdfEvery unit (and every set of n) equally likely - the baseline. Cluster 整群抽 样 Randomly pick groups, survey all within. stratum; guarantees subgroup coverage. prone. ✓ How to spend a glossary term in the exam 如何在考试中「花掉」一个词汇表术语 Never just name it. Define - apply - because. e. g. "This is convenience sampling (define); here it over-represents metro students (apply); so a bigger sample won't fix the bias (because). " That three-move sentence is what the rubric pays for. 永远别只是点名。定义→应用→ because。例如:“这是 convenience sampling(便利抽样)(定义);这里它过度 代表了都市学生(应用);所以更大的样本也修复不了这个偏倚(because)。”那个三步句式,正是评分标准买单的东 西。 AskSia Library · MAST20034 · 双语 Bilingual REVISION . SHORT -ANSWER BANK - REVISION . SHORT - ANSWER BANK ALL CHAPTERS . EXAM REHEARSAL Practice bank: every mark is a because 练习题库:每一分都是一个 because Twelve short-answer species, each reasoned out the way the marker wants 十二类 short-answer 题型,每一类都按阅卷人想要的方式推理出来 TL;DR. The final is short-answer reasoning only - you name a concept, then reason it out, then say because . . . The rubric is explicit: "explaining your reasoning and choices is typically more important than any answer. " So marks are awarded per correct, sufficiently-detailed reason - not per fact recalled. Each card below gives you the skeleton (define - reason - because), the marking note (what actually scores), and the trap that zeroes a vague answer. TL;DR. 期末只考简答推理 -- 你先点名一个概念,再推理出来,然后说because . . . (因为 …. . . . . )。评分标准写得很明白:“阐 释你的推理与选择,通常比任何答案本身更重要。”所以分数是按每条正确、足够详细的理由给的 -- 而不是按回忆出的事实给 的。下面每张卡片给你骨架(定义→推理→ because)、评分提示(什么真正得分)以及让模糊答案归零的陷阱。 ★ What the exam asks here - the format you are rehearsing 考试在这里问什么 -- 你正在排练的那种格式 The 60% final is short-answer, no calculator, no calculations, no software operation. You may bring 4 sides of your own notes. Two question species recur: (1) "explain a concept / apply critical thinking to a context" and (2) "use critical thinking on a whole-class example" (anchored to a shared case, but you are never asked to recall its data). They may hand you statistical output or a graph to interpret - you read and explain it, you never compute it. Every card here is one rep of that move. 60% 的期末是简答题,不可用计算器,无需计算,无需操作软件。你可以带入自备4 面笔记。两类题目反复出现:(1) “解释一个概念/把批判性思维应用到某情境”和(2)“对一个全班案例运用批判性思维”(锚定在一个共享案例上,但从 不要求你回忆它的数据)。他们可能递给你一段统计输出或一张图让你解读 -- 你读它、解释它,但从不计算它。这里的 每一张卡片都是这一招的一次操练。 P. 1 How to read each card - the marking model P. 1如何读每张卡片 -- 评分模型 Markers do not reward the verb "explain"; they reward the linkage. A "4-mark, 2+2" item almost always means 2 marks for a precise definition and 2 marks for two distinct, consequence-level reasons. A reason that merely restates the definition earns nothing. The skeleton below is the spine of every answer. 评分者奖励的不是“explain(解释)”这个动词,而是关联(linkage)。一道“4分、2+2”的题几乎总意味着2分给精确定义,2 分给两条相异的、后果层面的理由。一条仅仅复述定义的理由得不到分。下面的骨架是每个答案的脊柱。 1 Name the concept / framework. Markers reward defined terms - say convenience sampling, confounder, Type I error by name before you reason. 点名概念 / 框架。评分者奖励定义清晰的术语 -- 在推理前先按名说出 convenience sampling、confounder、Type l error. 2 Define it precisely. One sentence that would let a stranger identify it; vagueness ("just picking people") loses the definition marks. 精确地定义它。用一句话让陌生人也能据此辨认它;含糊(“就是随便挑人”)会丢掉定义分。
- 给你一个 study/scenario
- 让你判断研究设计、能不能推因果、哪里有偏倚
- 给你一个 graph
- 让你说两个优点、一个缺点、怎么改
- 给你 statistical output / CI / P-value
- 让你解释“这代表什么、不代表什么”
- 给你 sampling 情境
- 让你点名抽样方法、指出谁被排除、偏倚方向是什么
- 给你 observational association
- 让你找 confounder,并解释为什么“相关不等于因果”
- 给你 big data / AI claim / meta-analysis / public claim
- 让你从效应大小、代表性、伦理、发表偏倚、Hill criteria 去批判[3]Source: asksia-bible-mast20034-bilingual.pdf"Interpret this CI / P- value / output / forest plot" Inference reading rules . Ch7, Ch8, Ch10 (1) State what it shows in context (CI excludes O - evidence of an effect; small P - strong evidence against H. ). (2) Add the correct caveat (the interval is random, u is fixed; large P does not prove H. ). (3) Comment on strength / meaning - significant # important. SAMPLING & TRUST "Is this sample OK?"; "what's wrong with how they recruited?" Sampling bias + WEIRD · Ch1, Ch9 (1) Ask who is missing - frame / selection / non-response / volunteer gap. (2) Name the method and its bias (convenience - people similar to each other). (3) State that a bigger sample will NOT fix bias - it repeats the mistake at scale; consider WEIRD over-sampling. "Too good to be true"; "a surprising significant result"; "just barely p‹0. 05" Reproducibility + p-hacking · Ch9-10 (1) Publication bias - novel/significant results over-published, inflating effects. (2) Watch for p-hacking / HARKing (one-sided chosen after the data, multiple looks). (3) Ask for replication, a CI / effect size, and pre- registration before trusting it. "Big data / an Al claim"; "with millions of records . . . " Ethics + validity at scale · Ch1, Ch11 (1) Huge n - everything is significant - judge effect size & practical importance, not P. (2) Apply the context questions (who/why/what/how) + data ethics (consent, fairness, stewardship). (3) Be sceptical of Al - check provenance and the missing data. ✓ How to use the table under pressure 在压力下如何使用这张表 Underline the verb and the noun in the stem first ("choose a design", "critique this graph", "interpret the output"). That two-word cue picks the row; the right column is your paragraph. Then convert each of the three things into a sentence that ends in a because. You are never asked to compute - resist the urge. 先在题干里给动词和名词划线(“选择一个设计”、“批判这张图”、“解读这段输出”)。那个两字提示挑出对应的行;右侧 那一栏就是你要写的段落。然后把这三样东西各转成一个以 because 收尾的句子。题目从不要求你计算 -- 忍住冲动。 AskSia Library . MAST20034 . XXia Bilingual EXAM MORNING . THE DECODER - EXAM MORNING . THE DECODER BUILDING THE NOTES YOU CARRY IN Your 4 sides, the 'because' rule, and the 3 hours 你的4 页笔记、‘because’规则,以及那3个小时 TL;DR. You may bring four sides of your own notes and there is no calculator - so do not waste space on formulas. Fill the four sides with decision trees, checklists, and crisp definitions: the machinery that turns a cue into a justified answer. This page lays out what to put on each side, the one rule that wins short- answer marks, a timing plan for the three hours, and the closing concepts-to-recall list. TL;DR. 你可以带四面自己的笔记,而且没有计算器 -- 所以不要把空间浪费在公式上。把这四面填满decision trees (决策 树)、checklists (清单)和精炼的定义:那些把线索变成有论证答案的机器。本页摆出每一面该放什么、赢得简答分的那一 条规则、三小时的时间规划,以及收官的待回忆概念清单。 - 12. 2 The 4-side notes plan 12. 2四页笔记计划 The exam is reasoning, not recall of numbers, so your sheet is a reasoning toolkit. A good layout maps one side to each job of the decoder above. Trees and checklists earn marks; a wall of formulae does not (there is nothing to calculate). 考的是推理,而非对数字的回忆,所以你的小抄是一个推理工具箱。好的布局把每一面对应到上面解码器的一项工作。树与清 单能得分;一墙公式不能(没有任何东西要算)。 Side What goes on it[9]Source: asksia-bible-mast20034-bilingual.pdfThe PPDAC cycle - the spine of the whole unit, and a one-glance map of how an exam scenario hangs together: every question lives somewhere on Problem - Plan - Data - Analysis - Conclusion. Locating the stage tells you which concept the marker wants. PPDAC 循环 -- 整个单元的脊柱,也是一张让你一眼看清考试情景如何拼接的地图:每道题都栖身于 Problem → Plan → Data → Analysis → Conclusion 中的某处。定位到阶段,就知道评分者想要哪个概念。 AskSia Library . MAST20034 . XXia Bilingual ★ Concepts to recall - the whole-book checklist 要回忆的概念 -- 全书清单 · Context first (Ch1): data are value-laden; ask who/why/what/how/when; critique # criticism (always offer a constructive fix). 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · Graphics (Ch2): the 5 principles; match graph to variable types; two good features + one specific improvement. I 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 · Design (Ch3): validity = randomise/compare/control (kills bias); precision = replicate/stratify/balance (kills variability); they are independent axes. 设计(第3章): validity = 随机化/比较/控制(杀 bias); precision = 重复/分层/平衡(杀 variability);二者是独 立的轴。 · Observational (Ch4): cohort=group-by-exposure, case-control=group-by-outcome; confounder links to both; correlation # causation. 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠ 因果。 I 报告(第5章):中心/离散/趋势/离群点;报告 Cl+水平、以及统计量 +P,而非只报P。 · Qualitative (Ch6): "why" not "what"; bottom-up vs top-down coding; convergence as the stopping rule. 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence 作为停止规则。 推断(第7章):随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大P 不证明 Ho; Type l/ll 与 power. · Modelling (Ch8): signal+noise; "all models wrong, some useful"; parsimony; read residual/QQ plots - interpret, never fit. 建模(第8章):信号+噪声;“所有模型都是错的,有些有用”;简约性;读残差/QQ图 -- 解读,绝不拟合。 · Sampling (Ch9): frame vs sample; a big sample won't fix bias; 4 random + 4 non-random methods; WEIRD; reproducibility crisis. I 抽样(第9章):抽样框 vs样本;大样本修不了偏倚;4种随机+4种非随机方法;WEIRD;可重复性危机。 I 累积(第10章):森林图(零线+菱形);Hill 准则(时序性+梯度);发表偏倚。 · Big data (Ch11): significance # importance at scale; provenance, ethics, scepticism toward Al findings. Big data (第11章):在大规模下显著 ≠重要;来源、伦理、对 AI发现的怀疑。 而且永远 -- 铁律:点名概念,然后给 because。祝你好运。 AskSia Library · MAST20034 · 双语 Bilingual ● 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 ● 设计(第3章): validity =随机化/比较/控制(杀 bias); precision= 重复/分层/平衡(杀 variability);二者是独 立的轴。 ● 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠因果。 ● 报告(第5章):中心/离散/趋势/离群点;报告 CI+水平、以及统计量+P,而非只报P。 ● 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence(收敛)作为停止规则。 推断(第7章): 随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大 P 不证明 Ho; Type l/ll 与 power.[11]Source: asksia-bible-mast20034-bilingual.pdf(b) Big-data 批判(because):巨大的 n修不了 bias -- 它只是那些已经在用这个 app 的人(selection bias); 在海量 n 下一切看起来都“显著”,所以 effect size 与来源比 P 更要紧。再加上数据伦理标记:被记录用户的隐私/ 同意。 What earns the marks. a justified qual choice (the "why" logic + convergence) + a big-data critique naming that size # representativeness, with an ethics flag. 什么能得分。一个有论证的 qual 选择(“why”逻辑+ convergence)+一个 big-data 批判,点明规模 ≠代表性,并附 上一个 ethics 标记。 Trap. dismissing qualitative as "unscientific"; equating large n with representative; forgetting consent/provenance for found data. 陷阱。把 qualitative 斥为“不科学”;把大n 等同于有代表性;对找来的数据忘了 consent/provenance(同意/来源)。 AskSia Library · MAST20034 · 双语 Bilingual ★ Recall checklist - the decision rules for the bank 回忆清单 -- 题库的决策规则 1 每个答案:点名→定义→在情境中给理由→ because(后果)。按分值数理由。 · Design/sampling: justify the choice against its alternative; non-probability methods are biased - size won't cure it. 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 · Confounder: must link to both exposure and outcome; observational - associated with, never causes. Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 · Graph critique: two good features to a principle; one issue + a fix that matches it. 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 · Type I/II: false +/false -; power = 1-ß; small n -+ low power; rare condition - base-rate false positives. Type l/ll: 假阳/假阴;power = 1-β;小 n→低 power;罕见情况→基础率导致的假阳性。 · Qual vs quant: why vs what; convergence is the qual stopping rule. Big data: size # unbiased; effect size & ethics over P. 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data: 大≠无偏;effect size 与伦理胜过 P。 ● 每个答案:点名 →定义→在情境中给理由→ because(后果)。按分值数理由。 ● 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 ● Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 ● 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 ● CI:随机的是区间,固定的是参数。P-value: Pr(datalHo),而非 Pr(Holdata);显著 ≠重要。 ● Type I/ll: 假阳/假阴;power = 1-β;小n→低 power;罕见情况→ 基础率导致的假阳性。 ● Forest plot: 菱形对零线+异质性+发表偏倚的注意。Hill:证据的权重,时序性优先。 ● 诊断图:漏斗形→方差非恒定;QQ 弯曲→非正态→变换/用假设更少的方法。 ● 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data:大≠无偏;effect size 与伦理胜过 P。 AskSia Library · MAST20034 · 双语 Bilingual EXAM MORNING . THE DECODER EXAM MORNING . THE DECODER[14]Source: asksia-bible-mast20034-bilingual.pdfGLM 连接:连续→identity/Normal;二元→logit/Bernouli (优势比);计数→log/Poisson。认出来,别去算。 · Diagnostics: residuals-vs-fitted = funnel(variance)/curve(nonlinearity); QQ tails = non-Normal. Any structure = a problem. I 诊断:残差对拟合=漏斗形(方差)/弯曲(非线性);QQ 尾部=非正态。任何结构=一个问题。 修复点名的违例:变换 · 加项 · 假设更少的检验。 · Parsimony over complexity (compare via AIC/BIC - don't compute); over-fitting is a fault, not a virtue. 一 简约性胜过复杂性(用AIC/BIC比较 -- 别去算);过拟合是缺陷,而非美德。 两个拒绝:不向数据之外做 extrapolation(外推);相关 ≠因果 -- 点出那个混杂因素。 · Model = signal + noise (模型=信号+噪声);“所有模型都是错的,有些有用” -- 评判有用性+简约性,而非 真伪。 ● 从三个轴读一个系数:符号 · 大小 · 显著性(CI 不含 0/小P=真实,而非噪声)。 ● 预测变量:数值型→斜率;分类型→相对基线的组间偏移;两者皆有→调整后的效应。 ● GLM links: 连续型→identity/Normal;二元→logit/Bernoulli (odds ratios);计数→log/Poisson。会识别,不计 算。 ● 诊断:residuals-vs-fitted = 漏斗形(方差)/曲线(非线性);QQ 尾部=非正态。任何结构=一个问题。 ● 修正点名的那个违背:transform · 加一项 · 假设更少的检验。 · 简约性(Parsimony)优于复杂性(通过 AIC/BIC 比较 -- 不计算);over-fitting 是缺点,不是优点。 ● 两条拒绝:不在数据之外做 extrapolation; correlation ≠ causation––点名 confounder。 AskSia Library · MAST20034 · 双语 Bilingual WEEK 9 . SAMPLING WEEK 9 . SAMPLING CH 9 . REASONING, NOT COMPUTING Population, frame & sample - and why a big sample can't fix a bad one population、frame 与 sample -- 以及为何大样本救不了一个坏样本 A sample is a guess about a population; the method decides if the guess is honest 样本是对总体的一次猜测;方法决定这猜测是否诚实 TL;DR. Almost no study measures everyone, so we measure a sample and reason about the population. Two different things can go wrong, and the exam lives in the gap between them: sampling error is the harmless luck-of-the-draw wobble that shrinks as the sample grows, while bias is a systematic lean baked in by the method - and a bigger sample only repeats that mistake on a larger scale. The whole skill is naming who is missing from the sample and which way that tilts the answer. TL;DR. 几乎没有研究能测量所有人,所以我们测量一个 sample 并就 population 推理。两种不同的东西可能出错,而考试 就活在它们之间的缝隙里:sampling error(抽样误差)是无害的、抽签运气式的抖动,随样本增大而收缩,而 bias 是由方 法烤进去的一种系统性偏斜 -- 更大的样本只是把那个错误放在更大的规模上重演一遍。整个技能就是点名样本里谁缺席了, 以及那朝哪个方向歪了答案。 ★ What the exam asks here 考试在这里问什么 Sampling seeds the most-rehearsed reasoning question on the 60% final - a 3-hour, short-answer paper with no calculator and no calculations, where you bring in four sides of your own notes. The released sample question asks you to define a sampling method and say why it is (not) recommended, and the tutorial species asks which sampling method would you use here and why. You will also be handed a scenario and asked to identify who is excluded and the direction of the resulting bias. The marking rule is explicit: "explaining your reasoning and choices is typically more important than any answer. " Every mark is a because - carry the taxonomy and the bias checklist, and spend the words on the consequence, not the label. Sampling 在60% 期末上孕育出最常演练的推理题 -- 一份 3小时、简答、无计算器、无计算的卷子,你带入四面自己 的笔记。发布的样题要你定义一种 sampling 方法并说出它为何(不)被推荐,教程题种则问此处你会用哪种 sampling 方法、为什么。你也会被递给一个情景,要你识别谁被排除,以及由此产生的bias的方向。评分规则明确:“阐释你的 推理与选择,通常比任何答案本身更重要。”每一分都是一个because -- 带上分类法和 bias 清单,把字花在后果上,而 非标签上。 9. 1 The core vocabulary - unit, population, frame, sample 9. 1核心词汇 -- unit、population、frame、sample Definition. A unit is one thing you study (a person, a tree, a transaction). The population is all the units you want to talk about. A census measures every unit in the population. The sampling frame is the actual list of units you can draw from - the electoral roll, a class list, a customer database. The sample is the subset you measure. Write each term as a defined object, because markers reward the defined term before the reasoning.
- 给你一个 study/scenario
-
二、最最重要的答题总模板:这门课就是在考这个
-
你几乎所有题都可以套这个骨架:[4]Source: asksia-bible-mast20034-bilingual.pdf4 Analysis explore + model + test 3 Data collect / clean / store iterative: conclusions raise new problems -> cycle repeats The course's engine in one picture: Problem - Plan + Data - Analysis - Conclusion - the investigation cycle that frames almost every critique prompt. Most exam answers are really a question about one node: was the Plan a sound design? Were the Data well sampled? Is the Conclusion licensed by the design? Learn to walk it from memory. 一幅图道尽本课程的引擎:Problem → Plan → Data → Analysis → Conclusion -- 这个研究循环 框定了几乎每一个批判题。多数考试答案其实是关于 某一个节点的问题:Plan 是不是一个可靠的设计? Data 抽样得当吗?Conclusion 是设计所许可的吗? 要学会凭记忆把它走一遍。 Examinable scope = the 12-week reasoning spine: objectivity & data · good graphics · study design . observational studies & confounding · reporting & critiquing claims . qualitative methods . frameworks for inference . analysis & modelling . sampling & AskSia Library . MAST20034 . XXia Bilingual WEIRD bias . accumulating research (meta-analysis & Hill) . big data · context. Research prompts touch only the whole-class case studies and never demand recalled details. 可考范围 = 12 周的推理主线:objectivity 与数据 · 好图表 · 研究设计 · observational study 与 confounding · 报告 与批判主张 · qualitative methods · 推断框架 · 分析与建 模 · 抽样与 WEIRD bias · 积累研究 (meta-analysis 与 Hill) · big data · 情境。研究类题目只触及全班共学的案 例研究,从不要求背诵细节。 What the exam is really testing 这场考试真正在考什么 The cue you get The move it rewards A graph / figure Critique it: name two good features + one specific fix (the graphics principles) A described study Name the design - say what conclusion is legal (causation vs association) An association Find the confounder / lurking variable and explain how it could fake the link Statistical output / a CI / P Interpret it in context - without the classic misreads A sampling scenario Name the method & the bias (incl. WEIRD) and why size won't cure it ✓ The one habit that wins this exam 赢下这场考试的那一个习惯 For every prompt, name the concept first, then write the because. "Two variables move together" - confounding / correlation#causation; "who got picked" - a sampling / selection bias; "is this graph any good" - the five graphics principles; "does X cause Y across studies" - Bradford Hill; "what does this P-value mean" - the interpretation rules. The decoder in Ch 14 lists every cue. 对每道题,先点名概念,再写because。“两个变量一 起变动”→ confounding / correlation≠causation;“谁被选中”→ 某种 sampling/ selection bias;“这张图好不好”→五条 图表原则;“跨多项研究X是否导致 Y”→ Bradford Hill;“这个 P-value 是什么意思”→解读规则。第 14 章的解码器列出了每一个线索。 ★ The single highest-value habit 价值最高的单一习惯 You may write in dot-points or sentences, and there are no marks for grammar or spelling - so spend every word on the reasoning. Practise answering in the shape the markers reward: (1) name the concept, (2) define it in a line, (3) apply it to the scenario, (4) state the consequence or fix. Four sentences, full marks. "Explaining your reasoning and choices is typically more important than any answer. " 你可以用要点或句子书写,且语法或拼写不计分 所以把每一个词都花在推理上。按评分者奖励的形态 练习作答:(1)点名概念,(2)一行内定义它,(3)把它 应用到情景,(4)陈述后果或修正。四句话,满分。 “阐释你的推理与选择,通常比任何答案本身更重要。”[6]Source: asksia-bible-mast20034-bilingual.pdfEvery unit (and every set of n) equally likely - the baseline. Cluster 整群抽 样 Randomly pick groups, survey all within. stratum; guarantees subgroup coverage. prone. ✓ How to spend a glossary term in the exam 如何在考试中「花掉」一个词汇表术语 Never just name it. Define - apply - because. e. g. "This is convenience sampling (define); here it over-represents metro students (apply); so a bigger sample won't fix the bias (because). " That three-move sentence is what the rubric pays for. 永远别只是点名。定义→应用→ because。例如:“这是 convenience sampling(便利抽样)(定义);这里它过度 代表了都市学生(应用);所以更大的样本也修复不了这个偏倚(because)。”那个三步句式,正是评分标准买单的东 西。 AskSia Library · MAST20034 · 双语 Bilingual REVISION . SHORT -ANSWER BANK - REVISION . SHORT - ANSWER BANK ALL CHAPTERS . EXAM REHEARSAL Practice bank: every mark is a because 练习题库:每一分都是一个 because Twelve short-answer species, each reasoned out the way the marker wants 十二类 short-answer 题型,每一类都按阅卷人想要的方式推理出来 TL;DR. The final is short-answer reasoning only - you name a concept, then reason it out, then say because . . . The rubric is explicit: "explaining your reasoning and choices is typically more important than any answer. " So marks are awarded per correct, sufficiently-detailed reason - not per fact recalled. Each card below gives you the skeleton (define - reason - because), the marking note (what actually scores), and the trap that zeroes a vague answer. TL;DR. 期末只考简答推理 -- 你先点名一个概念,再推理出来,然后说because . . . (因为 …. . . . . )。评分标准写得很明白:“阐 释你的推理与选择,通常比任何答案本身更重要。”所以分数是按每条正确、足够详细的理由给的 -- 而不是按回忆出的事实给 的。下面每张卡片给你骨架(定义→推理→ because)、评分提示(什么真正得分)以及让模糊答案归零的陷阱。 ★ What the exam asks here - the format you are rehearsing 考试在这里问什么 -- 你正在排练的那种格式 The 60% final is short-answer, no calculator, no calculations, no software operation. You may bring 4 sides of your own notes. Two question species recur: (1) "explain a concept / apply critical thinking to a context" and (2) "use critical thinking on a whole-class example" (anchored to a shared case, but you are never asked to recall its data). They may hand you statistical output or a graph to interpret - you read and explain it, you never compute it. Every card here is one rep of that move. 60% 的期末是简答题,不可用计算器,无需计算,无需操作软件。你可以带入自备4 面笔记。两类题目反复出现:(1) “解释一个概念/把批判性思维应用到某情境”和(2)“对一个全班案例运用批判性思维”(锚定在一个共享案例上,但从 不要求你回忆它的数据)。他们可能递给你一段统计输出或一张图让你解读 -- 你读它、解释它,但从不计算它。这里的 每一张卡片都是这一招的一次操练。 P. 1 How to read each card - the marking model P. 1如何读每张卡片 -- 评分模型 Markers do not reward the verb "explain"; they reward the linkage. A "4-mark, 2+2" item almost always means 2 marks for a precise definition and 2 marks for two distinct, consequence-level reasons. A reason that merely restates the definition earns nothing. The skeleton below is the spine of every answer. 评分者奖励的不是“explain(解释)”这个动词,而是关联(linkage)。一道“4分、2+2”的题几乎总意味着2分给精确定义,2 分给两条相异的、后果层面的理由。一条仅仅复述定义的理由得不到分。下面的骨架是每个答案的脊柱。 1 Name the concept / framework. Markers reward defined terms - say convenience sampling, confounder, Type I error by name before you reason. 点名概念 / 框架。评分者奖励定义清晰的术语 -- 在推理前先按名说出 convenience sampling、confounder、Type l error. 2 Define it precisely. One sentence that would let a stranger identify it; vagueness ("just picking people") loses the definition marks. 精确地定义它。用一句话让陌生人也能据此辨认它;含糊(“就是随便挑人”)会丢掉定义分。[8]Source: asksia-bible-mast20034-bilingual.pdfAskSia Library · MAST20034 · 双语 Bilingual 3 Reason in the context given. Tie the concept to this scenario, not a generic textbook one. 在给定的情境中推理。把概念系到这个情景,而非一个泛泛的教科书情景。 4 Close each point with a because. State the consequence (the bias it induces, the assumption it breaks, the conclusion it licenses) - this is the mark-bearing clause. 每个要点都以一个 because 收尾。说出后果(它引发的偏倚、它破坏的假设、它许可的结论) -- 这是承载分数的从句。 5 Count your reasons against the marks. If it says [4: 2+2], deliver a definition and two separate consequence-level reasons. 按分值数你的理由。若标着[4:2+2],就要给出一个定义以及两个各自独立、到后果层面的理由。 ✓ The universal sentence frame 通用句式框架 "This is [named concept], which means [definition]. Here it matters because [consequence #1], and also because [consequence #2]. "Drop any scenario into that frame and you have structured an answer that the rubric can find marks in. “这是[点名的概念],意思是[定义]。在这里它之所以重要,是因为[后果#1],也因为[后果 #2]。”把任意情景套进这 个框架,你就把一个评分标准能找到分的答案给搭好了。 - ! The four ways students throw away marks 学生白白丢分的四种方式 复述,而非推理 -- “它有偏倚,因为它不随机”只是给定义换了个名字;要说出这个偏倚做了什么。 |一个理由套两件外套 -- 一道“2+2”的题目需要两个不同的理由,而非把一个换说法说两遍。 · Trying to compute - there is no calculator and no calc question; if you start arithmetic you have misread the task. 试图计算 -- 这里既无计算器也无计算题;若你开始做算术,你就读错题了。 过度宣称因果 -- 对观察性数据唯一合法的动词是 is associated with,绝不是 causes。 ● 复述,而非推理 -- “它有偏倚,因为它不随机”只是给定义换了个名字;要说出这个偏倚做了什么。 ● 一个理由套两件外套 -- 一道“2+2”的题目需要两个不同的理由,而非把一个换说法说两遍。 ● 试图计算 -- 这里既无计算器也无计算题;若你开始做算术,你就读错题了。 ● 过度宣称因果 -- 对观察性数据唯一合法的动词是 is associated with (与 . . . . . 相关),绝不是 causes(导致)。 AskSia Library · MAST20034 · 双语 Bilingual REVISION . SHORT - ANSWER BANK REVISION . SHORT - ANSWER BANK STUDY - PRODUCTION SPECIES Drills 1-6: design, sampling, confounding & graphs 演练 1-6: 设计、抽样、confounding 与图表 TL;DR. These six rehearse the "how the data were produced" family: choose & justify a design, pick a sampling method, name the confounder, identify exposure/outcome, and critique a graph (two good features + one specific fix). Reason the choice against its alternative - that contrast is where the marks live. TL;DR. 这六张演练“数据是如何产生的”这一族题:选择并论证一种设计、挑一种 sampling 方法、点名 confounder、识别 exposure/outcome,以及批判一张图(两个优点+一个具体的修正)。要把你的选择对照其备选方案来论证 -- 那个对比正 是分数所在。 Q1[11]Source: asksia-bible-mast20034-bilingual.pdf(b) Big-data 批判(because):巨大的 n修不了 bias -- 它只是那些已经在用这个 app 的人(selection bias); 在海量 n 下一切看起来都“显著”,所以 effect size 与来源比 P 更要紧。再加上数据伦理标记:被记录用户的隐私/ 同意。 What earns the marks. a justified qual choice (the "why" logic + convergence) + a big-data critique naming that size # representativeness, with an ethics flag. 什么能得分。一个有论证的 qual 选择(“why”逻辑+ convergence)+一个 big-data 批判,点明规模 ≠代表性,并附 上一个 ethics 标记。 Trap. dismissing qualitative as "unscientific"; equating large n with representative; forgetting consent/provenance for found data. 陷阱。把 qualitative 斥为“不科学”;把大n 等同于有代表性;对找来的数据忘了 consent/provenance(同意/来源)。 AskSia Library · MAST20034 · 双语 Bilingual ★ Recall checklist - the decision rules for the bank 回忆清单 -- 题库的决策规则 1 每个答案:点名→定义→在情境中给理由→ because(后果)。按分值数理由。 · Design/sampling: justify the choice against its alternative; non-probability methods are biased - size won't cure it. 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 · Confounder: must link to both exposure and outcome; observational - associated with, never causes. Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 · Graph critique: two good features to a principle; one issue + a fix that matches it. 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 · Type I/II: false +/false -; power = 1-ß; small n -+ low power; rare condition - base-rate false positives. Type l/ll: 假阳/假阴;power = 1-β;小 n→低 power;罕见情况→基础率导致的假阳性。 · Qual vs quant: why vs what; convergence is the qual stopping rule. Big data: size # unbiased; effect size & ethics over P. 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data: 大≠无偏;effect size 与伦理胜过 P。 ● 每个答案:点名 →定义→在情境中给理由→ because(后果)。按分值数理由。 ● 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 ● Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 ● 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 ● CI:随机的是区间,固定的是参数。P-value: Pr(datalHo),而非 Pr(Holdata);显著 ≠重要。 ● Type I/ll: 假阳/假阴;power = 1-β;小n→低 power;罕见情况→ 基础率导致的假阳性。 ● Forest plot: 菱形对零线+异质性+发表偏倚的注意。Hill:证据的权重,时序性优先。 ● 诊断图:漏斗形→方差非恒定;QQ 弯曲→非正态→变换/用假设更少的方法。 ● 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data:大≠无偏;effect size 与伦理胜过 P。 AskSia Library · MAST20034 · 双语 Bilingual EXAM MORNING . THE DECODER EXAM MORNING . THE DECODER[16]Source: asksia-cheatsheet-mast20034.pdfqualitative . convergence 17 . The "Because" Rule HOW TO BANK MARKS EVERY ANSWER = A CHAIN NAME the concept/framework - APPLY it to the context + BECAUSE . . . (the reason wins the mark) · Marks are per correct, sufficiently-detailed reason - one detailed reason often = full marks; restating the definition earns nothing · Dot points are fine . no grammar/spelling marks . 3 hours SIA > No question needs a calculator and none recall case-study details. If you're computing, you've misread it - they want the reasoning. Spend the words on the because. 18 . Top Traps to THE MARKS LOST MOST Avoid · "P = Pr(Ho true)" . "large P proves Ho" . choosing one- sided after the data · Calling a natural experiment an experiment . "association proves cause" · Cohort>case-control mix-up · stratified > quota . cluster<> stratified · Hill as a tick-box · ignoring temporality · "Significant" = "important" . trusting a result because n is huge the core of your 4-side bring-in notes . confirm on the MAST20034 exam-info page name it. apply it. because . . . asksia. ai/cheatsheet/ unimelb-mast20034 . side 2/2 AskSia CHEAT SHEET SERIES RESTRICTED BRING-IN . NO CALCULATOR
-
通用满分句式:
-
中文版你可以这样理解:
- 这是某个概念
- 这个概念的定义是
- 在这个题里它重要,因为……
- 还因为……
- 后果是…… / 所以不能…… / 所以更好的做法是……
-
阅卷人真正买单的是:
- 定义准确
- 理由具体
- 后果讲清楚
- 和题目情境绑定
- 不是空泛背书[6]Source: asksia-bible-mast20034-bilingual.pdfEvery unit (and every set of n) equally likely - the baseline. Cluster 整群抽 样 Randomly pick groups, survey all within. stratum; guarantees subgroup coverage. prone. ✓ How to spend a glossary term in the exam 如何在考试中「花掉」一个词汇表术语 Never just name it. Define - apply - because. e. g. "This is convenience sampling (define); here it over-represents metro students (apply); so a bigger sample won't fix the bias (because). " That three-move sentence is what the rubric pays for. 永远别只是点名。定义→应用→ because。例如:“这是 convenience sampling(便利抽样)(定义);这里它过度 代表了都市学生(应用);所以更大的样本也修复不了这个偏倚(because)。”那个三步句式,正是评分标准买单的东 西。 AskSia Library · MAST20034 · 双语 Bilingual REVISION . SHORT -ANSWER BANK - REVISION . SHORT - ANSWER BANK ALL CHAPTERS . EXAM REHEARSAL Practice bank: every mark is a because 练习题库:每一分都是一个 because Twelve short-answer species, each reasoned out the way the marker wants 十二类 short-answer 题型,每一类都按阅卷人想要的方式推理出来 TL;DR. The final is short-answer reasoning only - you name a concept, then reason it out, then say because . . . The rubric is explicit: "explaining your reasoning and choices is typically more important than any answer. " So marks are awarded per correct, sufficiently-detailed reason - not per fact recalled. Each card below gives you the skeleton (define - reason - because), the marking note (what actually scores), and the trap that zeroes a vague answer. TL;DR. 期末只考简答推理 -- 你先点名一个概念,再推理出来,然后说because . . . (因为 …. . . . . )。评分标准写得很明白:“阐 释你的推理与选择,通常比任何答案本身更重要。”所以分数是按每条正确、足够详细的理由给的 -- 而不是按回忆出的事实给 的。下面每张卡片给你骨架(定义→推理→ because)、评分提示(什么真正得分)以及让模糊答案归零的陷阱。 ★ What the exam asks here - the format you are rehearsing 考试在这里问什么 -- 你正在排练的那种格式 The 60% final is short-answer, no calculator, no calculations, no software operation. You may bring 4 sides of your own notes. Two question species recur: (1) "explain a concept / apply critical thinking to a context" and (2) "use critical thinking on a whole-class example" (anchored to a shared case, but you are never asked to recall its data). They may hand you statistical output or a graph to interpret - you read and explain it, you never compute it. Every card here is one rep of that move. 60% 的期末是简答题,不可用计算器,无需计算,无需操作软件。你可以带入自备4 面笔记。两类题目反复出现:(1) “解释一个概念/把批判性思维应用到某情境”和(2)“对一个全班案例运用批判性思维”(锚定在一个共享案例上,但从 不要求你回忆它的数据)。他们可能递给你一段统计输出或一张图让你解读 -- 你读它、解释它,但从不计算它。这里的 每一张卡片都是这一招的一次操练。 P. 1 How to read each card - the marking model P. 1如何读每张卡片 -- 评分模型 Markers do not reward the verb "explain"; they reward the linkage. A "4-mark, 2+2" item almost always means 2 marks for a precise definition and 2 marks for two distinct, consequence-level reasons. A reason that merely restates the definition earns nothing. The skeleton below is the spine of every answer. 评分者奖励的不是“explain(解释)”这个动词,而是关联(linkage)。一道“4分、2+2”的题几乎总意味着2分给精确定义,2 分给两条相异的、后果层面的理由。一条仅仅复述定义的理由得不到分。下面的骨架是每个答案的脊柱。 1 Name the concept / framework. Markers reward defined terms - say convenience sampling, confounder, Type I error by name before you reason. 点名概念 / 框架。评分者奖励定义清晰的术语 -- 在推理前先按名说出 convenience sampling、confounder、Type l error. 2 Define it precisely. One sentence that would let a stranger identify it; vagueness ("just picking people") loses the definition marks. 精确地定义它。用一句话让陌生人也能据此辨认它;含糊(“就是随便挑人”)会丢掉定义分。[7]Source: asksia-bible-mast20034-bilingual.pdfWhy it pays off Side 1 - design & cause Study-design tree (intervene? - experiment/observational - cohort/cross-sec/case-control/ecological); confounding triangle; the Hill 9 (star temporality + gradient); exposure/outcome synonyms. Covers the two biggest row- families - "choose a design" and "is it causal?" - with ready-made becauses. Side 2 - The 5 graphics principles as a critique checklist; the graph-chooser (by variable mix); the data-description checklist graphs & reporting (centre/spread/trend/outliers, concise+complete); inference-report checklist (CI + level, stat + P, n). Turns the graph-critique and "critique this description" species into fill-in-the-blank answers. Side 3 - inference rules Correct CI + P-value wordings (and the wrong ones to avoid); the Type I/II + power 2×2; the NHST 5 steps; the assumption hierarchy; the diagnostic-plot readings (funnel - non-constant variance). The "interpret this output" species is pure recall of the right sentence - have it verbatim. Side 4 - sampling, qual & ethics Sampling taxonomy (4 random + 4 non-random) with each method's bias; "size won't fix bias"; WEIRD + reproducibility; qual methods (why vs what, coding, convergence); data ethics + justice; the context questions. Mops up the trust / sampling / qualitative / big-data rows and the W1 + W12 context frame. AskSia Library . MAST20034 . XXia Bilingual ! Do NOT build a formula sheet 不要去做一张公式表 There is no calculator and no calculation on this exam. A side crammed with CLT algebra, t-formulae or regression normal equations is wasted - you will never plug numbers in. The only notation worth a line is the definition of a P- value or a CI in words. Every other millimetre should be a tree, a checklist, or a one-line definition. 这场考试没有计算器,也没有计算。一面塞满 CLT代数、t 公式或回归正规方程的笔记是浪费 -- 你永远不会代入数 字。唯一值得占一行的记号,是用文字写出 P-value 或 CI 的定义。其余每一毫米都应是一棵树、一份核对清单,或一 行定义。 12. 3 The short-answer 'because' rule 12. 3short-answer 的‘because’ 规则 Definition. A complete short answer = a named concept + a reason that connects it to the scenario. Marks are awarded per correct, sufficiently-detailed reason, not per fact stated. A reason that merely restates the definition is not a because and scores nothing. 定义。一个完整的简答=一个点名的概念+一条把它连到情景上的理由。分数按每条正确、足够详细的理由给,而非按陈述 的事实给。一条仅仅复述定义的理由不是 because,得不到分。[8]Source: asksia-bible-mast20034-bilingual.pdfAskSia Library · MAST20034 · 双语 Bilingual 3 Reason in the context given. Tie the concept to this scenario, not a generic textbook one. 在给定的情境中推理。把概念系到这个情景,而非一个泛泛的教科书情景。 4 Close each point with a because. State the consequence (the bias it induces, the assumption it breaks, the conclusion it licenses) - this is the mark-bearing clause. 每个要点都以一个 because 收尾。说出后果(它引发的偏倚、它破坏的假设、它许可的结论) -- 这是承载分数的从句。 5 Count your reasons against the marks. If it says [4: 2+2], deliver a definition and two separate consequence-level reasons. 按分值数你的理由。若标着[4:2+2],就要给出一个定义以及两个各自独立、到后果层面的理由。 ✓ The universal sentence frame 通用句式框架 "This is [named concept], which means [definition]. Here it matters because [consequence #1], and also because [consequence #2]. "Drop any scenario into that frame and you have structured an answer that the rubric can find marks in. “这是[点名的概念],意思是[定义]。在这里它之所以重要,是因为[后果#1],也因为[后果 #2]。”把任意情景套进这 个框架,你就把一个评分标准能找到分的答案给搭好了。 - ! The four ways students throw away marks 学生白白丢分的四种方式 复述,而非推理 -- “它有偏倚,因为它不随机”只是给定义换了个名字;要说出这个偏倚做了什么。 |一个理由套两件外套 -- 一道“2+2”的题目需要两个不同的理由,而非把一个换说法说两遍。 · Trying to compute - there is no calculator and no calc question; if you start arithmetic you have misread the task. 试图计算 -- 这里既无计算器也无计算题;若你开始做算术,你就读错题了。 过度宣称因果 -- 对观察性数据唯一合法的动词是 is associated with,绝不是 causes。 ● 复述,而非推理 -- “它有偏倚,因为它不随机”只是给定义换了个名字;要说出这个偏倚做了什么。 ● 一个理由套两件外套 -- 一道“2+2”的题目需要两个不同的理由,而非把一个换说法说两遍。 ● 试图计算 -- 这里既无计算器也无计算题;若你开始做算术,你就读错题了。 ● 过度宣称因果 -- 对观察性数据唯一合法的动词是 is associated with (与 . . . . . 相关),绝不是 causes(导致)。 AskSia Library · MAST20034 · 双语 Bilingual REVISION . SHORT - ANSWER BANK REVISION . SHORT - ANSWER BANK STUDY - PRODUCTION SPECIES Drills 1-6: design, sampling, confounding & graphs 演练 1-6: 设计、抽样、confounding 与图表 TL;DR. These six rehearse the "how the data were produced" family: choose & justify a design, pick a sampling method, name the confounder, identify exposure/outcome, and critique a graph (two good features + one specific fix). Reason the choice against its alternative - that contrast is where the marks live. TL;DR. 这六张演练“数据是如何产生的”这一族题:选择并论证一种设计、挑一种 sampling 方法、点名 confounder、识别 exposure/outcome,以及批判一张图(两个优点+一个具体的修正)。要把你的选择对照其备选方案来论证 -- 那个对比正 是分数所在。 Q1
-
这门课有一句铁律,你一定要记住:
- name the concept, then write the because
- 先点名概念,再写理由。[1]Source: asksia-bible-mast20034-bilingual.pdfB 2 . REVISE 2 · REVISE 2 . REVISE You've done the week. Use the tables and the chapter-end recall checklists to self-test: can you list the four observational designs, name three sampling biases, give the five graphics principles, recite the Hill criteria? The checklists are written to be lifted almost verbatim onto your four-side notes sheet. 你已经上完本周。用各表格和章 末的recall checklists (回忆清 单)来自测:你能列出四种观察 性设计、点名三种 sampling bias、给出五条图表原则、背出 Hill 准则吗?这些清单写出来就 是为了几乎逐字誉到你的四面笔 记纸上。 C 3 . APPLY 3 . APPLY 3 . APPLY You're building your notes sheet or sitting the paper. Run the name-the-concept decoder (Ch 14) on every prompt: read the cue - name the design / bias / method -> write the because. With four sides of notes carried in and no calculator, your edge is reasoning discipline, not recall under pressure. 你正在做笔记纸,或正在考场 上。对每道题跑一遍name-the- concept decoder (点名概念解 码器)(第14章):读线索→点 design / bias / method -> 写下because。带着四面笔记、 不用计算器,你的优势是推理纪 律,而非压力下的回忆。 AskSia Library · MAST20034 · 双语 Bilingual ! Read this first: the assessment shape, and the bring-in rule 先读这个:评估的形态,以及可带入规则 MAST20034 is assessed by four pieces: 5 revision quizzes (5%), 4 short assignments (20%, each a tight 200- word critique with hard word penalties), a group project (15%, study design/critique + a Week 11 presentation), and the 60% final exam. The final is in-person, short-answer reasoning, three hours. You may bring in up to two A4 pages double-sided - four sides - of your own notes, and calculators are not permitted (there are no questions that need one). So your notes sheet should carry definitions, taxonomies, decision rules and checklists, never arithmetic. Always confirm the current weights, dates and exam conditions on your own LMS, as details shift between cohorts. MAST20034 由四个部分评估:5 次复习 quiz(5%)、4次短作业(20%,每次是一篇严格200词的批判,超字数有硬 扣分)、一个小组项目(15%,研究设计/批判+第11周展示),以及60% 的期末考。期末是线下、简答推理、三小时。 你可以带入最多两张 A4 双面纸 -- 四面 -- 自己的笔记,不允许用计算器(也没有需要计算器的题目)。所以你的笔记 纸应承载定义、分类法、决策规则与清单,绝不放算术。请始终在你自己的 LMS 上确认当前的权重、日期与考试条件, 因为细节会随届次变动。 i How this book was built - the two-layer rule 这本书是怎么搭出来的 -- 两层规则 The framework canon here is standard, widely-published statistical-literacy theory - the PPDAC investigation cycle (Wild & Pfannkuch), EDA (Tukey), the standard study-design and sampling taxonomies, validity & precision principles, NHST + confidence-interval logic, the WEIRD-bias critique, the Bradford Hill criteria, and data-feminism / data-ethics (D'Ignazio & Klein). These are non-copyrightable canon, stated plainly. The course's own case-study stems and tutorial examples are paraphrased and re-authored with our own scenarios - we never reproduce a case study's specific data. Book status quoted and honoured (four-side bring-in, no calculator, short-answer). Verify on your LMS. 这里的框架经典是标准的、被广泛发表的统计素养理论 -- PPDAC 探究循环(Wild & Pfannkuch)、EDA (Tukey)、 标准的研究设计与 sampling 分类法、validity & precision (效度与精密度)原则、NHST+置信区间逻辑、WEIRD- bias 批判、Bradford Hill 准则,以及data-feminism / data-ethics (数据女性主义/数据伦理)(D'Ignazio & Klein)。这些是不可受版权保护的经典,平实陈述。本课自身的案例研究题干与教程例子都被转述并以我们自己的情景重 写 -- 我们绝不复制任何案例研究的具体数据。书面状态如实引用并遵守(四面带入、不用计算器、简答)。请在你的 LMS 上核实。 AskSia Library · MAST20034 · 双语 Bilingual THE BLUEPRINT - THE EXAM BLUEPRINT 60% FINAL . EVERY MARK IS A 'BECAUSE' Where every mark lives 每一分都落在哪里 One 60% short-answer final - reasoning only, no calculator, four sides of your own notes 一场占 60% 的 short-answer 期末 -- 只考推理、不许用计算器、可带四页自备笔记 TL;DR. Sixty percent is a short-answer reasoning final - no calculator, no calculations, no software, with four sides of your own notes carried in. Its make-or-break skill is "name the concept, then justify the call": you are handed a graph, a study or a piece of statistical output and asked to critique it and say how to fix it. Master the taxonomies and decision rules in this book and you hold the keys to the whole paper. TL;DR. 这 60% 是一场 short-answer 推理期末 -- 不许用计算器、不做计算、不用软件,可带入四页自备笔记。它成败攸 关的技能是“为概念命名,再论证你的判断”:你会拿到一张图、一项研究或一段统计输出,被要求批判它并说出如何修补。 掌握本书里的分类法与决策规则,你就握住了整张试卷的钥匙。 60% FINAL EXAM (3 HR) 期末考试(3 小时)[4]Source: asksia-bible-mast20034-bilingual.pdf4 Analysis explore + model + test 3 Data collect / clean / store iterative: conclusions raise new problems -> cycle repeats The course's engine in one picture: Problem - Plan + Data - Analysis - Conclusion - the investigation cycle that frames almost every critique prompt. Most exam answers are really a question about one node: was the Plan a sound design? Were the Data well sampled? Is the Conclusion licensed by the design? Learn to walk it from memory. 一幅图道尽本课程的引擎:Problem → Plan → Data → Analysis → Conclusion -- 这个研究循环 框定了几乎每一个批判题。多数考试答案其实是关于 某一个节点的问题:Plan 是不是一个可靠的设计? Data 抽样得当吗?Conclusion 是设计所许可的吗? 要学会凭记忆把它走一遍。 Examinable scope = the 12-week reasoning spine: objectivity & data · good graphics · study design . observational studies & confounding · reporting & critiquing claims . qualitative methods . frameworks for inference . analysis & modelling . sampling & AskSia Library . MAST20034 . XXia Bilingual WEIRD bias . accumulating research (meta-analysis & Hill) . big data · context. Research prompts touch only the whole-class case studies and never demand recalled details. 可考范围 = 12 周的推理主线:objectivity 与数据 · 好图表 · 研究设计 · observational study 与 confounding · 报告 与批判主张 · qualitative methods · 推断框架 · 分析与建 模 · 抽样与 WEIRD bias · 积累研究 (meta-analysis 与 Hill) · big data · 情境。研究类题目只触及全班共学的案 例研究,从不要求背诵细节。 What the exam is really testing 这场考试真正在考什么 The cue you get The move it rewards A graph / figure Critique it: name two good features + one specific fix (the graphics principles) A described study Name the design - say what conclusion is legal (causation vs association) An association Find the confounder / lurking variable and explain how it could fake the link Statistical output / a CI / P Interpret it in context - without the classic misreads A sampling scenario Name the method & the bias (incl. WEIRD) and why size won't cure it ✓ The one habit that wins this exam 赢下这场考试的那一个习惯 For every prompt, name the concept first, then write the because. "Two variables move together" - confounding / correlation#causation; "who got picked" - a sampling / selection bias; "is this graph any good" - the five graphics principles; "does X cause Y across studies" - Bradford Hill; "what does this P-value mean" - the interpretation rules. The decoder in Ch 14 lists every cue. 对每道题,先点名概念,再写because。“两个变量一 起变动”→ confounding / correlation≠causation;“谁被选中”→ 某种 sampling/ selection bias;“这张图好不好”→五条 图表原则;“跨多项研究X是否导致 Y”→ Bradford Hill;“这个 P-value 是什么意思”→解读规则。第 14 章的解码器列出了每一个线索。 ★ The single highest-value habit 价值最高的单一习惯 You may write in dot-points or sentences, and there are no marks for grammar or spelling - so spend every word on the reasoning. Practise answering in the shape the markers reward: (1) name the concept, (2) define it in a line, (3) apply it to the scenario, (4) state the consequence or fix. Four sentences, full marks. "Explaining your reasoning and choices is typically more important than any answer. " 你可以用要点或句子书写,且语法或拼写不计分 所以把每一个词都花在推理上。按评分者奖励的形态 练习作答:(1)点名概念,(2)一行内定义它,(3)把它 应用到情景,(4)陈述后果或修正。四句话,满分。 “阐释你的推理与选择,通常比任何答案本身更重要。”[9]Source: asksia-bible-mast20034-bilingual.pdfThe PPDAC cycle - the spine of the whole unit, and a one-glance map of how an exam scenario hangs together: every question lives somewhere on Problem - Plan - Data - Analysis - Conclusion. Locating the stage tells you which concept the marker wants. PPDAC 循环 -- 整个单元的脊柱,也是一张让你一眼看清考试情景如何拼接的地图:每道题都栖身于 Problem → Plan → Data → Analysis → Conclusion 中的某处。定位到阶段,就知道评分者想要哪个概念。 AskSia Library . MAST20034 . XXia Bilingual ★ Concepts to recall - the whole-book checklist 要回忆的概念 -- 全书清单 · Context first (Ch1): data are value-laden; ask who/why/what/how/when; critique # criticism (always offer a constructive fix). 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · Graphics (Ch2): the 5 principles; match graph to variable types; two good features + one specific improvement. I 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 · Design (Ch3): validity = randomise/compare/control (kills bias); precision = replicate/stratify/balance (kills variability); they are independent axes. 设计(第3章): validity = 随机化/比较/控制(杀 bias); precision = 重复/分层/平衡(杀 variability);二者是独 立的轴。 · Observational (Ch4): cohort=group-by-exposure, case-control=group-by-outcome; confounder links to both; correlation # causation. 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠ 因果。 I 报告(第5章):中心/离散/趋势/离群点;报告 Cl+水平、以及统计量 +P,而非只报P。 · Qualitative (Ch6): "why" not "what"; bottom-up vs top-down coding; convergence as the stopping rule. 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence 作为停止规则。 推断(第7章):随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大P 不证明 Ho; Type l/ll 与 power. · Modelling (Ch8): signal+noise; "all models wrong, some useful"; parsimony; read residual/QQ plots - interpret, never fit. 建模(第8章):信号+噪声;“所有模型都是错的,有些有用”;简约性;读残差/QQ图 -- 解读,绝不拟合。 · Sampling (Ch9): frame vs sample; a big sample won't fix bias; 4 random + 4 non-random methods; WEIRD; reproducibility crisis. I 抽样(第9章):抽样框 vs样本;大样本修不了偏倚;4种随机+4种非随机方法;WEIRD;可重复性危机。 I 累积(第10章):森林图(零线+菱形);Hill 准则(时序性+梯度);发表偏倚。 · Big data (Ch11): significance # importance at scale; provenance, ethics, scepticism toward Al findings. Big data (第11章):在大规模下显著 ≠重要;来源、伦理、对 AI发现的怀疑。 而且永远 -- 铁律:点名概念,然后给 because。祝你好运。 AskSia Library · MAST20034 · 双语 Bilingual ● 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 ● 设计(第3章): validity =随机化/比较/控制(杀 bias); precision= 重复/分层/平衡(杀 variability);二者是独 立的轴。 ● 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠因果。 ● 报告(第5章):中心/离散/趋势/离群点;报告 CI+水平、以及统计量+P,而非只报P。 ● 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence(收敛)作为停止规则。 推断(第7章): 随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大 P 不证明 Ho; Type l/ll 与 power.
-
三、这门课没有“公式重点”,但有“定义重点”和“句子重点”
-
因为这门课 不考计算,所以你不用准备数学公式推导。
-
你真正要背的是:
- 定义
- 分类法
- 决策树
- 常见误解的纠正句
- 题型答题模板[1]Source: asksia-bible-mast20034-bilingual.pdfB 2 . REVISE 2 · REVISE 2 . REVISE You've done the week. Use the tables and the chapter-end recall checklists to self-test: can you list the four observational designs, name three sampling biases, give the five graphics principles, recite the Hill criteria? The checklists are written to be lifted almost verbatim onto your four-side notes sheet. 你已经上完本周。用各表格和章 末的recall checklists (回忆清 单)来自测:你能列出四种观察 性设计、点名三种 sampling bias、给出五条图表原则、背出 Hill 准则吗?这些清单写出来就 是为了几乎逐字誉到你的四面笔 记纸上。 C 3 . APPLY 3 . APPLY 3 . APPLY You're building your notes sheet or sitting the paper. Run the name-the-concept decoder (Ch 14) on every prompt: read the cue - name the design / bias / method -> write the because. With four sides of notes carried in and no calculator, your edge is reasoning discipline, not recall under pressure. 你正在做笔记纸,或正在考场 上。对每道题跑一遍name-the- concept decoder (点名概念解 码器)(第14章):读线索→点 design / bias / method -> 写下because。带着四面笔记、 不用计算器,你的优势是推理纪 律,而非压力下的回忆。 AskSia Library · MAST20034 · 双语 Bilingual ! Read this first: the assessment shape, and the bring-in rule 先读这个:评估的形态,以及可带入规则 MAST20034 is assessed by four pieces: 5 revision quizzes (5%), 4 short assignments (20%, each a tight 200- word critique with hard word penalties), a group project (15%, study design/critique + a Week 11 presentation), and the 60% final exam. The final is in-person, short-answer reasoning, three hours. You may bring in up to two A4 pages double-sided - four sides - of your own notes, and calculators are not permitted (there are no questions that need one). So your notes sheet should carry definitions, taxonomies, decision rules and checklists, never arithmetic. Always confirm the current weights, dates and exam conditions on your own LMS, as details shift between cohorts. MAST20034 由四个部分评估:5 次复习 quiz(5%)、4次短作业(20%,每次是一篇严格200词的批判,超字数有硬 扣分)、一个小组项目(15%,研究设计/批判+第11周展示),以及60% 的期末考。期末是线下、简答推理、三小时。 你可以带入最多两张 A4 双面纸 -- 四面 -- 自己的笔记,不允许用计算器(也没有需要计算器的题目)。所以你的笔记 纸应承载定义、分类法、决策规则与清单,绝不放算术。请始终在你自己的 LMS 上确认当前的权重、日期与考试条件, 因为细节会随届次变动。 i How this book was built - the two-layer rule 这本书是怎么搭出来的 -- 两层规则 The framework canon here is standard, widely-published statistical-literacy theory - the PPDAC investigation cycle (Wild & Pfannkuch), EDA (Tukey), the standard study-design and sampling taxonomies, validity & precision principles, NHST + confidence-interval logic, the WEIRD-bias critique, the Bradford Hill criteria, and data-feminism / data-ethics (D'Ignazio & Klein). These are non-copyrightable canon, stated plainly. The course's own case-study stems and tutorial examples are paraphrased and re-authored with our own scenarios - we never reproduce a case study's specific data. Book status quoted and honoured (four-side bring-in, no calculator, short-answer). Verify on your LMS. 这里的框架经典是标准的、被广泛发表的统计素养理论 -- PPDAC 探究循环(Wild & Pfannkuch)、EDA (Tukey)、 标准的研究设计与 sampling 分类法、validity & precision (效度与精密度)原则、NHST+置信区间逻辑、WEIRD- bias 批判、Bradford Hill 准则,以及data-feminism / data-ethics (数据女性主义/数据伦理)(D'Ignazio & Klein)。这些是不可受版权保护的经典,平实陈述。本课自身的案例研究题干与教程例子都被转述并以我们自己的情景重 写 -- 我们绝不复制任何案例研究的具体数据。书面状态如实引用并遵守(四面带入、不用计算器、简答)。请在你的 LMS 上核实。 AskSia Library · MAST20034 · 双语 Bilingual THE BLUEPRINT - THE EXAM BLUEPRINT 60% FINAL . EVERY MARK IS A 'BECAUSE' Where every mark lives 每一分都落在哪里 One 60% short-answer final - reasoning only, no calculator, four sides of your own notes 一场占 60% 的 short-answer 期末 -- 只考推理、不许用计算器、可带四页自备笔记 TL;DR. Sixty percent is a short-answer reasoning final - no calculator, no calculations, no software, with four sides of your own notes carried in. Its make-or-break skill is "name the concept, then justify the call": you are handed a graph, a study or a piece of statistical output and asked to critique it and say how to fix it. Master the taxonomies and decision rules in this book and you hold the keys to the whole paper. TL;DR. 这 60% 是一场 short-answer 推理期末 -- 不许用计算器、不做计算、不用软件,可带入四页自备笔记。它成败攸 关的技能是“为概念命名,再论证你的判断”:你会拿到一张图、一项研究或一段统计输出,被要求批判它并说出如何修补。 掌握本书里的分类法与决策规则,你就握住了整张试卷的钥匙。 60% FINAL EXAM (3 HR) 期末考试(3 小时)[2]Source: asksia-bible-mast20034-bilingual.pdf-P C P - A D - - THE COMPLETE EXAM BIBLE Critical Thinking with Data 用数据进行批判性思考 DON'T COMPUTE - CRITIQUE. NAME THE DESIGN, SPOT THE BIAS, READ THE OUTPUT, JUSTIFY EVERY CALL. 四面笔记· 不带计算器 -- 每一分都靠讲清你的推理。 MAST20034 . THE UNIVERSITY OF MELBOURNE 中英双语版 · BILINGUAL EDITION 英文主讲,中文随行 一 考试要点与术语保留英文原词 The final is 60% of your mark, short-answer reasoning only - no calculator, no calculations, no software, with four sides of your own notes carried in. You are handed a graph, a study or a piece of output and asked to name what is good, name what is wrong, and say how to fix it. As the marking criteria put it, "explaining your reasoning and choices is typically more important than any answer. " Every mark is a because. This book is a decision-tree, taxonomy and checklist machine built to win exactly that. Independent study companion. Not affiliated with or endorsed by the University of Melbourne. Corrections: takedowns@asksia. ai PREFACE - - HOW TO USE THIS BOOK Reasoning, not arithmetic 讲推理,不讲算术 The exam pays for the 'because' - so does this book 考试为‘because’ 付分 -- 本书亦然 TL;DR. This is not a copy of the lecture slides or a formula dump - MAST20034 has no calculator and no calculations, so there is nothing to crunch. It is a self-contained bank of the concepts, taxonomies, decision rules and critique checklists the course examines: each idea defined plainly (markers reward defined terms), drawn as an original schematic where a picture helps, and tied to the exam's one move - name the concept, then justify the call. The same pages serve you three ways across the twelve teaching weeks. TL;DR. 这不是讲义幻灯片的副本,也不是公式堆砌 -- MAST20034 不许用计算器、也不做计算,所以根本没有什么要算。 它是一份自成体系的题库,收录课程会考的概念、分类法、决策规则与批判清单:每个想法都平实地定义(阅卷人为定义好的 术语给分),在图能帮上忙处画成原创示意图,并扣住考试唯一的动作 -- 为概念命名,再论证你的判断。同一批页面在这十 二个教学周里以三种方式为你服务。 A 1 . LEARN 1 . LEARN 1 . LEARN You haven't done the week's lecture yet. Read a chapter top to bottom. Each concept opens with a plain-English definition, lands a schematic or a decision table, then a worked short- answer that shows how to reason - the 'because' - not how to compute. Meet PPDAC, confounding, the bias dartboard or Bradford Hill here cold. 你还没上本周的课。把一章从头 读到尾。每个概念以一个大白话 定义开场,落到一张示意图或决 策表,再接一个简答范例,展示 如何推理 -- 那个“because” 而非如何计算。在这里第一 次冷碰 PPDAC、 confounding、bias 飞镖盘或 Bradford Hill.[3]Source: asksia-bible-mast20034-bilingual.pdf"Interpret this CI / P- value / output / forest plot" Inference reading rules . Ch7, Ch8, Ch10 (1) State what it shows in context (CI excludes O - evidence of an effect; small P - strong evidence against H. ). (2) Add the correct caveat (the interval is random, u is fixed; large P does not prove H. ). (3) Comment on strength / meaning - significant # important. SAMPLING & TRUST "Is this sample OK?"; "what's wrong with how they recruited?" Sampling bias + WEIRD · Ch1, Ch9 (1) Ask who is missing - frame / selection / non-response / volunteer gap. (2) Name the method and its bias (convenience - people similar to each other). (3) State that a bigger sample will NOT fix bias - it repeats the mistake at scale; consider WEIRD over-sampling. "Too good to be true"; "a surprising significant result"; "just barely p‹0. 05" Reproducibility + p-hacking · Ch9-10 (1) Publication bias - novel/significant results over-published, inflating effects. (2) Watch for p-hacking / HARKing (one-sided chosen after the data, multiple looks). (3) Ask for replication, a CI / effect size, and pre- registration before trusting it. "Big data / an Al claim"; "with millions of records . . . " Ethics + validity at scale · Ch1, Ch11 (1) Huge n - everything is significant - judge effect size & practical importance, not P. (2) Apply the context questions (who/why/what/how) + data ethics (consent, fairness, stewardship). (3) Be sceptical of Al - check provenance and the missing data. ✓ How to use the table under pressure 在压力下如何使用这张表 Underline the verb and the noun in the stem first ("choose a design", "critique this graph", "interpret the output"). That two-word cue picks the row; the right column is your paragraph. Then convert each of the three things into a sentence that ends in a because. You are never asked to compute - resist the urge. 先在题干里给动词和名词划线(“选择一个设计”、“批判这张图”、“解读这段输出”)。那个两字提示挑出对应的行;右侧 那一栏就是你要写的段落。然后把这三样东西各转成一个以 because 收尾的句子。题目从不要求你计算 -- 忍住冲动。 AskSia Library . MAST20034 . XXia Bilingual EXAM MORNING . THE DECODER - EXAM MORNING . THE DECODER BUILDING THE NOTES YOU CARRY IN Your 4 sides, the 'because' rule, and the 3 hours 你的4 页笔记、‘because’规则,以及那3个小时 TL;DR. You may bring four sides of your own notes and there is no calculator - so do not waste space on formulas. Fill the four sides with decision trees, checklists, and crisp definitions: the machinery that turns a cue into a justified answer. This page lays out what to put on each side, the one rule that wins short- answer marks, a timing plan for the three hours, and the closing concepts-to-recall list. TL;DR. 你可以带四面自己的笔记,而且没有计算器 -- 所以不要把空间浪费在公式上。把这四面填满decision trees (决策 树)、checklists (清单)和精炼的定义:那些把线索变成有论证答案的机器。本页摆出每一面该放什么、赢得简答分的那一 条规则、三小时的时间规划,以及收官的待回忆概念清单。 - 12. 2 The 4-side notes plan 12. 2四页笔记计划 The exam is reasoning, not recall of numbers, so your sheet is a reasoning toolkit. A good layout maps one side to each job of the decoder above. Trees and checklists earn marks; a wall of formulae does not (there is nothing to calculate). 考的是推理,而非对数字的回忆,所以你的小抄是一个推理工具箱。好的布局把每一面对应到上面解码器的一项工作。树与清 单能得分;一墙公式不能(没有任何东西要算)。 Side What goes on it[7]Source: asksia-bible-mast20034-bilingual.pdfWhy it pays off Side 1 - design & cause Study-design tree (intervene? - experiment/observational - cohort/cross-sec/case-control/ecological); confounding triangle; the Hill 9 (star temporality + gradient); exposure/outcome synonyms. Covers the two biggest row- families - "choose a design" and "is it causal?" - with ready-made becauses. Side 2 - The 5 graphics principles as a critique checklist; the graph-chooser (by variable mix); the data-description checklist graphs & reporting (centre/spread/trend/outliers, concise+complete); inference-report checklist (CI + level, stat + P, n). Turns the graph-critique and "critique this description" species into fill-in-the-blank answers. Side 3 - inference rules Correct CI + P-value wordings (and the wrong ones to avoid); the Type I/II + power 2×2; the NHST 5 steps; the assumption hierarchy; the diagnostic-plot readings (funnel - non-constant variance). The "interpret this output" species is pure recall of the right sentence - have it verbatim. Side 4 - sampling, qual & ethics Sampling taxonomy (4 random + 4 non-random) with each method's bias; "size won't fix bias"; WEIRD + reproducibility; qual methods (why vs what, coding, convergence); data ethics + justice; the context questions. Mops up the trust / sampling / qualitative / big-data rows and the W1 + W12 context frame. AskSia Library . MAST20034 . XXia Bilingual ! Do NOT build a formula sheet 不要去做一张公式表 There is no calculator and no calculation on this exam. A side crammed with CLT algebra, t-formulae or regression normal equations is wasted - you will never plug numbers in. The only notation worth a line is the definition of a P- value or a CI in words. Every other millimetre should be a tree, a checklist, or a one-line definition. 这场考试没有计算器,也没有计算。一面塞满 CLT代数、t 公式或回归正规方程的笔记是浪费 -- 你永远不会代入数 字。唯一值得占一行的记号,是用文字写出 P-value 或 CI 的定义。其余每一毫米都应是一棵树、一份核对清单,或一 行定义。 12. 3 The short-answer 'because' rule 12. 3short-answer 的‘because’ 规则 Definition. A complete short answer = a named concept + a reason that connects it to the scenario. Marks are awarded per correct, sufficiently-detailed reason, not per fact stated. A reason that merely restates the definition is not a because and scores nothing. 定义。一个完整的简答=一个点名的概念+一条把它连到情景上的理由。分数按每条正确、足够详细的理由给,而非按陈述 的事实给。一条仅仅复述定义的理由不是 because,得不到分。[16]Source: asksia-cheatsheet-mast20034.pdfqualitative . convergence 17 . The "Because" Rule HOW TO BANK MARKS EVERY ANSWER = A CHAIN NAME the concept/framework - APPLY it to the context + BECAUSE . . . (the reason wins the mark) · Marks are per correct, sufficiently-detailed reason - one detailed reason often = full marks; restating the definition earns nothing · Dot points are fine . no grammar/spelling marks . 3 hours SIA > No question needs a calculator and none recall case-study details. If you're computing, you've misread it - they want the reasoning. Spend the words on the because. 18 . Top Traps to THE MARKS LOST MOST Avoid · "P = Pr(Ho true)" . "large P proves Ho" . choosing one- sided after the data · Calling a natural experiment an experiment . "association proves cause" · Cohort>case-control mix-up · stratified > quota . cluster<> stratified · Hill as a tick-box · ignoring temporality · "Significant" = "important" . trusting a result because n is huge the core of your 4-side bring-in notes . confirm on the MAST20034 exam-info page name it. apply it. because . . . asksia. ai/cheatsheet/ unimelb-mast20034 . side 2/2 AskSia CHEAT SHEET SERIES RESTRICTED BRING-IN . NO CALCULATOR
-
真正接近“公式”的,只有这类“定义表达式”,而且是用语言解释:
-
P-value 定义
- $P = \Pr(\text{data or more extreme} \mid H_0)$
- 意思是:在原假设 $H_0$ 为真时,观察到当前数据或更极端数据的概率。[9]Source: asksia-bible-mast20034-bilingual.pdfThe PPDAC cycle - the spine of the whole unit, and a one-glance map of how an exam scenario hangs together: every question lives somewhere on Problem - Plan - Data - Analysis - Conclusion. Locating the stage tells you which concept the marker wants. PPDAC 循环 -- 整个单元的脊柱,也是一张让你一眼看清考试情景如何拼接的地图:每道题都栖身于 Problem → Plan → Data → Analysis → Conclusion 中的某处。定位到阶段,就知道评分者想要哪个概念。 AskSia Library . MAST20034 . XXia Bilingual ★ Concepts to recall - the whole-book checklist 要回忆的概念 -- 全书清单 · Context first (Ch1): data are value-laden; ask who/why/what/how/when; critique # criticism (always offer a constructive fix). 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · Graphics (Ch2): the 5 principles; match graph to variable types; two good features + one specific improvement. I 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 · Design (Ch3): validity = randomise/compare/control (kills bias); precision = replicate/stratify/balance (kills variability); they are independent axes. 设计(第3章): validity = 随机化/比较/控制(杀 bias); precision = 重复/分层/平衡(杀 variability);二者是独 立的轴。 · Observational (Ch4): cohort=group-by-exposure, case-control=group-by-outcome; confounder links to both; correlation # causation. 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠ 因果。 I 报告(第5章):中心/离散/趋势/离群点;报告 Cl+水平、以及统计量 +P,而非只报P。 · Qualitative (Ch6): "why" not "what"; bottom-up vs top-down coding; convergence as the stopping rule. 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence 作为停止规则。 推断(第7章):随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大P 不证明 Ho; Type l/ll 与 power. · Modelling (Ch8): signal+noise; "all models wrong, some useful"; parsimony; read residual/QQ plots - interpret, never fit. 建模(第8章):信号+噪声;“所有模型都是错的,有些有用”;简约性;读残差/QQ图 -- 解读,绝不拟合。 · Sampling (Ch9): frame vs sample; a big sample won't fix bias; 4 random + 4 non-random methods; WEIRD; reproducibility crisis. I 抽样(第9章):抽样框 vs样本;大样本修不了偏倚;4种随机+4种非随机方法;WEIRD;可重复性危机。 I 累积(第10章):森林图(零线+菱形);Hill 准则(时序性+梯度);发表偏倚。 · Big data (Ch11): significance # importance at scale; provenance, ethics, scepticism toward Al findings. Big data (第11章):在大规模下显著 ≠重要;来源、伦理、对 AI发现的怀疑。 而且永远 -- 铁律:点名概念,然后给 because。祝你好运。 AskSia Library · MAST20034 · 双语 Bilingual ● 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 ● 设计(第3章): validity =随机化/比较/控制(杀 bias); precision= 重复/分层/平衡(杀 variability);二者是独 立的轴。 ● 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠因果。 ● 报告(第5章):中心/离散/趋势/离群点;报告 CI+水平、以及统计量+P,而非只报P。 ● 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence(收敛)作为停止规则。 推断(第7章): 随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大 P 不证明 Ho; Type l/ll 与 power.[11]Source: asksia-bible-mast20034-bilingual.pdf(b) Big-data 批判(because):巨大的 n修不了 bias -- 它只是那些已经在用这个 app 的人(selection bias); 在海量 n 下一切看起来都“显著”,所以 effect size 与来源比 P 更要紧。再加上数据伦理标记:被记录用户的隐私/ 同意。 What earns the marks. a justified qual choice (the "why" logic + convergence) + a big-data critique naming that size # representativeness, with an ethics flag. 什么能得分。一个有论证的 qual 选择(“why”逻辑+ convergence)+一个 big-data 批判,点明规模 ≠代表性,并附 上一个 ethics 标记。 Trap. dismissing qualitative as "unscientific"; equating large n with representative; forgetting consent/provenance for found data. 陷阱。把 qualitative 斥为“不科学”;把大n 等同于有代表性;对找来的数据忘了 consent/provenance(同意/来源)。 AskSia Library · MAST20034 · 双语 Bilingual ★ Recall checklist - the decision rules for the bank 回忆清单 -- 题库的决策规则 1 每个答案:点名→定义→在情境中给理由→ because(后果)。按分值数理由。 · Design/sampling: justify the choice against its alternative; non-probability methods are biased - size won't cure it. 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 · Confounder: must link to both exposure and outcome; observational - associated with, never causes. Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 · Graph critique: two good features to a principle; one issue + a fix that matches it. 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 · Type I/II: false +/false -; power = 1-ß; small n -+ low power; rare condition - base-rate false positives. Type l/ll: 假阳/假阴;power = 1-β;小 n→低 power;罕见情况→基础率导致的假阳性。 · Qual vs quant: why vs what; convergence is the qual stopping rule. Big data: size # unbiased; effect size & ethics over P. 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data: 大≠无偏;effect size 与伦理胜过 P。 ● 每个答案:点名 →定义→在情境中给理由→ because(后果)。按分值数理由。 ● 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 ● Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 ● 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 ● CI:随机的是区间,固定的是参数。P-value: Pr(datalHo),而非 Pr(Holdata);显著 ≠重要。 ● Type I/ll: 假阳/假阴;power = 1-β;小n→低 power;罕见情况→ 基础率导致的假阳性。 ● Forest plot: 菱形对零线+异质性+发表偏倚的注意。Hill:证据的权重,时序性优先。 ● 诊断图:漏斗形→方差非恒定;QQ 弯曲→非正态→变换/用假设更少的方法。 ● 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data:大≠无偏;effect size 与伦理胜过 P。 AskSia Library · MAST20034 · 双语 Bilingual EXAM MORNING . THE DECODER EXAM MORNING . THE DECODER[15]Source: asksia-bible-mast20034-bilingual.pdf混合方法=定性+定量;定性解释或为定量播种。永远点名其伦理/实务成本(时间、编码、匿名性)。 · Trap: never call qualitative "unscientific"; never "just take a bigger survey" for a why question; never swap bottom-up + top-down. I 陷阱:绝不把定性称为“不科学”;对一个为何的问题绝不“就去做更大的调查”;绝不把自下而上←自上而下对调。 ● Quant=“什么/有多少”;qual=“为什么” -- 按问题的性质来选,而非按哪个“更好”。 ● 四种来源: interviews (深度) · focus groups (互动)· observation (做≠说)· documents/artefacts (已经存在) -- 各有其风险。 ● 编码:bottom-up =归纳,codes 从数据中涌现;top-down =演绎,codes 来自先验理论。Themes =归组后的 codes (thematic analysis) . · Convergence (收敛)=定性的停止规则(类比于 power/样本量):当新数据不再增添新主题时停止。 · Rigour (严谨): credibility (~ 内部效度)、transferability (~外部效度)、transparency、有目的的采集。 ● Mixed methods (混合方法) = qual + quant; qual 解释或孕育 quant。始终点名伦理/实务代价(时间、编码、 匿名)。 ● 陷阱:绝不把定性叫“不科学”;对一个为什么问题绝不“干脆做更大的调查”;绝不把 bottom-up ←> top-down 互 换。 AskSia Library · MAST20034 · 双语 Bilingual WEEK 7 . FRAMEWORKS FOR INFERENCE - WEEK 7 . FRAMEWORKS FOR INFERENCE CH 7 . ESTIMATION & SAMPLING DISTRIBUTIONS From sample to population: estimation & the CLT 从样本到总体:估计与 CLT Why one sample can speak for a whole population - and how confidently 为何一个样本能为整个总体发声 -- 以及有多大把握 TL;DR. Inference runs the arrow backwards: probability reasons population - sample, inference reasons sample - population. A point estimate is one number; a confidence interval is honest because it carries "how close". The whole machine rests on the sampling distribution - what the estimate would do over many samples - which the Central Limit Theorem makes Normal. Everything on these three pages is about reading and explaining this, never computing it. TL;DR. 推断把箭头倒过来跑:概率从总体→样本推理,推断从样本→ 总体推理。一个 point estimate(点估计)是一个数 字;一个 confidence interval(置信区间)之所以诚实,是因为它带着“有多接近”。整套机器都立在 sampling distribution (抽样分布)之上 -- 即这个估计在许多样本上会有的表现 -- 而 Central Limit Theorem (CLT,中心极限定理)让它呈正 态。这三页里的一切都关于读懂并解释它,从不计算它。 ★ What the exam asks here 考试在这里问什么 The 60% final is short-answer reasoning only - no calculator, no calculations, no multiple choice. You bring in four sides of your own notes. For inference you will be handed a CI or a P-value to interpret and asked to say what it does (and does not) mean, or to name an error / explain power in a scenario. The marking is explicit: "explaining your reasoning and choices is typically more important than any answer. " Every mark is a because - so carry the definitions, the CI/P-value interpretation rules, and the Type I/II decoder, not arithmetic. 60% 期末只考简答推理 -- 无计算器、无计算、无多选。你带入四面自己的笔记。对于推断,你会被递给一个要解读的 CI 或 P-value,要你说它意味着什么(以及不意味着什么),或在某情景中点名一种 error/解释 power。评分明确: “阐释你的推理与选择,通常比任何答案本身更重要。”每一分都是一个because -- 所以带上定义、CI/P-value 解读规 则,以及 Type l/II 解码器,而非算术。 7. 1 Estimation: point estimate vs confidence interval 7. 1fait: point estimate vs confidence interval Definitions. A point estimate is a single number computed from the sample that stands in for an unknown population parameter (the sample mean x estimates u; the sample proportion p estimates p; s estimates o). A confidence interval (CI) is a range - the estimate plus a margin that encodes how close it is likely to be: estimate ± (distribution multiplier) x (variability). The width comes from sampling variability (and so shrinks as n grows); the multiplier comes from the confidence level you choose. - 定义。一个 point estimate 是从样本算出的单个数字,用来替代一个未知的总体 parameter(参数)(样本均值 x估计 μ; 样本比例 p^估计 p; s 估计 o)。一个 confidence interval (CI,置信区间)是一个范围 -- 估计值加上一个编码了它有多大 可能接近的余量:估计值+(分布乘数)× (变异性)。宽度来自抽样变异性(所以随n增大而收缩);乘数来自你选择的置 信水平。 Quantity Symbol Lives in
- 不能说:
- “$P$ 是 $H_0$ 为真的概率”
- “大 $P$ 证明 $H_0$ 为真”[8]Source: asksia-bible-mast20034-bilingual.pdfAskSia Library · MAST20034 · 双语 Bilingual 3 Reason in the context given. Tie the concept to this scenario, not a generic textbook one. 在给定的情境中推理。把概念系到这个情景,而非一个泛泛的教科书情景。 4 Close each point with a because. State the consequence (the bias it induces, the assumption it breaks, the conclusion it licenses) - this is the mark-bearing clause. 每个要点都以一个 because 收尾。说出后果(它引发的偏倚、它破坏的假设、它许可的结论) -- 这是承载分数的从句。 5 Count your reasons against the marks. If it says [4: 2+2], deliver a definition and two separate consequence-level reasons. 按分值数你的理由。若标着[4:2+2],就要给出一个定义以及两个各自独立、到后果层面的理由。 ✓ The universal sentence frame 通用句式框架 "This is [named concept], which means [definition]. Here it matters because [consequence #1], and also because [consequence #2]. "Drop any scenario into that frame and you have structured an answer that the rubric can find marks in. “这是[点名的概念],意思是[定义]。在这里它之所以重要,是因为[后果#1],也因为[后果 #2]。”把任意情景套进这 个框架,你就把一个评分标准能找到分的答案给搭好了。 - ! The four ways students throw away marks 学生白白丢分的四种方式 复述,而非推理 -- “它有偏倚,因为它不随机”只是给定义换了个名字;要说出这个偏倚做了什么。 |一个理由套两件外套 -- 一道“2+2”的题目需要两个不同的理由,而非把一个换说法说两遍。 · Trying to compute - there is no calculator and no calc question; if you start arithmetic you have misread the task. 试图计算 -- 这里既无计算器也无计算题;若你开始做算术,你就读错题了。 过度宣称因果 -- 对观察性数据唯一合法的动词是 is associated with,绝不是 causes。 ● 复述,而非推理 -- “它有偏倚,因为它不随机”只是给定义换了个名字;要说出这个偏倚做了什么。 ● 一个理由套两件外套 -- 一道“2+2”的题目需要两个不同的理由,而非把一个换说法说两遍。 ● 试图计算 -- 这里既无计算器也无计算题;若你开始做算术,你就读错题了。 ● 过度宣称因果 -- 对观察性数据唯一合法的动词是 is associated with (与 . . . . . 相关),绝不是 causes(导致)。 AskSia Library · MAST20034 · 双语 Bilingual REVISION . SHORT - ANSWER BANK REVISION . SHORT - ANSWER BANK STUDY - PRODUCTION SPECIES Drills 1-6: design, sampling, confounding & graphs 演练 1-6: 设计、抽样、confounding 与图表 TL;DR. These six rehearse the "how the data were produced" family: choose & justify a design, pick a sampling method, name the confounder, identify exposure/outcome, and critique a graph (two good features + one specific fix). Reason the choice against its alternative - that contrast is where the marks live. TL;DR. 这六张演练“数据是如何产生的”这一族题:选择并论证一种设计、挑一种 sampling 方法、点名 confounder、识别 exposure/outcome,以及批判一张图(两个优点+一个具体的修正)。要把你的选择对照其备选方案来论证 -- 那个对比正 是分数所在。 Q1[16]Source: asksia-cheatsheet-mast20034.pdfqualitative . convergence 17 . The "Because" Rule HOW TO BANK MARKS EVERY ANSWER = A CHAIN NAME the concept/framework - APPLY it to the context + BECAUSE . . . (the reason wins the mark) · Marks are per correct, sufficiently-detailed reason - one detailed reason often = full marks; restating the definition earns nothing · Dot points are fine . no grammar/spelling marks . 3 hours SIA > No question needs a calculator and none recall case-study details. If you're computing, you've misread it - they want the reasoning. Spend the words on the because. 18 . Top Traps to THE MARKS LOST MOST Avoid · "P = Pr(Ho true)" . "large P proves Ho" . choosing one- sided after the data · Calling a natural experiment an experiment . "association proves cause" · Cohort>case-control mix-up · stratified > quota . cluster<> stratified · Hill as a tick-box · ignoring temporality · "Significant" = "important" . trusting a result because n is huge the core of your 4-side bring-in notes . confirm on the MAST20034 exam-info page name it. apply it. because . . . asksia. ai/cheatsheet/ unimelb-mast20034 . side 2/2 AskSia CHEAT SHEET SERIES RESTRICTED BRING-IN . NO CALCULATOR
-
Power 定义
- $$\text{Power} = 1 - \beta$$
- 表示:当真实有作用时,检验成功检出的概率。[11]Source: asksia-bible-mast20034-bilingual.pdf(b) Big-data 批判(because):巨大的 n修不了 bias -- 它只是那些已经在用这个 app 的人(selection bias); 在海量 n 下一切看起来都“显著”,所以 effect size 与来源比 P 更要紧。再加上数据伦理标记:被记录用户的隐私/ 同意。 What earns the marks. a justified qual choice (the "why" logic + convergence) + a big-data critique naming that size # representativeness, with an ethics flag. 什么能得分。一个有论证的 qual 选择(“why”逻辑+ convergence)+一个 big-data 批判,点明规模 ≠代表性,并附 上一个 ethics 标记。 Trap. dismissing qualitative as "unscientific"; equating large n with representative; forgetting consent/provenance for found data. 陷阱。把 qualitative 斥为“不科学”;把大n 等同于有代表性;对找来的数据忘了 consent/provenance(同意/来源)。 AskSia Library · MAST20034 · 双语 Bilingual ★ Recall checklist - the decision rules for the bank 回忆清单 -- 题库的决策规则 1 每个答案:点名→定义→在情境中给理由→ because(后果)。按分值数理由。 · Design/sampling: justify the choice against its alternative; non-probability methods are biased - size won't cure it. 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 · Confounder: must link to both exposure and outcome; observational - associated with, never causes. Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 · Graph critique: two good features to a principle; one issue + a fix that matches it. 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 · Type I/II: false +/false -; power = 1-ß; small n -+ low power; rare condition - base-rate false positives. Type l/ll: 假阳/假阴;power = 1-β;小 n→低 power;罕见情况→基础率导致的假阳性。 · Qual vs quant: why vs what; convergence is the qual stopping rule. Big data: size # unbiased; effect size & ethics over P. 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data: 大≠无偏;effect size 与伦理胜过 P。 ● 每个答案:点名 →定义→在情境中给理由→ because(后果)。按分值数理由。 ● 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 ● Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 ● 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 ● CI:随机的是区间,固定的是参数。P-value: Pr(datalHo),而非 Pr(Holdata);显著 ≠重要。 ● Type I/ll: 假阳/假阴;power = 1-β;小n→低 power;罕见情况→ 基础率导致的假阳性。 ● Forest plot: 菱形对零线+异质性+发表偏倚的注意。Hill:证据的权重,时序性优先。 ● 诊断图:漏斗形→方差非恒定;QQ 弯曲→非正态→变换/用假设更少的方法。 ● 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data:大≠无偏;effect size 与伦理胜过 P。 AskSia Library · MAST20034 · 双语 Bilingual EXAM MORNING . THE DECODER EXAM MORNING . THE DECODER[15]Source: asksia-bible-mast20034-bilingual.pdf混合方法=定性+定量;定性解释或为定量播种。永远点名其伦理/实务成本(时间、编码、匿名性)。 · Trap: never call qualitative "unscientific"; never "just take a bigger survey" for a why question; never swap bottom-up + top-down. I 陷阱:绝不把定性称为“不科学”;对一个为何的问题绝不“就去做更大的调查”;绝不把自下而上←自上而下对调。 ● Quant=“什么/有多少”;qual=“为什么” -- 按问题的性质来选,而非按哪个“更好”。 ● 四种来源: interviews (深度) · focus groups (互动)· observation (做≠说)· documents/artefacts (已经存在) -- 各有其风险。 ● 编码:bottom-up =归纳,codes 从数据中涌现;top-down =演绎,codes 来自先验理论。Themes =归组后的 codes (thematic analysis) . · Convergence (收敛)=定性的停止规则(类比于 power/样本量):当新数据不再增添新主题时停止。 · Rigour (严谨): credibility (~ 内部效度)、transferability (~外部效度)、transparency、有目的的采集。 ● Mixed methods (混合方法) = qual + quant; qual 解释或孕育 quant。始终点名伦理/实务代价(时间、编码、 匿名)。 ● 陷阱:绝不把定性叫“不科学”;对一个为什么问题绝不“干脆做更大的调查”;绝不把 bottom-up ←> top-down 互 换。 AskSia Library · MAST20034 · 双语 Bilingual WEEK 7 . FRAMEWORKS FOR INFERENCE - WEEK 7 . FRAMEWORKS FOR INFERENCE CH 7 . ESTIMATION & SAMPLING DISTRIBUTIONS From sample to population: estimation & the CLT 从样本到总体:估计与 CLT Why one sample can speak for a whole population - and how confidently 为何一个样本能为整个总体发声 -- 以及有多大把握 TL;DR. Inference runs the arrow backwards: probability reasons population - sample, inference reasons sample - population. A point estimate is one number; a confidence interval is honest because it carries "how close". The whole machine rests on the sampling distribution - what the estimate would do over many samples - which the Central Limit Theorem makes Normal. Everything on these three pages is about reading and explaining this, never computing it. TL;DR. 推断把箭头倒过来跑:概率从总体→样本推理,推断从样本→ 总体推理。一个 point estimate(点估计)是一个数 字;一个 confidence interval(置信区间)之所以诚实,是因为它带着“有多接近”。整套机器都立在 sampling distribution (抽样分布)之上 -- 即这个估计在许多样本上会有的表现 -- 而 Central Limit Theorem (CLT,中心极限定理)让它呈正 态。这三页里的一切都关于读懂并解释它,从不计算它。 ★ What the exam asks here 考试在这里问什么 The 60% final is short-answer reasoning only - no calculator, no calculations, no multiple choice. You bring in four sides of your own notes. For inference you will be handed a CI or a P-value to interpret and asked to say what it does (and does not) mean, or to name an error / explain power in a scenario. The marking is explicit: "explaining your reasoning and choices is typically more important than any answer. " Every mark is a because - so carry the definitions, the CI/P-value interpretation rules, and the Type I/II decoder, not arithmetic. 60% 期末只考简答推理 -- 无计算器、无计算、无多选。你带入四面自己的笔记。对于推断,你会被递给一个要解读的 CI 或 P-value,要你说它意味着什么(以及不意味着什么),或在某情景中点名一种 error/解释 power。评分明确: “阐释你的推理与选择,通常比任何答案本身更重要。”每一分都是一个because -- 所以带上定义、CI/P-value 解读规 则,以及 Type l/II 解码器,而非算术。 7. 1 Estimation: point estimate vs confidence interval 7. 1fait: point estimate vs confidence interval Definitions. A point estimate is a single number computed from the sample that stands in for an unknown population parameter (the sample mean x estimates u; the sample proportion p estimates p; s estimates o). A confidence interval (CI) is a range - the estimate plus a margin that encodes how close it is likely to be: estimate ± (distribution multiplier) x (variability). The width comes from sampling variability (and so shrinks as n grows); the multiplier comes from the confidence level you choose. - 定义。一个 point estimate 是从样本算出的单个数字,用来替代一个未知的总体 parameter(参数)(样本均值 x估计 μ; 样本比例 p^估计 p; s 估计 o)。一个 confidence interval (CI,置信区间)是一个范围 -- 估计值加上一个编码了它有多大 可能接近的余量:估计值+(分布乘数)× (变异性)。宽度来自抽样变异性(所以随n增大而收缩);乘数来自你选择的置 信水平。 Quantity Symbol Lives in
-
Type I / Type II error
- Type I:假阳性,原假设是真的却被你拒绝了
- Type II:假阴性,原假设是假的却没拒绝它[11]Source: asksia-bible-mast20034-bilingual.pdf(b) Big-data 批判(because):巨大的 n修不了 bias -- 它只是那些已经在用这个 app 的人(selection bias); 在海量 n 下一切看起来都“显著”,所以 effect size 与来源比 P 更要紧。再加上数据伦理标记:被记录用户的隐私/ 同意。 What earns the marks. a justified qual choice (the "why" logic + convergence) + a big-data critique naming that size # representativeness, with an ethics flag. 什么能得分。一个有论证的 qual 选择(“why”逻辑+ convergence)+一个 big-data 批判,点明规模 ≠代表性,并附 上一个 ethics 标记。 Trap. dismissing qualitative as "unscientific"; equating large n with representative; forgetting consent/provenance for found data. 陷阱。把 qualitative 斥为“不科学”;把大n 等同于有代表性;对找来的数据忘了 consent/provenance(同意/来源)。 AskSia Library · MAST20034 · 双语 Bilingual ★ Recall checklist - the decision rules for the bank 回忆清单 -- 题库的决策规则 1 每个答案:点名→定义→在情境中给理由→ because(后果)。按分值数理由。 · Design/sampling: justify the choice against its alternative; non-probability methods are biased - size won't cure it. 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 · Confounder: must link to both exposure and outcome; observational - associated with, never causes. Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 · Graph critique: two good features to a principle; one issue + a fix that matches it. 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 · Type I/II: false +/false -; power = 1-ß; small n -+ low power; rare condition - base-rate false positives. Type l/ll: 假阳/假阴;power = 1-β;小 n→低 power;罕见情况→基础率导致的假阳性。 · Qual vs quant: why vs what; convergence is the qual stopping rule. Big data: size # unbiased; effect size & ethics over P. 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data: 大≠无偏;effect size 与伦理胜过 P。 ● 每个答案:点名 →定义→在情境中给理由→ because(后果)。按分值数理由。 ● 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 ● Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 ● 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 ● CI:随机的是区间,固定的是参数。P-value: Pr(datalHo),而非 Pr(Holdata);显著 ≠重要。 ● Type I/ll: 假阳/假阴;power = 1-β;小n→低 power;罕见情况→ 基础率导致的假阳性。 ● Forest plot: 菱形对零线+异质性+发表偏倚的注意。Hill:证据的权重,时序性优先。 ● 诊断图:漏斗形→方差非恒定;QQ 弯曲→非正态→变换/用假设更少的方法。 ● 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data:大≠无偏;effect size 与伦理胜过 P。 AskSia Library · MAST20034 · 双语 Bilingual EXAM MORNING . THE DECODER EXAM MORNING . THE DECODER[15]Source: asksia-bible-mast20034-bilingual.pdf混合方法=定性+定量;定性解释或为定量播种。永远点名其伦理/实务成本(时间、编码、匿名性)。 · Trap: never call qualitative "unscientific"; never "just take a bigger survey" for a why question; never swap bottom-up + top-down. I 陷阱:绝不把定性称为“不科学”;对一个为何的问题绝不“就去做更大的调查”;绝不把自下而上←自上而下对调。 ● Quant=“什么/有多少”;qual=“为什么” -- 按问题的性质来选,而非按哪个“更好”。 ● 四种来源: interviews (深度) · focus groups (互动)· observation (做≠说)· documents/artefacts (已经存在) -- 各有其风险。 ● 编码:bottom-up =归纳,codes 从数据中涌现;top-down =演绎,codes 来自先验理论。Themes =归组后的 codes (thematic analysis) . · Convergence (收敛)=定性的停止规则(类比于 power/样本量):当新数据不再增添新主题时停止。 · Rigour (严谨): credibility (~ 内部效度)、transferability (~外部效度)、transparency、有目的的采集。 ● Mixed methods (混合方法) = qual + quant; qual 解释或孕育 quant。始终点名伦理/实务代价(时间、编码、 匿名)。 ● 陷阱:绝不把定性叫“不科学”;对一个为什么问题绝不“干脆做更大的调查”;绝不把 bottom-up ←> top-down 互 换。 AskSia Library · MAST20034 · 双语 Bilingual WEEK 7 . FRAMEWORKS FOR INFERENCE - WEEK 7 . FRAMEWORKS FOR INFERENCE CH 7 . ESTIMATION & SAMPLING DISTRIBUTIONS From sample to population: estimation & the CLT 从样本到总体:估计与 CLT Why one sample can speak for a whole population - and how confidently 为何一个样本能为整个总体发声 -- 以及有多大把握 TL;DR. Inference runs the arrow backwards: probability reasons population - sample, inference reasons sample - population. A point estimate is one number; a confidence interval is honest because it carries "how close". The whole machine rests on the sampling distribution - what the estimate would do over many samples - which the Central Limit Theorem makes Normal. Everything on these three pages is about reading and explaining this, never computing it. TL;DR. 推断把箭头倒过来跑:概率从总体→样本推理,推断从样本→ 总体推理。一个 point estimate(点估计)是一个数 字;一个 confidence interval(置信区间)之所以诚实,是因为它带着“有多接近”。整套机器都立在 sampling distribution (抽样分布)之上 -- 即这个估计在许多样本上会有的表现 -- 而 Central Limit Theorem (CLT,中心极限定理)让它呈正 态。这三页里的一切都关于读懂并解释它,从不计算它。 ★ What the exam asks here 考试在这里问什么 The 60% final is short-answer reasoning only - no calculator, no calculations, no multiple choice. You bring in four sides of your own notes. For inference you will be handed a CI or a P-value to interpret and asked to say what it does (and does not) mean, or to name an error / explain power in a scenario. The marking is explicit: "explaining your reasoning and choices is typically more important than any answer. " Every mark is a because - so carry the definitions, the CI/P-value interpretation rules, and the Type I/II decoder, not arithmetic. 60% 期末只考简答推理 -- 无计算器、无计算、无多选。你带入四面自己的笔记。对于推断,你会被递给一个要解读的 CI 或 P-value,要你说它意味着什么(以及不意味着什么),或在某情景中点名一种 error/解释 power。评分明确: “阐释你的推理与选择,通常比任何答案本身更重要。”每一分都是一个because -- 所以带上定义、CI/P-value 解读规 则,以及 Type l/II 解码器,而非算术。 7. 1 Estimation: point estimate vs confidence interval 7. 1fait: point estimate vs confidence interval Definitions. A point estimate is a single number computed from the sample that stands in for an unknown population parameter (the sample mean x estimates u; the sample proportion p estimates p; s estimates o). A confidence interval (CI) is a range - the estimate plus a margin that encodes how close it is likely to be: estimate ± (distribution multiplier) x (variability). The width comes from sampling variability (and so shrinks as n grows); the multiplier comes from the confidence level you choose. - 定义。一个 point estimate 是从样本算出的单个数字,用来替代一个未知的总体 parameter(参数)(样本均值 x估计 μ; 样本比例 p^估计 p; s 估计 o)。一个 confidence interval (CI,置信区间)是一个范围 -- 估计值加上一个编码了它有多大 可能接近的余量:估计值+(分布乘数)× (变异性)。宽度来自抽样变异性(所以随n增大而收缩);乘数来自你选择的置 信水平。 Quantity Symbol Lives in
-
但注意:这门课重点不是让你算这些,而是让你在情境里解释这些是什么意思。[15]Source: asksia-bible-mast20034-bilingual.pdf混合方法=定性+定量;定性解释或为定量播种。永远点名其伦理/实务成本(时间、编码、匿名性)。 · Trap: never call qualitative "unscientific"; never "just take a bigger survey" for a why question; never swap bottom-up + top-down. I 陷阱:绝不把定性称为“不科学”;对一个为何的问题绝不“就去做更大的调查”;绝不把自下而上←自上而下对调。 ● Quant=“什么/有多少”;qual=“为什么” -- 按问题的性质来选,而非按哪个“更好”。 ● 四种来源: interviews (深度) · focus groups (互动)· observation (做≠说)· documents/artefacts (已经存在) -- 各有其风险。 ● 编码:bottom-up =归纳,codes 从数据中涌现;top-down =演绎,codes 来自先验理论。Themes =归组后的 codes (thematic analysis) . · Convergence (收敛)=定性的停止规则(类比于 power/样本量):当新数据不再增添新主题时停止。 · Rigour (严谨): credibility (~ 内部效度)、transferability (~外部效度)、transparency、有目的的采集。 ● Mixed methods (混合方法) = qual + quant; qual 解释或孕育 quant。始终点名伦理/实务代价(时间、编码、 匿名)。 ● 陷阱:绝不把定性叫“不科学”;对一个为什么问题绝不“干脆做更大的调查”;绝不把 bottom-up ←> top-down 互 换。 AskSia Library · MAST20034 · 双语 Bilingual WEEK 7 . FRAMEWORKS FOR INFERENCE - WEEK 7 . FRAMEWORKS FOR INFERENCE CH 7 . ESTIMATION & SAMPLING DISTRIBUTIONS From sample to population: estimation & the CLT 从样本到总体:估计与 CLT Why one sample can speak for a whole population - and how confidently 为何一个样本能为整个总体发声 -- 以及有多大把握 TL;DR. Inference runs the arrow backwards: probability reasons population - sample, inference reasons sample - population. A point estimate is one number; a confidence interval is honest because it carries "how close". The whole machine rests on the sampling distribution - what the estimate would do over many samples - which the Central Limit Theorem makes Normal. Everything on these three pages is about reading and explaining this, never computing it. TL;DR. 推断把箭头倒过来跑:概率从总体→样本推理,推断从样本→ 总体推理。一个 point estimate(点估计)是一个数 字;一个 confidence interval(置信区间)之所以诚实,是因为它带着“有多接近”。整套机器都立在 sampling distribution (抽样分布)之上 -- 即这个估计在许多样本上会有的表现 -- 而 Central Limit Theorem (CLT,中心极限定理)让它呈正 态。这三页里的一切都关于读懂并解释它,从不计算它。 ★ What the exam asks here 考试在这里问什么 The 60% final is short-answer reasoning only - no calculator, no calculations, no multiple choice. You bring in four sides of your own notes. For inference you will be handed a CI or a P-value to interpret and asked to say what it does (and does not) mean, or to name an error / explain power in a scenario. The marking is explicit: "explaining your reasoning and choices is typically more important than any answer. " Every mark is a because - so carry the definitions, the CI/P-value interpretation rules, and the Type I/II decoder, not arithmetic. 60% 期末只考简答推理 -- 无计算器、无计算、无多选。你带入四面自己的笔记。对于推断,你会被递给一个要解读的 CI 或 P-value,要你说它意味着什么(以及不意味着什么),或在某情景中点名一种 error/解释 power。评分明确: “阐释你的推理与选择,通常比任何答案本身更重要。”每一分都是一个because -- 所以带上定义、CI/P-value 解读规 则,以及 Type l/II 解码器,而非算术。 7. 1 Estimation: point estimate vs confidence interval 7. 1fait: point estimate vs confidence interval Definitions. A point estimate is a single number computed from the sample that stands in for an unknown population parameter (the sample mean x estimates u; the sample proportion p estimates p; s estimates o). A confidence interval (CI) is a range - the estimate plus a margin that encodes how close it is likely to be: estimate ± (distribution multiplier) x (variability). The width comes from sampling variability (and so shrinks as n grows); the multiplier comes from the confidence level you choose. - 定义。一个 point estimate 是从样本算出的单个数字,用来替代一个未知的总体 parameter(参数)(样本均值 x估计 μ; 样本比例 p^估计 p; s 估计 o)。一个 confidence interval (CI,置信区间)是一个范围 -- 估计值加上一个编码了它有多大 可能接近的余量:估计值+(分布乘数)× (变异性)。宽度来自抽样变异性(所以随n增大而收缩);乘数来自你选择的置 信水平。 Quantity Symbol Lives in[16]Source: asksia-cheatsheet-mast20034.pdfqualitative . convergence 17 . The "Because" Rule HOW TO BANK MARKS EVERY ANSWER = A CHAIN NAME the concept/framework - APPLY it to the context + BECAUSE . . . (the reason wins the mark) · Marks are per correct, sufficiently-detailed reason - one detailed reason often = full marks; restating the definition earns nothing · Dot points are fine . no grammar/spelling marks . 3 hours SIA > No question needs a calculator and none recall case-study details. If you're computing, you've misread it - they want the reasoning. Spend the words on the because. 18 . Top Traps to THE MARKS LOST MOST Avoid · "P = Pr(Ho true)" . "large P proves Ho" . choosing one- sided after the data · Calling a natural experiment an experiment . "association proves cause" · Cohort>case-control mix-up · stratified > quota . cluster<> stratified · Hill as a tick-box · ignoring temporality · "Significant" = "important" . trusting a result because n is huge the core of your 4-side bring-in notes . confirm on the MAST20034 exam-info page name it. apply it. because . . . asksia. ai/cheatsheet/ unimelb-mast20034 . side 2/2 AskSia CHEAT SHEET SERIES RESTRICTED BRING-IN . NO CALCULATOR
-
四、按章节拆 final 复习重点
-
1)总体框架:PPDAC + context first
-
PPDAC 是整门课脊柱:[4]Source: asksia-bible-mast20034-bilingual.pdf4 Analysis explore + model + test 3 Data collect / clean / store iterative: conclusions raise new problems -> cycle repeats The course's engine in one picture: Problem - Plan + Data - Analysis - Conclusion - the investigation cycle that frames almost every critique prompt. Most exam answers are really a question about one node: was the Plan a sound design? Were the Data well sampled? Is the Conclusion licensed by the design? Learn to walk it from memory. 一幅图道尽本课程的引擎:Problem → Plan → Data → Analysis → Conclusion -- 这个研究循环 框定了几乎每一个批判题。多数考试答案其实是关于 某一个节点的问题:Plan 是不是一个可靠的设计? Data 抽样得当吗?Conclusion 是设计所许可的吗? 要学会凭记忆把它走一遍。 Examinable scope = the 12-week reasoning spine: objectivity & data · good graphics · study design . observational studies & confounding · reporting & critiquing claims . qualitative methods . frameworks for inference . analysis & modelling . sampling & AskSia Library . MAST20034 . XXia Bilingual WEIRD bias . accumulating research (meta-analysis & Hill) . big data · context. Research prompts touch only the whole-class case studies and never demand recalled details. 可考范围 = 12 周的推理主线:objectivity 与数据 · 好图表 · 研究设计 · observational study 与 confounding · 报告 与批判主张 · qualitative methods · 推断框架 · 分析与建 模 · 抽样与 WEIRD bias · 积累研究 (meta-analysis 与 Hill) · big data · 情境。研究类题目只触及全班共学的案 例研究,从不要求背诵细节。 What the exam is really testing 这场考试真正在考什么 The cue you get The move it rewards A graph / figure Critique it: name two good features + one specific fix (the graphics principles) A described study Name the design - say what conclusion is legal (causation vs association) An association Find the confounder / lurking variable and explain how it could fake the link Statistical output / a CI / P Interpret it in context - without the classic misreads A sampling scenario Name the method & the bias (incl. WEIRD) and why size won't cure it ✓ The one habit that wins this exam 赢下这场考试的那一个习惯 For every prompt, name the concept first, then write the because. "Two variables move together" - confounding / correlation#causation; "who got picked" - a sampling / selection bias; "is this graph any good" - the five graphics principles; "does X cause Y across studies" - Bradford Hill; "what does this P-value mean" - the interpretation rules. The decoder in Ch 14 lists every cue. 对每道题,先点名概念,再写because。“两个变量一 起变动”→ confounding / correlation≠causation;“谁被选中”→ 某种 sampling/ selection bias;“这张图好不好”→五条 图表原则;“跨多项研究X是否导致 Y”→ Bradford Hill;“这个 P-value 是什么意思”→解读规则。第 14 章的解码器列出了每一个线索。 ★ The single highest-value habit 价值最高的单一习惯 You may write in dot-points or sentences, and there are no marks for grammar or spelling - so spend every word on the reasoning. Practise answering in the shape the markers reward: (1) name the concept, (2) define it in a line, (3) apply it to the scenario, (4) state the consequence or fix. Four sentences, full marks. "Explaining your reasoning and choices is typically more important than any answer. " 你可以用要点或句子书写,且语法或拼写不计分 所以把每一个词都花在推理上。按评分者奖励的形态 练习作答:(1)点名概念,(2)一行内定义它,(3)把它 应用到情景,(4)陈述后果或修正。四句话,满分。 “阐释你的推理与选择,通常比任何答案本身更重要。”[9]Source: asksia-bible-mast20034-bilingual.pdfThe PPDAC cycle - the spine of the whole unit, and a one-glance map of how an exam scenario hangs together: every question lives somewhere on Problem - Plan - Data - Analysis - Conclusion. Locating the stage tells you which concept the marker wants. PPDAC 循环 -- 整个单元的脊柱,也是一张让你一眼看清考试情景如何拼接的地图:每道题都栖身于 Problem → Plan → Data → Analysis → Conclusion 中的某处。定位到阶段,就知道评分者想要哪个概念。 AskSia Library . MAST20034 . XXia Bilingual ★ Concepts to recall - the whole-book checklist 要回忆的概念 -- 全书清单 · Context first (Ch1): data are value-laden; ask who/why/what/how/when; critique # criticism (always offer a constructive fix). 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · Graphics (Ch2): the 5 principles; match graph to variable types; two good features + one specific improvement. I 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 · Design (Ch3): validity = randomise/compare/control (kills bias); precision = replicate/stratify/balance (kills variability); they are independent axes. 设计(第3章): validity = 随机化/比较/控制(杀 bias); precision = 重复/分层/平衡(杀 variability);二者是独 立的轴。 · Observational (Ch4): cohort=group-by-exposure, case-control=group-by-outcome; confounder links to both; correlation # causation. 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠ 因果。 I 报告(第5章):中心/离散/趋势/离群点;报告 Cl+水平、以及统计量 +P,而非只报P。 · Qualitative (Ch6): "why" not "what"; bottom-up vs top-down coding; convergence as the stopping rule. 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence 作为停止规则。 推断(第7章):随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大P 不证明 Ho; Type l/ll 与 power. · Modelling (Ch8): signal+noise; "all models wrong, some useful"; parsimony; read residual/QQ plots - interpret, never fit. 建模(第8章):信号+噪声;“所有模型都是错的,有些有用”;简约性;读残差/QQ图 -- 解读,绝不拟合。 · Sampling (Ch9): frame vs sample; a big sample won't fix bias; 4 random + 4 non-random methods; WEIRD; reproducibility crisis. I 抽样(第9章):抽样框 vs样本;大样本修不了偏倚;4种随机+4种非随机方法;WEIRD;可重复性危机。 I 累积(第10章):森林图(零线+菱形);Hill 准则(时序性+梯度);发表偏倚。 · Big data (Ch11): significance # importance at scale; provenance, ethics, scepticism toward Al findings. Big data (第11章):在大规模下显著 ≠重要;来源、伦理、对 AI发现的怀疑。 而且永远 -- 铁律:点名概念,然后给 because。祝你好运。 AskSia Library · MAST20034 · 双语 Bilingual ● 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 ● 设计(第3章): validity =随机化/比较/控制(杀 bias); precision= 重复/分层/平衡(杀 variability);二者是独 立的轴。 ● 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠因果。 ● 报告(第5章):中心/离散/趋势/离群点;报告 CI+水平、以及统计量+P,而非只报P。 ● 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence(收敛)作为停止规则。 推断(第7章): 随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大 P 不证明 Ho; Type l/ll 与 power.[10]Source: asksia-bible-mast20034-bilingual.pdf4 SIDES OF OWN NOTES (BRING-IN) 面自备笔记(可带入) CALCULATORS PERMITTED 允许使用计算器 S/A SHORT-ANSWER REASONING 简答推理 The four assessment pieces 四项评估构成 Component Weight When Format Final exam 60% Exam period In-person, 3 hr; short-answer reasoning; no calculator / no calculations; 4- side own-notes bring-in 4 Short assignments 20% Across sem Individual; 200- word critique each (APA 7); hard word penalties; pick a case study Group project 15% + Wk 11 Team report + presentation + peer/contribution review; design or critique a study 5 Revision quizzes 5% Wks 2/4/6/9/12 Online (LMS); low- stakes checks on lecture content FIG 0. 1 1 Problem define the question 5 Conclusion answer in context 2 Plan design how to get date PPDAC cycle[13]Source: asksia-bible-mast20034-bilingual.pdfEX 12. 1 Turning a fact into a because (worked short-answer) name > consequence + because Stem (AskSia-invented): "A wellbeing app is evaluated by surveying users who clicked an in-app pop-up. Comment on the sample. " 题干(AskSia 自拟):“某福祉 app通过调查那些点击了应用内弹窗的用户来评估。评论这个样本。” Weak (no marks): "It is a convenience sample. " - a label, no reasoning. 弱(无分):“这是一个 convenience sample。” -- 只是标签,没有推理。 Strong (banks the marks): "This is a convenience / volunteer sample, because only users already engaged enough to click respond - so it suffers self-selection bias and likely over-states satisfaction (because dissatisfied users have churned and are missing). A larger pop-up sample would not fix this, because it repeats the same biased method at scale. " 强(存下分数):“这是一个 convenience/ volunteer(便利/自愿)样本,因为只有已经足够投入到会去点击的用户才 会作答 -- 所以它存在 self-selection bias(自我选择偏倚),很可能高估满意度(因为不满意的用户已经流失,处于 缺失(missing)状态)。更大的弹窗样本并不能解决这个问题,因为它只是把同一种有偏的方法放大重复。” - Read-out: three clauses, three becauses: name the method - name the consequence in context - pre- empt the 'bigger sample' trap. (Scenario AskSia-invented; no figures to compute. ) 读出结构:三个分句,三个because:点名方法→在情境中点名后果→预先化解“样本更大”的陷阱。(情景由 AskSia 自拟;没有要计算的数字。) - 12. 4 The 3-hour timing plan 12. 43 小时计时计划 Three hours for short-answer reasoning is generous - the risk is over-writing early questions, not running out of ideas. Budget by marks, leave a critique-polish pass at the end. 三小时做简答推理是宽裕的 -- 风险在于早段题目写过头,而非想不出点子。按分值分配时间,末尾留一遍批判-润色的检 查。 AskSia Library · MAST20034 · 双语 Bilingual 1 First 10 min - survey & map. Read every question; pencil the decoder row next to each (design? critique? interpret? sample?). Spot the high-mark items. 头10分钟 -- 通览与定位。把每道题读一遍;在每题旁用铅笔标出解码器的行(设计?批判?解读?抽样?)。挑出高分 题。 2 Bulk - answer by mark weight. Roughly a minute per mark; a 4-mark design question wants two detailed becauses, a 2-mark "two good features" wants exactly two. Do not pad. 主体 -- 按分值作答。大约每分一分钟;一道 4分的设计题想要两个详尽的 because,一道 2分的“两个好特征”就恰好两 个。别注水。 3 Discipline - never compute. If you feel an arithmetic urge, you have misread - the answer is an interpretation, not a number. 纪律 -- 绝不计算。若你感到一股算术冲动,那你读错题了 -- 答案是一个解读,而非一个数字。 4 Last 20 min - the because audit. Re-read each answer and check every claim ends in a reason tied to the scenario; add the missing because, the missing caveat (significance # importance), the missing specific fix. 最后 20 分钟 -- because 审计。重读每个答案,检查每一条主张是否都以一个系到情景的理由收尾;补上缺失的 because、缺失的注意点(显著≠重要)、缺失的具体修复。 FIG 12. 1 1 Problem define the question 5 Conclusion answer in context 2 Plan design how to get data PPDAC cycle 4 Analysis explore + model + test 3 Data collect / clean / store iterative: conclusions raise new problems -> cycle repeats
- Problem:研究问题是什么
- Plan:怎么设计研究
- Data:怎么收集/清洗/存储数据
- Analysis:怎么探索/建模/检验
- Conclusion:结论能说到哪一步
-
你要会:
-
context first 的意思:
- 数据不是中立的,要先问:
- critique 不是纯挑刺,而是:
- 指出问题
- 给建设性修复建议[9]Source: asksia-bible-mast20034-bilingual.pdfThe PPDAC cycle - the spine of the whole unit, and a one-glance map of how an exam scenario hangs together: every question lives somewhere on Problem - Plan - Data - Analysis - Conclusion. Locating the stage tells you which concept the marker wants. PPDAC 循环 -- 整个单元的脊柱,也是一张让你一眼看清考试情景如何拼接的地图:每道题都栖身于 Problem → Plan → Data → Analysis → Conclusion 中的某处。定位到阶段,就知道评分者想要哪个概念。 AskSia Library . MAST20034 . XXia Bilingual ★ Concepts to recall - the whole-book checklist 要回忆的概念 -- 全书清单 · Context first (Ch1): data are value-laden; ask who/why/what/how/when; critique # criticism (always offer a constructive fix). 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · Graphics (Ch2): the 5 principles; match graph to variable types; two good features + one specific improvement. I 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 · Design (Ch3): validity = randomise/compare/control (kills bias); precision = replicate/stratify/balance (kills variability); they are independent axes. 设计(第3章): validity = 随机化/比较/控制(杀 bias); precision = 重复/分层/平衡(杀 variability);二者是独 立的轴。 · Observational (Ch4): cohort=group-by-exposure, case-control=group-by-outcome; confounder links to both; correlation # causation. 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠ 因果。 I 报告(第5章):中心/离散/趋势/离群点;报告 Cl+水平、以及统计量 +P,而非只报P。 · Qualitative (Ch6): "why" not "what"; bottom-up vs top-down coding; convergence as the stopping rule. 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence 作为停止规则。 推断(第7章):随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大P 不证明 Ho; Type l/ll 与 power. · Modelling (Ch8): signal+noise; "all models wrong, some useful"; parsimony; read residual/QQ plots - interpret, never fit. 建模(第8章):信号+噪声;“所有模型都是错的,有些有用”;简约性;读残差/QQ图 -- 解读,绝不拟合。 · Sampling (Ch9): frame vs sample; a big sample won't fix bias; 4 random + 4 non-random methods; WEIRD; reproducibility crisis. I 抽样(第9章):抽样框 vs样本;大样本修不了偏倚;4种随机+4种非随机方法;WEIRD;可重复性危机。 I 累积(第10章):森林图(零线+菱形);Hill 准则(时序性+梯度);发表偏倚。 · Big data (Ch11): significance # importance at scale; provenance, ethics, scepticism toward Al findings. Big data (第11章):在大规模下显著 ≠重要;来源、伦理、对 AI发现的怀疑。 而且永远 -- 铁律:点名概念,然后给 because。祝你好运。 AskSia Library · MAST20034 · 双语 Bilingual ● 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 ● 设计(第3章): validity =随机化/比较/控制(杀 bias); precision= 重复/分层/平衡(杀 variability);二者是独 立的轴。 ● 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠因果。 ● 报告(第5章):中心/离散/趋势/离群点;报告 CI+水平、以及统计量+P,而非只报P。 ● 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence(收敛)作为停止规则。 推断(第7章): 随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大 P 不证明 Ho; Type l/ll 与 power.[12]Source: asksia-bible-mast20034-bilingual.pdfTrap. offering a variable that affects only the outcome (that's not a confounder - a confounder must link to both); claiming causation. 陷阱。给出一个只影响结局(outcome)的变量(那不是 confounder -- confounder 必须同时连向两端);声称存在因 果。 Q5 GRAPH CRITIQUE - PRAISE [2: 1 mark each] Prompt (paraphrased). Given a multi-panel scatterplot, identify two good features in terms of communicating information. 题目(转述)。给定一张多面板散点图,从信息传达的角度识别两个优点。 - Model reasoning - the because skeleton. Name two concrete features, each tied to a 范例推理 -- because骨架。 graphics principle: (1) a scatterplot is the standard form for two numerical variables; (2) colour and symbol double-encode the groups (accessible / redundant coding); or panels separate groups instead of overplotting; or a clear title + source gives context. What earns the marks. two specific features, each named to a principle - 1 mark each. Specificity is everything. 什么能得分。两个具体的特征,每个都挂到一条原则上 -- 各1分。具体性就是一切。 Trap. generic praise ("nice colours", "easy to read") with no principle; praising a feature that isn't actually in the chart. 陷阱。没有原则的笼统夸赞(“颜色好看”“易读”);夸了一个图里其实没有的特征。 AskSia Library . MAST20034 . XXia Bilingual 06 GRAPH CRITIQUE - FIX [3: 1 issue + 2 specific fix] Prompt (paraphrased). Identify one feature that could be improved and suggest a specific improvement. 题目(转述)。识别一个可改进之处,并提出一个具体的改进。 Model reasoning - the because skeleton. 范例推理 -- because骨架。 (1) Issue: name one real fault - e. g. panels are on different y-axis scales, so the eye mis-reads the comparison. (2) Fix (because): "set every panel to a common axis range so the heights are directly comparable" - the fix must address the named issue. (Other valid pairs: no title - add a title stating data + context + source; cryptic labels - spell out the full variable names. ) What earns the marks. 1 mark to name a real issue + 2 marks for a fix that directly resolves that exact issue. The fix must match the fault. 什么能得分。1分给点名一个真实问题+2 分给一个直接解决那个确切问题的修正。修正必须对得上故障。 Trap. a fix that doesn't address the issue you named (rubric zeroes it); naming an issue but giving no concrete fix; "fixing" something that was fine. 陷阱。一个不针对你所点名问题的修正(评分会归零);只点名问题却不给具体修正;“修”了本来没问题的东西。 AskSia Library . MAST20034 . XXia Bilingual REVISION . SHORT - ANSWER BANK REVISION . SHORT - ANSWER BANK INTERPRET -AND - CRITIQUE SPECIES Drills 7-12: inference, errors, causation & ethics 演练 7-12: inference、误差、causation 与伦理
-
考试里怎么写
- “This issue sits at the Plan/Data stage of PPDAC, because the main problem is how the sample was recruited.”
- 中文思路:
- “这个问题属于 PPDAC 的 Plan/Data 阶段,因为核心问题在于样本是怎么招募的。”
-
2)图表批判(Graphics)
-
这是高频题。你通常要写:
- 两个优点
- 一个问题
- 一个具体修复[4]Source: asksia-bible-mast20034-bilingual.pdf4 Analysis explore + model + test 3 Data collect / clean / store iterative: conclusions raise new problems -> cycle repeats The course's engine in one picture: Problem - Plan + Data - Analysis - Conclusion - the investigation cycle that frames almost every critique prompt. Most exam answers are really a question about one node: was the Plan a sound design? Were the Data well sampled? Is the Conclusion licensed by the design? Learn to walk it from memory. 一幅图道尽本课程的引擎:Problem → Plan → Data → Analysis → Conclusion -- 这个研究循环 框定了几乎每一个批判题。多数考试答案其实是关于 某一个节点的问题:Plan 是不是一个可靠的设计? Data 抽样得当吗?Conclusion 是设计所许可的吗? 要学会凭记忆把它走一遍。 Examinable scope = the 12-week reasoning spine: objectivity & data · good graphics · study design . observational studies & confounding · reporting & critiquing claims . qualitative methods . frameworks for inference . analysis & modelling . sampling & AskSia Library . MAST20034 . XXia Bilingual WEIRD bias . accumulating research (meta-analysis & Hill) . big data · context. Research prompts touch only the whole-class case studies and never demand recalled details. 可考范围 = 12 周的推理主线:objectivity 与数据 · 好图表 · 研究设计 · observational study 与 confounding · 报告 与批判主张 · qualitative methods · 推断框架 · 分析与建 模 · 抽样与 WEIRD bias · 积累研究 (meta-analysis 与 Hill) · big data · 情境。研究类题目只触及全班共学的案 例研究,从不要求背诵细节。 What the exam is really testing 这场考试真正在考什么 The cue you get The move it rewards A graph / figure Critique it: name two good features + one specific fix (the graphics principles) A described study Name the design - say what conclusion is legal (causation vs association) An association Find the confounder / lurking variable and explain how it could fake the link Statistical output / a CI / P Interpret it in context - without the classic misreads A sampling scenario Name the method & the bias (incl. WEIRD) and why size won't cure it ✓ The one habit that wins this exam 赢下这场考试的那一个习惯 For every prompt, name the concept first, then write the because. "Two variables move together" - confounding / correlation#causation; "who got picked" - a sampling / selection bias; "is this graph any good" - the five graphics principles; "does X cause Y across studies" - Bradford Hill; "what does this P-value mean" - the interpretation rules. The decoder in Ch 14 lists every cue. 对每道题,先点名概念,再写because。“两个变量一 起变动”→ confounding / correlation≠causation;“谁被选中”→ 某种 sampling/ selection bias;“这张图好不好”→五条 图表原则;“跨多项研究X是否导致 Y”→ Bradford Hill;“这个 P-value 是什么意思”→解读规则。第 14 章的解码器列出了每一个线索。 ★ The single highest-value habit 价值最高的单一习惯 You may write in dot-points or sentences, and there are no marks for grammar or spelling - so spend every word on the reasoning. Practise answering in the shape the markers reward: (1) name the concept, (2) define it in a line, (3) apply it to the scenario, (4) state the consequence or fix. Four sentences, full marks. "Explaining your reasoning and choices is typically more important than any answer. " 你可以用要点或句子书写,且语法或拼写不计分 所以把每一个词都花在推理上。按评分者奖励的形态 练习作答:(1)点名概念,(2)一行内定义它,(3)把它 应用到情景,(4)陈述后果或修正。四句话,满分。 “阐释你的推理与选择,通常比任何答案本身更重要。”[5]Source: asksia-bible-mast20034-bilingual.pdfFINAL . 60% . SHORT-ANSWER REASONING The exam-morning decoder 考试当天解码器 If the question says X, reach for Y, and say these three things 若题目说 X,就伸手去取 Y,并说出这三件事 TL;DR. The final hands you a scenario, a graph, or a piece of statistical output and asks you to name the concept and justify it. There is no calculator and no arithmetic - every mark is a because. This page is the lookup table: read the cue words in the stem, reach for the matching concept, then deliver the three reasons that bank the marks. Memorise the column on the right; that is the answer. TL;DR. 期末递给你一个scenario (情景)、一张图,或一段统计输出,要你点名概念并加以论证。没有计算器,也没有算术 -- 每一分都是一个because。本页就是查找表:读题干里的cue words (线索词),伸手抓对应的概念,再交出能存下分数 的三条理由。把右侧那一列背下来;那就是答案。 ★ What the exam asks here 考试在这里问什么 The 60% final is 3 hours, short-answer only (no MCQ, no essay), no calculator, and you bring in 4 sides of your own notes. The marking criteria are explicit: "explaining your reasoning and choices is typically more important than any answer. " Dot-points are fine; no marks for grammar/spelling. So this whole chapter trains the one move the exam pays for - name the concept - give the because. 60% 的期末为时 3 小时,仅简答(无MCQ、无论文),不可用计算器,且你带入自备4面笔记。评分标准写得很明 确:“解释你的推理与选择,通常比任何答案本身更重要。”用要点列举即可;语法/拼写不计分。所以整章都在训练考试 买单的那一招 -- 点名概念 →给出 because。 12. 1 The cue - concept - because table 12. 1cue -> concept -> because xJAR Each row is a question species you have already met in this book. The left column is what the stem sounds like; the middle is the framework to invoke (with the chapter); the right is the 3-part skeleton - say all three and you have earned the reasoning marks. 每一行都是你在本书中已经见过的一类问题species(题种)。左列是题干听起来像什么;中列是要调用的框架(附章节);右 列是三段式骨架 -- 三段都说出来,你就挣到了推理分。 If the question says . . . Reach for this concept Say these 3 things (the because) DESIGN & CAUSATION "Choose / justify a study design"; "how would you investigate . . . " Study-design tree + validity · Ch3-4 (1) Can you intervene? - experiment (RCT) vs observational. (2) Pick the type with a because - rare outcome-case-control, many outcomes-cohort, snapshot-cross-sectional, populations-ecological. (3) Name the design tools that protect validity (randomise/compare/control). "Is it causal?"; "does X cause Y?"; "can we conclude . . . " Confounding + Bradford Hill . Ch4, Ch10 (1) Correlation # causation - observational data give association only. (2) Name a plausible confounder (linked to both exposure and outcome). (3) Argue Hill - esp. temporality (cause first) + dose-response gradient; an RCT would strengthen it by removing confounders. AskSia Library . MAST20034 . XXia Bilingual If the question says . . . Reach for this concept Say these 3 things (the because) GRAPHS & OUTPUT "Critique this graph"; "two good features & one improvement" 5 graphics principles . Ch2 (1) Two good features, each tied to a principle (standard form / common scale / clear encoding / shows data / simple). (2) One real issue (no title, abbreviated labels, panels on different scales). (3) A specific fix that addresses that issue - vague fixes score zero.[12]Source: asksia-bible-mast20034-bilingual.pdfTrap. offering a variable that affects only the outcome (that's not a confounder - a confounder must link to both); claiming causation. 陷阱。给出一个只影响结局(outcome)的变量(那不是 confounder -- confounder 必须同时连向两端);声称存在因 果。 Q5 GRAPH CRITIQUE - PRAISE [2: 1 mark each] Prompt (paraphrased). Given a multi-panel scatterplot, identify two good features in terms of communicating information. 题目(转述)。给定一张多面板散点图,从信息传达的角度识别两个优点。 - Model reasoning - the because skeleton. Name two concrete features, each tied to a 范例推理 -- because骨架。 graphics principle: (1) a scatterplot is the standard form for two numerical variables; (2) colour and symbol double-encode the groups (accessible / redundant coding); or panels separate groups instead of overplotting; or a clear title + source gives context. What earns the marks. two specific features, each named to a principle - 1 mark each. Specificity is everything. 什么能得分。两个具体的特征,每个都挂到一条原则上 -- 各1分。具体性就是一切。 Trap. generic praise ("nice colours", "easy to read") with no principle; praising a feature that isn't actually in the chart. 陷阱。没有原则的笼统夸赞(“颜色好看”“易读”);夸了一个图里其实没有的特征。 AskSia Library . MAST20034 . XXia Bilingual 06 GRAPH CRITIQUE - FIX [3: 1 issue + 2 specific fix] Prompt (paraphrased). Identify one feature that could be improved and suggest a specific improvement. 题目(转述)。识别一个可改进之处,并提出一个具体的改进。 Model reasoning - the because skeleton. 范例推理 -- because骨架。 (1) Issue: name one real fault - e. g. panels are on different y-axis scales, so the eye mis-reads the comparison. (2) Fix (because): "set every panel to a common axis range so the heights are directly comparable" - the fix must address the named issue. (Other valid pairs: no title - add a title stating data + context + source; cryptic labels - spell out the full variable names. ) What earns the marks. 1 mark to name a real issue + 2 marks for a fix that directly resolves that exact issue. The fix must match the fault. 什么能得分。1分给点名一个真实问题+2 分给一个直接解决那个确切问题的修正。修正必须对得上故障。 Trap. a fix that doesn't address the issue you named (rubric zeroes it); naming an issue but giving no concrete fix; "fixing" something that was fine. 陷阱。一个不针对你所点名问题的修正(评分会归零);只点名问题却不给具体修正;“修”了本来没问题的东西。 AskSia Library . MAST20034 . XXia Bilingual REVISION . SHORT - ANSWER BANK REVISION . SHORT - ANSWER BANK INTERPRET -AND - CRITIQUE SPECIES Drills 7-12: inference, errors, causation & ethics 演练 7-12: inference、误差、causation 与伦理
-
五条图表原则是核心。材料里给出的关键词包括:[3]Source: asksia-bible-mast20034-bilingual.pdf"Interpret this CI / P- value / output / forest plot" Inference reading rules . Ch7, Ch8, Ch10 (1) State what it shows in context (CI excludes O - evidence of an effect; small P - strong evidence against H. ). (2) Add the correct caveat (the interval is random, u is fixed; large P does not prove H. ). (3) Comment on strength / meaning - significant # important. SAMPLING & TRUST "Is this sample OK?"; "what's wrong with how they recruited?" Sampling bias + WEIRD · Ch1, Ch9 (1) Ask who is missing - frame / selection / non-response / volunteer gap. (2) Name the method and its bias (convenience - people similar to each other). (3) State that a bigger sample will NOT fix bias - it repeats the mistake at scale; consider WEIRD over-sampling. "Too good to be true"; "a surprising significant result"; "just barely p‹0. 05" Reproducibility + p-hacking · Ch9-10 (1) Publication bias - novel/significant results over-published, inflating effects. (2) Watch for p-hacking / HARKing (one-sided chosen after the data, multiple looks). (3) Ask for replication, a CI / effect size, and pre- registration before trusting it. "Big data / an Al claim"; "with millions of records . . . " Ethics + validity at scale · Ch1, Ch11 (1) Huge n - everything is significant - judge effect size & practical importance, not P. (2) Apply the context questions (who/why/what/how) + data ethics (consent, fairness, stewardship). (3) Be sceptical of Al - check provenance and the missing data. ✓ How to use the table under pressure 在压力下如何使用这张表 Underline the verb and the noun in the stem first ("choose a design", "critique this graph", "interpret the output"). That two-word cue picks the row; the right column is your paragraph. Then convert each of the three things into a sentence that ends in a because. You are never asked to compute - resist the urge. 先在题干里给动词和名词划线(“选择一个设计”、“批判这张图”、“解读这段输出”)。那个两字提示挑出对应的行;右侧 那一栏就是你要写的段落。然后把这三样东西各转成一个以 because 收尾的句子。题目从不要求你计算 -- 忍住冲动。 AskSia Library . MAST20034 . XXia Bilingual EXAM MORNING . THE DECODER - EXAM MORNING . THE DECODER BUILDING THE NOTES YOU CARRY IN Your 4 sides, the 'because' rule, and the 3 hours 你的4 页笔记、‘because’规则,以及那3个小时 TL;DR. You may bring four sides of your own notes and there is no calculator - so do not waste space on formulas. Fill the four sides with decision trees, checklists, and crisp definitions: the machinery that turns a cue into a justified answer. This page lays out what to put on each side, the one rule that wins short- answer marks, a timing plan for the three hours, and the closing concepts-to-recall list. TL;DR. 你可以带四面自己的笔记,而且没有计算器 -- 所以不要把空间浪费在公式上。把这四面填满decision trees (决策 树)、checklists (清单)和精炼的定义:那些把线索变成有论证答案的机器。本页摆出每一面该放什么、赢得简答分的那一 条规则、三小时的时间规划,以及收官的待回忆概念清单。 - 12. 2 The 4-side notes plan 12. 2四页笔记计划 The exam is reasoning, not recall of numbers, so your sheet is a reasoning toolkit. A good layout maps one side to each job of the decoder above. Trees and checklists earn marks; a wall of formulae does not (there is nothing to calculate). 考的是推理,而非对数字的回忆,所以你的小抄是一个推理工具箱。好的布局把每一面对应到上面解码器的一项工作。树与清 单能得分;一墙公式不能(没有任何东西要算)。 Side What goes on it[9]Source: asksia-bible-mast20034-bilingual.pdfThe PPDAC cycle - the spine of the whole unit, and a one-glance map of how an exam scenario hangs together: every question lives somewhere on Problem - Plan - Data - Analysis - Conclusion. Locating the stage tells you which concept the marker wants. PPDAC 循环 -- 整个单元的脊柱,也是一张让你一眼看清考试情景如何拼接的地图:每道题都栖身于 Problem → Plan → Data → Analysis → Conclusion 中的某处。定位到阶段,就知道评分者想要哪个概念。 AskSia Library . MAST20034 . XXia Bilingual ★ Concepts to recall - the whole-book checklist 要回忆的概念 -- 全书清单 · Context first (Ch1): data are value-laden; ask who/why/what/how/when; critique # criticism (always offer a constructive fix). 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · Graphics (Ch2): the 5 principles; match graph to variable types; two good features + one specific improvement. I 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 · Design (Ch3): validity = randomise/compare/control (kills bias); precision = replicate/stratify/balance (kills variability); they are independent axes. 设计(第3章): validity = 随机化/比较/控制(杀 bias); precision = 重复/分层/平衡(杀 variability);二者是独 立的轴。 · Observational (Ch4): cohort=group-by-exposure, case-control=group-by-outcome; confounder links to both; correlation # causation. 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠ 因果。 I 报告(第5章):中心/离散/趋势/离群点;报告 Cl+水平、以及统计量 +P,而非只报P。 · Qualitative (Ch6): "why" not "what"; bottom-up vs top-down coding; convergence as the stopping rule. 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence 作为停止规则。 推断(第7章):随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大P 不证明 Ho; Type l/ll 与 power. · Modelling (Ch8): signal+noise; "all models wrong, some useful"; parsimony; read residual/QQ plots - interpret, never fit. 建模(第8章):信号+噪声;“所有模型都是错的,有些有用”;简约性;读残差/QQ图 -- 解读,绝不拟合。 · Sampling (Ch9): frame vs sample; a big sample won't fix bias; 4 random + 4 non-random methods; WEIRD; reproducibility crisis. I 抽样(第9章):抽样框 vs样本;大样本修不了偏倚;4种随机+4种非随机方法;WEIRD;可重复性危机。 I 累积(第10章):森林图(零线+菱形);Hill 准则(时序性+梯度);发表偏倚。 · Big data (Ch11): significance # importance at scale; provenance, ethics, scepticism toward Al findings. Big data (第11章):在大规模下显著 ≠重要;来源、伦理、对 AI发现的怀疑。 而且永远 -- 铁律:点名概念,然后给 because。祝你好运。 AskSia Library · MAST20034 · 双语 Bilingual ● 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 ● 设计(第3章): validity =随机化/比较/控制(杀 bias); precision= 重复/分层/平衡(杀 variability);二者是独 立的轴。 ● 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠因果。 ● 报告(第5章):中心/离散/趋势/离群点;报告 CI+水平、以及统计量+P,而非只报P。 ● 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence(收敛)作为停止规则。 推断(第7章): 随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大 P 不证明 Ho; Type l/ll 与 power.[12]Source: asksia-bible-mast20034-bilingual.pdfTrap. offering a variable that affects only the outcome (that's not a confounder - a confounder must link to both); claiming causation. 陷阱。给出一个只影响结局(outcome)的变量(那不是 confounder -- confounder 必须同时连向两端);声称存在因 果。 Q5 GRAPH CRITIQUE - PRAISE [2: 1 mark each] Prompt (paraphrased). Given a multi-panel scatterplot, identify two good features in terms of communicating information. 题目(转述)。给定一张多面板散点图,从信息传达的角度识别两个优点。 - Model reasoning - the because skeleton. Name two concrete features, each tied to a 范例推理 -- because骨架。 graphics principle: (1) a scatterplot is the standard form for two numerical variables; (2) colour and symbol double-encode the groups (accessible / redundant coding); or panels separate groups instead of overplotting; or a clear title + source gives context. What earns the marks. two specific features, each named to a principle - 1 mark each. Specificity is everything. 什么能得分。两个具体的特征,每个都挂到一条原则上 -- 各1分。具体性就是一切。 Trap. generic praise ("nice colours", "easy to read") with no principle; praising a feature that isn't actually in the chart. 陷阱。没有原则的笼统夸赞(“颜色好看”“易读”);夸了一个图里其实没有的特征。 AskSia Library . MAST20034 . XXia Bilingual 06 GRAPH CRITIQUE - FIX [3: 1 issue + 2 specific fix] Prompt (paraphrased). Identify one feature that could be improved and suggest a specific improvement. 题目(转述)。识别一个可改进之处,并提出一个具体的改进。 Model reasoning - the because skeleton. 范例推理 -- because骨架。 (1) Issue: name one real fault - e. g. panels are on different y-axis scales, so the eye mis-reads the comparison. (2) Fix (because): "set every panel to a common axis range so the heights are directly comparable" - the fix must address the named issue. (Other valid pairs: no title - add a title stating data + context + source; cryptic labels - spell out the full variable names. ) What earns the marks. 1 mark to name a real issue + 2 marks for a fix that directly resolves that exact issue. The fix must match the fault. 什么能得分。1分给点名一个真实问题+2 分给一个直接解决那个确切问题的修正。修正必须对得上故障。 Trap. a fix that doesn't address the issue you named (rubric zeroes it); naming an issue but giving no concrete fix; "fixing" something that was fine. 陷阱。一个不针对你所点名问题的修正(评分会归零);只点名问题却不给具体修正;“修”了本来没问题的东西。 AskSia Library . MAST20034 . XXia Bilingual REVISION . SHORT - ANSWER BANK REVISION . SHORT - ANSWER BANK INTERPRET -AND - CRITIQUE SPECIES Drills 7-12: inference, errors, causation & ethics 演练 7-12: inference、误差、causation 与伦理
- standard form:图型和变量类型匹配
- common scale:比较时量尺一致
- clear encoding:编码清楚,颜色/符号含义明确
- shows data:真正把数据展示出来
- simple:简洁,不乱
-
常见优点写法
- 散点图适合两个数值变量,因为这是 standard form。[12]Source: asksia-bible-mast20034-bilingual.pdfTrap. offering a variable that affects only the outcome (that's not a confounder - a confounder must link to both); claiming causation. 陷阱。给出一个只影响结局(outcome)的变量(那不是 confounder -- confounder 必须同时连向两端);声称存在因 果。 Q5 GRAPH CRITIQUE - PRAISE [2: 1 mark each] Prompt (paraphrased). Given a multi-panel scatterplot, identify two good features in terms of communicating information. 题目(转述)。给定一张多面板散点图,从信息传达的角度识别两个优点。 - Model reasoning - the because skeleton. Name two concrete features, each tied to a 范例推理 -- because骨架。 graphics principle: (1) a scatterplot is the standard form for two numerical variables; (2) colour and symbol double-encode the groups (accessible / redundant coding); or panels separate groups instead of overplotting; or a clear title + source gives context. What earns the marks. two specific features, each named to a principle - 1 mark each. Specificity is everything. 什么能得分。两个具体的特征,每个都挂到一条原则上 -- 各1分。具体性就是一切。 Trap. generic praise ("nice colours", "easy to read") with no principle; praising a feature that isn't actually in the chart. 陷阱。没有原则的笼统夸赞(“颜色好看”“易读”);夸了一个图里其实没有的特征。 AskSia Library . MAST20034 . XXia Bilingual 06 GRAPH CRITIQUE - FIX [3: 1 issue + 2 specific fix] Prompt (paraphrased). Identify one feature that could be improved and suggest a specific improvement. 题目(转述)。识别一个可改进之处,并提出一个具体的改进。 Model reasoning - the because skeleton. 范例推理 -- because骨架。 (1) Issue: name one real fault - e. g. panels are on different y-axis scales, so the eye mis-reads the comparison. (2) Fix (because): "set every panel to a common axis range so the heights are directly comparable" - the fix must address the named issue. (Other valid pairs: no title - add a title stating data + context + source; cryptic labels - spell out the full variable names. ) What earns the marks. 1 mark to name a real issue + 2 marks for a fix that directly resolves that exact issue. The fix must match the fault. 什么能得分。1分给点名一个真实问题+2 分给一个直接解决那个确切问题的修正。修正必须对得上故障。 Trap. a fix that doesn't address the issue you named (rubric zeroes it); naming an issue but giving no concrete fix; "fixing" something that was fine. 陷阱。一个不针对你所点名问题的修正(评分会归零);只点名问题却不给具体修正;“修”了本来没问题的东西。 AskSia Library . MAST20034 . XXia Bilingual REVISION . SHORT - ANSWER BANK REVISION . SHORT - ANSWER BANK INTERPRET -AND - CRITIQUE SPECIES Drills 7-12: inference, errors, causation & ethics 演练 7-12: inference、误差、causation 与伦理
- 用颜色和符号双重编码分组,因为更清楚、更可及。[12]Source: asksia-bible-mast20034-bilingual.pdfTrap. offering a variable that affects only the outcome (that's not a confounder - a confounder must link to both); claiming causation. 陷阱。给出一个只影响结局(outcome)的变量(那不是 confounder -- confounder 必须同时连向两端);声称存在因 果。 Q5 GRAPH CRITIQUE - PRAISE [2: 1 mark each] Prompt (paraphrased). Given a multi-panel scatterplot, identify two good features in terms of communicating information. 题目(转述)。给定一张多面板散点图,从信息传达的角度识别两个优点。 - Model reasoning - the because skeleton. Name two concrete features, each tied to a 范例推理 -- because骨架。 graphics principle: (1) a scatterplot is the standard form for two numerical variables; (2) colour and symbol double-encode the groups (accessible / redundant coding); or panels separate groups instead of overplotting; or a clear title + source gives context. What earns the marks. two specific features, each named to a principle - 1 mark each. Specificity is everything. 什么能得分。两个具体的特征,每个都挂到一条原则上 -- 各1分。具体性就是一切。 Trap. generic praise ("nice colours", "easy to read") with no principle; praising a feature that isn't actually in the chart. 陷阱。没有原则的笼统夸赞(“颜色好看”“易读”);夸了一个图里其实没有的特征。 AskSia Library . MAST20034 . XXia Bilingual 06 GRAPH CRITIQUE - FIX [3: 1 issue + 2 specific fix] Prompt (paraphrased). Identify one feature that could be improved and suggest a specific improvement. 题目(转述)。识别一个可改进之处,并提出一个具体的改进。 Model reasoning - the because skeleton. 范例推理 -- because骨架。 (1) Issue: name one real fault - e. g. panels are on different y-axis scales, so the eye mis-reads the comparison. (2) Fix (because): "set every panel to a common axis range so the heights are directly comparable" - the fix must address the named issue. (Other valid pairs: no title - add a title stating data + context + source; cryptic labels - spell out the full variable names. ) What earns the marks. 1 mark to name a real issue + 2 marks for a fix that directly resolves that exact issue. The fix must match the fault. 什么能得分。1分给点名一个真实问题+2 分给一个直接解决那个确切问题的修正。修正必须对得上故障。 Trap. a fix that doesn't address the issue you named (rubric zeroes it); naming an issue but giving no concrete fix; "fixing" something that was fine. 陷阱。一个不针对你所点名问题的修正(评分会归零);只点名问题却不给具体修正;“修”了本来没问题的东西。 AskSia Library . MAST20034 . XXia Bilingual REVISION . SHORT - ANSWER BANK REVISION . SHORT - ANSWER BANK INTERPRET -AND - CRITIQUE SPECIES Drills 7-12: inference, errors, causation & ethics 演练 7-12: inference、误差、causation 与伦理
- 多面板分组能减少 overplotting。[12]Source: asksia-bible-mast20034-bilingual.pdfTrap. offering a variable that affects only the outcome (that's not a confounder - a confounder must link to both); claiming causation. 陷阱。给出一个只影响结局(outcome)的变量(那不是 confounder -- confounder 必须同时连向两端);声称存在因 果。 Q5 GRAPH CRITIQUE - PRAISE [2: 1 mark each] Prompt (paraphrased). Given a multi-panel scatterplot, identify two good features in terms of communicating information. 题目(转述)。给定一张多面板散点图,从信息传达的角度识别两个优点。 - Model reasoning - the because skeleton. Name two concrete features, each tied to a 范例推理 -- because骨架。 graphics principle: (1) a scatterplot is the standard form for two numerical variables; (2) colour and symbol double-encode the groups (accessible / redundant coding); or panels separate groups instead of overplotting; or a clear title + source gives context. What earns the marks. two specific features, each named to a principle - 1 mark each. Specificity is everything. 什么能得分。两个具体的特征,每个都挂到一条原则上 -- 各1分。具体性就是一切。 Trap. generic praise ("nice colours", "easy to read") with no principle; praising a feature that isn't actually in the chart. 陷阱。没有原则的笼统夸赞(“颜色好看”“易读”);夸了一个图里其实没有的特征。 AskSia Library . MAST20034 . XXia Bilingual 06 GRAPH CRITIQUE - FIX [3: 1 issue + 2 specific fix] Prompt (paraphrased). Identify one feature that could be improved and suggest a specific improvement. 题目(转述)。识别一个可改进之处,并提出一个具体的改进。 Model reasoning - the because skeleton. 范例推理 -- because骨架。 (1) Issue: name one real fault - e. g. panels are on different y-axis scales, so the eye mis-reads the comparison. (2) Fix (because): "set every panel to a common axis range so the heights are directly comparable" - the fix must address the named issue. (Other valid pairs: no title - add a title stating data + context + source; cryptic labels - spell out the full variable names. ) What earns the marks. 1 mark to name a real issue + 2 marks for a fix that directly resolves that exact issue. The fix must match the fault. 什么能得分。1分给点名一个真实问题+2 分给一个直接解决那个确切问题的修正。修正必须对得上故障。 Trap. a fix that doesn't address the issue you named (rubric zeroes it); naming an issue but giving no concrete fix; "fixing" something that was fine. 陷阱。一个不针对你所点名问题的修正(评分会归零);只点名问题却不给具体修正;“修”了本来没问题的东西。 AskSia Library . MAST20034 . XXia Bilingual REVISION . SHORT - ANSWER BANK REVISION . SHORT - ANSWER BANK INTERPRET -AND - CRITIQUE SPECIES Drills 7-12: inference, errors, causation & ethics 演练 7-12: inference、误差、causation 与伦理
- 标题、来源、情境写清楚,有助于解释。[12]Source: asksia-bible-mast20034-bilingual.pdfTrap. offering a variable that affects only the outcome (that's not a confounder - a confounder must link to both); claiming causation. 陷阱。给出一个只影响结局(outcome)的变量(那不是 confounder -- confounder 必须同时连向两端);声称存在因 果。 Q5 GRAPH CRITIQUE - PRAISE [2: 1 mark each] Prompt (paraphrased). Given a multi-panel scatterplot, identify two good features in terms of communicating information. 题目(转述)。给定一张多面板散点图,从信息传达的角度识别两个优点。 - Model reasoning - the because skeleton. Name two concrete features, each tied to a 范例推理 -- because骨架。 graphics principle: (1) a scatterplot is the standard form for two numerical variables; (2) colour and symbol double-encode the groups (accessible / redundant coding); or panels separate groups instead of overplotting; or a clear title + source gives context. What earns the marks. two specific features, each named to a principle - 1 mark each. Specificity is everything. 什么能得分。两个具体的特征,每个都挂到一条原则上 -- 各1分。具体性就是一切。 Trap. generic praise ("nice colours", "easy to read") with no principle; praising a feature that isn't actually in the chart. 陷阱。没有原则的笼统夸赞(“颜色好看”“易读”);夸了一个图里其实没有的特征。 AskSia Library . MAST20034 . XXia Bilingual 06 GRAPH CRITIQUE - FIX [3: 1 issue + 2 specific fix] Prompt (paraphrased). Identify one feature that could be improved and suggest a specific improvement. 题目(转述)。识别一个可改进之处,并提出一个具体的改进。 Model reasoning - the because skeleton. 范例推理 -- because骨架。 (1) Issue: name one real fault - e. g. panels are on different y-axis scales, so the eye mis-reads the comparison. (2) Fix (because): "set every panel to a common axis range so the heights are directly comparable" - the fix must address the named issue. (Other valid pairs: no title - add a title stating data + context + source; cryptic labels - spell out the full variable names. ) What earns the marks. 1 mark to name a real issue + 2 marks for a fix that directly resolves that exact issue. The fix must match the fault. 什么能得分。1分给点名一个真实问题+2 分给一个直接解决那个确切问题的修正。修正必须对得上故障。 Trap. a fix that doesn't address the issue you named (rubric zeroes it); naming an issue but giving no concrete fix; "fixing" something that was fine. 陷阱。一个不针对你所点名问题的修正(评分会归零);只点名问题却不给具体修正;“修”了本来没问题的东西。 AskSia Library . MAST20034 . XXia Bilingual REVISION . SHORT - ANSWER BANK REVISION . SHORT - ANSWER BANK INTERPRET -AND - CRITIQUE SPECIES Drills 7-12: inference, errors, causation & ethics 演练 7-12: inference、误差、causation 与伦理
-
常见缺点写法
- panels 用不同 y 轴,导致视觉上不可直接比较[12]Source: asksia-bible-mast20034-bilingual.pdfTrap. offering a variable that affects only the outcome (that's not a confounder - a confounder must link to both); claiming causation. 陷阱。给出一个只影响结局(outcome)的变量(那不是 confounder -- confounder 必须同时连向两端);声称存在因 果。 Q5 GRAPH CRITIQUE - PRAISE [2: 1 mark each] Prompt (paraphrased). Given a multi-panel scatterplot, identify two good features in terms of communicating information. 题目(转述)。给定一张多面板散点图,从信息传达的角度识别两个优点。 - Model reasoning - the because skeleton. Name two concrete features, each tied to a 范例推理 -- because骨架。 graphics principle: (1) a scatterplot is the standard form for two numerical variables; (2) colour and symbol double-encode the groups (accessible / redundant coding); or panels separate groups instead of overplotting; or a clear title + source gives context. What earns the marks. two specific features, each named to a principle - 1 mark each. Specificity is everything. 什么能得分。两个具体的特征,每个都挂到一条原则上 -- 各1分。具体性就是一切。 Trap. generic praise ("nice colours", "easy to read") with no principle; praising a feature that isn't actually in the chart. 陷阱。没有原则的笼统夸赞(“颜色好看”“易读”);夸了一个图里其实没有的特征。 AskSia Library . MAST20034 . XXia Bilingual 06 GRAPH CRITIQUE - FIX [3: 1 issue + 2 specific fix] Prompt (paraphrased). Identify one feature that could be improved and suggest a specific improvement. 题目(转述)。识别一个可改进之处,并提出一个具体的改进。 Model reasoning - the because skeleton. 范例推理 -- because骨架。 (1) Issue: name one real fault - e. g. panels are on different y-axis scales, so the eye mis-reads the comparison. (2) Fix (because): "set every panel to a common axis range so the heights are directly comparable" - the fix must address the named issue. (Other valid pairs: no title - add a title stating data + context + source; cryptic labels - spell out the full variable names. ) What earns the marks. 1 mark to name a real issue + 2 marks for a fix that directly resolves that exact issue. The fix must match the fault. 什么能得分。1分给点名一个真实问题+2 分给一个直接解决那个确切问题的修正。修正必须对得上故障。 Trap. a fix that doesn't address the issue you named (rubric zeroes it); naming an issue but giving no concrete fix; "fixing" something that was fine. 陷阱。一个不针对你所点名问题的修正(评分会归零);只点名问题却不给具体修正;“修”了本来没问题的东西。 AskSia Library . MAST20034 . XXia Bilingual REVISION . SHORT - ANSWER BANK REVISION . SHORT - ANSWER BANK INTERPRET -AND - CRITIQUE SPECIES Drills 7-12: inference, errors, causation & ethics 演练 7-12: inference、误差、causation 与伦理
- 没有标题[12]Source: asksia-bible-mast20034-bilingual.pdfTrap. offering a variable that affects only the outcome (that's not a confounder - a confounder must link to both); claiming causation. 陷阱。给出一个只影响结局(outcome)的变量(那不是 confounder -- confounder 必须同时连向两端);声称存在因 果。 Q5 GRAPH CRITIQUE - PRAISE [2: 1 mark each] Prompt (paraphrased). Given a multi-panel scatterplot, identify two good features in terms of communicating information. 题目(转述)。给定一张多面板散点图,从信息传达的角度识别两个优点。 - Model reasoning - the because skeleton. Name two concrete features, each tied to a 范例推理 -- because骨架。 graphics principle: (1) a scatterplot is the standard form for two numerical variables; (2) colour and symbol double-encode the groups (accessible / redundant coding); or panels separate groups instead of overplotting; or a clear title + source gives context. What earns the marks. two specific features, each named to a principle - 1 mark each. Specificity is everything. 什么能得分。两个具体的特征,每个都挂到一条原则上 -- 各1分。具体性就是一切。 Trap. generic praise ("nice colours", "easy to read") with no principle; praising a feature that isn't actually in the chart. 陷阱。没有原则的笼统夸赞(“颜色好看”“易读”);夸了一个图里其实没有的特征。 AskSia Library . MAST20034 . XXia Bilingual 06 GRAPH CRITIQUE - FIX [3: 1 issue + 2 specific fix] Prompt (paraphrased). Identify one feature that could be improved and suggest a specific improvement. 题目(转述)。识别一个可改进之处,并提出一个具体的改进。 Model reasoning - the because skeleton. 范例推理 -- because骨架。 (1) Issue: name one real fault - e. g. panels are on different y-axis scales, so the eye mis-reads the comparison. (2) Fix (because): "set every panel to a common axis range so the heights are directly comparable" - the fix must address the named issue. (Other valid pairs: no title - add a title stating data + context + source; cryptic labels - spell out the full variable names. ) What earns the marks. 1 mark to name a real issue + 2 marks for a fix that directly resolves that exact issue. The fix must match the fault. 什么能得分。1分给点名一个真实问题+2 分给一个直接解决那个确切问题的修正。修正必须对得上故障。 Trap. a fix that doesn't address the issue you named (rubric zeroes it); naming an issue but giving no concrete fix; "fixing" something that was fine. 陷阱。一个不针对你所点名问题的修正(评分会归零);只点名问题却不给具体修正;“修”了本来没问题的东西。 AskSia Library . MAST20034 . XXia Bilingual REVISION . SHORT - ANSWER BANK REVISION . SHORT - ANSWER BANK INTERPRET -AND - CRITIQUE SPECIES Drills 7-12: inference, errors, causation & ethics 演练 7-12: inference、误差、causation 与伦理
- 标签缩写太严重[12]Source: asksia-bible-mast20034-bilingual.pdfTrap. offering a variable that affects only the outcome (that's not a confounder - a confounder must link to both); claiming causation. 陷阱。给出一个只影响结局(outcome)的变量(那不是 confounder -- confounder 必须同时连向两端);声称存在因 果。 Q5 GRAPH CRITIQUE - PRAISE [2: 1 mark each] Prompt (paraphrased). Given a multi-panel scatterplot, identify two good features in terms of communicating information. 题目(转述)。给定一张多面板散点图,从信息传达的角度识别两个优点。 - Model reasoning - the because skeleton. Name two concrete features, each tied to a 范例推理 -- because骨架。 graphics principle: (1) a scatterplot is the standard form for two numerical variables; (2) colour and symbol double-encode the groups (accessible / redundant coding); or panels separate groups instead of overplotting; or a clear title + source gives context. What earns the marks. two specific features, each named to a principle - 1 mark each. Specificity is everything. 什么能得分。两个具体的特征,每个都挂到一条原则上 -- 各1分。具体性就是一切。 Trap. generic praise ("nice colours", "easy to read") with no principle; praising a feature that isn't actually in the chart. 陷阱。没有原则的笼统夸赞(“颜色好看”“易读”);夸了一个图里其实没有的特征。 AskSia Library . MAST20034 . XXia Bilingual 06 GRAPH CRITIQUE - FIX [3: 1 issue + 2 specific fix] Prompt (paraphrased). Identify one feature that could be improved and suggest a specific improvement. 题目(转述)。识别一个可改进之处,并提出一个具体的改进。 Model reasoning - the because skeleton. 范例推理 -- because骨架。 (1) Issue: name one real fault - e. g. panels are on different y-axis scales, so the eye mis-reads the comparison. (2) Fix (because): "set every panel to a common axis range so the heights are directly comparable" - the fix must address the named issue. (Other valid pairs: no title - add a title stating data + context + source; cryptic labels - spell out the full variable names. ) What earns the marks. 1 mark to name a real issue + 2 marks for a fix that directly resolves that exact issue. The fix must match the fault. 什么能得分。1分给点名一个真实问题+2 分给一个直接解决那个确切问题的修正。修正必须对得上故障。 Trap. a fix that doesn't address the issue you named (rubric zeroes it); naming an issue but giving no concrete fix; "fixing" something that was fine. 陷阱。一个不针对你所点名问题的修正(评分会归零);只点名问题却不给具体修正;“修”了本来没问题的东西。 AskSia Library . MAST20034 . XXia Bilingual REVISION . SHORT - ANSWER BANK REVISION . SHORT - ANSWER BANK INTERPRET -AND - CRITIQUE SPECIES Drills 7-12: inference, errors, causation & ethics 演练 7-12: inference、误差、causation 与伦理
- 尺度不统一[12]Source: asksia-bible-mast20034-bilingual.pdfTrap. offering a variable that affects only the outcome (that's not a confounder - a confounder must link to both); claiming causation. 陷阱。给出一个只影响结局(outcome)的变量(那不是 confounder -- confounder 必须同时连向两端);声称存在因 果。 Q5 GRAPH CRITIQUE - PRAISE [2: 1 mark each] Prompt (paraphrased). Given a multi-panel scatterplot, identify two good features in terms of communicating information. 题目(转述)。给定一张多面板散点图,从信息传达的角度识别两个优点。 - Model reasoning - the because skeleton. Name two concrete features, each tied to a 范例推理 -- because骨架。 graphics principle: (1) a scatterplot is the standard form for two numerical variables; (2) colour and symbol double-encode the groups (accessible / redundant coding); or panels separate groups instead of overplotting; or a clear title + source gives context. What earns the marks. two specific features, each named to a principle - 1 mark each. Specificity is everything. 什么能得分。两个具体的特征,每个都挂到一条原则上 -- 各1分。具体性就是一切。 Trap. generic praise ("nice colours", "easy to read") with no principle; praising a feature that isn't actually in the chart. 陷阱。没有原则的笼统夸赞(“颜色好看”“易读”);夸了一个图里其实没有的特征。 AskSia Library . MAST20034 . XXia Bilingual 06 GRAPH CRITIQUE - FIX [3: 1 issue + 2 specific fix] Prompt (paraphrased). Identify one feature that could be improved and suggest a specific improvement. 题目(转述)。识别一个可改进之处,并提出一个具体的改进。 Model reasoning - the because skeleton. 范例推理 -- because骨架。 (1) Issue: name one real fault - e. g. panels are on different y-axis scales, so the eye mis-reads the comparison. (2) Fix (because): "set every panel to a common axis range so the heights are directly comparable" - the fix must address the named issue. (Other valid pairs: no title - add a title stating data + context + source; cryptic labels - spell out the full variable names. ) What earns the marks. 1 mark to name a real issue + 2 marks for a fix that directly resolves that exact issue. The fix must match the fault. 什么能得分。1分给点名一个真实问题+2 分给一个直接解决那个确切问题的修正。修正必须对得上故障。 Trap. a fix that doesn't address the issue you named (rubric zeroes it); naming an issue but giving no concrete fix; "fixing" something that was fine. 陷阱。一个不针对你所点名问题的修正(评分会归零);只点名问题却不给具体修正;“修”了本来没问题的东西。 AskSia Library . MAST20034 . XXia Bilingual REVISION . SHORT - ANSWER BANK REVISION . SHORT - ANSWER BANK INTERPRET -AND - CRITIQUE SPECIES Drills 7-12: inference, errors, causation & ethics 演练 7-12: inference、误差、causation 与伦理
-
修复句模板
-
图表题陷阱
-
3)研究设计(Design)+ validity / precision
-
这是 super 高频。
-
你要先会分:
-
观察性研究里你要会认:
- cohort:按 exposure 分组,看 outcome[5]Source: asksia-bible-mast20034-bilingual.pdfFINAL . 60% . SHORT-ANSWER REASONING The exam-morning decoder 考试当天解码器 If the question says X, reach for Y, and say these three things 若题目说 X,就伸手去取 Y,并说出这三件事 TL;DR. The final hands you a scenario, a graph, or a piece of statistical output and asks you to name the concept and justify it. There is no calculator and no arithmetic - every mark is a because. This page is the lookup table: read the cue words in the stem, reach for the matching concept, then deliver the three reasons that bank the marks. Memorise the column on the right; that is the answer. TL;DR. 期末递给你一个scenario (情景)、一张图,或一段统计输出,要你点名概念并加以论证。没有计算器,也没有算术 -- 每一分都是一个because。本页就是查找表:读题干里的cue words (线索词),伸手抓对应的概念,再交出能存下分数 的三条理由。把右侧那一列背下来;那就是答案。 ★ What the exam asks here 考试在这里问什么 The 60% final is 3 hours, short-answer only (no MCQ, no essay), no calculator, and you bring in 4 sides of your own notes. The marking criteria are explicit: "explaining your reasoning and choices is typically more important than any answer. " Dot-points are fine; no marks for grammar/spelling. So this whole chapter trains the one move the exam pays for - name the concept - give the because. 60% 的期末为时 3 小时,仅简答(无MCQ、无论文),不可用计算器,且你带入自备4面笔记。评分标准写得很明 确:“解释你的推理与选择,通常比任何答案本身更重要。”用要点列举即可;语法/拼写不计分。所以整章都在训练考试 买单的那一招 -- 点名概念 →给出 because。 12. 1 The cue - concept - because table 12. 1cue -> concept -> because xJAR Each row is a question species you have already met in this book. The left column is what the stem sounds like; the middle is the framework to invoke (with the chapter); the right is the 3-part skeleton - say all three and you have earned the reasoning marks. 每一行都是你在本书中已经见过的一类问题species(题种)。左列是题干听起来像什么;中列是要调用的框架(附章节);右 列是三段式骨架 -- 三段都说出来,你就挣到了推理分。 If the question says . . . Reach for this concept Say these 3 things (the because) DESIGN & CAUSATION "Choose / justify a study design"; "how would you investigate . . . " Study-design tree + validity · Ch3-4 (1) Can you intervene? - experiment (RCT) vs observational. (2) Pick the type with a because - rare outcome-case-control, many outcomes-cohort, snapshot-cross-sectional, populations-ecological. (3) Name the design tools that protect validity (randomise/compare/control). "Is it causal?"; "does X cause Y?"; "can we conclude . . . " Confounding + Bradford Hill . Ch4, Ch10 (1) Correlation # causation - observational data give association only. (2) Name a plausible confounder (linked to both exposure and outcome). (3) Argue Hill - esp. temporality (cause first) + dose-response gradient; an RCT would strengthen it by removing confounders. AskSia Library . MAST20034 . XXia Bilingual If the question says . . . Reach for this concept Say these 3 things (the because) GRAPHS & OUTPUT "Critique this graph"; "two good features & one improvement" 5 graphics principles . Ch2 (1) Two good features, each tied to a principle (standard form / common scale / clear encoding / shows data / simple). (2) One real issue (no title, abbreviated labels, panels on different scales). (3) A specific fix that addresses that issue - vague fixes score zero.[9]Source: asksia-bible-mast20034-bilingual.pdfThe PPDAC cycle - the spine of the whole unit, and a one-glance map of how an exam scenario hangs together: every question lives somewhere on Problem - Plan - Data - Analysis - Conclusion. Locating the stage tells you which concept the marker wants. PPDAC 循环 -- 整个单元的脊柱,也是一张让你一眼看清考试情景如何拼接的地图:每道题都栖身于 Problem → Plan → Data → Analysis → Conclusion 中的某处。定位到阶段,就知道评分者想要哪个概念。 AskSia Library . MAST20034 . XXia Bilingual ★ Concepts to recall - the whole-book checklist 要回忆的概念 -- 全书清单 · Context first (Ch1): data are value-laden; ask who/why/what/how/when; critique # criticism (always offer a constructive fix). 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · Graphics (Ch2): the 5 principles; match graph to variable types; two good features + one specific improvement. I 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 · Design (Ch3): validity = randomise/compare/control (kills bias); precision = replicate/stratify/balance (kills variability); they are independent axes. 设计(第3章): validity = 随机化/比较/控制(杀 bias); precision = 重复/分层/平衡(杀 variability);二者是独 立的轴。 · Observational (Ch4): cohort=group-by-exposure, case-control=group-by-outcome; confounder links to both; correlation # causation. 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠ 因果。 I 报告(第5章):中心/离散/趋势/离群点;报告 Cl+水平、以及统计量 +P,而非只报P。 · Qualitative (Ch6): "why" not "what"; bottom-up vs top-down coding; convergence as the stopping rule. 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence 作为停止规则。 推断(第7章):随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大P 不证明 Ho; Type l/ll 与 power. · Modelling (Ch8): signal+noise; "all models wrong, some useful"; parsimony; read residual/QQ plots - interpret, never fit. 建模(第8章):信号+噪声;“所有模型都是错的,有些有用”;简约性;读残差/QQ图 -- 解读,绝不拟合。 · Sampling (Ch9): frame vs sample; a big sample won't fix bias; 4 random + 4 non-random methods; WEIRD; reproducibility crisis. I 抽样(第9章):抽样框 vs样本;大样本修不了偏倚;4种随机+4种非随机方法;WEIRD;可重复性危机。 I 累积(第10章):森林图(零线+菱形);Hill 准则(时序性+梯度);发表偏倚。 · Big data (Ch11): significance # importance at scale; provenance, ethics, scepticism toward Al findings. Big data (第11章):在大规模下显著 ≠重要;来源、伦理、对 AI发现的怀疑。 而且永远 -- 铁律:点名概念,然后给 because。祝你好运。 AskSia Library · MAST20034 · 双语 Bilingual ● 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 ● 设计(第3章): validity =随机化/比较/控制(杀 bias); precision= 重复/分层/平衡(杀 variability);二者是独 立的轴。 ● 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠因果。 ● 报告(第5章):中心/离散/趋势/离群点;报告 CI+水平、以及统计量+P,而非只报P。 ● 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence(收敛)作为停止规则。 推断(第7章): 随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大 P 不证明 Ho; Type l/ll 与 power.
- case-control:按 outcome 分组,回头看 exposure[5]Source: asksia-bible-mast20034-bilingual.pdfFINAL . 60% . SHORT-ANSWER REASONING The exam-morning decoder 考试当天解码器 If the question says X, reach for Y, and say these three things 若题目说 X,就伸手去取 Y,并说出这三件事 TL;DR. The final hands you a scenario, a graph, or a piece of statistical output and asks you to name the concept and justify it. There is no calculator and no arithmetic - every mark is a because. This page is the lookup table: read the cue words in the stem, reach for the matching concept, then deliver the three reasons that bank the marks. Memorise the column on the right; that is the answer. TL;DR. 期末递给你一个scenario (情景)、一张图,或一段统计输出,要你点名概念并加以论证。没有计算器,也没有算术 -- 每一分都是一个because。本页就是查找表:读题干里的cue words (线索词),伸手抓对应的概念,再交出能存下分数 的三条理由。把右侧那一列背下来;那就是答案。 ★ What the exam asks here 考试在这里问什么 The 60% final is 3 hours, short-answer only (no MCQ, no essay), no calculator, and you bring in 4 sides of your own notes. The marking criteria are explicit: "explaining your reasoning and choices is typically more important than any answer. " Dot-points are fine; no marks for grammar/spelling. So this whole chapter trains the one move the exam pays for - name the concept - give the because. 60% 的期末为时 3 小时,仅简答(无MCQ、无论文),不可用计算器,且你带入自备4面笔记。评分标准写得很明 确:“解释你的推理与选择,通常比任何答案本身更重要。”用要点列举即可;语法/拼写不计分。所以整章都在训练考试 买单的那一招 -- 点名概念 →给出 because。 12. 1 The cue - concept - because table 12. 1cue -> concept -> because xJAR Each row is a question species you have already met in this book. The left column is what the stem sounds like; the middle is the framework to invoke (with the chapter); the right is the 3-part skeleton - say all three and you have earned the reasoning marks. 每一行都是你在本书中已经见过的一类问题species(题种)。左列是题干听起来像什么;中列是要调用的框架(附章节);右 列是三段式骨架 -- 三段都说出来,你就挣到了推理分。 If the question says . . . Reach for this concept Say these 3 things (the because) DESIGN & CAUSATION "Choose / justify a study design"; "how would you investigate . . . " Study-design tree + validity · Ch3-4 (1) Can you intervene? - experiment (RCT) vs observational. (2) Pick the type with a because - rare outcome-case-control, many outcomes-cohort, snapshot-cross-sectional, populations-ecological. (3) Name the design tools that protect validity (randomise/compare/control). "Is it causal?"; "does X cause Y?"; "can we conclude . . . " Confounding + Bradford Hill . Ch4, Ch10 (1) Correlation # causation - observational data give association only. (2) Name a plausible confounder (linked to both exposure and outcome). (3) Argue Hill - esp. temporality (cause first) + dose-response gradient; an RCT would strengthen it by removing confounders. AskSia Library . MAST20034 . XXia Bilingual If the question says . . . Reach for this concept Say these 3 things (the because) GRAPHS & OUTPUT "Critique this graph"; "two good features & one improvement" 5 graphics principles . Ch2 (1) Two good features, each tied to a principle (standard form / common scale / clear encoding / shows data / simple). (2) One real issue (no title, abbreviated labels, panels on different scales). (3) A specific fix that addresses that issue - vague fixes score zero.[9]Source: asksia-bible-mast20034-bilingual.pdfThe PPDAC cycle - the spine of the whole unit, and a one-glance map of how an exam scenario hangs together: every question lives somewhere on Problem - Plan - Data - Analysis - Conclusion. Locating the stage tells you which concept the marker wants. PPDAC 循环 -- 整个单元的脊柱,也是一张让你一眼看清考试情景如何拼接的地图:每道题都栖身于 Problem → Plan → Data → Analysis → Conclusion 中的某处。定位到阶段,就知道评分者想要哪个概念。 AskSia Library . MAST20034 . XXia Bilingual ★ Concepts to recall - the whole-book checklist 要回忆的概念 -- 全书清单 · Context first (Ch1): data are value-laden; ask who/why/what/how/when; critique # criticism (always offer a constructive fix). 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · Graphics (Ch2): the 5 principles; match graph to variable types; two good features + one specific improvement. I 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 · Design (Ch3): validity = randomise/compare/control (kills bias); precision = replicate/stratify/balance (kills variability); they are independent axes. 设计(第3章): validity = 随机化/比较/控制(杀 bias); precision = 重复/分层/平衡(杀 variability);二者是独 立的轴。 · Observational (Ch4): cohort=group-by-exposure, case-control=group-by-outcome; confounder links to both; correlation # causation. 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠ 因果。 I 报告(第5章):中心/离散/趋势/离群点;报告 Cl+水平、以及统计量 +P,而非只报P。 · Qualitative (Ch6): "why" not "what"; bottom-up vs top-down coding; convergence as the stopping rule. 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence 作为停止规则。 推断(第7章):随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大P 不证明 Ho; Type l/ll 与 power. · Modelling (Ch8): signal+noise; "all models wrong, some useful"; parsimony; read residual/QQ plots - interpret, never fit. 建模(第8章):信号+噪声;“所有模型都是错的,有些有用”;简约性;读残差/QQ图 -- 解读,绝不拟合。 · Sampling (Ch9): frame vs sample; a big sample won't fix bias; 4 random + 4 non-random methods; WEIRD; reproducibility crisis. I 抽样(第9章):抽样框 vs样本;大样本修不了偏倚;4种随机+4种非随机方法;WEIRD;可重复性危机。 I 累积(第10章):森林图(零线+菱形);Hill 准则(时序性+梯度);发表偏倚。 · Big data (Ch11): significance # importance at scale; provenance, ethics, scepticism toward Al findings. Big data (第11章):在大规模下显著 ≠重要;来源、伦理、对 AI发现的怀疑。 而且永远 -- 铁律:点名概念,然后给 because。祝你好运。 AskSia Library · MAST20034 · 双语 Bilingual ● 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 ● 设计(第3章): validity =随机化/比较/控制(杀 bias); precision= 重复/分层/平衡(杀 variability);二者是独 立的轴。 ● 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠因果。 ● 报告(第5章):中心/离散/趋势/离群点;报告 CI+水平、以及统计量+P,而非只报P。 ● 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence(收敛)作为停止规则。 推断(第7章): 随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大 P 不证明 Ho; Type l/ll 与 power.
- cross-sectional:某个时间点 snapshot[5]Source: asksia-bible-mast20034-bilingual.pdfFINAL . 60% . SHORT-ANSWER REASONING The exam-morning decoder 考试当天解码器 If the question says X, reach for Y, and say these three things 若题目说 X,就伸手去取 Y,并说出这三件事 TL;DR. The final hands you a scenario, a graph, or a piece of statistical output and asks you to name the concept and justify it. There is no calculator and no arithmetic - every mark is a because. This page is the lookup table: read the cue words in the stem, reach for the matching concept, then deliver the three reasons that bank the marks. Memorise the column on the right; that is the answer. TL;DR. 期末递给你一个scenario (情景)、一张图,或一段统计输出,要你点名概念并加以论证。没有计算器,也没有算术 -- 每一分都是一个because。本页就是查找表:读题干里的cue words (线索词),伸手抓对应的概念,再交出能存下分数 的三条理由。把右侧那一列背下来;那就是答案。 ★ What the exam asks here 考试在这里问什么 The 60% final is 3 hours, short-answer only (no MCQ, no essay), no calculator, and you bring in 4 sides of your own notes. The marking criteria are explicit: "explaining your reasoning and choices is typically more important than any answer. " Dot-points are fine; no marks for grammar/spelling. So this whole chapter trains the one move the exam pays for - name the concept - give the because. 60% 的期末为时 3 小时,仅简答(无MCQ、无论文),不可用计算器,且你带入自备4面笔记。评分标准写得很明 确:“解释你的推理与选择,通常比任何答案本身更重要。”用要点列举即可;语法/拼写不计分。所以整章都在训练考试 买单的那一招 -- 点名概念 →给出 because。 12. 1 The cue - concept - because table 12. 1cue -> concept -> because xJAR Each row is a question species you have already met in this book. The left column is what the stem sounds like; the middle is the framework to invoke (with the chapter); the right is the 3-part skeleton - say all three and you have earned the reasoning marks. 每一行都是你在本书中已经见过的一类问题species(题种)。左列是题干听起来像什么;中列是要调用的框架(附章节);右 列是三段式骨架 -- 三段都说出来,你就挣到了推理分。 If the question says . . . Reach for this concept Say these 3 things (the because) DESIGN & CAUSATION "Choose / justify a study design"; "how would you investigate . . . " Study-design tree + validity · Ch3-4 (1) Can you intervene? - experiment (RCT) vs observational. (2) Pick the type with a because - rare outcome-case-control, many outcomes-cohort, snapshot-cross-sectional, populations-ecological. (3) Name the design tools that protect validity (randomise/compare/control). "Is it causal?"; "does X cause Y?"; "can we conclude . . . " Confounding + Bradford Hill . Ch4, Ch10 (1) Correlation # causation - observational data give association only. (2) Name a plausible confounder (linked to both exposure and outcome). (3) Argue Hill - esp. temporality (cause first) + dose-response gradient; an RCT would strengthen it by removing confounders. AskSia Library . MAST20034 . XXia Bilingual If the question says . . . Reach for this concept Say these 3 things (the because) GRAPHS & OUTPUT "Critique this graph"; "two good features & one improvement" 5 graphics principles . Ch2 (1) Two good features, each tied to a principle (standard form / common scale / clear encoding / shows data / simple). (2) One real issue (no title, abbreviated labels, panels on different scales). (3) A specific fix that addresses that issue - vague fixes score zero.
- ecological:总体/群体层面数据[5]Source: asksia-bible-mast20034-bilingual.pdfFINAL . 60% . SHORT-ANSWER REASONING The exam-morning decoder 考试当天解码器 If the question says X, reach for Y, and say these three things 若题目说 X,就伸手去取 Y,并说出这三件事 TL;DR. The final hands you a scenario, a graph, or a piece of statistical output and asks you to name the concept and justify it. There is no calculator and no arithmetic - every mark is a because. This page is the lookup table: read the cue words in the stem, reach for the matching concept, then deliver the three reasons that bank the marks. Memorise the column on the right; that is the answer. TL;DR. 期末递给你一个scenario (情景)、一张图,或一段统计输出,要你点名概念并加以论证。没有计算器,也没有算术 -- 每一分都是一个because。本页就是查找表:读题干里的cue words (线索词),伸手抓对应的概念,再交出能存下分数 的三条理由。把右侧那一列背下来;那就是答案。 ★ What the exam asks here 考试在这里问什么 The 60% final is 3 hours, short-answer only (no MCQ, no essay), no calculator, and you bring in 4 sides of your own notes. The marking criteria are explicit: "explaining your reasoning and choices is typically more important than any answer. " Dot-points are fine; no marks for grammar/spelling. So this whole chapter trains the one move the exam pays for - name the concept - give the because. 60% 的期末为时 3 小时,仅简答(无MCQ、无论文),不可用计算器,且你带入自备4面笔记。评分标准写得很明 确:“解释你的推理与选择,通常比任何答案本身更重要。”用要点列举即可;语法/拼写不计分。所以整章都在训练考试 买单的那一招 -- 点名概念 →给出 because。 12. 1 The cue - concept - because table 12. 1cue -> concept -> because xJAR Each row is a question species you have already met in this book. The left column is what the stem sounds like; the middle is the framework to invoke (with the chapter); the right is the 3-part skeleton - say all three and you have earned the reasoning marks. 每一行都是你在本书中已经见过的一类问题species(题种)。左列是题干听起来像什么;中列是要调用的框架(附章节);右 列是三段式骨架 -- 三段都说出来,你就挣到了推理分。 If the question says . . . Reach for this concept Say these 3 things (the because) DESIGN & CAUSATION "Choose / justify a study design"; "how would you investigate . . . " Study-design tree + validity · Ch3-4 (1) Can you intervene? - experiment (RCT) vs observational. (2) Pick the type with a because - rare outcome-case-control, many outcomes-cohort, snapshot-cross-sectional, populations-ecological. (3) Name the design tools that protect validity (randomise/compare/control). "Is it causal?"; "does X cause Y?"; "can we conclude . . . " Confounding + Bradford Hill . Ch4, Ch10 (1) Correlation # causation - observational data give association only. (2) Name a plausible confounder (linked to both exposure and outcome). (3) Argue Hill - esp. temporality (cause first) + dose-response gradient; an RCT would strengthen it by removing confounders. AskSia Library . MAST20034 . XXia Bilingual If the question says . . . Reach for this concept Say these 3 things (the because) GRAPHS & OUTPUT "Critique this graph"; "two good features & one improvement" 5 graphics principles . Ch2 (1) Two good features, each tied to a principle (standard form / common scale / clear encoding / shows data / simple). (2) One real issue (no title, abbreviated labels, panels on different scales). (3) A specific fix that addresses that issue - vague fixes score zero.
-
validity 核心工具:[5]Source: asksia-bible-mast20034-bilingual.pdfFINAL . 60% . SHORT-ANSWER REASONING The exam-morning decoder 考试当天解码器 If the question says X, reach for Y, and say these three things 若题目说 X,就伸手去取 Y,并说出这三件事 TL;DR. The final hands you a scenario, a graph, or a piece of statistical output and asks you to name the concept and justify it. There is no calculator and no arithmetic - every mark is a because. This page is the lookup table: read the cue words in the stem, reach for the matching concept, then deliver the three reasons that bank the marks. Memorise the column on the right; that is the answer. TL;DR. 期末递给你一个scenario (情景)、一张图,或一段统计输出,要你点名概念并加以论证。没有计算器,也没有算术 -- 每一分都是一个because。本页就是查找表:读题干里的cue words (线索词),伸手抓对应的概念,再交出能存下分数 的三条理由。把右侧那一列背下来;那就是答案。 ★ What the exam asks here 考试在这里问什么 The 60% final is 3 hours, short-answer only (no MCQ, no essay), no calculator, and you bring in 4 sides of your own notes. The marking criteria are explicit: "explaining your reasoning and choices is typically more important than any answer. " Dot-points are fine; no marks for grammar/spelling. So this whole chapter trains the one move the exam pays for - name the concept - give the because. 60% 的期末为时 3 小时,仅简答(无MCQ、无论文),不可用计算器,且你带入自备4面笔记。评分标准写得很明 确:“解释你的推理与选择,通常比任何答案本身更重要。”用要点列举即可;语法/拼写不计分。所以整章都在训练考试 买单的那一招 -- 点名概念 →给出 because。 12. 1 The cue - concept - because table 12. 1cue -> concept -> because xJAR Each row is a question species you have already met in this book. The left column is what the stem sounds like; the middle is the framework to invoke (with the chapter); the right is the 3-part skeleton - say all three and you have earned the reasoning marks. 每一行都是你在本书中已经见过的一类问题species(题种)。左列是题干听起来像什么;中列是要调用的框架(附章节);右 列是三段式骨架 -- 三段都说出来,你就挣到了推理分。 If the question says . . . Reach for this concept Say these 3 things (the because) DESIGN & CAUSATION "Choose / justify a study design"; "how would you investigate . . . " Study-design tree + validity · Ch3-4 (1) Can you intervene? - experiment (RCT) vs observational. (2) Pick the type with a because - rare outcome-case-control, many outcomes-cohort, snapshot-cross-sectional, populations-ecological. (3) Name the design tools that protect validity (randomise/compare/control). "Is it causal?"; "does X cause Y?"; "can we conclude . . . " Confounding + Bradford Hill . Ch4, Ch10 (1) Correlation # causation - observational data give association only. (2) Name a plausible confounder (linked to both exposure and outcome). (3) Argue Hill - esp. temporality (cause first) + dose-response gradient; an RCT would strengthen it by removing confounders. AskSia Library . MAST20034 . XXia Bilingual If the question says . . . Reach for this concept Say these 3 things (the because) GRAPHS & OUTPUT "Critique this graph"; "two good features & one improvement" 5 graphics principles . Ch2 (1) Two good features, each tied to a principle (standard form / common scale / clear encoding / shows data / simple). (2) One real issue (no title, abbreviated labels, panels on different scales). (3) A specific fix that addresses that issue - vague fixes score zero.[9]Source: asksia-bible-mast20034-bilingual.pdfThe PPDAC cycle - the spine of the whole unit, and a one-glance map of how an exam scenario hangs together: every question lives somewhere on Problem - Plan - Data - Analysis - Conclusion. Locating the stage tells you which concept the marker wants. PPDAC 循环 -- 整个单元的脊柱,也是一张让你一眼看清考试情景如何拼接的地图:每道题都栖身于 Problem → Plan → Data → Analysis → Conclusion 中的某处。定位到阶段,就知道评分者想要哪个概念。 AskSia Library . MAST20034 . XXia Bilingual ★ Concepts to recall - the whole-book checklist 要回忆的概念 -- 全书清单 · Context first (Ch1): data are value-laden; ask who/why/what/how/when; critique # criticism (always offer a constructive fix). 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · Graphics (Ch2): the 5 principles; match graph to variable types; two good features + one specific improvement. I 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 · Design (Ch3): validity = randomise/compare/control (kills bias); precision = replicate/stratify/balance (kills variability); they are independent axes. 设计(第3章): validity = 随机化/比较/控制(杀 bias); precision = 重复/分层/平衡(杀 variability);二者是独 立的轴。 · Observational (Ch4): cohort=group-by-exposure, case-control=group-by-outcome; confounder links to both; correlation # causation. 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠ 因果。 I 报告(第5章):中心/离散/趋势/离群点;报告 Cl+水平、以及统计量 +P,而非只报P。 · Qualitative (Ch6): "why" not "what"; bottom-up vs top-down coding; convergence as the stopping rule. 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence 作为停止规则。 推断(第7章):随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大P 不证明 Ho; Type l/ll 与 power. · Modelling (Ch8): signal+noise; "all models wrong, some useful"; parsimony; read residual/QQ plots - interpret, never fit. 建模(第8章):信号+噪声;“所有模型都是错的,有些有用”;简约性;读残差/QQ图 -- 解读,绝不拟合。 · Sampling (Ch9): frame vs sample; a big sample won't fix bias; 4 random + 4 non-random methods; WEIRD; reproducibility crisis. I 抽样(第9章):抽样框 vs样本;大样本修不了偏倚;4种随机+4种非随机方法;WEIRD;可重复性危机。 I 累积(第10章):森林图(零线+菱形);Hill 准则(时序性+梯度);发表偏倚。 · Big data (Ch11): significance # importance at scale; provenance, ethics, scepticism toward Al findings. Big data (第11章):在大规模下显著 ≠重要;来源、伦理、对 AI发现的怀疑。 而且永远 -- 铁律:点名概念,然后给 because。祝你好运。 AskSia Library · MAST20034 · 双语 Bilingual ● 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 ● 设计(第3章): validity =随机化/比较/控制(杀 bias); precision= 重复/分层/平衡(杀 variability);二者是独 立的轴。 ● 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠因果。 ● 报告(第5章):中心/离散/趋势/离群点;报告 CI+水平、以及统计量+P,而非只报P。 ● 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence(收敛)作为停止规则。 推断(第7章): 随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大 P 不证明 Ho; Type l/ll 与 power.
- randomise
- compare
- control
-
- replicate
- stratify
- balance
-
并且你要记住:
-
考试里怎么答
- 先判断适合哪种设计
- 再说为什么不用别的设计
- 再说这种设计如何保护 validity[5]Source: asksia-bible-mast20034-bilingual.pdfFINAL . 60% . SHORT-ANSWER REASONING The exam-morning decoder 考试当天解码器 If the question says X, reach for Y, and say these three things 若题目说 X,就伸手去取 Y,并说出这三件事 TL;DR. The final hands you a scenario, a graph, or a piece of statistical output and asks you to name the concept and justify it. There is no calculator and no arithmetic - every mark is a because. This page is the lookup table: read the cue words in the stem, reach for the matching concept, then deliver the three reasons that bank the marks. Memorise the column on the right; that is the answer. TL;DR. 期末递给你一个scenario (情景)、一张图,或一段统计输出,要你点名概念并加以论证。没有计算器,也没有算术 -- 每一分都是一个because。本页就是查找表:读题干里的cue words (线索词),伸手抓对应的概念,再交出能存下分数 的三条理由。把右侧那一列背下来;那就是答案。 ★ What the exam asks here 考试在这里问什么 The 60% final is 3 hours, short-answer only (no MCQ, no essay), no calculator, and you bring in 4 sides of your own notes. The marking criteria are explicit: "explaining your reasoning and choices is typically more important than any answer. " Dot-points are fine; no marks for grammar/spelling. So this whole chapter trains the one move the exam pays for - name the concept - give the because. 60% 的期末为时 3 小时,仅简答(无MCQ、无论文),不可用计算器,且你带入自备4面笔记。评分标准写得很明 确:“解释你的推理与选择,通常比任何答案本身更重要。”用要点列举即可;语法/拼写不计分。所以整章都在训练考试 买单的那一招 -- 点名概念 →给出 because。 12. 1 The cue - concept - because table 12. 1cue -> concept -> because xJAR Each row is a question species you have already met in this book. The left column is what the stem sounds like; the middle is the framework to invoke (with the chapter); the right is the 3-part skeleton - say all three and you have earned the reasoning marks. 每一行都是你在本书中已经见过的一类问题species(题种)。左列是题干听起来像什么;中列是要调用的框架(附章节);右 列是三段式骨架 -- 三段都说出来,你就挣到了推理分。 If the question says . . . Reach for this concept Say these 3 things (the because) DESIGN & CAUSATION "Choose / justify a study design"; "how would you investigate . . . " Study-design tree + validity · Ch3-4 (1) Can you intervene? - experiment (RCT) vs observational. (2) Pick the type with a because - rare outcome-case-control, many outcomes-cohort, snapshot-cross-sectional, populations-ecological. (3) Name the design tools that protect validity (randomise/compare/control). "Is it causal?"; "does X cause Y?"; "can we conclude . . . " Confounding + Bradford Hill . Ch4, Ch10 (1) Correlation # causation - observational data give association only. (2) Name a plausible confounder (linked to both exposure and outcome). (3) Argue Hill - esp. temporality (cause first) + dose-response gradient; an RCT would strengthen it by removing confounders. AskSia Library . MAST20034 . XXia Bilingual If the question says . . . Reach for this concept Say these 3 things (the because) GRAPHS & OUTPUT "Critique this graph"; "two good features & one improvement" 5 graphics principles . Ch2 (1) Two good features, each tied to a principle (standard form / common scale / clear encoding / shows data / simple). (2) One real issue (no title, abbreviated labels, panels on different scales). (3) A specific fix that addresses that issue - vague fixes score zero.[11]Source: asksia-bible-mast20034-bilingual.pdf(b) Big-data 批判(because):巨大的 n修不了 bias -- 它只是那些已经在用这个 app 的人(selection bias); 在海量 n 下一切看起来都“显著”,所以 effect size 与来源比 P 更要紧。再加上数据伦理标记:被记录用户的隐私/ 同意。 What earns the marks. a justified qual choice (the "why" logic + convergence) + a big-data critique naming that size # representativeness, with an ethics flag. 什么能得分。一个有论证的 qual 选择(“why”逻辑+ convergence)+一个 big-data 批判,点明规模 ≠代表性,并附 上一个 ethics 标记。 Trap. dismissing qualitative as "unscientific"; equating large n with representative; forgetting consent/provenance for found data. 陷阱。把 qualitative 斥为“不科学”;把大n 等同于有代表性;对找来的数据忘了 consent/provenance(同意/来源)。 AskSia Library · MAST20034 · 双语 Bilingual ★ Recall checklist - the decision rules for the bank 回忆清单 -- 题库的决策规则 1 每个答案:点名→定义→在情境中给理由→ because(后果)。按分值数理由。 · Design/sampling: justify the choice against its alternative; non-probability methods are biased - size won't cure it. 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 · Confounder: must link to both exposure and outcome; observational - associated with, never causes. Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 · Graph critique: two good features to a principle; one issue + a fix that matches it. 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 · Type I/II: false +/false -; power = 1-ß; small n -+ low power; rare condition - base-rate false positives. Type l/ll: 假阳/假阴;power = 1-β;小 n→低 power;罕见情况→基础率导致的假阳性。 · Qual vs quant: why vs what; convergence is the qual stopping rule. Big data: size # unbiased; effect size & ethics over P. 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data: 大≠无偏;effect size 与伦理胜过 P。 ● 每个答案:点名 →定义→在情境中给理由→ because(后果)。按分值数理由。 ● 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 ● Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 ● 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 ● CI:随机的是区间,固定的是参数。P-value: Pr(datalHo),而非 Pr(Holdata);显著 ≠重要。 ● Type I/ll: 假阳/假阴;power = 1-β;小n→低 power;罕见情况→ 基础率导致的假阳性。 ● Forest plot: 菱形对零线+异质性+发表偏倚的注意。Hill:证据的权重,时序性优先。 ● 诊断图:漏斗形→方差非恒定;QQ 弯曲→非正态→变换/用假设更少的方法。 ● 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data:大≠无偏;effect size 与伦理胜过 P。 AskSia Library · MAST20034 · 双语 Bilingual EXAM MORNING . THE DECODER EXAM MORNING . THE DECODER
-
经典句型
- “Because the exposure cannot be assigned ethically, this should be an observational study rather than an experiment.”[5]Source: asksia-bible-mast20034-bilingual.pdfFINAL . 60% . SHORT-ANSWER REASONING The exam-morning decoder 考试当天解码器 If the question says X, reach for Y, and say these three things 若题目说 X,就伸手去取 Y,并说出这三件事 TL;DR. The final hands you a scenario, a graph, or a piece of statistical output and asks you to name the concept and justify it. There is no calculator and no arithmetic - every mark is a because. This page is the lookup table: read the cue words in the stem, reach for the matching concept, then deliver the three reasons that bank the marks. Memorise the column on the right; that is the answer. TL;DR. 期末递给你一个scenario (情景)、一张图,或一段统计输出,要你点名概念并加以论证。没有计算器,也没有算术 -- 每一分都是一个because。本页就是查找表:读题干里的cue words (线索词),伸手抓对应的概念,再交出能存下分数 的三条理由。把右侧那一列背下来;那就是答案。 ★ What the exam asks here 考试在这里问什么 The 60% final is 3 hours, short-answer only (no MCQ, no essay), no calculator, and you bring in 4 sides of your own notes. The marking criteria are explicit: "explaining your reasoning and choices is typically more important than any answer. " Dot-points are fine; no marks for grammar/spelling. So this whole chapter trains the one move the exam pays for - name the concept - give the because. 60% 的期末为时 3 小时,仅简答(无MCQ、无论文),不可用计算器,且你带入自备4面笔记。评分标准写得很明 确:“解释你的推理与选择,通常比任何答案本身更重要。”用要点列举即可;语法/拼写不计分。所以整章都在训练考试 买单的那一招 -- 点名概念 →给出 because。 12. 1 The cue - concept - because table 12. 1cue -> concept -> because xJAR Each row is a question species you have already met in this book. The left column is what the stem sounds like; the middle is the framework to invoke (with the chapter); the right is the 3-part skeleton - say all three and you have earned the reasoning marks. 每一行都是你在本书中已经见过的一类问题species(题种)。左列是题干听起来像什么;中列是要调用的框架(附章节);右 列是三段式骨架 -- 三段都说出来,你就挣到了推理分。 If the question says . . . Reach for this concept Say these 3 things (the because) DESIGN & CAUSATION "Choose / justify a study design"; "how would you investigate . . . " Study-design tree + validity · Ch3-4 (1) Can you intervene? - experiment (RCT) vs observational. (2) Pick the type with a because - rare outcome-case-control, many outcomes-cohort, snapshot-cross-sectional, populations-ecological. (3) Name the design tools that protect validity (randomise/compare/control). "Is it causal?"; "does X cause Y?"; "can we conclude . . . " Confounding + Bradford Hill . Ch4, Ch10 (1) Correlation # causation - observational data give association only. (2) Name a plausible confounder (linked to both exposure and outcome). (3) Argue Hill - esp. temporality (cause first) + dose-response gradient; an RCT would strengthen it by removing confounders. AskSia Library . MAST20034 . XXia Bilingual If the question says . . . Reach for this concept Say these 3 things (the because) GRAPHS & OUTPUT "Critique this graph"; "two good features & one improvement" 5 graphics principles . Ch2 (1) Two good features, each tied to a principle (standard form / common scale / clear encoding / shows data / simple). (2) One real issue (no title, abbreviated labels, panels on different scales). (3) A specific fix that addresses that issue - vague fixes score zero.
- “A case-control design is efficient for a rare outcome, because it starts with cases and controls rather than waiting for enough outcomes to occur.”[5]Source: asksia-bible-mast20034-bilingual.pdfFINAL . 60% . SHORT-ANSWER REASONING The exam-morning decoder 考试当天解码器 If the question says X, reach for Y, and say these three things 若题目说 X,就伸手去取 Y,并说出这三件事 TL;DR. The final hands you a scenario, a graph, or a piece of statistical output and asks you to name the concept and justify it. There is no calculator and no arithmetic - every mark is a because. This page is the lookup table: read the cue words in the stem, reach for the matching concept, then deliver the three reasons that bank the marks. Memorise the column on the right; that is the answer. TL;DR. 期末递给你一个scenario (情景)、一张图,或一段统计输出,要你点名概念并加以论证。没有计算器,也没有算术 -- 每一分都是一个because。本页就是查找表:读题干里的cue words (线索词),伸手抓对应的概念,再交出能存下分数 的三条理由。把右侧那一列背下来;那就是答案。 ★ What the exam asks here 考试在这里问什么 The 60% final is 3 hours, short-answer only (no MCQ, no essay), no calculator, and you bring in 4 sides of your own notes. The marking criteria are explicit: "explaining your reasoning and choices is typically more important than any answer. " Dot-points are fine; no marks for grammar/spelling. So this whole chapter trains the one move the exam pays for - name the concept - give the because. 60% 的期末为时 3 小时,仅简答(无MCQ、无论文),不可用计算器,且你带入自备4面笔记。评分标准写得很明 确:“解释你的推理与选择,通常比任何答案本身更重要。”用要点列举即可;语法/拼写不计分。所以整章都在训练考试 买单的那一招 -- 点名概念 →给出 because。 12. 1 The cue - concept - because table 12. 1cue -> concept -> because xJAR Each row is a question species you have already met in this book. The left column is what the stem sounds like; the middle is the framework to invoke (with the chapter); the right is the 3-part skeleton - say all three and you have earned the reasoning marks. 每一行都是你在本书中已经见过的一类问题species(题种)。左列是题干听起来像什么;中列是要调用的框架(附章节);右 列是三段式骨架 -- 三段都说出来,你就挣到了推理分。 If the question says . . . Reach for this concept Say these 3 things (the because) DESIGN & CAUSATION "Choose / justify a study design"; "how would you investigate . . . " Study-design tree + validity · Ch3-4 (1) Can you intervene? - experiment (RCT) vs observational. (2) Pick the type with a because - rare outcome-case-control, many outcomes-cohort, snapshot-cross-sectional, populations-ecological. (3) Name the design tools that protect validity (randomise/compare/control). "Is it causal?"; "does X cause Y?"; "can we conclude . . . " Confounding + Bradford Hill . Ch4, Ch10 (1) Correlation # causation - observational data give association only. (2) Name a plausible confounder (linked to both exposure and outcome). (3) Argue Hill - esp. temporality (cause first) + dose-response gradient; an RCT would strengthen it by removing confounders. AskSia Library . MAST20034 . XXia Bilingual If the question says . . . Reach for this concept Say these 3 things (the because) GRAPHS & OUTPUT "Critique this graph"; "two good features & one improvement" 5 graphics principles . Ch2 (1) Two good features, each tied to a principle (standard form / common scale / clear encoding / shows data / simple). (2) One real issue (no title, abbreviated labels, panels on different scales). (3) A specific fix that addresses that issue - vague fixes score zero.
-
4)因果、相关、confounding、Bradford Hill
-
这是 final 核心中的核心。
-
最重要一句话:
- correlation $\ne$ causation[4]Source: asksia-bible-mast20034-bilingual.pdf4 Analysis explore + model + test 3 Data collect / clean / store iterative: conclusions raise new problems -> cycle repeats The course's engine in one picture: Problem - Plan + Data - Analysis - Conclusion - the investigation cycle that frames almost every critique prompt. Most exam answers are really a question about one node: was the Plan a sound design? Were the Data well sampled? Is the Conclusion licensed by the design? Learn to walk it from memory. 一幅图道尽本课程的引擎:Problem → Plan → Data → Analysis → Conclusion -- 这个研究循环 框定了几乎每一个批判题。多数考试答案其实是关于 某一个节点的问题:Plan 是不是一个可靠的设计? Data 抽样得当吗?Conclusion 是设计所许可的吗? 要学会凭记忆把它走一遍。 Examinable scope = the 12-week reasoning spine: objectivity & data · good graphics · study design . observational studies & confounding · reporting & critiquing claims . qualitative methods . frameworks for inference . analysis & modelling . sampling & AskSia Library . MAST20034 . XXia Bilingual WEIRD bias . accumulating research (meta-analysis & Hill) . big data · context. Research prompts touch only the whole-class case studies and never demand recalled details. 可考范围 = 12 周的推理主线:objectivity 与数据 · 好图表 · 研究设计 · observational study 与 confounding · 报告 与批判主张 · qualitative methods · 推断框架 · 分析与建 模 · 抽样与 WEIRD bias · 积累研究 (meta-analysis 与 Hill) · big data · 情境。研究类题目只触及全班共学的案 例研究,从不要求背诵细节。 What the exam is really testing 这场考试真正在考什么 The cue you get The move it rewards A graph / figure Critique it: name two good features + one specific fix (the graphics principles) A described study Name the design - say what conclusion is legal (causation vs association) An association Find the confounder / lurking variable and explain how it could fake the link Statistical output / a CI / P Interpret it in context - without the classic misreads A sampling scenario Name the method & the bias (incl. WEIRD) and why size won't cure it ✓ The one habit that wins this exam 赢下这场考试的那一个习惯 For every prompt, name the concept first, then write the because. "Two variables move together" - confounding / correlation#causation; "who got picked" - a sampling / selection bias; "is this graph any good" - the five graphics principles; "does X cause Y across studies" - Bradford Hill; "what does this P-value mean" - the interpretation rules. The decoder in Ch 14 lists every cue. 对每道题,先点名概念,再写because。“两个变量一 起变动”→ confounding / correlation≠causation;“谁被选中”→ 某种 sampling/ selection bias;“这张图好不好”→五条 图表原则;“跨多项研究X是否导致 Y”→ Bradford Hill;“这个 P-value 是什么意思”→解读规则。第 14 章的解码器列出了每一个线索。 ★ The single highest-value habit 价值最高的单一习惯 You may write in dot-points or sentences, and there are no marks for grammar or spelling - so spend every word on the reasoning. Practise answering in the shape the markers reward: (1) name the concept, (2) define it in a line, (3) apply it to the scenario, (4) state the consequence or fix. Four sentences, full marks. "Explaining your reasoning and choices is typically more important than any answer. " 你可以用要点或句子书写,且语法或拼写不计分 所以把每一个词都花在推理上。按评分者奖励的形态 练习作答:(1)点名概念,(2)一行内定义它,(3)把它 应用到情景,(4)陈述后果或修正。四句话,满分。 “阐释你的推理与选择,通常比任何答案本身更重要。”[9]Source: asksia-bible-mast20034-bilingual.pdfThe PPDAC cycle - the spine of the whole unit, and a one-glance map of how an exam scenario hangs together: every question lives somewhere on Problem - Plan - Data - Analysis - Conclusion. Locating the stage tells you which concept the marker wants. PPDAC 循环 -- 整个单元的脊柱,也是一张让你一眼看清考试情景如何拼接的地图:每道题都栖身于 Problem → Plan → Data → Analysis → Conclusion 中的某处。定位到阶段,就知道评分者想要哪个概念。 AskSia Library . MAST20034 . XXia Bilingual ★ Concepts to recall - the whole-book checklist 要回忆的概念 -- 全书清单 · Context first (Ch1): data are value-laden; ask who/why/what/how/when; critique # criticism (always offer a constructive fix). 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · Graphics (Ch2): the 5 principles; match graph to variable types; two good features + one specific improvement. I 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 · Design (Ch3): validity = randomise/compare/control (kills bias); precision = replicate/stratify/balance (kills variability); they are independent axes. 设计(第3章): validity = 随机化/比较/控制(杀 bias); precision = 重复/分层/平衡(杀 variability);二者是独 立的轴。 · Observational (Ch4): cohort=group-by-exposure, case-control=group-by-outcome; confounder links to both; correlation # causation. 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠ 因果。 I 报告(第5章):中心/离散/趋势/离群点;报告 Cl+水平、以及统计量 +P,而非只报P。 · Qualitative (Ch6): "why" not "what"; bottom-up vs top-down coding; convergence as the stopping rule. 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence 作为停止规则。 推断(第7章):随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大P 不证明 Ho; Type l/ll 与 power. · Modelling (Ch8): signal+noise; "all models wrong, some useful"; parsimony; read residual/QQ plots - interpret, never fit. 建模(第8章):信号+噪声;“所有模型都是错的,有些有用”;简约性;读残差/QQ图 -- 解读,绝不拟合。 · Sampling (Ch9): frame vs sample; a big sample won't fix bias; 4 random + 4 non-random methods; WEIRD; reproducibility crisis. I 抽样(第9章):抽样框 vs样本;大样本修不了偏倚;4种随机+4种非随机方法;WEIRD;可重复性危机。 I 累积(第10章):森林图(零线+菱形);Hill 准则(时序性+梯度);发表偏倚。 · Big data (Ch11): significance # importance at scale; provenance, ethics, scepticism toward Al findings. Big data (第11章):在大规模下显著 ≠重要;来源、伦理、对 AI发现的怀疑。 而且永远 -- 铁律:点名概念,然后给 because。祝你好运。 AskSia Library · MAST20034 · 双语 Bilingual ● 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 ● 设计(第3章): validity =随机化/比较/控制(杀 bias); precision= 重复/分层/平衡(杀 variability);二者是独 立的轴。 ● 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠因果。 ● 报告(第5章):中心/离散/趋势/离群点;报告 CI+水平、以及统计量+P,而非只报P。 ● 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence(收敛)作为停止规则。 推断(第7章): 随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大 P 不证明 Ho; Type l/ll 与 power.[16]Source: asksia-cheatsheet-mast20034.pdfqualitative . convergence 17 . The "Because" Rule HOW TO BANK MARKS EVERY ANSWER = A CHAIN NAME the concept/framework - APPLY it to the context + BECAUSE . . . (the reason wins the mark) · Marks are per correct, sufficiently-detailed reason - one detailed reason often = full marks; restating the definition earns nothing · Dot points are fine . no grammar/spelling marks . 3 hours SIA > No question needs a calculator and none recall case-study details. If you're computing, you've misread it - they want the reasoning. Spend the words on the because. 18 . Top Traps to THE MARKS LOST MOST Avoid · "P = Pr(Ho true)" . "large P proves Ho" . choosing one- sided after the data · Calling a natural experiment an experiment . "association proves cause" · Cohort>case-control mix-up · stratified > quota . cluster<> stratified · Hill as a tick-box · ignoring temporality · "Significant" = "important" . trusting a result because n is huge the core of your 4-side bring-in notes . confirm on the MAST20034 exam-info page name it. apply it. because . . . asksia. ai/cheatsheet/ unimelb-mast20034 . side 2/2 AskSia CHEAT SHEET SERIES RESTRICTED BRING-IN . NO CALCULATOR
-
对 observational data,你通常只能说:
- is associated with
- 不能轻易说:
- causes[8]Source: asksia-bible-mast20034-bilingual.pdfAskSia Library · MAST20034 · 双语 Bilingual 3 Reason in the context given. Tie the concept to this scenario, not a generic textbook one. 在给定的情境中推理。把概念系到这个情景,而非一个泛泛的教科书情景。 4 Close each point with a because. State the consequence (the bias it induces, the assumption it breaks, the conclusion it licenses) - this is the mark-bearing clause. 每个要点都以一个 because 收尾。说出后果(它引发的偏倚、它破坏的假设、它许可的结论) -- 这是承载分数的从句。 5 Count your reasons against the marks. If it says [4: 2+2], deliver a definition and two separate consequence-level reasons. 按分值数你的理由。若标着[4:2+2],就要给出一个定义以及两个各自独立、到后果层面的理由。 ✓ The universal sentence frame 通用句式框架 "This is [named concept], which means [definition]. Here it matters because [consequence #1], and also because [consequence #2]. "Drop any scenario into that frame and you have structured an answer that the rubric can find marks in. “这是[点名的概念],意思是[定义]。在这里它之所以重要,是因为[后果#1],也因为[后果 #2]。”把任意情景套进这 个框架,你就把一个评分标准能找到分的答案给搭好了。 - ! The four ways students throw away marks 学生白白丢分的四种方式 复述,而非推理 -- “它有偏倚,因为它不随机”只是给定义换了个名字;要说出这个偏倚做了什么。 |一个理由套两件外套 -- 一道“2+2”的题目需要两个不同的理由,而非把一个换说法说两遍。 · Trying to compute - there is no calculator and no calc question; if you start arithmetic you have misread the task. 试图计算 -- 这里既无计算器也无计算题;若你开始做算术,你就读错题了。 过度宣称因果 -- 对观察性数据唯一合法的动词是 is associated with,绝不是 causes。 ● 复述,而非推理 -- “它有偏倚,因为它不随机”只是给定义换了个名字;要说出这个偏倚做了什么。 ● 一个理由套两件外套 -- 一道“2+2”的题目需要两个不同的理由,而非把一个换说法说两遍。 ● 试图计算 -- 这里既无计算器也无计算题;若你开始做算术,你就读错题了。 ● 过度宣称因果 -- 对观察性数据唯一合法的动词是 is associated with (与 . . . . . 相关),绝不是 causes(导致)。 AskSia Library · MAST20034 · 双语 Bilingual REVISION . SHORT - ANSWER BANK REVISION . SHORT - ANSWER BANK STUDY - PRODUCTION SPECIES Drills 1-6: design, sampling, confounding & graphs 演练 1-6: 设计、抽样、confounding 与图表 TL;DR. These six rehearse the "how the data were produced" family: choose & justify a design, pick a sampling method, name the confounder, identify exposure/outcome, and critique a graph (two good features + one specific fix). Reason the choice against its alternative - that contrast is where the marks live. TL;DR. 这六张演练“数据是如何产生的”这一族题:选择并论证一种设计、挑一种 sampling 方法、点名 confounder、识别 exposure/outcome,以及批判一张图(两个优点+一个具体的修正)。要把你的选择对照其备选方案来论证 -- 那个对比正 是分数所在。 Q1[11]Source: asksia-bible-mast20034-bilingual.pdf(b) Big-data 批判(because):巨大的 n修不了 bias -- 它只是那些已经在用这个 app 的人(selection bias); 在海量 n 下一切看起来都“显著”,所以 effect size 与来源比 P 更要紧。再加上数据伦理标记:被记录用户的隐私/ 同意。 What earns the marks. a justified qual choice (the "why" logic + convergence) + a big-data critique naming that size # representativeness, with an ethics flag. 什么能得分。一个有论证的 qual 选择(“why”逻辑+ convergence)+一个 big-data 批判,点明规模 ≠代表性,并附 上一个 ethics 标记。 Trap. dismissing qualitative as "unscientific"; equating large n with representative; forgetting consent/provenance for found data. 陷阱。把 qualitative 斥为“不科学”;把大n 等同于有代表性;对找来的数据忘了 consent/provenance(同意/来源)。 AskSia Library · MAST20034 · 双语 Bilingual ★ Recall checklist - the decision rules for the bank 回忆清单 -- 题库的决策规则 1 每个答案:点名→定义→在情境中给理由→ because(后果)。按分值数理由。 · Design/sampling: justify the choice against its alternative; non-probability methods are biased - size won't cure it. 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 · Confounder: must link to both exposure and outcome; observational - associated with, never causes. Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 · Graph critique: two good features to a principle; one issue + a fix that matches it. 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 · Type I/II: false +/false -; power = 1-ß; small n -+ low power; rare condition - base-rate false positives. Type l/ll: 假阳/假阴;power = 1-β;小 n→低 power;罕见情况→基础率导致的假阳性。 · Qual vs quant: why vs what; convergence is the qual stopping rule. Big data: size # unbiased; effect size & ethics over P. 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data: 大≠无偏;effect size 与伦理胜过 P。 ● 每个答案:点名 →定义→在情境中给理由→ because(后果)。按分值数理由。 ● 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 ● Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 ● 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 ● CI:随机的是区间,固定的是参数。P-value: Pr(datalHo),而非 Pr(Holdata);显著 ≠重要。 ● Type I/ll: 假阳/假阴;power = 1-β;小n→低 power;罕见情况→ 基础率导致的假阳性。 ● Forest plot: 菱形对零线+异质性+发表偏倚的注意。Hill:证据的权重,时序性优先。 ● 诊断图:漏斗形→方差非恒定;QQ 弯曲→非正态→变换/用假设更少的方法。 ● 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data:大≠无偏;effect size 与伦理胜过 P。 AskSia Library · MAST20034 · 双语 Bilingual EXAM MORNING . THE DECODER EXAM MORNING . THE DECODER[16]Source: asksia-cheatsheet-mast20034.pdfqualitative . convergence 17 . The "Because" Rule HOW TO BANK MARKS EVERY ANSWER = A CHAIN NAME the concept/framework - APPLY it to the context + BECAUSE . . . (the reason wins the mark) · Marks are per correct, sufficiently-detailed reason - one detailed reason often = full marks; restating the definition earns nothing · Dot points are fine . no grammar/spelling marks . 3 hours SIA > No question needs a calculator and none recall case-study details. If you're computing, you've misread it - they want the reasoning. Spend the words on the because. 18 . Top Traps to THE MARKS LOST MOST Avoid · "P = Pr(Ho true)" . "large P proves Ho" . choosing one- sided after the data · Calling a natural experiment an experiment . "association proves cause" · Cohort>case-control mix-up · stratified > quota . cluster<> stratified · Hill as a tick-box · ignoring temporality · "Significant" = "important" . trusting a result because n is huge the core of your 4-side bring-in notes . confirm on the MAST20034 exam-info page name it. apply it. because . . . asksia. ai/cheatsheet/ unimelb-mast20034 . side 2/2 AskSia CHEAT SHEET SERIES RESTRICTED BRING-IN . NO CALCULATOR
-
confounder 定义
- confounder 是一个同时和 exposure、outcome 都有关的变量,它可能伪造或扭曲两者的关系。[4]Source: asksia-bible-mast20034-bilingual.pdf4 Analysis explore + model + test 3 Data collect / clean / store iterative: conclusions raise new problems -> cycle repeats The course's engine in one picture: Problem - Plan + Data - Analysis - Conclusion - the investigation cycle that frames almost every critique prompt. Most exam answers are really a question about one node: was the Plan a sound design? Were the Data well sampled? Is the Conclusion licensed by the design? Learn to walk it from memory. 一幅图道尽本课程的引擎:Problem → Plan → Data → Analysis → Conclusion -- 这个研究循环 框定了几乎每一个批判题。多数考试答案其实是关于 某一个节点的问题:Plan 是不是一个可靠的设计? Data 抽样得当吗?Conclusion 是设计所许可的吗? 要学会凭记忆把它走一遍。 Examinable scope = the 12-week reasoning spine: objectivity & data · good graphics · study design . observational studies & confounding · reporting & critiquing claims . qualitative methods . frameworks for inference . analysis & modelling . sampling & AskSia Library . MAST20034 . XXia Bilingual WEIRD bias . accumulating research (meta-analysis & Hill) . big data · context. Research prompts touch only the whole-class case studies and never demand recalled details. 可考范围 = 12 周的推理主线:objectivity 与数据 · 好图表 · 研究设计 · observational study 与 confounding · 报告 与批判主张 · qualitative methods · 推断框架 · 分析与建 模 · 抽样与 WEIRD bias · 积累研究 (meta-analysis 与 Hill) · big data · 情境。研究类题目只触及全班共学的案 例研究,从不要求背诵细节。 What the exam is really testing 这场考试真正在考什么 The cue you get The move it rewards A graph / figure Critique it: name two good features + one specific fix (the graphics principles) A described study Name the design - say what conclusion is legal (causation vs association) An association Find the confounder / lurking variable and explain how it could fake the link Statistical output / a CI / P Interpret it in context - without the classic misreads A sampling scenario Name the method & the bias (incl. WEIRD) and why size won't cure it ✓ The one habit that wins this exam 赢下这场考试的那一个习惯 For every prompt, name the concept first, then write the because. "Two variables move together" - confounding / correlation#causation; "who got picked" - a sampling / selection bias; "is this graph any good" - the five graphics principles; "does X cause Y across studies" - Bradford Hill; "what does this P-value mean" - the interpretation rules. The decoder in Ch 14 lists every cue. 对每道题,先点名概念,再写because。“两个变量一 起变动”→ confounding / correlation≠causation;“谁被选中”→ 某种 sampling/ selection bias;“这张图好不好”→五条 图表原则;“跨多项研究X是否导致 Y”→ Bradford Hill;“这个 P-value 是什么意思”→解读规则。第 14 章的解码器列出了每一个线索。 ★ The single highest-value habit 价值最高的单一习惯 You may write in dot-points or sentences, and there are no marks for grammar or spelling - so spend every word on the reasoning. Practise answering in the shape the markers reward: (1) name the concept, (2) define it in a line, (3) apply it to the scenario, (4) state the consequence or fix. Four sentences, full marks. "Explaining your reasoning and choices is typically more important than any answer. " 你可以用要点或句子书写,且语法或拼写不计分 所以把每一个词都花在推理上。按评分者奖励的形态 练习作答:(1)点名概念,(2)一行内定义它,(3)把它 应用到情景,(4)陈述后果或修正。四句话,满分。 “阐释你的推理与选择,通常比任何答案本身更重要。”[9]Source: asksia-bible-mast20034-bilingual.pdfThe PPDAC cycle - the spine of the whole unit, and a one-glance map of how an exam scenario hangs together: every question lives somewhere on Problem - Plan - Data - Analysis - Conclusion. Locating the stage tells you which concept the marker wants. PPDAC 循环 -- 整个单元的脊柱,也是一张让你一眼看清考试情景如何拼接的地图:每道题都栖身于 Problem → Plan → Data → Analysis → Conclusion 中的某处。定位到阶段,就知道评分者想要哪个概念。 AskSia Library . MAST20034 . XXia Bilingual ★ Concepts to recall - the whole-book checklist 要回忆的概念 -- 全书清单 · Context first (Ch1): data are value-laden; ask who/why/what/how/when; critique # criticism (always offer a constructive fix). 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · Graphics (Ch2): the 5 principles; match graph to variable types; two good features + one specific improvement. I 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 · Design (Ch3): validity = randomise/compare/control (kills bias); precision = replicate/stratify/balance (kills variability); they are independent axes. 设计(第3章): validity = 随机化/比较/控制(杀 bias); precision = 重复/分层/平衡(杀 variability);二者是独 立的轴。 · Observational (Ch4): cohort=group-by-exposure, case-control=group-by-outcome; confounder links to both; correlation # causation. 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠ 因果。 I 报告(第5章):中心/离散/趋势/离群点;报告 Cl+水平、以及统计量 +P,而非只报P。 · Qualitative (Ch6): "why" not "what"; bottom-up vs top-down coding; convergence as the stopping rule. 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence 作为停止规则。 推断(第7章):随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大P 不证明 Ho; Type l/ll 与 power. · Modelling (Ch8): signal+noise; "all models wrong, some useful"; parsimony; read residual/QQ plots - interpret, never fit. 建模(第8章):信号+噪声;“所有模型都是错的,有些有用”;简约性;读残差/QQ图 -- 解读,绝不拟合。 · Sampling (Ch9): frame vs sample; a big sample won't fix bias; 4 random + 4 non-random methods; WEIRD; reproducibility crisis. I 抽样(第9章):抽样框 vs样本;大样本修不了偏倚;4种随机+4种非随机方法;WEIRD;可重复性危机。 I 累积(第10章):森林图(零线+菱形);Hill 准则(时序性+梯度);发表偏倚。 · Big data (Ch11): significance # importance at scale; provenance, ethics, scepticism toward Al findings. Big data (第11章):在大规模下显著 ≠重要;来源、伦理、对 AI发现的怀疑。 而且永远 -- 铁律:点名概念,然后给 because。祝你好运。 AskSia Library · MAST20034 · 双语 Bilingual ● 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 ● 设计(第3章): validity =随机化/比较/控制(杀 bias); precision= 重复/分层/平衡(杀 variability);二者是独 立的轴。 ● 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠因果。 ● 报告(第5章):中心/离散/趋势/离群点;报告 CI+水平、以及统计量+P,而非只报P。 ● 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence(收敛)作为停止规则。 推断(第7章): 随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大 P 不证明 Ho; Type l/ll 与 power.[11]Source: asksia-bible-mast20034-bilingual.pdf(b) Big-data 批判(because):巨大的 n修不了 bias -- 它只是那些已经在用这个 app 的人(selection bias); 在海量 n 下一切看起来都“显著”,所以 effect size 与来源比 P 更要紧。再加上数据伦理标记:被记录用户的隐私/ 同意。 What earns the marks. a justified qual choice (the "why" logic + convergence) + a big-data critique naming that size # representativeness, with an ethics flag. 什么能得分。一个有论证的 qual 选择(“why”逻辑+ convergence)+一个 big-data 批判,点明规模 ≠代表性,并附 上一个 ethics 标记。 Trap. dismissing qualitative as "unscientific"; equating large n with representative; forgetting consent/provenance for found data. 陷阱。把 qualitative 斥为“不科学”;把大n 等同于有代表性;对找来的数据忘了 consent/provenance(同意/来源)。 AskSia Library · MAST20034 · 双语 Bilingual ★ Recall checklist - the decision rules for the bank 回忆清单 -- 题库的决策规则 1 每个答案:点名→定义→在情境中给理由→ because(后果)。按分值数理由。 · Design/sampling: justify the choice against its alternative; non-probability methods are biased - size won't cure it. 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 · Confounder: must link to both exposure and outcome; observational - associated with, never causes. Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 · Graph critique: two good features to a principle; one issue + a fix that matches it. 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 · Type I/II: false +/false -; power = 1-ß; small n -+ low power; rare condition - base-rate false positives. Type l/ll: 假阳/假阴;power = 1-β;小 n→低 power;罕见情况→基础率导致的假阳性。 · Qual vs quant: why vs what; convergence is the qual stopping rule. Big data: size # unbiased; effect size & ethics over P. 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data: 大≠无偏;effect size 与伦理胜过 P。 ● 每个答案:点名 →定义→在情境中给理由→ because(后果)。按分值数理由。 ● 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 ● Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 ● 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 ● CI:随机的是区间,固定的是参数。P-value: Pr(datalHo),而非 Pr(Holdata);显著 ≠重要。 ● Type I/ll: 假阳/假阴;power = 1-β;小n→低 power;罕见情况→ 基础率导致的假阳性。 ● Forest plot: 菱形对零线+异质性+发表偏倚的注意。Hill:证据的权重,时序性优先。 ● 诊断图:漏斗形→方差非恒定;QQ 弯曲→非正态→变换/用假设更少的方法。 ● 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data:大≠无偏;effect size 与伦理胜过 P。 AskSia Library · MAST20034 · 双语 Bilingual EXAM MORNING . THE DECODER EXAM MORNING . THE DECODER[12]Source: asksia-bible-mast20034-bilingual.pdfTrap. offering a variable that affects only the outcome (that's not a confounder - a confounder must link to both); claiming causation. 陷阱。给出一个只影响结局(outcome)的变量(那不是 confounder -- confounder 必须同时连向两端);声称存在因 果。 Q5 GRAPH CRITIQUE - PRAISE [2: 1 mark each] Prompt (paraphrased). Given a multi-panel scatterplot, identify two good features in terms of communicating information. 题目(转述)。给定一张多面板散点图,从信息传达的角度识别两个优点。 - Model reasoning - the because skeleton. Name two concrete features, each tied to a 范例推理 -- because骨架。 graphics principle: (1) a scatterplot is the standard form for two numerical variables; (2) colour and symbol double-encode the groups (accessible / redundant coding); or panels separate groups instead of overplotting; or a clear title + source gives context. What earns the marks. two specific features, each named to a principle - 1 mark each. Specificity is everything. 什么能得分。两个具体的特征,每个都挂到一条原则上 -- 各1分。具体性就是一切。 Trap. generic praise ("nice colours", "easy to read") with no principle; praising a feature that isn't actually in the chart. 陷阱。没有原则的笼统夸赞(“颜色好看”“易读”);夸了一个图里其实没有的特征。 AskSia Library . MAST20034 . XXia Bilingual 06 GRAPH CRITIQUE - FIX [3: 1 issue + 2 specific fix] Prompt (paraphrased). Identify one feature that could be improved and suggest a specific improvement. 题目(转述)。识别一个可改进之处,并提出一个具体的改进。 Model reasoning - the because skeleton. 范例推理 -- because骨架。 (1) Issue: name one real fault - e. g. panels are on different y-axis scales, so the eye mis-reads the comparison. (2) Fix (because): "set every panel to a common axis range so the heights are directly comparable" - the fix must address the named issue. (Other valid pairs: no title - add a title stating data + context + source; cryptic labels - spell out the full variable names. ) What earns the marks. 1 mark to name a real issue + 2 marks for a fix that directly resolves that exact issue. The fix must match the fault. 什么能得分。1分给点名一个真实问题+2 分给一个直接解决那个确切问题的修正。修正必须对得上故障。 Trap. a fix that doesn't address the issue you named (rubric zeroes it); naming an issue but giving no concrete fix; "fixing" something that was fine. 陷阱。一个不针对你所点名问题的修正(评分会归零);只点名问题却不给具体修正;“修”了本来没问题的东西。 AskSia Library . MAST20034 . XXia Bilingual REVISION . SHORT - ANSWER BANK REVISION . SHORT - ANSWER BANK INTERPRET -AND - CRITIQUE SPECIES Drills 7-12: inference, errors, causation & ethics 演练 7-12: inference、误差、causation 与伦理
-
你答 confounder 题时必须做到:
- 点出一个变量
- 解释它怎样连到 exposure
- 再解释它怎样连到 outcome
- 再说它如何让观察到的关系不可靠[11]Source: asksia-bible-mast20034-bilingual.pdf(b) Big-data 批判(because):巨大的 n修不了 bias -- 它只是那些已经在用这个 app 的人(selection bias); 在海量 n 下一切看起来都“显著”,所以 effect size 与来源比 P 更要紧。再加上数据伦理标记:被记录用户的隐私/ 同意。 What earns the marks. a justified qual choice (the "why" logic + convergence) + a big-data critique naming that size # representativeness, with an ethics flag. 什么能得分。一个有论证的 qual 选择(“why”逻辑+ convergence)+一个 big-data 批判,点明规模 ≠代表性,并附 上一个 ethics 标记。 Trap. dismissing qualitative as "unscientific"; equating large n with representative; forgetting consent/provenance for found data. 陷阱。把 qualitative 斥为“不科学”;把大n 等同于有代表性;对找来的数据忘了 consent/provenance(同意/来源)。 AskSia Library · MAST20034 · 双语 Bilingual ★ Recall checklist - the decision rules for the bank 回忆清单 -- 题库的决策规则 1 每个答案:点名→定义→在情境中给理由→ because(后果)。按分值数理由。 · Design/sampling: justify the choice against its alternative; non-probability methods are biased - size won't cure it. 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 · Confounder: must link to both exposure and outcome; observational - associated with, never causes. Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 · Graph critique: two good features to a principle; one issue + a fix that matches it. 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 · Type I/II: false +/false -; power = 1-ß; small n -+ low power; rare condition - base-rate false positives. Type l/ll: 假阳/假阴;power = 1-β;小 n→低 power;罕见情况→基础率导致的假阳性。 · Qual vs quant: why vs what; convergence is the qual stopping rule. Big data: size # unbiased; effect size & ethics over P. 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data: 大≠无偏;effect size 与伦理胜过 P。 ● 每个答案:点名 →定义→在情境中给理由→ because(后果)。按分值数理由。 ● 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 ● Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 ● 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 ● CI:随机的是区间,固定的是参数。P-value: Pr(datalHo),而非 Pr(Holdata);显著 ≠重要。 ● Type I/ll: 假阳/假阴;power = 1-β;小n→低 power;罕见情况→ 基础率导致的假阳性。 ● Forest plot: 菱形对零线+异质性+发表偏倚的注意。Hill:证据的权重,时序性优先。 ● 诊断图:漏斗形→方差非恒定;QQ 弯曲→非正态→变换/用假设更少的方法。 ● 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data:大≠无偏;effect size 与伦理胜过 P。 AskSia Library · MAST20034 · 双语 Bilingual EXAM MORNING . THE DECODER EXAM MORNING . THE DECODER[12]Source: asksia-bible-mast20034-bilingual.pdfTrap. offering a variable that affects only the outcome (that's not a confounder - a confounder must link to both); claiming causation. 陷阱。给出一个只影响结局(outcome)的变量(那不是 confounder -- confounder 必须同时连向两端);声称存在因 果。 Q5 GRAPH CRITIQUE - PRAISE [2: 1 mark each] Prompt (paraphrased). Given a multi-panel scatterplot, identify two good features in terms of communicating information. 题目(转述)。给定一张多面板散点图,从信息传达的角度识别两个优点。 - Model reasoning - the because skeleton. Name two concrete features, each tied to a 范例推理 -- because骨架。 graphics principle: (1) a scatterplot is the standard form for two numerical variables; (2) colour and symbol double-encode the groups (accessible / redundant coding); or panels separate groups instead of overplotting; or a clear title + source gives context. What earns the marks. two specific features, each named to a principle - 1 mark each. Specificity is everything. 什么能得分。两个具体的特征,每个都挂到一条原则上 -- 各1分。具体性就是一切。 Trap. generic praise ("nice colours", "easy to read") with no principle; praising a feature that isn't actually in the chart. 陷阱。没有原则的笼统夸赞(“颜色好看”“易读”);夸了一个图里其实没有的特征。 AskSia Library . MAST20034 . XXia Bilingual 06 GRAPH CRITIQUE - FIX [3: 1 issue + 2 specific fix] Prompt (paraphrased). Identify one feature that could be improved and suggest a specific improvement. 题目(转述)。识别一个可改进之处,并提出一个具体的改进。 Model reasoning - the because skeleton. 范例推理 -- because骨架。 (1) Issue: name one real fault - e. g. panels are on different y-axis scales, so the eye mis-reads the comparison. (2) Fix (because): "set every panel to a common axis range so the heights are directly comparable" - the fix must address the named issue. (Other valid pairs: no title - add a title stating data + context + source; cryptic labels - spell out the full variable names. ) What earns the marks. 1 mark to name a real issue + 2 marks for a fix that directly resolves that exact issue. The fix must match the fault. 什么能得分。1分给点名一个真实问题+2 分给一个直接解决那个确切问题的修正。修正必须对得上故障。 Trap. a fix that doesn't address the issue you named (rubric zeroes it); naming an issue but giving no concrete fix; "fixing" something that was fine. 陷阱。一个不针对你所点名问题的修正(评分会归零);只点名问题却不给具体修正;“修”了本来没问题的东西。 AskSia Library . MAST20034 . XXia Bilingual REVISION . SHORT - ANSWER BANK REVISION . SHORT - ANSWER BANK INTERPRET -AND - CRITIQUE SPECIES Drills 7-12: inference, errors, causation & ethics 演练 7-12: inference、误差、causation 与伦理
-
大陷阱
-
Bradford Hill criteria
- 材料里特别强调:
- temporality:原因必须先于结果
- dose-response gradient:剂量反应梯度[1]Source: asksia-bible-mast20034-bilingual.pdfB 2 . REVISE 2 · REVISE 2 . REVISE You've done the week. Use the tables and the chapter-end recall checklists to self-test: can you list the four observational designs, name three sampling biases, give the five graphics principles, recite the Hill criteria? The checklists are written to be lifted almost verbatim onto your four-side notes sheet. 你已经上完本周。用各表格和章 末的recall checklists (回忆清 单)来自测:你能列出四种观察 性设计、点名三种 sampling bias、给出五条图表原则、背出 Hill 准则吗?这些清单写出来就 是为了几乎逐字誉到你的四面笔 记纸上。 C 3 . APPLY 3 . APPLY 3 . APPLY You're building your notes sheet or sitting the paper. Run the name-the-concept decoder (Ch 14) on every prompt: read the cue - name the design / bias / method -> write the because. With four sides of notes carried in and no calculator, your edge is reasoning discipline, not recall under pressure. 你正在做笔记纸,或正在考场 上。对每道题跑一遍name-the- concept decoder (点名概念解 码器)(第14章):读线索→点 design / bias / method -> 写下because。带着四面笔记、 不用计算器,你的优势是推理纪 律,而非压力下的回忆。 AskSia Library · MAST20034 · 双语 Bilingual ! Read this first: the assessment shape, and the bring-in rule 先读这个:评估的形态,以及可带入规则 MAST20034 is assessed by four pieces: 5 revision quizzes (5%), 4 short assignments (20%, each a tight 200- word critique with hard word penalties), a group project (15%, study design/critique + a Week 11 presentation), and the 60% final exam. The final is in-person, short-answer reasoning, three hours. You may bring in up to two A4 pages double-sided - four sides - of your own notes, and calculators are not permitted (there are no questions that need one). So your notes sheet should carry definitions, taxonomies, decision rules and checklists, never arithmetic. Always confirm the current weights, dates and exam conditions on your own LMS, as details shift between cohorts. MAST20034 由四个部分评估:5 次复习 quiz(5%)、4次短作业(20%,每次是一篇严格200词的批判,超字数有硬 扣分)、一个小组项目(15%,研究设计/批判+第11周展示),以及60% 的期末考。期末是线下、简答推理、三小时。 你可以带入最多两张 A4 双面纸 -- 四面 -- 自己的笔记,不允许用计算器(也没有需要计算器的题目)。所以你的笔记 纸应承载定义、分类法、决策规则与清单,绝不放算术。请始终在你自己的 LMS 上确认当前的权重、日期与考试条件, 因为细节会随届次变动。 i How this book was built - the two-layer rule 这本书是怎么搭出来的 -- 两层规则 The framework canon here is standard, widely-published statistical-literacy theory - the PPDAC investigation cycle (Wild & Pfannkuch), EDA (Tukey), the standard study-design and sampling taxonomies, validity & precision principles, NHST + confidence-interval logic, the WEIRD-bias critique, the Bradford Hill criteria, and data-feminism / data-ethics (D'Ignazio & Klein). These are non-copyrightable canon, stated plainly. The course's own case-study stems and tutorial examples are paraphrased and re-authored with our own scenarios - we never reproduce a case study's specific data. Book status quoted and honoured (four-side bring-in, no calculator, short-answer). Verify on your LMS. 这里的框架经典是标准的、被广泛发表的统计素养理论 -- PPDAC 探究循环(Wild & Pfannkuch)、EDA (Tukey)、 标准的研究设计与 sampling 分类法、validity & precision (效度与精密度)原则、NHST+置信区间逻辑、WEIRD- bias 批判、Bradford Hill 准则,以及data-feminism / data-ethics (数据女性主义/数据伦理)(D'Ignazio & Klein)。这些是不可受版权保护的经典,平实陈述。本课自身的案例研究题干与教程例子都被转述并以我们自己的情景重 写 -- 我们绝不复制任何案例研究的具体数据。书面状态如实引用并遵守(四面带入、不用计算器、简答)。请在你的 LMS 上核实。 AskSia Library · MAST20034 · 双语 Bilingual THE BLUEPRINT - THE EXAM BLUEPRINT 60% FINAL . EVERY MARK IS A 'BECAUSE' Where every mark lives 每一分都落在哪里 One 60% short-answer final - reasoning only, no calculator, four sides of your own notes 一场占 60% 的 short-answer 期末 -- 只考推理、不许用计算器、可带四页自备笔记 TL;DR. Sixty percent is a short-answer reasoning final - no calculator, no calculations, no software, with four sides of your own notes carried in. Its make-or-break skill is "name the concept, then justify the call": you are handed a graph, a study or a piece of statistical output and asked to critique it and say how to fix it. Master the taxonomies and decision rules in this book and you hold the keys to the whole paper. TL;DR. 这 60% 是一场 short-answer 推理期末 -- 不许用计算器、不做计算、不用软件,可带入四页自备笔记。它成败攸 关的技能是“为概念命名,再论证你的判断”:你会拿到一张图、一项研究或一段统计输出,被要求批判它并说出如何修补。 掌握本书里的分类法与决策规则,你就握住了整张试卷的钥匙。 60% FINAL EXAM (3 HR) 期末考试(3 小时)[4]Source: asksia-bible-mast20034-bilingual.pdf4 Analysis explore + model + test 3 Data collect / clean / store iterative: conclusions raise new problems -> cycle repeats The course's engine in one picture: Problem - Plan + Data - Analysis - Conclusion - the investigation cycle that frames almost every critique prompt. Most exam answers are really a question about one node: was the Plan a sound design? Were the Data well sampled? Is the Conclusion licensed by the design? Learn to walk it from memory. 一幅图道尽本课程的引擎:Problem → Plan → Data → Analysis → Conclusion -- 这个研究循环 框定了几乎每一个批判题。多数考试答案其实是关于 某一个节点的问题:Plan 是不是一个可靠的设计? Data 抽样得当吗?Conclusion 是设计所许可的吗? 要学会凭记忆把它走一遍。 Examinable scope = the 12-week reasoning spine: objectivity & data · good graphics · study design . observational studies & confounding · reporting & critiquing claims . qualitative methods . frameworks for inference . analysis & modelling . sampling & AskSia Library . MAST20034 . XXia Bilingual WEIRD bias . accumulating research (meta-analysis & Hill) . big data · context. Research prompts touch only the whole-class case studies and never demand recalled details. 可考范围 = 12 周的推理主线:objectivity 与数据 · 好图表 · 研究设计 · observational study 与 confounding · 报告 与批判主张 · qualitative methods · 推断框架 · 分析与建 模 · 抽样与 WEIRD bias · 积累研究 (meta-analysis 与 Hill) · big data · 情境。研究类题目只触及全班共学的案 例研究,从不要求背诵细节。 What the exam is really testing 这场考试真正在考什么 The cue you get The move it rewards A graph / figure Critique it: name two good features + one specific fix (the graphics principles) A described study Name the design - say what conclusion is legal (causation vs association) An association Find the confounder / lurking variable and explain how it could fake the link Statistical output / a CI / P Interpret it in context - without the classic misreads A sampling scenario Name the method & the bias (incl. WEIRD) and why size won't cure it ✓ The one habit that wins this exam 赢下这场考试的那一个习惯 For every prompt, name the concept first, then write the because. "Two variables move together" - confounding / correlation#causation; "who got picked" - a sampling / selection bias; "is this graph any good" - the five graphics principles; "does X cause Y across studies" - Bradford Hill; "what does this P-value mean" - the interpretation rules. The decoder in Ch 14 lists every cue. 对每道题,先点名概念,再写because。“两个变量一 起变动”→ confounding / correlation≠causation;“谁被选中”→ 某种 sampling/ selection bias;“这张图好不好”→五条 图表原则;“跨多项研究X是否导致 Y”→ Bradford Hill;“这个 P-value 是什么意思”→解读规则。第 14 章的解码器列出了每一个线索。 ★ The single highest-value habit 价值最高的单一习惯 You may write in dot-points or sentences, and there are no marks for grammar or spelling - so spend every word on the reasoning. Practise answering in the shape the markers reward: (1) name the concept, (2) define it in a line, (3) apply it to the scenario, (4) state the consequence or fix. Four sentences, full marks. "Explaining your reasoning and choices is typically more important than any answer. " 你可以用要点或句子书写,且语法或拼写不计分 所以把每一个词都花在推理上。按评分者奖励的形态 练习作答:(1)点名概念,(2)一行内定义它,(3)把它 应用到情景,(4)陈述后果或修正。四句话,满分。 “阐释你的推理与选择,通常比任何答案本身更重要。”[5]Source: asksia-bible-mast20034-bilingual.pdfFINAL . 60% . SHORT-ANSWER REASONING The exam-morning decoder 考试当天解码器 If the question says X, reach for Y, and say these three things 若题目说 X,就伸手去取 Y,并说出这三件事 TL;DR. The final hands you a scenario, a graph, or a piece of statistical output and asks you to name the concept and justify it. There is no calculator and no arithmetic - every mark is a because. This page is the lookup table: read the cue words in the stem, reach for the matching concept, then deliver the three reasons that bank the marks. Memorise the column on the right; that is the answer. TL;DR. 期末递给你一个scenario (情景)、一张图,或一段统计输出,要你点名概念并加以论证。没有计算器,也没有算术 -- 每一分都是一个because。本页就是查找表:读题干里的cue words (线索词),伸手抓对应的概念,再交出能存下分数 的三条理由。把右侧那一列背下来;那就是答案。 ★ What the exam asks here 考试在这里问什么 The 60% final is 3 hours, short-answer only (no MCQ, no essay), no calculator, and you bring in 4 sides of your own notes. The marking criteria are explicit: "explaining your reasoning and choices is typically more important than any answer. " Dot-points are fine; no marks for grammar/spelling. So this whole chapter trains the one move the exam pays for - name the concept - give the because. 60% 的期末为时 3 小时,仅简答(无MCQ、无论文),不可用计算器,且你带入自备4面笔记。评分标准写得很明 确:“解释你的推理与选择,通常比任何答案本身更重要。”用要点列举即可;语法/拼写不计分。所以整章都在训练考试 买单的那一招 -- 点名概念 →给出 because。 12. 1 The cue - concept - because table 12. 1cue -> concept -> because xJAR Each row is a question species you have already met in this book. The left column is what the stem sounds like; the middle is the framework to invoke (with the chapter); the right is the 3-part skeleton - say all three and you have earned the reasoning marks. 每一行都是你在本书中已经见过的一类问题species(题种)。左列是题干听起来像什么;中列是要调用的框架(附章节);右 列是三段式骨架 -- 三段都说出来,你就挣到了推理分。 If the question says . . . Reach for this concept Say these 3 things (the because) DESIGN & CAUSATION "Choose / justify a study design"; "how would you investigate . . . " Study-design tree + validity · Ch3-4 (1) Can you intervene? - experiment (RCT) vs observational. (2) Pick the type with a because - rare outcome-case-control, many outcomes-cohort, snapshot-cross-sectional, populations-ecological. (3) Name the design tools that protect validity (randomise/compare/control). "Is it causal?"; "does X cause Y?"; "can we conclude . . . " Confounding + Bradford Hill . Ch4, Ch10 (1) Correlation # causation - observational data give association only. (2) Name a plausible confounder (linked to both exposure and outcome). (3) Argue Hill - esp. temporality (cause first) + dose-response gradient; an RCT would strengthen it by removing confounders. AskSia Library . MAST20034 . XXia Bilingual If the question says . . . Reach for this concept Say these 3 things (the because) GRAPHS & OUTPUT "Critique this graph"; "two good features & one improvement" 5 graphics principles . Ch2 (1) Two good features, each tied to a principle (standard form / common scale / clear encoding / shows data / simple). (2) One real issue (no title, abbreviated labels, panels on different scales). (3) A specific fix that addresses that issue - vague fixes score zero.[11]Source: asksia-bible-mast20034-bilingual.pdf(b) Big-data 批判(because):巨大的 n修不了 bias -- 它只是那些已经在用这个 app 的人(selection bias); 在海量 n 下一切看起来都“显著”,所以 effect size 与来源比 P 更要紧。再加上数据伦理标记:被记录用户的隐私/ 同意。 What earns the marks. a justified qual choice (the "why" logic + convergence) + a big-data critique naming that size # representativeness, with an ethics flag. 什么能得分。一个有论证的 qual 选择(“why”逻辑+ convergence)+一个 big-data 批判,点明规模 ≠代表性,并附 上一个 ethics 标记。 Trap. dismissing qualitative as "unscientific"; equating large n with representative; forgetting consent/provenance for found data. 陷阱。把 qualitative 斥为“不科学”;把大n 等同于有代表性;对找来的数据忘了 consent/provenance(同意/来源)。 AskSia Library · MAST20034 · 双语 Bilingual ★ Recall checklist - the decision rules for the bank 回忆清单 -- 题库的决策规则 1 每个答案:点名→定义→在情境中给理由→ because(后果)。按分值数理由。 · Design/sampling: justify the choice against its alternative; non-probability methods are biased - size won't cure it. 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 · Confounder: must link to both exposure and outcome; observational - associated with, never causes. Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 · Graph critique: two good features to a principle; one issue + a fix that matches it. 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 · Type I/II: false +/false -; power = 1-ß; small n -+ low power; rare condition - base-rate false positives. Type l/ll: 假阳/假阴;power = 1-β;小 n→低 power;罕见情况→基础率导致的假阳性。 · Qual vs quant: why vs what; convergence is the qual stopping rule. Big data: size # unbiased; effect size & ethics over P. 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data: 大≠无偏;effect size 与伦理胜过 P。 ● 每个答案:点名 →定义→在情境中给理由→ because(后果)。按分值数理由。 ● 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 ● Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 ● 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 ● CI:随机的是区间,固定的是参数。P-value: Pr(datalHo),而非 Pr(Holdata);显著 ≠重要。 ● Type I/ll: 假阳/假阴;power = 1-β;小n→低 power;罕见情况→ 基础率导致的假阳性。 ● Forest plot: 菱形对零线+异质性+发表偏倚的注意。Hill:证据的权重,时序性优先。 ● 诊断图:漏斗形→方差非恒定;QQ 弯曲→非正态→变换/用假设更少的方法。 ● 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data:大≠无偏;effect size 与伦理胜过 P。 AskSia Library · MAST20034 · 双语 Bilingual EXAM MORNING . THE DECODER EXAM MORNING . THE DECODER[16]Source: asksia-cheatsheet-mast20034.pdfqualitative . convergence 17 . The "Because" Rule HOW TO BANK MARKS EVERY ANSWER = A CHAIN NAME the concept/framework - APPLY it to the context + BECAUSE . . . (the reason wins the mark) · Marks are per correct, sufficiently-detailed reason - one detailed reason often = full marks; restating the definition earns nothing · Dot points are fine . no grammar/spelling marks . 3 hours SIA > No question needs a calculator and none recall case-study details. If you're computing, you've misread it - they want the reasoning. Spend the words on the because. 18 . Top Traps to THE MARKS LOST MOST Avoid · "P = Pr(Ho true)" . "large P proves Ho" . choosing one- sided after the data · Calling a natural experiment an experiment . "association proves cause" · Cohort>case-control mix-up · stratified > quota . cluster<> stratified · Hill as a tick-box · ignoring temporality · "Significant" = "important" . trusting a result because n is huge the core of your 4-side bring-in notes . confirm on the MAST20034 exam-info page name it. apply it. because . . . asksia. ai/cheatsheet/ unimelb-mast20034 . side 2/2 AskSia CHEAT SHEET SERIES RESTRICTED BRING-IN . NO CALCULATOR
- 你要理解:
- Hill 不是打勾清单,不是满足几条就自动证明因果[16]Source: asksia-cheatsheet-mast20034.pdfqualitative . convergence 17 . The "Because" Rule HOW TO BANK MARKS EVERY ANSWER = A CHAIN NAME the concept/framework - APPLY it to the context + BECAUSE . . . (the reason wins the mark) · Marks are per correct, sufficiently-detailed reason - one detailed reason often = full marks; restating the definition earns nothing · Dot points are fine . no grammar/spelling marks . 3 hours SIA > No question needs a calculator and none recall case-study details. If you're computing, you've misread it - they want the reasoning. Spend the words on the because. 18 . Top Traps to THE MARKS LOST MOST Avoid · "P = Pr(Ho true)" . "large P proves Ho" . choosing one- sided after the data · Calling a natural experiment an experiment . "association proves cause" · Cohort>case-control mix-up · stratified > quota . cluster<> stratified · Hill as a tick-box · ignoring temporality · "Significant" = "important" . trusting a result because n is huge the core of your 4-side bring-in notes . confirm on the MAST20034 exam-info page name it. apply it. because . . . asksia. ai/cheatsheet/ unimelb-mast20034 . side 2/2 AskSia CHEAT SHEET SERIES RESTRICTED BRING-IN . NO CALCULATOR
- 它是“支持因果解释的证据框架”[11]Source: asksia-bible-mast20034-bilingual.pdf(b) Big-data 批判(because):巨大的 n修不了 bias -- 它只是那些已经在用这个 app 的人(selection bias); 在海量 n 下一切看起来都“显著”,所以 effect size 与来源比 P 更要紧。再加上数据伦理标记:被记录用户的隐私/ 同意。 What earns the marks. a justified qual choice (the "why" logic + convergence) + a big-data critique naming that size # representativeness, with an ethics flag. 什么能得分。一个有论证的 qual 选择(“why”逻辑+ convergence)+一个 big-data 批判,点明规模 ≠代表性,并附 上一个 ethics 标记。 Trap. dismissing qualitative as "unscientific"; equating large n with representative; forgetting consent/provenance for found data. 陷阱。把 qualitative 斥为“不科学”;把大n 等同于有代表性;对找来的数据忘了 consent/provenance(同意/来源)。 AskSia Library · MAST20034 · 双语 Bilingual ★ Recall checklist - the decision rules for the bank 回忆清单 -- 题库的决策规则 1 每个答案:点名→定义→在情境中给理由→ because(后果)。按分值数理由。 · Design/sampling: justify the choice against its alternative; non-probability methods are biased - size won't cure it. 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 · Confounder: must link to both exposure and outcome; observational - associated with, never causes. Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 · Graph critique: two good features to a principle; one issue + a fix that matches it. 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 · Type I/II: false +/false -; power = 1-ß; small n -+ low power; rare condition - base-rate false positives. Type l/ll: 假阳/假阴;power = 1-β;小 n→低 power;罕见情况→基础率导致的假阳性。 · Qual vs quant: why vs what; convergence is the qual stopping rule. Big data: size # unbiased; effect size & ethics over P. 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data: 大≠无偏;effect size 与伦理胜过 P。 ● 每个答案:点名 →定义→在情境中给理由→ because(后果)。按分值数理由。 ● 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 ● Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 ● 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 ● CI:随机的是区间,固定的是参数。P-value: Pr(datalHo),而非 Pr(Holdata);显著 ≠重要。 ● Type I/ll: 假阳/假阴;power = 1-β;小n→低 power;罕见情况→ 基础率导致的假阳性。 ● Forest plot: 菱形对零线+异质性+发表偏倚的注意。Hill:证据的权重,时序性优先。 ● 诊断图:漏斗形→方差非恒定;QQ 弯曲→非正态→变换/用假设更少的方法。 ● 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data:大≠无偏;effect size 与伦理胜过 P。 AskSia Library · MAST20034 · 双语 Bilingual EXAM MORNING . THE DECODER EXAM MORNING . THE DECODER
- 材料里特别强调:
-
考试标准答法
- “This is observational, so the data support association, not causation. A plausible confounder is X, because it is related to both the exposure and the outcome. Bradford Hill would also ask whether the exposure came first and whether there is a dose-response pattern.”[5]Source: asksia-bible-mast20034-bilingual.pdfFINAL . 60% . SHORT-ANSWER REASONING The exam-morning decoder 考试当天解码器 If the question says X, reach for Y, and say these three things 若题目说 X,就伸手去取 Y,并说出这三件事 TL;DR. The final hands you a scenario, a graph, or a piece of statistical output and asks you to name the concept and justify it. There is no calculator and no arithmetic - every mark is a because. This page is the lookup table: read the cue words in the stem, reach for the matching concept, then deliver the three reasons that bank the marks. Memorise the column on the right; that is the answer. TL;DR. 期末递给你一个scenario (情景)、一张图,或一段统计输出,要你点名概念并加以论证。没有计算器,也没有算术 -- 每一分都是一个because。本页就是查找表:读题干里的cue words (线索词),伸手抓对应的概念,再交出能存下分数 的三条理由。把右侧那一列背下来;那就是答案。 ★ What the exam asks here 考试在这里问什么 The 60% final is 3 hours, short-answer only (no MCQ, no essay), no calculator, and you bring in 4 sides of your own notes. The marking criteria are explicit: "explaining your reasoning and choices is typically more important than any answer. " Dot-points are fine; no marks for grammar/spelling. So this whole chapter trains the one move the exam pays for - name the concept - give the because. 60% 的期末为时 3 小时,仅简答(无MCQ、无论文),不可用计算器,且你带入自备4面笔记。评分标准写得很明 确:“解释你的推理与选择,通常比任何答案本身更重要。”用要点列举即可;语法/拼写不计分。所以整章都在训练考试 买单的那一招 -- 点名概念 →给出 because。 12. 1 The cue - concept - because table 12. 1cue -> concept -> because xJAR Each row is a question species you have already met in this book. The left column is what the stem sounds like; the middle is the framework to invoke (with the chapter); the right is the 3-part skeleton - say all three and you have earned the reasoning marks. 每一行都是你在本书中已经见过的一类问题species(题种)。左列是题干听起来像什么;中列是要调用的框架(附章节);右 列是三段式骨架 -- 三段都说出来,你就挣到了推理分。 If the question says . . . Reach for this concept Say these 3 things (the because) DESIGN & CAUSATION "Choose / justify a study design"; "how would you investigate . . . " Study-design tree + validity · Ch3-4 (1) Can you intervene? - experiment (RCT) vs observational. (2) Pick the type with a because - rare outcome-case-control, many outcomes-cohort, snapshot-cross-sectional, populations-ecological. (3) Name the design tools that protect validity (randomise/compare/control). "Is it causal?"; "does X cause Y?"; "can we conclude . . . " Confounding + Bradford Hill . Ch4, Ch10 (1) Correlation # causation - observational data give association only. (2) Name a plausible confounder (linked to both exposure and outcome). (3) Argue Hill - esp. temporality (cause first) + dose-response gradient; an RCT would strengthen it by removing confounders. AskSia Library . MAST20034 . XXia Bilingual If the question says . . . Reach for this concept Say these 3 things (the because) GRAPHS & OUTPUT "Critique this graph"; "two good features & one improvement" 5 graphics principles . Ch2 (1) Two good features, each tied to a principle (standard form / common scale / clear encoding / shows data / simple). (2) One real issue (no title, abbreviated labels, panels on different scales). (3) A specific fix that addresses that issue - vague fixes score zero.[11]Source: asksia-bible-mast20034-bilingual.pdf(b) Big-data 批判(because):巨大的 n修不了 bias -- 它只是那些已经在用这个 app 的人(selection bias); 在海量 n 下一切看起来都“显著”,所以 effect size 与来源比 P 更要紧。再加上数据伦理标记:被记录用户的隐私/ 同意。 What earns the marks. a justified qual choice (the "why" logic + convergence) + a big-data critique naming that size # representativeness, with an ethics flag. 什么能得分。一个有论证的 qual 选择(“why”逻辑+ convergence)+一个 big-data 批判,点明规模 ≠代表性,并附 上一个 ethics 标记。 Trap. dismissing qualitative as "unscientific"; equating large n with representative; forgetting consent/provenance for found data. 陷阱。把 qualitative 斥为“不科学”;把大n 等同于有代表性;对找来的数据忘了 consent/provenance(同意/来源)。 AskSia Library · MAST20034 · 双语 Bilingual ★ Recall checklist - the decision rules for the bank 回忆清单 -- 题库的决策规则 1 每个答案:点名→定义→在情境中给理由→ because(后果)。按分值数理由。 · Design/sampling: justify the choice against its alternative; non-probability methods are biased - size won't cure it. 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 · Confounder: must link to both exposure and outcome; observational - associated with, never causes. Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 · Graph critique: two good features to a principle; one issue + a fix that matches it. 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 · Type I/II: false +/false -; power = 1-ß; small n -+ low power; rare condition - base-rate false positives. Type l/ll: 假阳/假阴;power = 1-β;小 n→低 power;罕见情况→基础率导致的假阳性。 · Qual vs quant: why vs what; convergence is the qual stopping rule. Big data: size # unbiased; effect size & ethics over P. 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data: 大≠无偏;effect size 与伦理胜过 P。 ● 每个答案:点名 →定义→在情境中给理由→ because(后果)。按分值数理由。 ● 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 ● Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 ● 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 ● CI:随机的是区间,固定的是参数。P-value: Pr(datalHo),而非 Pr(Holdata);显著 ≠重要。 ● Type I/ll: 假阳/假阴;power = 1-β;小n→低 power;罕见情况→ 基础率导致的假阳性。 ● Forest plot: 菱形对零线+异质性+发表偏倚的注意。Hill:证据的权重,时序性优先。 ● 诊断图:漏斗形→方差非恒定;QQ 弯曲→非正态→变换/用假设更少的方法。 ● 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data:大≠无偏;effect size 与伦理胜过 P。 AskSia Library · MAST20034 · 双语 Bilingual EXAM MORNING . THE DECODER EXAM MORNING . THE DECODER
-
5)抽样(Sampling)+ WEIRD + bigger sample won’t fix bias
-
这块特别爱考。
-
你要区分:
-
核心思想:
-
必背铁律:
- a bigger sample will NOT fix bias[3]Source: asksia-bible-mast20034-bilingual.pdf"Interpret this CI / P- value / output / forest plot" Inference reading rules . Ch7, Ch8, Ch10 (1) State what it shows in context (CI excludes O - evidence of an effect; small P - strong evidence against H. ). (2) Add the correct caveat (the interval is random, u is fixed; large P does not prove H. ). (3) Comment on strength / meaning - significant # important. SAMPLING & TRUST "Is this sample OK?"; "what's wrong with how they recruited?" Sampling bias + WEIRD · Ch1, Ch9 (1) Ask who is missing - frame / selection / non-response / volunteer gap. (2) Name the method and its bias (convenience - people similar to each other). (3) State that a bigger sample will NOT fix bias - it repeats the mistake at scale; consider WEIRD over-sampling. "Too good to be true"; "a surprising significant result"; "just barely p‹0. 05" Reproducibility + p-hacking · Ch9-10 (1) Publication bias - novel/significant results over-published, inflating effects. (2) Watch for p-hacking / HARKing (one-sided chosen after the data, multiple looks). (3) Ask for replication, a CI / effect size, and pre- registration before trusting it. "Big data / an Al claim"; "with millions of records . . . " Ethics + validity at scale · Ch1, Ch11 (1) Huge n - everything is significant - judge effect size & practical importance, not P. (2) Apply the context questions (who/why/what/how) + data ethics (consent, fairness, stewardship). (3) Be sceptical of Al - check provenance and the missing data. ✓ How to use the table under pressure 在压力下如何使用这张表 Underline the verb and the noun in the stem first ("choose a design", "critique this graph", "interpret the output"). That two-word cue picks the row; the right column is your paragraph. Then convert each of the three things into a sentence that ends in a because. You are never asked to compute - resist the urge. 先在题干里给动词和名词划线(“选择一个设计”、“批判这张图”、“解读这段输出”)。那个两字提示挑出对应的行;右侧 那一栏就是你要写的段落。然后把这三样东西各转成一个以 because 收尾的句子。题目从不要求你计算 -- 忍住冲动。 AskSia Library . MAST20034 . XXia Bilingual EXAM MORNING . THE DECODER - EXAM MORNING . THE DECODER BUILDING THE NOTES YOU CARRY IN Your 4 sides, the 'because' rule, and the 3 hours 你的4 页笔记、‘because’规则,以及那3个小时 TL;DR. You may bring four sides of your own notes and there is no calculator - so do not waste space on formulas. Fill the four sides with decision trees, checklists, and crisp definitions: the machinery that turns a cue into a justified answer. This page lays out what to put on each side, the one rule that wins short- answer marks, a timing plan for the three hours, and the closing concepts-to-recall list. TL;DR. 你可以带四面自己的笔记,而且没有计算器 -- 所以不要把空间浪费在公式上。把这四面填满decision trees (决策 树)、checklists (清单)和精炼的定义:那些把线索变成有论证答案的机器。本页摆出每一面该放什么、赢得简答分的那一 条规则、三小时的时间规划,以及收官的待回忆概念清单。 - 12. 2 The 4-side notes plan 12. 2四页笔记计划 The exam is reasoning, not recall of numbers, so your sheet is a reasoning toolkit. A good layout maps one side to each job of the decoder above. Trees and checklists earn marks; a wall of formulae does not (there is nothing to calculate). 考的是推理,而非对数字的回忆,所以你的小抄是一个推理工具箱。好的布局把每一面对应到上面解码器的一项工作。树与清 单能得分;一墙公式不能(没有任何东西要算)。 Side What goes on it[9]Source: asksia-bible-mast20034-bilingual.pdfThe PPDAC cycle - the spine of the whole unit, and a one-glance map of how an exam scenario hangs together: every question lives somewhere on Problem - Plan - Data - Analysis - Conclusion. Locating the stage tells you which concept the marker wants. PPDAC 循环 -- 整个单元的脊柱,也是一张让你一眼看清考试情景如何拼接的地图:每道题都栖身于 Problem → Plan → Data → Analysis → Conclusion 中的某处。定位到阶段,就知道评分者想要哪个概念。 AskSia Library . MAST20034 . XXia Bilingual ★ Concepts to recall - the whole-book checklist 要回忆的概念 -- 全书清单 · Context first (Ch1): data are value-laden; ask who/why/what/how/when; critique # criticism (always offer a constructive fix). 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · Graphics (Ch2): the 5 principles; match graph to variable types; two good features + one specific improvement. I 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 · Design (Ch3): validity = randomise/compare/control (kills bias); precision = replicate/stratify/balance (kills variability); they are independent axes. 设计(第3章): validity = 随机化/比较/控制(杀 bias); precision = 重复/分层/平衡(杀 variability);二者是独 立的轴。 · Observational (Ch4): cohort=group-by-exposure, case-control=group-by-outcome; confounder links to both; correlation # causation. 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠ 因果。 I 报告(第5章):中心/离散/趋势/离群点;报告 Cl+水平、以及统计量 +P,而非只报P。 · Qualitative (Ch6): "why" not "what"; bottom-up vs top-down coding; convergence as the stopping rule. 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence 作为停止规则。 推断(第7章):随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大P 不证明 Ho; Type l/ll 与 power. · Modelling (Ch8): signal+noise; "all models wrong, some useful"; parsimony; read residual/QQ plots - interpret, never fit. 建模(第8章):信号+噪声;“所有模型都是错的,有些有用”;简约性;读残差/QQ图 -- 解读,绝不拟合。 · Sampling (Ch9): frame vs sample; a big sample won't fix bias; 4 random + 4 non-random methods; WEIRD; reproducibility crisis. I 抽样(第9章):抽样框 vs样本;大样本修不了偏倚;4种随机+4种非随机方法;WEIRD;可重复性危机。 I 累积(第10章):森林图(零线+菱形);Hill 准则(时序性+梯度);发表偏倚。 · Big data (Ch11): significance # importance at scale; provenance, ethics, scepticism toward Al findings. Big data (第11章):在大规模下显著 ≠重要;来源、伦理、对 AI发现的怀疑。 而且永远 -- 铁律:点名概念,然后给 because。祝你好运。 AskSia Library · MAST20034 · 双语 Bilingual ● 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 ● 设计(第3章): validity =随机化/比较/控制(杀 bias); precision= 重复/分层/平衡(杀 variability);二者是独 立的轴。 ● 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠因果。 ● 报告(第5章):中心/离散/趋势/离群点;报告 CI+水平、以及统计量+P,而非只报P。 ● 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence(收敛)作为停止规则。 推断(第7章): 随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大 P 不证明 Ho; Type l/ll 与 power.[11]Source: asksia-bible-mast20034-bilingual.pdf(b) Big-data 批判(because):巨大的 n修不了 bias -- 它只是那些已经在用这个 app 的人(selection bias); 在海量 n 下一切看起来都“显著”,所以 effect size 与来源比 P 更要紧。再加上数据伦理标记:被记录用户的隐私/ 同意。 What earns the marks. a justified qual choice (the "why" logic + convergence) + a big-data critique naming that size # representativeness, with an ethics flag. 什么能得分。一个有论证的 qual 选择(“why”逻辑+ convergence)+一个 big-data 批判,点明规模 ≠代表性,并附 上一个 ethics 标记。 Trap. dismissing qualitative as "unscientific"; equating large n with representative; forgetting consent/provenance for found data. 陷阱。把 qualitative 斥为“不科学”;把大n 等同于有代表性;对找来的数据忘了 consent/provenance(同意/来源)。 AskSia Library · MAST20034 · 双语 Bilingual ★ Recall checklist - the decision rules for the bank 回忆清单 -- 题库的决策规则 1 每个答案:点名→定义→在情境中给理由→ because(后果)。按分值数理由。 · Design/sampling: justify the choice against its alternative; non-probability methods are biased - size won't cure it. 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 · Confounder: must link to both exposure and outcome; observational - associated with, never causes. Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 · Graph critique: two good features to a principle; one issue + a fix that matches it. 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 · Type I/II: false +/false -; power = 1-ß; small n -+ low power; rare condition - base-rate false positives. Type l/ll: 假阳/假阴;power = 1-β;小 n→低 power;罕见情况→基础率导致的假阳性。 · Qual vs quant: why vs what; convergence is the qual stopping rule. Big data: size # unbiased; effect size & ethics over P. 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data: 大≠无偏;effect size 与伦理胜过 P。 ● 每个答案:点名 →定义→在情境中给理由→ because(后果)。按分值数理由。 ● 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 ● Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 ● 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 ● CI:随机的是区间,固定的是参数。P-value: Pr(datalHo),而非 Pr(Holdata);显著 ≠重要。 ● Type I/ll: 假阳/假阴;power = 1-β;小n→低 power;罕见情况→ 基础率导致的假阳性。 ● Forest plot: 菱形对零线+异质性+发表偏倚的注意。Hill:证据的权重,时序性优先。 ● 诊断图:漏斗形→方差非恒定;QQ 弯曲→非正态→变换/用假设更少的方法。 ● 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data:大≠无偏;effect size 与伦理胜过 P。 AskSia Library · MAST20034 · 双语 Bilingual EXAM MORNING . THE DECODER EXAM MORNING . THE DECODER[13]Source: asksia-bible-mast20034-bilingual.pdfEX 12. 1 Turning a fact into a because (worked short-answer) name > consequence + because Stem (AskSia-invented): "A wellbeing app is evaluated by surveying users who clicked an in-app pop-up. Comment on the sample. " 题干(AskSia 自拟):“某福祉 app通过调查那些点击了应用内弹窗的用户来评估。评论这个样本。” Weak (no marks): "It is a convenience sample. " - a label, no reasoning. 弱(无分):“这是一个 convenience sample。” -- 只是标签,没有推理。 Strong (banks the marks): "This is a convenience / volunteer sample, because only users already engaged enough to click respond - so it suffers self-selection bias and likely over-states satisfaction (because dissatisfied users have churned and are missing). A larger pop-up sample would not fix this, because it repeats the same biased method at scale. " 强(存下分数):“这是一个 convenience/ volunteer(便利/自愿)样本,因为只有已经足够投入到会去点击的用户才 会作答 -- 所以它存在 self-selection bias(自我选择偏倚),很可能高估满意度(因为不满意的用户已经流失,处于 缺失(missing)状态)。更大的弹窗样本并不能解决这个问题,因为它只是把同一种有偏的方法放大重复。” - Read-out: three clauses, three becauses: name the method - name the consequence in context - pre- empt the 'bigger sample' trap. (Scenario AskSia-invented; no figures to compute. ) 读出结构:三个分句,三个because:点名方法→在情境中点名后果→预先化解“样本更大”的陷阱。(情景由 AskSia 自拟;没有要计算的数字。) - 12. 4 The 3-hour timing plan 12. 43 小时计时计划 Three hours for short-answer reasoning is generous - the risk is over-writing early questions, not running out of ideas. Budget by marks, leave a critique-polish pass at the end. 三小时做简答推理是宽裕的 -- 风险在于早段题目写过头,而非想不出点子。按分值分配时间,末尾留一遍批判-润色的检 查。 AskSia Library · MAST20034 · 双语 Bilingual 1 First 10 min - survey & map. Read every question; pencil the decoder row next to each (design? critique? interpret? sample?). Spot the high-mark items. 头10分钟 -- 通览与定位。把每道题读一遍;在每题旁用铅笔标出解码器的行(设计?批判?解读?抽样?)。挑出高分 题。 2 Bulk - answer by mark weight. Roughly a minute per mark; a 4-mark design question wants two detailed becauses, a 2-mark "two good features" wants exactly two. Do not pad. 主体 -- 按分值作答。大约每分一分钟;一道 4分的设计题想要两个详尽的 because,一道 2分的“两个好特征”就恰好两 个。别注水。 3 Discipline - never compute. If you feel an arithmetic urge, you have misread - the answer is an interpretation, not a number. 纪律 -- 绝不计算。若你感到一股算术冲动,那你读错题了 -- 答案是一个解读,而非一个数字。 4 Last 20 min - the because audit. Re-read each answer and check every claim ends in a reason tied to the scenario; add the missing because, the missing caveat (significance # importance), the missing specific fix. 最后 20 分钟 -- because 审计。重读每个答案,检查每一条主张是否都以一个系到情景的理由收尾;补上缺失的 because、缺失的注意点(显著≠重要)、缺失的具体修复。 FIG 12. 1 1 Problem define the question 5 Conclusion answer in context 2 Plan design how to get data PPDAC cycle 4 Analysis explore + model + test 3 Data collect / clean / store iterative: conclusions raise new problems -> cycle repeats[14]Source: asksia-bible-mast20034-bilingual.pdfGLM 连接:连续→identity/Normal;二元→logit/Bernouli (优势比);计数→log/Poisson。认出来,别去算。 · Diagnostics: residuals-vs-fitted = funnel(variance)/curve(nonlinearity); QQ tails = non-Normal. Any structure = a problem. I 诊断:残差对拟合=漏斗形(方差)/弯曲(非线性);QQ 尾部=非正态。任何结构=一个问题。 修复点名的违例:变换 · 加项 · 假设更少的检验。 · Parsimony over complexity (compare via AIC/BIC - don't compute); over-fitting is a fault, not a virtue. 一 简约性胜过复杂性(用AIC/BIC比较 -- 别去算);过拟合是缺陷,而非美德。 两个拒绝:不向数据之外做 extrapolation(外推);相关 ≠因果 -- 点出那个混杂因素。 · Model = signal + noise (模型=信号+噪声);“所有模型都是错的,有些有用” -- 评判有用性+简约性,而非 真伪。 ● 从三个轴读一个系数:符号 · 大小 · 显著性(CI 不含 0/小P=真实,而非噪声)。 ● 预测变量:数值型→斜率;分类型→相对基线的组间偏移;两者皆有→调整后的效应。 ● GLM links: 连续型→identity/Normal;二元→logit/Bernoulli (odds ratios);计数→log/Poisson。会识别,不计 算。 ● 诊断:residuals-vs-fitted = 漏斗形(方差)/曲线(非线性);QQ 尾部=非正态。任何结构=一个问题。 ● 修正点名的那个违背:transform · 加一项 · 假设更少的检验。 · 简约性(Parsimony)优于复杂性(通过 AIC/BIC 比较 -- 不计算);over-fitting 是缺点,不是优点。 ● 两条拒绝:不在数据之外做 extrapolation; correlation ≠ causation––点名 confounder。 AskSia Library · MAST20034 · 双语 Bilingual WEEK 9 . SAMPLING WEEK 9 . SAMPLING CH 9 . REASONING, NOT COMPUTING Population, frame & sample - and why a big sample can't fix a bad one population、frame 与 sample -- 以及为何大样本救不了一个坏样本 A sample is a guess about a population; the method decides if the guess is honest 样本是对总体的一次猜测;方法决定这猜测是否诚实 TL;DR. Almost no study measures everyone, so we measure a sample and reason about the population. Two different things can go wrong, and the exam lives in the gap between them: sampling error is the harmless luck-of-the-draw wobble that shrinks as the sample grows, while bias is a systematic lean baked in by the method - and a bigger sample only repeats that mistake on a larger scale. The whole skill is naming who is missing from the sample and which way that tilts the answer. TL;DR. 几乎没有研究能测量所有人,所以我们测量一个 sample 并就 population 推理。两种不同的东西可能出错,而考试 就活在它们之间的缝隙里:sampling error(抽样误差)是无害的、抽签运气式的抖动,随样本增大而收缩,而 bias 是由方 法烤进去的一种系统性偏斜 -- 更大的样本只是把那个错误放在更大的规模上重演一遍。整个技能就是点名样本里谁缺席了, 以及那朝哪个方向歪了答案。 ★ What the exam asks here 考试在这里问什么 Sampling seeds the most-rehearsed reasoning question on the 60% final - a 3-hour, short-answer paper with no calculator and no calculations, where you bring in four sides of your own notes. The released sample question asks you to define a sampling method and say why it is (not) recommended, and the tutorial species asks which sampling method would you use here and why. You will also be handed a scenario and asked to identify who is excluded and the direction of the resulting bias. The marking rule is explicit: "explaining your reasoning and choices is typically more important than any answer. " Every mark is a because - carry the taxonomy and the bias checklist, and spend the words on the consequence, not the label. Sampling 在60% 期末上孕育出最常演练的推理题 -- 一份 3小时、简答、无计算器、无计算的卷子,你带入四面自己 的笔记。发布的样题要你定义一种 sampling 方法并说出它为何(不)被推荐,教程题种则问此处你会用哪种 sampling 方法、为什么。你也会被递给一个情景,要你识别谁被排除,以及由此产生的bias的方向。评分规则明确:“阐释你的 推理与选择,通常比任何答案本身更重要。”每一分都是一个because -- 带上分类法和 bias 清单,把字花在后果上,而 非标签上。 9. 1 The core vocabulary - unit, population, frame, sample 9. 1核心词汇 -- unit、population、frame、sample Definition. A unit is one thing you study (a person, a tree, a transaction). The population is all the units you want to talk about. A census measures every unit in the population. The sampling frame is the actual list of units you can draw from - the electoral roll, a class list, a customer database. The sample is the subset you measure. Write each term as a defined object, because markers reward the defined term before the reasoning.
-
你要会的抽样分类:
- 材料提示你要背 4 random + 4 non-random methods[7]Source: asksia-bible-mast20034-bilingual.pdfWhy it pays off Side 1 - design & cause Study-design tree (intervene? - experiment/observational - cohort/cross-sec/case-control/ecological); confounding triangle; the Hill 9 (star temporality + gradient); exposure/outcome synonyms. Covers the two biggest row- families - "choose a design" and "is it causal?" - with ready-made becauses. Side 2 - The 5 graphics principles as a critique checklist; the graph-chooser (by variable mix); the data-description checklist graphs & reporting (centre/spread/trend/outliers, concise+complete); inference-report checklist (CI + level, stat + P, n). Turns the graph-critique and "critique this description" species into fill-in-the-blank answers. Side 3 - inference rules Correct CI + P-value wordings (and the wrong ones to avoid); the Type I/II + power 2×2; the NHST 5 steps; the assumption hierarchy; the diagnostic-plot readings (funnel - non-constant variance). The "interpret this output" species is pure recall of the right sentence - have it verbatim. Side 4 - sampling, qual & ethics Sampling taxonomy (4 random + 4 non-random) with each method's bias; "size won't fix bias"; WEIRD + reproducibility; qual methods (why vs what, coding, convergence); data ethics + justice; the context questions. Mops up the trust / sampling / qualitative / big-data rows and the W1 + W12 context frame. AskSia Library . MAST20034 . XXia Bilingual ! Do NOT build a formula sheet 不要去做一张公式表 There is no calculator and no calculation on this exam. A side crammed with CLT algebra, t-formulae or regression normal equations is wasted - you will never plug numbers in. The only notation worth a line is the definition of a P- value or a CI in words. Every other millimetre should be a tree, a checklist, or a one-line definition. 这场考试没有计算器,也没有计算。一面塞满 CLT代数、t 公式或回归正规方程的笔记是浪费 -- 你永远不会代入数 字。唯一值得占一行的记号,是用文字写出 P-value 或 CI 的定义。其余每一毫米都应是一棵树、一份核对清单,或一 行定义。 12. 3 The short-answer 'because' rule 12. 3short-answer 的‘because’ 规则 Definition. A complete short answer = a named concept + a reason that connects it to the scenario. Marks are awarded per correct, sufficiently-detailed reason, not per fact stated. A reason that merely restates the definition is not a because and scores nothing. 定义。一个完整的简答=一个点名的概念+一条把它连到情景上的理由。分数按每条正确、足够详细的理由给,而非按陈述 的事实给。一条仅仅复述定义的理由不是 because,得不到分。[9]Source: asksia-bible-mast20034-bilingual.pdfThe PPDAC cycle - the spine of the whole unit, and a one-glance map of how an exam scenario hangs together: every question lives somewhere on Problem - Plan - Data - Analysis - Conclusion. Locating the stage tells you which concept the marker wants. PPDAC 循环 -- 整个单元的脊柱,也是一张让你一眼看清考试情景如何拼接的地图:每道题都栖身于 Problem → Plan → Data → Analysis → Conclusion 中的某处。定位到阶段,就知道评分者想要哪个概念。 AskSia Library . MAST20034 . XXia Bilingual ★ Concepts to recall - the whole-book checklist 要回忆的概念 -- 全书清单 · Context first (Ch1): data are value-laden; ask who/why/what/how/when; critique # criticism (always offer a constructive fix). 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · Graphics (Ch2): the 5 principles; match graph to variable types; two good features + one specific improvement. I 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 · Design (Ch3): validity = randomise/compare/control (kills bias); precision = replicate/stratify/balance (kills variability); they are independent axes. 设计(第3章): validity = 随机化/比较/控制(杀 bias); precision = 重复/分层/平衡(杀 variability);二者是独 立的轴。 · Observational (Ch4): cohort=group-by-exposure, case-control=group-by-outcome; confounder links to both; correlation # causation. 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠ 因果。 I 报告(第5章):中心/离散/趋势/离群点;报告 Cl+水平、以及统计量 +P,而非只报P。 · Qualitative (Ch6): "why" not "what"; bottom-up vs top-down coding; convergence as the stopping rule. 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence 作为停止规则。 推断(第7章):随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大P 不证明 Ho; Type l/ll 与 power. · Modelling (Ch8): signal+noise; "all models wrong, some useful"; parsimony; read residual/QQ plots - interpret, never fit. 建模(第8章):信号+噪声;“所有模型都是错的,有些有用”;简约性;读残差/QQ图 -- 解读,绝不拟合。 · Sampling (Ch9): frame vs sample; a big sample won't fix bias; 4 random + 4 non-random methods; WEIRD; reproducibility crisis. I 抽样(第9章):抽样框 vs样本;大样本修不了偏倚;4种随机+4种非随机方法;WEIRD;可重复性危机。 I 累积(第10章):森林图(零线+菱形);Hill 准则(时序性+梯度);发表偏倚。 · Big data (Ch11): significance # importance at scale; provenance, ethics, scepticism toward Al findings. Big data (第11章):在大规模下显著 ≠重要;来源、伦理、对 AI发现的怀疑。 而且永远 -- 铁律:点名概念,然后给 because。祝你好运。 AskSia Library · MAST20034 · 双语 Bilingual ● 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 ● 设计(第3章): validity =随机化/比较/控制(杀 bias); precision= 重复/分层/平衡(杀 variability);二者是独 立的轴。 ● 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠因果。 ● 报告(第5章):中心/离散/趋势/离群点;报告 CI+水平、以及统计量+P,而非只报P。 ● 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence(收敛)作为停止规则。 推断(第7章): 随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大 P 不证明 Ho; Type l/ll 与 power.
- 但当前摘录没有完整展开八种名字,所以这里我只能明确支持:
- convenience sampling[6]Source: asksia-bible-mast20034-bilingual.pdfEvery unit (and every set of n) equally likely - the baseline. Cluster 整群抽 样 Randomly pick groups, survey all within. stratum; guarantees subgroup coverage. prone. ✓ How to spend a glossary term in the exam 如何在考试中「花掉」一个词汇表术语 Never just name it. Define - apply - because. e. g. "This is convenience sampling (define); here it over-represents metro students (apply); so a bigger sample won't fix the bias (because). " That three-move sentence is what the rubric pays for. 永远别只是点名。定义→应用→ because。例如:“这是 convenience sampling(便利抽样)(定义);这里它过度 代表了都市学生(应用);所以更大的样本也修复不了这个偏倚(because)。”那个三步句式,正是评分标准买单的东 西。 AskSia Library · MAST20034 · 双语 Bilingual REVISION . SHORT -ANSWER BANK - REVISION . SHORT - ANSWER BANK ALL CHAPTERS . EXAM REHEARSAL Practice bank: every mark is a because 练习题库:每一分都是一个 because Twelve short-answer species, each reasoned out the way the marker wants 十二类 short-answer 题型,每一类都按阅卷人想要的方式推理出来 TL;DR. The final is short-answer reasoning only - you name a concept, then reason it out, then say because . . . The rubric is explicit: "explaining your reasoning and choices is typically more important than any answer. " So marks are awarded per correct, sufficiently-detailed reason - not per fact recalled. Each card below gives you the skeleton (define - reason - because), the marking note (what actually scores), and the trap that zeroes a vague answer. TL;DR. 期末只考简答推理 -- 你先点名一个概念,再推理出来,然后说because . . . (因为 …. . . . . )。评分标准写得很明白:“阐 释你的推理与选择,通常比任何答案本身更重要。”所以分数是按每条正确、足够详细的理由给的 -- 而不是按回忆出的事实给 的。下面每张卡片给你骨架(定义→推理→ because)、评分提示(什么真正得分)以及让模糊答案归零的陷阱。 ★ What the exam asks here - the format you are rehearsing 考试在这里问什么 -- 你正在排练的那种格式 The 60% final is short-answer, no calculator, no calculations, no software operation. You may bring 4 sides of your own notes. Two question species recur: (1) "explain a concept / apply critical thinking to a context" and (2) "use critical thinking on a whole-class example" (anchored to a shared case, but you are never asked to recall its data). They may hand you statistical output or a graph to interpret - you read and explain it, you never compute it. Every card here is one rep of that move. 60% 的期末是简答题,不可用计算器,无需计算,无需操作软件。你可以带入自备4 面笔记。两类题目反复出现:(1) “解释一个概念/把批判性思维应用到某情境”和(2)“对一个全班案例运用批判性思维”(锚定在一个共享案例上,但从 不要求你回忆它的数据)。他们可能递给你一段统计输出或一张图让你解读 -- 你读它、解释它,但从不计算它。这里的 每一张卡片都是这一招的一次操练。 P. 1 How to read each card - the marking model P. 1如何读每张卡片 -- 评分模型 Markers do not reward the verb "explain"; they reward the linkage. A "4-mark, 2+2" item almost always means 2 marks for a precise definition and 2 marks for two distinct, consequence-level reasons. A reason that merely restates the definition earns nothing. The skeleton below is the spine of every answer. 评分者奖励的不是“explain(解释)”这个动词,而是关联(linkage)。一道“4分、2+2”的题几乎总意味着2分给精确定义,2 分给两条相异的、后果层面的理由。一条仅仅复述定义的理由得不到分。下面的骨架是每个答案的脊柱。 1 Name the concept / framework. Markers reward defined terms - say convenience sampling, confounder, Type I error by name before you reason. 点名概念 / 框架。评分者奖励定义清晰的术语 -- 在推理前先按名说出 convenience sampling、confounder、Type l error. 2 Define it precisely. One sentence that would let a stranger identify it; vagueness ("just picking people") loses the definition marks. 精确地定义它。用一句话让陌生人也能据此辨认它;含糊(“就是随便挑人”)会丢掉定义分。[13]Source: asksia-bible-mast20034-bilingual.pdfEX 12. 1 Turning a fact into a because (worked short-answer) name > consequence + because Stem (AskSia-invented): "A wellbeing app is evaluated by surveying users who clicked an in-app pop-up. Comment on the sample. " 题干(AskSia 自拟):“某福祉 app通过调查那些点击了应用内弹窗的用户来评估。评论这个样本。” Weak (no marks): "It is a convenience sample. " - a label, no reasoning. 弱(无分):“这是一个 convenience sample。” -- 只是标签,没有推理。 Strong (banks the marks): "This is a convenience / volunteer sample, because only users already engaged enough to click respond - so it suffers self-selection bias and likely over-states satisfaction (because dissatisfied users have churned and are missing). A larger pop-up sample would not fix this, because it repeats the same biased method at scale. " 强(存下分数):“这是一个 convenience/ volunteer(便利/自愿)样本,因为只有已经足够投入到会去点击的用户才 会作答 -- 所以它存在 self-selection bias(自我选择偏倚),很可能高估满意度(因为不满意的用户已经流失,处于 缺失(missing)状态)。更大的弹窗样本并不能解决这个问题,因为它只是把同一种有偏的方法放大重复。” - Read-out: three clauses, three becauses: name the method - name the consequence in context - pre- empt the 'bigger sample' trap. (Scenario AskSia-invented; no figures to compute. ) 读出结构:三个分句,三个because:点名方法→在情境中点名后果→预先化解“样本更大”的陷阱。(情景由 AskSia 自拟;没有要计算的数字。) - 12. 4 The 3-hour timing plan 12. 43 小时计时计划 Three hours for short-answer reasoning is generous - the risk is over-writing early questions, not running out of ideas. Budget by marks, leave a critique-polish pass at the end. 三小时做简答推理是宽裕的 -- 风险在于早段题目写过头,而非想不出点子。按分值分配时间,末尾留一遍批判-润色的检 查。 AskSia Library · MAST20034 · 双语 Bilingual 1 First 10 min - survey & map. Read every question; pencil the decoder row next to each (design? critique? interpret? sample?). Spot the high-mark items. 头10分钟 -- 通览与定位。把每道题读一遍;在每题旁用铅笔标出解码器的行(设计?批判?解读?抽样?)。挑出高分 题。 2 Bulk - answer by mark weight. Roughly a minute per mark; a 4-mark design question wants two detailed becauses, a 2-mark "two good features" wants exactly two. Do not pad. 主体 -- 按分值作答。大约每分一分钟;一道 4分的设计题想要两个详尽的 because,一道 2分的“两个好特征”就恰好两 个。别注水。 3 Discipline - never compute. If you feel an arithmetic urge, you have misread - the answer is an interpretation, not a number. 纪律 -- 绝不计算。若你感到一股算术冲动,那你读错题了 -- 答案是一个解读,而非一个数字。 4 Last 20 min - the because audit. Re-read each answer and check every claim ends in a reason tied to the scenario; add the missing because, the missing caveat (significance # importance), the missing specific fix. 最后 20 分钟 -- because 审计。重读每个答案,检查每一条主张是否都以一个系到情景的理由收尾;补上缺失的 because、缺失的注意点(显著≠重要)、缺失的具体修复。 FIG 12. 1 1 Problem define the question 5 Conclusion answer in context 2 Plan design how to get data PPDAC cycle 4 Analysis explore + model + test 3 Data collect / clean / store iterative: conclusions raise new problems -> cycle repeats
- volunteer / self-selection[13]Source: asksia-bible-mast20034-bilingual.pdfEX 12. 1 Turning a fact into a because (worked short-answer) name > consequence + because Stem (AskSia-invented): "A wellbeing app is evaluated by surveying users who clicked an in-app pop-up. Comment on the sample. " 题干(AskSia 自拟):“某福祉 app通过调查那些点击了应用内弹窗的用户来评估。评论这个样本。” Weak (no marks): "It is a convenience sample. " - a label, no reasoning. 弱(无分):“这是一个 convenience sample。” -- 只是标签,没有推理。 Strong (banks the marks): "This is a convenience / volunteer sample, because only users already engaged enough to click respond - so it suffers self-selection bias and likely over-states satisfaction (because dissatisfied users have churned and are missing). A larger pop-up sample would not fix this, because it repeats the same biased method at scale. " 强(存下分数):“这是一个 convenience/ volunteer(便利/自愿)样本,因为只有已经足够投入到会去点击的用户才 会作答 -- 所以它存在 self-selection bias(自我选择偏倚),很可能高估满意度(因为不满意的用户已经流失,处于 缺失(missing)状态)。更大的弹窗样本并不能解决这个问题,因为它只是把同一种有偏的方法放大重复。” - Read-out: three clauses, three becauses: name the method - name the consequence in context - pre- empt the 'bigger sample' trap. (Scenario AskSia-invented; no figures to compute. ) 读出结构:三个分句,三个because:点名方法→在情境中点名后果→预先化解“样本更大”的陷阱。(情景由 AskSia 自拟;没有要计算的数字。) - 12. 4 The 3-hour timing plan 12. 43 小时计时计划 Three hours for short-answer reasoning is generous - the risk is over-writing early questions, not running out of ideas. Budget by marks, leave a critique-polish pass at the end. 三小时做简答推理是宽裕的 -- 风险在于早段题目写过头,而非想不出点子。按分值分配时间,末尾留一遍批判-润色的检 查。 AskSia Library · MAST20034 · 双语 Bilingual 1 First 10 min - survey & map. Read every question; pencil the decoder row next to each (design? critique? interpret? sample?). Spot the high-mark items. 头10分钟 -- 通览与定位。把每道题读一遍;在每题旁用铅笔标出解码器的行(设计?批判?解读?抽样?)。挑出高分 题。 2 Bulk - answer by mark weight. Roughly a minute per mark; a 4-mark design question wants two detailed becauses, a 2-mark "two good features" wants exactly two. Do not pad. 主体 -- 按分值作答。大约每分一分钟;一道 4分的设计题想要两个详尽的 because,一道 2分的“两个好特征”就恰好两 个。别注水。 3 Discipline - never compute. If you feel an arithmetic urge, you have misread - the answer is an interpretation, not a number. 纪律 -- 绝不计算。若你感到一股算术冲动,那你读错题了 -- 答案是一个解读,而非一个数字。 4 Last 20 min - the because audit. Re-read each answer and check every claim ends in a reason tied to the scenario; add the missing because, the missing caveat (significance # importance), the missing specific fix. 最后 20 分钟 -- because 审计。重读每个答案,检查每一条主张是否都以一个系到情景的理由收尾;补上缺失的 because、缺失的注意点(显著≠重要)、缺失的具体修复。 FIG 12. 1 1 Problem define the question 5 Conclusion answer in context 2 Plan design how to get data PPDAC cycle 4 Analysis explore + model + test 3 Data collect / clean / store iterative: conclusions raise new problems -> cycle repeats
- cluster sampling[6]Source: asksia-bible-mast20034-bilingual.pdfEvery unit (and every set of n) equally likely - the baseline. Cluster 整群抽 样 Randomly pick groups, survey all within. stratum; guarantees subgroup coverage. prone. ✓ How to spend a glossary term in the exam 如何在考试中「花掉」一个词汇表术语 Never just name it. Define - apply - because. e. g. "This is convenience sampling (define); here it over-represents metro students (apply); so a bigger sample won't fix the bias (because). " That three-move sentence is what the rubric pays for. 永远别只是点名。定义→应用→ because。例如:“这是 convenience sampling(便利抽样)(定义);这里它过度 代表了都市学生(应用);所以更大的样本也修复不了这个偏倚(because)。”那个三步句式,正是评分标准买单的东 西。 AskSia Library · MAST20034 · 双语 Bilingual REVISION . SHORT -ANSWER BANK - REVISION . SHORT - ANSWER BANK ALL CHAPTERS . EXAM REHEARSAL Practice bank: every mark is a because 练习题库:每一分都是一个 because Twelve short-answer species, each reasoned out the way the marker wants 十二类 short-answer 题型,每一类都按阅卷人想要的方式推理出来 TL;DR. The final is short-answer reasoning only - you name a concept, then reason it out, then say because . . . The rubric is explicit: "explaining your reasoning and choices is typically more important than any answer. " So marks are awarded per correct, sufficiently-detailed reason - not per fact recalled. Each card below gives you the skeleton (define - reason - because), the marking note (what actually scores), and the trap that zeroes a vague answer. TL;DR. 期末只考简答推理 -- 你先点名一个概念,再推理出来,然后说because . . . (因为 …. . . . . )。评分标准写得很明白:“阐 释你的推理与选择,通常比任何答案本身更重要。”所以分数是按每条正确、足够详细的理由给的 -- 而不是按回忆出的事实给 的。下面每张卡片给你骨架(定义→推理→ because)、评分提示(什么真正得分)以及让模糊答案归零的陷阱。 ★ What the exam asks here - the format you are rehearsing 考试在这里问什么 -- 你正在排练的那种格式 The 60% final is short-answer, no calculator, no calculations, no software operation. You may bring 4 sides of your own notes. Two question species recur: (1) "explain a concept / apply critical thinking to a context" and (2) "use critical thinking on a whole-class example" (anchored to a shared case, but you are never asked to recall its data). They may hand you statistical output or a graph to interpret - you read and explain it, you never compute it. Every card here is one rep of that move. 60% 的期末是简答题,不可用计算器,无需计算,无需操作软件。你可以带入自备4 面笔记。两类题目反复出现:(1) “解释一个概念/把批判性思维应用到某情境”和(2)“对一个全班案例运用批判性思维”(锚定在一个共享案例上,但从 不要求你回忆它的数据)。他们可能递给你一段统计输出或一张图让你解读 -- 你读它、解释它,但从不计算它。这里的 每一张卡片都是这一招的一次操练。 P. 1 How to read each card - the marking model P. 1如何读每张卡片 -- 评分模型 Markers do not reward the verb "explain"; they reward the linkage. A "4-mark, 2+2" item almost always means 2 marks for a precise definition and 2 marks for two distinct, consequence-level reasons. A reason that merely restates the definition earns nothing. The skeleton below is the spine of every answer. 评分者奖励的不是“explain(解释)”这个动词,而是关联(linkage)。一道“4分、2+2”的题几乎总意味着2分给精确定义,2 分给两条相异的、后果层面的理由。一条仅仅复述定义的理由得不到分。下面的骨架是每个答案的脊柱。 1 Name the concept / framework. Markers reward defined terms - say convenience sampling, confounder, Type I error by name before you reason. 点名概念 / 框架。评分者奖励定义清晰的术语 -- 在推理前先按名说出 convenience sampling、confounder、Type l error. 2 Define it precisely. One sentence that would let a stranger identify it; vagueness ("just picking people") loses the definition marks. 精确地定义它。用一句话让陌生人也能据此辨认它;含糊(“就是随便挑人”)会丢掉定义分。
- stratified sampling[6]Source: asksia-bible-mast20034-bilingual.pdfEvery unit (and every set of n) equally likely - the baseline. Cluster 整群抽 样 Randomly pick groups, survey all within. stratum; guarantees subgroup coverage. prone. ✓ How to spend a glossary term in the exam 如何在考试中「花掉」一个词汇表术语 Never just name it. Define - apply - because. e. g. "This is convenience sampling (define); here it over-represents metro students (apply); so a bigger sample won't fix the bias (because). " That three-move sentence is what the rubric pays for. 永远别只是点名。定义→应用→ because。例如:“这是 convenience sampling(便利抽样)(定义);这里它过度 代表了都市学生(应用);所以更大的样本也修复不了这个偏倚(because)。”那个三步句式,正是评分标准买单的东 西。 AskSia Library · MAST20034 · 双语 Bilingual REVISION . SHORT -ANSWER BANK - REVISION . SHORT - ANSWER BANK ALL CHAPTERS . EXAM REHEARSAL Practice bank: every mark is a because 练习题库:每一分都是一个 because Twelve short-answer species, each reasoned out the way the marker wants 十二类 short-answer 题型,每一类都按阅卷人想要的方式推理出来 TL;DR. The final is short-answer reasoning only - you name a concept, then reason it out, then say because . . . The rubric is explicit: "explaining your reasoning and choices is typically more important than any answer. " So marks are awarded per correct, sufficiently-detailed reason - not per fact recalled. Each card below gives you the skeleton (define - reason - because), the marking note (what actually scores), and the trap that zeroes a vague answer. TL;DR. 期末只考简答推理 -- 你先点名一个概念,再推理出来,然后说because . . . (因为 …. . . . . )。评分标准写得很明白:“阐 释你的推理与选择,通常比任何答案本身更重要。”所以分数是按每条正确、足够详细的理由给的 -- 而不是按回忆出的事实给 的。下面每张卡片给你骨架(定义→推理→ because)、评分提示(什么真正得分)以及让模糊答案归零的陷阱。 ★ What the exam asks here - the format you are rehearsing 考试在这里问什么 -- 你正在排练的那种格式 The 60% final is short-answer, no calculator, no calculations, no software operation. You may bring 4 sides of your own notes. Two question species recur: (1) "explain a concept / apply critical thinking to a context" and (2) "use critical thinking on a whole-class example" (anchored to a shared case, but you are never asked to recall its data). They may hand you statistical output or a graph to interpret - you read and explain it, you never compute it. Every card here is one rep of that move. 60% 的期末是简答题,不可用计算器,无需计算,无需操作软件。你可以带入自备4 面笔记。两类题目反复出现:(1) “解释一个概念/把批判性思维应用到某情境”和(2)“对一个全班案例运用批判性思维”(锚定在一个共享案例上,但从 不要求你回忆它的数据)。他们可能递给你一段统计输出或一张图让你解读 -- 你读它、解释它,但从不计算它。这里的 每一张卡片都是这一招的一次操练。 P. 1 How to read each card - the marking model P. 1如何读每张卡片 -- 评分模型 Markers do not reward the verb "explain"; they reward the linkage. A "4-mark, 2+2" item almost always means 2 marks for a precise definition and 2 marks for two distinct, consequence-level reasons. A reason that merely restates the definition earns nothing. The skeleton below is the spine of every answer. 评分者奖励的不是“explain(解释)”这个动词,而是关联(linkage)。一道“4分、2+2”的题几乎总意味着2分给精确定义,2 分给两条相异的、后果层面的理由。一条仅仅复述定义的理由得不到分。下面的骨架是每个答案的脊柱。 1 Name the concept / framework. Markers reward defined terms - say convenience sampling, confounder, Type I error by name before you reason. 点名概念 / 框架。评分者奖励定义清晰的术语 -- 在推理前先按名说出 convenience sampling、confounder、Type l error. 2 Define it precisely. One sentence that would let a stranger identify it; vagueness ("just picking people") loses the definition marks. 精确地定义它。用一句话让陌生人也能据此辨认它;含糊(“就是随便挑人”)会丢掉定义分。
- 如果你愿意,我下一条可以专门帮你整理成“抽样方法总表”,但现有摘录里没有完整八类明文列全。
-
WEIRD
- 这是 sampling & trust 常考批判点[1]Source: asksia-bible-mast20034-bilingual.pdfB 2 . REVISE 2 · REVISE 2 . REVISE You've done the week. Use the tables and the chapter-end recall checklists to self-test: can you list the four observational designs, name three sampling biases, give the five graphics principles, recite the Hill criteria? The checklists are written to be lifted almost verbatim onto your four-side notes sheet. 你已经上完本周。用各表格和章 末的recall checklists (回忆清 单)来自测:你能列出四种观察 性设计、点名三种 sampling bias、给出五条图表原则、背出 Hill 准则吗?这些清单写出来就 是为了几乎逐字誉到你的四面笔 记纸上。 C 3 . APPLY 3 . APPLY 3 . APPLY You're building your notes sheet or sitting the paper. Run the name-the-concept decoder (Ch 14) on every prompt: read the cue - name the design / bias / method -> write the because. With four sides of notes carried in and no calculator, your edge is reasoning discipline, not recall under pressure. 你正在做笔记纸,或正在考场 上。对每道题跑一遍name-the- concept decoder (点名概念解 码器)(第14章):读线索→点 design / bias / method -> 写下because。带着四面笔记、 不用计算器,你的优势是推理纪 律,而非压力下的回忆。 AskSia Library · MAST20034 · 双语 Bilingual ! Read this first: the assessment shape, and the bring-in rule 先读这个:评估的形态,以及可带入规则 MAST20034 is assessed by four pieces: 5 revision quizzes (5%), 4 short assignments (20%, each a tight 200- word critique with hard word penalties), a group project (15%, study design/critique + a Week 11 presentation), and the 60% final exam. The final is in-person, short-answer reasoning, three hours. You may bring in up to two A4 pages double-sided - four sides - of your own notes, and calculators are not permitted (there are no questions that need one). So your notes sheet should carry definitions, taxonomies, decision rules and checklists, never arithmetic. Always confirm the current weights, dates and exam conditions on your own LMS, as details shift between cohorts. MAST20034 由四个部分评估:5 次复习 quiz(5%)、4次短作业(20%,每次是一篇严格200词的批判,超字数有硬 扣分)、一个小组项目(15%,研究设计/批判+第11周展示),以及60% 的期末考。期末是线下、简答推理、三小时。 你可以带入最多两张 A4 双面纸 -- 四面 -- 自己的笔记,不允许用计算器(也没有需要计算器的题目)。所以你的笔记 纸应承载定义、分类法、决策规则与清单,绝不放算术。请始终在你自己的 LMS 上确认当前的权重、日期与考试条件, 因为细节会随届次变动。 i How this book was built - the two-layer rule 这本书是怎么搭出来的 -- 两层规则 The framework canon here is standard, widely-published statistical-literacy theory - the PPDAC investigation cycle (Wild & Pfannkuch), EDA (Tukey), the standard study-design and sampling taxonomies, validity & precision principles, NHST + confidence-interval logic, the WEIRD-bias critique, the Bradford Hill criteria, and data-feminism / data-ethics (D'Ignazio & Klein). These are non-copyrightable canon, stated plainly. The course's own case-study stems and tutorial examples are paraphrased and re-authored with our own scenarios - we never reproduce a case study's specific data. Book status quoted and honoured (four-side bring-in, no calculator, short-answer). Verify on your LMS. 这里的框架经典是标准的、被广泛发表的统计素养理论 -- PPDAC 探究循环(Wild & Pfannkuch)、EDA (Tukey)、 标准的研究设计与 sampling 分类法、validity & precision (效度与精密度)原则、NHST+置信区间逻辑、WEIRD- bias 批判、Bradford Hill 准则,以及data-feminism / data-ethics (数据女性主义/数据伦理)(D'Ignazio & Klein)。这些是不可受版权保护的经典,平实陈述。本课自身的案例研究题干与教程例子都被转述并以我们自己的情景重 写 -- 我们绝不复制任何案例研究的具体数据。书面状态如实引用并遵守(四面带入、不用计算器、简答)。请在你的 LMS 上核实。 AskSia Library · MAST20034 · 双语 Bilingual THE BLUEPRINT - THE EXAM BLUEPRINT 60% FINAL . EVERY MARK IS A 'BECAUSE' Where every mark lives 每一分都落在哪里 One 60% short-answer final - reasoning only, no calculator, four sides of your own notes 一场占 60% 的 short-answer 期末 -- 只考推理、不许用计算器、可带四页自备笔记 TL;DR. Sixty percent is a short-answer reasoning final - no calculator, no calculations, no software, with four sides of your own notes carried in. Its make-or-break skill is "name the concept, then justify the call": you are handed a graph, a study or a piece of statistical output and asked to critique it and say how to fix it. Master the taxonomies and decision rules in this book and you hold the keys to the whole paper. TL;DR. 这 60% 是一场 short-answer 推理期末 -- 不许用计算器、不做计算、不用软件,可带入四页自备笔记。它成败攸 关的技能是“为概念命名,再论证你的判断”:你会拿到一张图、一项研究或一段统计输出,被要求批判它并说出如何修补。 掌握本书里的分类法与决策规则,你就握住了整张试卷的钥匙。 60% FINAL EXAM (3 HR) 期末考试(3 小时)[3]Source: asksia-bible-mast20034-bilingual.pdf"Interpret this CI / P- value / output / forest plot" Inference reading rules . Ch7, Ch8, Ch10 (1) State what it shows in context (CI excludes O - evidence of an effect; small P - strong evidence against H. ). (2) Add the correct caveat (the interval is random, u is fixed; large P does not prove H. ). (3) Comment on strength / meaning - significant # important. SAMPLING & TRUST "Is this sample OK?"; "what's wrong with how they recruited?" Sampling bias + WEIRD · Ch1, Ch9 (1) Ask who is missing - frame / selection / non-response / volunteer gap. (2) Name the method and its bias (convenience - people similar to each other). (3) State that a bigger sample will NOT fix bias - it repeats the mistake at scale; consider WEIRD over-sampling. "Too good to be true"; "a surprising significant result"; "just barely p‹0. 05" Reproducibility + p-hacking · Ch9-10 (1) Publication bias - novel/significant results over-published, inflating effects. (2) Watch for p-hacking / HARKing (one-sided chosen after the data, multiple looks). (3) Ask for replication, a CI / effect size, and pre- registration before trusting it. "Big data / an Al claim"; "with millions of records . . . " Ethics + validity at scale · Ch1, Ch11 (1) Huge n - everything is significant - judge effect size & practical importance, not P. (2) Apply the context questions (who/why/what/how) + data ethics (consent, fairness, stewardship). (3) Be sceptical of Al - check provenance and the missing data. ✓ How to use the table under pressure 在压力下如何使用这张表 Underline the verb and the noun in the stem first ("choose a design", "critique this graph", "interpret the output"). That two-word cue picks the row; the right column is your paragraph. Then convert each of the three things into a sentence that ends in a because. You are never asked to compute - resist the urge. 先在题干里给动词和名词划线(“选择一个设计”、“批判这张图”、“解读这段输出”)。那个两字提示挑出对应的行;右侧 那一栏就是你要写的段落。然后把这三样东西各转成一个以 because 收尾的句子。题目从不要求你计算 -- 忍住冲动。 AskSia Library . MAST20034 . XXia Bilingual EXAM MORNING . THE DECODER - EXAM MORNING . THE DECODER BUILDING THE NOTES YOU CARRY IN Your 4 sides, the 'because' rule, and the 3 hours 你的4 页笔记、‘because’规则,以及那3个小时 TL;DR. You may bring four sides of your own notes and there is no calculator - so do not waste space on formulas. Fill the four sides with decision trees, checklists, and crisp definitions: the machinery that turns a cue into a justified answer. This page lays out what to put on each side, the one rule that wins short- answer marks, a timing plan for the three hours, and the closing concepts-to-recall list. TL;DR. 你可以带四面自己的笔记,而且没有计算器 -- 所以不要把空间浪费在公式上。把这四面填满decision trees (决策 树)、checklists (清单)和精炼的定义:那些把线索变成有论证答案的机器。本页摆出每一面该放什么、赢得简答分的那一 条规则、三小时的时间规划,以及收官的待回忆概念清单。 - 12. 2 The 4-side notes plan 12. 2四页笔记计划 The exam is reasoning, not recall of numbers, so your sheet is a reasoning toolkit. A good layout maps one side to each job of the decoder above. Trees and checklists earn marks; a wall of formulae does not (there is nothing to calculate). 考的是推理,而非对数字的回忆,所以你的小抄是一个推理工具箱。好的布局把每一面对应到上面解码器的一项工作。树与清 单能得分;一墙公式不能(没有任何东西要算)。 Side What goes on it[9]Source: asksia-bible-mast20034-bilingual.pdfThe PPDAC cycle - the spine of the whole unit, and a one-glance map of how an exam scenario hangs together: every question lives somewhere on Problem - Plan - Data - Analysis - Conclusion. Locating the stage tells you which concept the marker wants. PPDAC 循环 -- 整个单元的脊柱,也是一张让你一眼看清考试情景如何拼接的地图:每道题都栖身于 Problem → Plan → Data → Analysis → Conclusion 中的某处。定位到阶段,就知道评分者想要哪个概念。 AskSia Library . MAST20034 . XXia Bilingual ★ Concepts to recall - the whole-book checklist 要回忆的概念 -- 全书清单 · Context first (Ch1): data are value-laden; ask who/why/what/how/when; critique # criticism (always offer a constructive fix). 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · Graphics (Ch2): the 5 principles; match graph to variable types; two good features + one specific improvement. I 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 · Design (Ch3): validity = randomise/compare/control (kills bias); precision = replicate/stratify/balance (kills variability); they are independent axes. 设计(第3章): validity = 随机化/比较/控制(杀 bias); precision = 重复/分层/平衡(杀 variability);二者是独 立的轴。 · Observational (Ch4): cohort=group-by-exposure, case-control=group-by-outcome; confounder links to both; correlation # causation. 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠ 因果。 I 报告(第5章):中心/离散/趋势/离群点;报告 Cl+水平、以及统计量 +P,而非只报P。 · Qualitative (Ch6): "why" not "what"; bottom-up vs top-down coding; convergence as the stopping rule. 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence 作为停止规则。 推断(第7章):随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大P 不证明 Ho; Type l/ll 与 power. · Modelling (Ch8): signal+noise; "all models wrong, some useful"; parsimony; read residual/QQ plots - interpret, never fit. 建模(第8章):信号+噪声;“所有模型都是错的,有些有用”;简约性;读残差/QQ图 -- 解读,绝不拟合。 · Sampling (Ch9): frame vs sample; a big sample won't fix bias; 4 random + 4 non-random methods; WEIRD; reproducibility crisis. I 抽样(第9章):抽样框 vs样本;大样本修不了偏倚;4种随机+4种非随机方法;WEIRD;可重复性危机。 I 累积(第10章):森林图(零线+菱形);Hill 准则(时序性+梯度);发表偏倚。 · Big data (Ch11): significance # importance at scale; provenance, ethics, scepticism toward Al findings. Big data (第11章):在大规模下显著 ≠重要;来源、伦理、对 AI发现的怀疑。 而且永远 -- 铁律:点名概念,然后给 because。祝你好运。 AskSia Library · MAST20034 · 双语 Bilingual ● 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 ● 设计(第3章): validity =随机化/比较/控制(杀 bias); precision= 重复/分层/平衡(杀 variability);二者是独 立的轴。 ● 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠因果。 ● 报告(第5章):中心/离散/趋势/离群点;报告 CI+水平、以及统计量+P,而非只报P。 ● 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence(收敛)作为停止规则。 推断(第7章): 随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大 P 不证明 Ho; Type l/ll 与 power.
- 核心意思:样本可能过度集中在某些不具代表性人群上,所以不能把结论轻易推广。
-
抽样题怎么写
- 谁被排除了?
- 偏向了谁?
- 结果会往哪个方向歪?
- 为什么大样本也没用?[3]Source: asksia-bible-mast20034-bilingual.pdf"Interpret this CI / P- value / output / forest plot" Inference reading rules . Ch7, Ch8, Ch10 (1) State what it shows in context (CI excludes O - evidence of an effect; small P - strong evidence against H. ). (2) Add the correct caveat (the interval is random, u is fixed; large P does not prove H. ). (3) Comment on strength / meaning - significant # important. SAMPLING & TRUST "Is this sample OK?"; "what's wrong with how they recruited?" Sampling bias + WEIRD · Ch1, Ch9 (1) Ask who is missing - frame / selection / non-response / volunteer gap. (2) Name the method and its bias (convenience - people similar to each other). (3) State that a bigger sample will NOT fix bias - it repeats the mistake at scale; consider WEIRD over-sampling. "Too good to be true"; "a surprising significant result"; "just barely p‹0. 05" Reproducibility + p-hacking · Ch9-10 (1) Publication bias - novel/significant results over-published, inflating effects. (2) Watch for p-hacking / HARKing (one-sided chosen after the data, multiple looks). (3) Ask for replication, a CI / effect size, and pre- registration before trusting it. "Big data / an Al claim"; "with millions of records . . . " Ethics + validity at scale · Ch1, Ch11 (1) Huge n - everything is significant - judge effect size & practical importance, not P. (2) Apply the context questions (who/why/what/how) + data ethics (consent, fairness, stewardship). (3) Be sceptical of Al - check provenance and the missing data. ✓ How to use the table under pressure 在压力下如何使用这张表 Underline the verb and the noun in the stem first ("choose a design", "critique this graph", "interpret the output"). That two-word cue picks the row; the right column is your paragraph. Then convert each of the three things into a sentence that ends in a because. You are never asked to compute - resist the urge. 先在题干里给动词和名词划线(“选择一个设计”、“批判这张图”、“解读这段输出”)。那个两字提示挑出对应的行;右侧 那一栏就是你要写的段落。然后把这三样东西各转成一个以 because 收尾的句子。题目从不要求你计算 -- 忍住冲动。 AskSia Library . MAST20034 . XXia Bilingual EXAM MORNING . THE DECODER - EXAM MORNING . THE DECODER BUILDING THE NOTES YOU CARRY IN Your 4 sides, the 'because' rule, and the 3 hours 你的4 页笔记、‘because’规则,以及那3个小时 TL;DR. You may bring four sides of your own notes and there is no calculator - so do not waste space on formulas. Fill the four sides with decision trees, checklists, and crisp definitions: the machinery that turns a cue into a justified answer. This page lays out what to put on each side, the one rule that wins short- answer marks, a timing plan for the three hours, and the closing concepts-to-recall list. TL;DR. 你可以带四面自己的笔记,而且没有计算器 -- 所以不要把空间浪费在公式上。把这四面填满decision trees (决策 树)、checklists (清单)和精炼的定义:那些把线索变成有论证答案的机器。本页摆出每一面该放什么、赢得简答分的那一 条规则、三小时的时间规划,以及收官的待回忆概念清单。 - 12. 2 The 4-side notes plan 12. 2四页笔记计划 The exam is reasoning, not recall of numbers, so your sheet is a reasoning toolkit. A good layout maps one side to each job of the decoder above. Trees and checklists earn marks; a wall of formulae does not (there is nothing to calculate). 考的是推理,而非对数字的回忆,所以你的小抄是一个推理工具箱。好的布局把每一面对应到上面解码器的一项工作。树与清 单能得分;一墙公式不能(没有任何东西要算)。 Side What goes on it[13]Source: asksia-bible-mast20034-bilingual.pdfEX 12. 1 Turning a fact into a because (worked short-answer) name > consequence + because Stem (AskSia-invented): "A wellbeing app is evaluated by surveying users who clicked an in-app pop-up. Comment on the sample. " 题干(AskSia 自拟):“某福祉 app通过调查那些点击了应用内弹窗的用户来评估。评论这个样本。” Weak (no marks): "It is a convenience sample. " - a label, no reasoning. 弱(无分):“这是一个 convenience sample。” -- 只是标签,没有推理。 Strong (banks the marks): "This is a convenience / volunteer sample, because only users already engaged enough to click respond - so it suffers self-selection bias and likely over-states satisfaction (because dissatisfied users have churned and are missing). A larger pop-up sample would not fix this, because it repeats the same biased method at scale. " 强(存下分数):“这是一个 convenience/ volunteer(便利/自愿)样本,因为只有已经足够投入到会去点击的用户才 会作答 -- 所以它存在 self-selection bias(自我选择偏倚),很可能高估满意度(因为不满意的用户已经流失,处于 缺失(missing)状态)。更大的弹窗样本并不能解决这个问题,因为它只是把同一种有偏的方法放大重复。” - Read-out: three clauses, three becauses: name the method - name the consequence in context - pre- empt the 'bigger sample' trap. (Scenario AskSia-invented; no figures to compute. ) 读出结构:三个分句,三个because:点名方法→在情境中点名后果→预先化解“样本更大”的陷阱。(情景由 AskSia 自拟;没有要计算的数字。) - 12. 4 The 3-hour timing plan 12. 43 小时计时计划 Three hours for short-answer reasoning is generous - the risk is over-writing early questions, not running out of ideas. Budget by marks, leave a critique-polish pass at the end. 三小时做简答推理是宽裕的 -- 风险在于早段题目写过头,而非想不出点子。按分值分配时间,末尾留一遍批判-润色的检 查。 AskSia Library · MAST20034 · 双语 Bilingual 1 First 10 min - survey & map. Read every question; pencil the decoder row next to each (design? critique? interpret? sample?). Spot the high-mark items. 头10分钟 -- 通览与定位。把每道题读一遍;在每题旁用铅笔标出解码器的行(设计?批判?解读?抽样?)。挑出高分 题。 2 Bulk - answer by mark weight. Roughly a minute per mark; a 4-mark design question wants two detailed becauses, a 2-mark "two good features" wants exactly two. Do not pad. 主体 -- 按分值作答。大约每分一分钟;一道 4分的设计题想要两个详尽的 because,一道 2分的“两个好特征”就恰好两 个。别注水。 3 Discipline - never compute. If you feel an arithmetic urge, you have misread - the answer is an interpretation, not a number. 纪律 -- 绝不计算。若你感到一股算术冲动,那你读错题了 -- 答案是一个解读,而非一个数字。 4 Last 20 min - the because audit. Re-read each answer and check every claim ends in a reason tied to the scenario; add the missing because, the missing caveat (significance # importance), the missing specific fix. 最后 20 分钟 -- because 审计。重读每个答案,检查每一条主张是否都以一个系到情景的理由收尾;补上缺失的 because、缺失的注意点(显著≠重要)、缺失的具体修复。 FIG 12. 1 1 Problem define the question 5 Conclusion answer in context 2 Plan design how to get data PPDAC cycle 4 Analysis explore + model + test 3 Data collect / clean / store iterative: conclusions raise new problems -> cycle repeats[14]Source: asksia-bible-mast20034-bilingual.pdfGLM 连接:连续→identity/Normal;二元→logit/Bernouli (优势比);计数→log/Poisson。认出来,别去算。 · Diagnostics: residuals-vs-fitted = funnel(variance)/curve(nonlinearity); QQ tails = non-Normal. Any structure = a problem. I 诊断:残差对拟合=漏斗形(方差)/弯曲(非线性);QQ 尾部=非正态。任何结构=一个问题。 修复点名的违例:变换 · 加项 · 假设更少的检验。 · Parsimony over complexity (compare via AIC/BIC - don't compute); over-fitting is a fault, not a virtue. 一 简约性胜过复杂性(用AIC/BIC比较 -- 别去算);过拟合是缺陷,而非美德。 两个拒绝:不向数据之外做 extrapolation(外推);相关 ≠因果 -- 点出那个混杂因素。 · Model = signal + noise (模型=信号+噪声);“所有模型都是错的,有些有用” -- 评判有用性+简约性,而非 真伪。 ● 从三个轴读一个系数:符号 · 大小 · 显著性(CI 不含 0/小P=真实,而非噪声)。 ● 预测变量:数值型→斜率;分类型→相对基线的组间偏移;两者皆有→调整后的效应。 ● GLM links: 连续型→identity/Normal;二元→logit/Bernoulli (odds ratios);计数→log/Poisson。会识别,不计 算。 ● 诊断:residuals-vs-fitted = 漏斗形(方差)/曲线(非线性);QQ 尾部=非正态。任何结构=一个问题。 ● 修正点名的那个违背:transform · 加一项 · 假设更少的检验。 · 简约性(Parsimony)优于复杂性(通过 AIC/BIC 比较 -- 不计算);over-fitting 是缺点,不是优点。 ● 两条拒绝:不在数据之外做 extrapolation; correlation ≠ causation––点名 confounder。 AskSia Library · MAST20034 · 双语 Bilingual WEEK 9 . SAMPLING WEEK 9 . SAMPLING CH 9 . REASONING, NOT COMPUTING Population, frame & sample - and why a big sample can't fix a bad one population、frame 与 sample -- 以及为何大样本救不了一个坏样本 A sample is a guess about a population; the method decides if the guess is honest 样本是对总体的一次猜测;方法决定这猜测是否诚实 TL;DR. Almost no study measures everyone, so we measure a sample and reason about the population. Two different things can go wrong, and the exam lives in the gap between them: sampling error is the harmless luck-of-the-draw wobble that shrinks as the sample grows, while bias is a systematic lean baked in by the method - and a bigger sample only repeats that mistake on a larger scale. The whole skill is naming who is missing from the sample and which way that tilts the answer. TL;DR. 几乎没有研究能测量所有人,所以我们测量一个 sample 并就 population 推理。两种不同的东西可能出错,而考试 就活在它们之间的缝隙里:sampling error(抽样误差)是无害的、抽签运气式的抖动,随样本增大而收缩,而 bias 是由方 法烤进去的一种系统性偏斜 -- 更大的样本只是把那个错误放在更大的规模上重演一遍。整个技能就是点名样本里谁缺席了, 以及那朝哪个方向歪了答案。 ★ What the exam asks here 考试在这里问什么 Sampling seeds the most-rehearsed reasoning question on the 60% final - a 3-hour, short-answer paper with no calculator and no calculations, where you bring in four sides of your own notes. The released sample question asks you to define a sampling method and say why it is (not) recommended, and the tutorial species asks which sampling method would you use here and why. You will also be handed a scenario and asked to identify who is excluded and the direction of the resulting bias. The marking rule is explicit: "explaining your reasoning and choices is typically more important than any answer. " Every mark is a because - carry the taxonomy and the bias checklist, and spend the words on the consequence, not the label. Sampling 在60% 期末上孕育出最常演练的推理题 -- 一份 3小时、简答、无计算器、无计算的卷子,你带入四面自己 的笔记。发布的样题要你定义一种 sampling 方法并说出它为何(不)被推荐,教程题种则问此处你会用哪种 sampling 方法、为什么。你也会被递给一个情景,要你识别谁被排除,以及由此产生的bias的方向。评分规则明确:“阐释你的 推理与选择,通常比任何答案本身更重要。”每一分都是一个because -- 带上分类法和 bias 清单,把字花在后果上,而 非标签上。 9. 1 The core vocabulary - unit, population, frame, sample 9. 1核心词汇 -- unit、population、frame、sample Definition. A unit is one thing you study (a person, a tree, a transaction). The population is all the units you want to talk about. A census measures every unit in the population. The sampling frame is the actual list of units you can draw from - the electoral roll, a class list, a customer database. The sample is the subset you measure. Write each term as a defined object, because markers reward the defined term before the reasoning.
-
强力模板
- “This is a convenience/volunteer sample, which means participants selected themselves rather than being randomly sampled. Here that matters because more engaged users are over-represented, so satisfaction is likely overstated. A larger sample would not fix this, because it repeats the same biased recruitment method at scale.”[6]Source: asksia-bible-mast20034-bilingual.pdfEvery unit (and every set of n) equally likely - the baseline. Cluster 整群抽 样 Randomly pick groups, survey all within. stratum; guarantees subgroup coverage. prone. ✓ How to spend a glossary term in the exam 如何在考试中「花掉」一个词汇表术语 Never just name it. Define - apply - because. e. g. "This is convenience sampling (define); here it over-represents metro students (apply); so a bigger sample won't fix the bias (because). " That three-move sentence is what the rubric pays for. 永远别只是点名。定义→应用→ because。例如:“这是 convenience sampling(便利抽样)(定义);这里它过度 代表了都市学生(应用);所以更大的样本也修复不了这个偏倚(because)。”那个三步句式,正是评分标准买单的东 西。 AskSia Library · MAST20034 · 双语 Bilingual REVISION . SHORT -ANSWER BANK - REVISION . SHORT - ANSWER BANK ALL CHAPTERS . EXAM REHEARSAL Practice bank: every mark is a because 练习题库:每一分都是一个 because Twelve short-answer species, each reasoned out the way the marker wants 十二类 short-answer 题型,每一类都按阅卷人想要的方式推理出来 TL;DR. The final is short-answer reasoning only - you name a concept, then reason it out, then say because . . . The rubric is explicit: "explaining your reasoning and choices is typically more important than any answer. " So marks are awarded per correct, sufficiently-detailed reason - not per fact recalled. Each card below gives you the skeleton (define - reason - because), the marking note (what actually scores), and the trap that zeroes a vague answer. TL;DR. 期末只考简答推理 -- 你先点名一个概念,再推理出来,然后说because . . . (因为 …. . . . . )。评分标准写得很明白:“阐 释你的推理与选择,通常比任何答案本身更重要。”所以分数是按每条正确、足够详细的理由给的 -- 而不是按回忆出的事实给 的。下面每张卡片给你骨架(定义→推理→ because)、评分提示(什么真正得分)以及让模糊答案归零的陷阱。 ★ What the exam asks here - the format you are rehearsing 考试在这里问什么 -- 你正在排练的那种格式 The 60% final is short-answer, no calculator, no calculations, no software operation. You may bring 4 sides of your own notes. Two question species recur: (1) "explain a concept / apply critical thinking to a context" and (2) "use critical thinking on a whole-class example" (anchored to a shared case, but you are never asked to recall its data). They may hand you statistical output or a graph to interpret - you read and explain it, you never compute it. Every card here is one rep of that move. 60% 的期末是简答题,不可用计算器,无需计算,无需操作软件。你可以带入自备4 面笔记。两类题目反复出现:(1) “解释一个概念/把批判性思维应用到某情境”和(2)“对一个全班案例运用批判性思维”(锚定在一个共享案例上,但从 不要求你回忆它的数据)。他们可能递给你一段统计输出或一张图让你解读 -- 你读它、解释它,但从不计算它。这里的 每一张卡片都是这一招的一次操练。 P. 1 How to read each card - the marking model P. 1如何读每张卡片 -- 评分模型 Markers do not reward the verb "explain"; they reward the linkage. A "4-mark, 2+2" item almost always means 2 marks for a precise definition and 2 marks for two distinct, consequence-level reasons. A reason that merely restates the definition earns nothing. The skeleton below is the spine of every answer. 评分者奖励的不是“explain(解释)”这个动词,而是关联(linkage)。一道“4分、2+2”的题几乎总意味着2分给精确定义,2 分给两条相异的、后果层面的理由。一条仅仅复述定义的理由得不到分。下面的骨架是每个答案的脊柱。 1 Name the concept / framework. Markers reward defined terms - say convenience sampling, confounder, Type I error by name before you reason. 点名概念 / 框架。评分者奖励定义清晰的术语 -- 在推理前先按名说出 convenience sampling、confounder、Type l error. 2 Define it precisely. One sentence that would let a stranger identify it; vagueness ("just picking people") loses the definition marks. 精确地定义它。用一句话让陌生人也能据此辨认它;含糊(“就是随便挑人”)会丢掉定义分。[13]Source: asksia-bible-mast20034-bilingual.pdfEX 12. 1 Turning a fact into a because (worked short-answer) name > consequence + because Stem (AskSia-invented): "A wellbeing app is evaluated by surveying users who clicked an in-app pop-up. Comment on the sample. " 题干(AskSia 自拟):“某福祉 app通过调查那些点击了应用内弹窗的用户来评估。评论这个样本。” Weak (no marks): "It is a convenience sample. " - a label, no reasoning. 弱(无分):“这是一个 convenience sample。” -- 只是标签,没有推理。 Strong (banks the marks): "This is a convenience / volunteer sample, because only users already engaged enough to click respond - so it suffers self-selection bias and likely over-states satisfaction (because dissatisfied users have churned and are missing). A larger pop-up sample would not fix this, because it repeats the same biased method at scale. " 强(存下分数):“这是一个 convenience/ volunteer(便利/自愿)样本,因为只有已经足够投入到会去点击的用户才 会作答 -- 所以它存在 self-selection bias(自我选择偏倚),很可能高估满意度(因为不满意的用户已经流失,处于 缺失(missing)状态)。更大的弹窗样本并不能解决这个问题,因为它只是把同一种有偏的方法放大重复。” - Read-out: three clauses, three becauses: name the method - name the consequence in context - pre- empt the 'bigger sample' trap. (Scenario AskSia-invented; no figures to compute. ) 读出结构:三个分句,三个because:点名方法→在情境中点名后果→预先化解“样本更大”的陷阱。(情景由 AskSia 自拟;没有要计算的数字。) - 12. 4 The 3-hour timing plan 12. 43 小时计时计划 Three hours for short-answer reasoning is generous - the risk is over-writing early questions, not running out of ideas. Budget by marks, leave a critique-polish pass at the end. 三小时做简答推理是宽裕的 -- 风险在于早段题目写过头,而非想不出点子。按分值分配时间,末尾留一遍批判-润色的检 查。 AskSia Library · MAST20034 · 双语 Bilingual 1 First 10 min - survey & map. Read every question; pencil the decoder row next to each (design? critique? interpret? sample?). Spot the high-mark items. 头10分钟 -- 通览与定位。把每道题读一遍;在每题旁用铅笔标出解码器的行(设计?批判?解读?抽样?)。挑出高分 题。 2 Bulk - answer by mark weight. Roughly a minute per mark; a 4-mark design question wants two detailed becauses, a 2-mark "two good features" wants exactly two. Do not pad. 主体 -- 按分值作答。大约每分一分钟;一道 4分的设计题想要两个详尽的 because,一道 2分的“两个好特征”就恰好两 个。别注水。 3 Discipline - never compute. If you feel an arithmetic urge, you have misread - the answer is an interpretation, not a number. 纪律 -- 绝不计算。若你感到一股算术冲动,那你读错题了 -- 答案是一个解读,而非一个数字。 4 Last 20 min - the because audit. Re-read each answer and check every claim ends in a reason tied to the scenario; add the missing because, the missing caveat (significance # importance), the missing specific fix. 最后 20 分钟 -- because 审计。重读每个答案,检查每一条主张是否都以一个系到情景的理由收尾;补上缺失的 because、缺失的注意点(显著≠重要)、缺失的具体修复。 FIG 12. 1 1 Problem define the question 5 Conclusion answer in context 2 Plan design how to get data PPDAC cycle 4 Analysis explore + model + test 3 Data collect / clean / store iterative: conclusions raise new problems -> cycle repeats
-
6)Inference:CI、P-value、显著性、重要性
-
这块非常高频,而且是最容易被故意挖坑的地方。
-
CI(置信区间)
- 你要会说:
- 随机的是区间,不是参数
- 参数(比如 $\mu$)是固定的[3]Source: asksia-bible-mast20034-bilingual.pdf"Interpret this CI / P- value / output / forest plot" Inference reading rules . Ch7, Ch8, Ch10 (1) State what it shows in context (CI excludes O - evidence of an effect; small P - strong evidence against H. ). (2) Add the correct caveat (the interval is random, u is fixed; large P does not prove H. ). (3) Comment on strength / meaning - significant # important. SAMPLING & TRUST "Is this sample OK?"; "what's wrong with how they recruited?" Sampling bias + WEIRD · Ch1, Ch9 (1) Ask who is missing - frame / selection / non-response / volunteer gap. (2) Name the method and its bias (convenience - people similar to each other). (3) State that a bigger sample will NOT fix bias - it repeats the mistake at scale; consider WEIRD over-sampling. "Too good to be true"; "a surprising significant result"; "just barely p‹0. 05" Reproducibility + p-hacking · Ch9-10 (1) Publication bias - novel/significant results over-published, inflating effects. (2) Watch for p-hacking / HARKing (one-sided chosen after the data, multiple looks). (3) Ask for replication, a CI / effect size, and pre- registration before trusting it. "Big data / an Al claim"; "with millions of records . . . " Ethics + validity at scale · Ch1, Ch11 (1) Huge n - everything is significant - judge effect size & practical importance, not P. (2) Apply the context questions (who/why/what/how) + data ethics (consent, fairness, stewardship). (3) Be sceptical of Al - check provenance and the missing data. ✓ How to use the table under pressure 在压力下如何使用这张表 Underline the verb and the noun in the stem first ("choose a design", "critique this graph", "interpret the output"). That two-word cue picks the row; the right column is your paragraph. Then convert each of the three things into a sentence that ends in a because. You are never asked to compute - resist the urge. 先在题干里给动词和名词划线(“选择一个设计”、“批判这张图”、“解读这段输出”)。那个两字提示挑出对应的行;右侧 那一栏就是你要写的段落。然后把这三样东西各转成一个以 because 收尾的句子。题目从不要求你计算 -- 忍住冲动。 AskSia Library . MAST20034 . XXia Bilingual EXAM MORNING . THE DECODER - EXAM MORNING . THE DECODER BUILDING THE NOTES YOU CARRY IN Your 4 sides, the 'because' rule, and the 3 hours 你的4 页笔记、‘because’规则,以及那3个小时 TL;DR. You may bring four sides of your own notes and there is no calculator - so do not waste space on formulas. Fill the four sides with decision trees, checklists, and crisp definitions: the machinery that turns a cue into a justified answer. This page lays out what to put on each side, the one rule that wins short- answer marks, a timing plan for the three hours, and the closing concepts-to-recall list. TL;DR. 你可以带四面自己的笔记,而且没有计算器 -- 所以不要把空间浪费在公式上。把这四面填满decision trees (决策 树)、checklists (清单)和精炼的定义:那些把线索变成有论证答案的机器。本页摆出每一面该放什么、赢得简答分的那一 条规则、三小时的时间规划,以及收官的待回忆概念清单。 - 12. 2 The 4-side notes plan 12. 2四页笔记计划 The exam is reasoning, not recall of numbers, so your sheet is a reasoning toolkit. A good layout maps one side to each job of the decoder above. Trees and checklists earn marks; a wall of formulae does not (there is nothing to calculate). 考的是推理,而非对数字的回忆,所以你的小抄是一个推理工具箱。好的布局把每一面对应到上面解码器的一项工作。树与清 单能得分;一墙公式不能(没有任何东西要算)。 Side What goes on it[9]Source: asksia-bible-mast20034-bilingual.pdfThe PPDAC cycle - the spine of the whole unit, and a one-glance map of how an exam scenario hangs together: every question lives somewhere on Problem - Plan - Data - Analysis - Conclusion. Locating the stage tells you which concept the marker wants. PPDAC 循环 -- 整个单元的脊柱,也是一张让你一眼看清考试情景如何拼接的地图:每道题都栖身于 Problem → Plan → Data → Analysis → Conclusion 中的某处。定位到阶段,就知道评分者想要哪个概念。 AskSia Library . MAST20034 . XXia Bilingual ★ Concepts to recall - the whole-book checklist 要回忆的概念 -- 全书清单 · Context first (Ch1): data are value-laden; ask who/why/what/how/when; critique # criticism (always offer a constructive fix). 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · Graphics (Ch2): the 5 principles; match graph to variable types; two good features + one specific improvement. I 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 · Design (Ch3): validity = randomise/compare/control (kills bias); precision = replicate/stratify/balance (kills variability); they are independent axes. 设计(第3章): validity = 随机化/比较/控制(杀 bias); precision = 重复/分层/平衡(杀 variability);二者是独 立的轴。 · Observational (Ch4): cohort=group-by-exposure, case-control=group-by-outcome; confounder links to both; correlation # causation. 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠ 因果。 I 报告(第5章):中心/离散/趋势/离群点;报告 Cl+水平、以及统计量 +P,而非只报P。 · Qualitative (Ch6): "why" not "what"; bottom-up vs top-down coding; convergence as the stopping rule. 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence 作为停止规则。 推断(第7章):随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大P 不证明 Ho; Type l/ll 与 power. · Modelling (Ch8): signal+noise; "all models wrong, some useful"; parsimony; read residual/QQ plots - interpret, never fit. 建模(第8章):信号+噪声;“所有模型都是错的,有些有用”;简约性;读残差/QQ图 -- 解读,绝不拟合。 · Sampling (Ch9): frame vs sample; a big sample won't fix bias; 4 random + 4 non-random methods; WEIRD; reproducibility crisis. I 抽样(第9章):抽样框 vs样本;大样本修不了偏倚;4种随机+4种非随机方法;WEIRD;可重复性危机。 I 累积(第10章):森林图(零线+菱形);Hill 准则(时序性+梯度);发表偏倚。 · Big data (Ch11): significance # importance at scale; provenance, ethics, scepticism toward Al findings. Big data (第11章):在大规模下显著 ≠重要;来源、伦理、对 AI发现的怀疑。 而且永远 -- 铁律:点名概念,然后给 because。祝你好运。 AskSia Library · MAST20034 · 双语 Bilingual ● 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 ● 设计(第3章): validity =随机化/比较/控制(杀 bias); precision= 重复/分层/平衡(杀 variability);二者是独 立的轴。 ● 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠因果。 ● 报告(第5章):中心/离散/趋势/离群点;报告 CI+水平、以及统计量+P,而非只报P。 ● 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence(收敛)作为停止规则。 推断(第7章): 随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大 P 不证明 Ho; Type l/ll 与 power.[11]Source: asksia-bible-mast20034-bilingual.pdf(b) Big-data 批判(because):巨大的 n修不了 bias -- 它只是那些已经在用这个 app 的人(selection bias); 在海量 n 下一切看起来都“显著”,所以 effect size 与来源比 P 更要紧。再加上数据伦理标记:被记录用户的隐私/ 同意。 What earns the marks. a justified qual choice (the "why" logic + convergence) + a big-data critique naming that size # representativeness, with an ethics flag. 什么能得分。一个有论证的 qual 选择(“why”逻辑+ convergence)+一个 big-data 批判,点明规模 ≠代表性,并附 上一个 ethics 标记。 Trap. dismissing qualitative as "unscientific"; equating large n with representative; forgetting consent/provenance for found data. 陷阱。把 qualitative 斥为“不科学”;把大n 等同于有代表性;对找来的数据忘了 consent/provenance(同意/来源)。 AskSia Library · MAST20034 · 双语 Bilingual ★ Recall checklist - the decision rules for the bank 回忆清单 -- 题库的决策规则 1 每个答案:点名→定义→在情境中给理由→ because(后果)。按分值数理由。 · Design/sampling: justify the choice against its alternative; non-probability methods are biased - size won't cure it. 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 · Confounder: must link to both exposure and outcome; observational - associated with, never causes. Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 · Graph critique: two good features to a principle; one issue + a fix that matches it. 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 · Type I/II: false +/false -; power = 1-ß; small n -+ low power; rare condition - base-rate false positives. Type l/ll: 假阳/假阴;power = 1-β;小 n→低 power;罕见情况→基础率导致的假阳性。 · Qual vs quant: why vs what; convergence is the qual stopping rule. Big data: size # unbiased; effect size & ethics over P. 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data: 大≠无偏;effect size 与伦理胜过 P。 ● 每个答案:点名 →定义→在情境中给理由→ because(后果)。按分值数理由。 ● 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 ● Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 ● 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 ● CI:随机的是区间,固定的是参数。P-value: Pr(datalHo),而非 Pr(Holdata);显著 ≠重要。 ● Type I/ll: 假阳/假阴;power = 1-β;小n→低 power;罕见情况→ 基础率导致的假阳性。 ● Forest plot: 菱形对零线+异质性+发表偏倚的注意。Hill:证据的权重,时序性优先。 ● 诊断图:漏斗形→方差非恒定;QQ 弯曲→非正态→变换/用假设更少的方法。 ● 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data:大≠无偏;effect size 与伦理胜过 P。 AskSia Library · MAST20034 · 双语 Bilingual EXAM MORNING . THE DECODER EXAM MORNING . THE DECODER
- 解释时要:
- 在具体情境里说这个区间对应什么
- 避免把它说成“参数有 95% 概率落在这个已算出的区间里”
- 当前摘录没把这个错误句完整展开,但它反复强调“interval is random, parameter is fixed”。[3]Source: asksia-bible-mast20034-bilingual.pdf"Interpret this CI / P- value / output / forest plot" Inference reading rules . Ch7, Ch8, Ch10 (1) State what it shows in context (CI excludes O - evidence of an effect; small P - strong evidence against H. ). (2) Add the correct caveat (the interval is random, u is fixed; large P does not prove H. ). (3) Comment on strength / meaning - significant # important. SAMPLING & TRUST "Is this sample OK?"; "what's wrong with how they recruited?" Sampling bias + WEIRD · Ch1, Ch9 (1) Ask who is missing - frame / selection / non-response / volunteer gap. (2) Name the method and its bias (convenience - people similar to each other). (3) State that a bigger sample will NOT fix bias - it repeats the mistake at scale; consider WEIRD over-sampling. "Too good to be true"; "a surprising significant result"; "just barely p‹0. 05" Reproducibility + p-hacking · Ch9-10 (1) Publication bias - novel/significant results over-published, inflating effects. (2) Watch for p-hacking / HARKing (one-sided chosen after the data, multiple looks). (3) Ask for replication, a CI / effect size, and pre- registration before trusting it. "Big data / an Al claim"; "with millions of records . . . " Ethics + validity at scale · Ch1, Ch11 (1) Huge n - everything is significant - judge effect size & practical importance, not P. (2) Apply the context questions (who/why/what/how) + data ethics (consent, fairness, stewardship). (3) Be sceptical of Al - check provenance and the missing data. ✓ How to use the table under pressure 在压力下如何使用这张表 Underline the verb and the noun in the stem first ("choose a design", "critique this graph", "interpret the output"). That two-word cue picks the row; the right column is your paragraph. Then convert each of the three things into a sentence that ends in a because. You are never asked to compute - resist the urge. 先在题干里给动词和名词划线(“选择一个设计”、“批判这张图”、“解读这段输出”)。那个两字提示挑出对应的行;右侧 那一栏就是你要写的段落。然后把这三样东西各转成一个以 because 收尾的句子。题目从不要求你计算 -- 忍住冲动。 AskSia Library . MAST20034 . XXia Bilingual EXAM MORNING . THE DECODER - EXAM MORNING . THE DECODER BUILDING THE NOTES YOU CARRY IN Your 4 sides, the 'because' rule, and the 3 hours 你的4 页笔记、‘because’规则,以及那3个小时 TL;DR. You may bring four sides of your own notes and there is no calculator - so do not waste space on formulas. Fill the four sides with decision trees, checklists, and crisp definitions: the machinery that turns a cue into a justified answer. This page lays out what to put on each side, the one rule that wins short- answer marks, a timing plan for the three hours, and the closing concepts-to-recall list. TL;DR. 你可以带四面自己的笔记,而且没有计算器 -- 所以不要把空间浪费在公式上。把这四面填满decision trees (决策 树)、checklists (清单)和精炼的定义:那些把线索变成有论证答案的机器。本页摆出每一面该放什么、赢得简答分的那一 条规则、三小时的时间规划,以及收官的待回忆概念清单。 - 12. 2 The 4-side notes plan 12. 2四页笔记计划 The exam is reasoning, not recall of numbers, so your sheet is a reasoning toolkit. A good layout maps one side to each job of the decoder above. Trees and checklists earn marks; a wall of formulae does not (there is nothing to calculate). 考的是推理,而非对数字的回忆,所以你的小抄是一个推理工具箱。好的布局把每一面对应到上面解码器的一项工作。树与清 单能得分;一墙公式不能(没有任何东西要算)。 Side What goes on it[11]Source: asksia-bible-mast20034-bilingual.pdf(b) Big-data 批判(because):巨大的 n修不了 bias -- 它只是那些已经在用这个 app 的人(selection bias); 在海量 n 下一切看起来都“显著”,所以 effect size 与来源比 P 更要紧。再加上数据伦理标记:被记录用户的隐私/ 同意。 What earns the marks. a justified qual choice (the "why" logic + convergence) + a big-data critique naming that size # representativeness, with an ethics flag. 什么能得分。一个有论证的 qual 选择(“why”逻辑+ convergence)+一个 big-data 批判,点明规模 ≠代表性,并附 上一个 ethics 标记。 Trap. dismissing qualitative as "unscientific"; equating large n with representative; forgetting consent/provenance for found data. 陷阱。把 qualitative 斥为“不科学”;把大n 等同于有代表性;对找来的数据忘了 consent/provenance(同意/来源)。 AskSia Library · MAST20034 · 双语 Bilingual ★ Recall checklist - the decision rules for the bank 回忆清单 -- 题库的决策规则 1 每个答案:点名→定义→在情境中给理由→ because(后果)。按分值数理由。 · Design/sampling: justify the choice against its alternative; non-probability methods are biased - size won't cure it. 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 · Confounder: must link to both exposure and outcome; observational - associated with, never causes. Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 · Graph critique: two good features to a principle; one issue + a fix that matches it. 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 · Type I/II: false +/false -; power = 1-ß; small n -+ low power; rare condition - base-rate false positives. Type l/ll: 假阳/假阴;power = 1-β;小 n→低 power;罕见情况→基础率导致的假阳性。 · Qual vs quant: why vs what; convergence is the qual stopping rule. Big data: size # unbiased; effect size & ethics over P. 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data: 大≠无偏;effect size 与伦理胜过 P。 ● 每个答案:点名 →定义→在情境中给理由→ because(后果)。按分值数理由。 ● 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 ● Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 ● 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 ● CI:随机的是区间,固定的是参数。P-value: Pr(datalHo),而非 Pr(Holdata);显著 ≠重要。 ● Type I/ll: 假阳/假阴;power = 1-β;小n→低 power;罕见情况→ 基础率导致的假阳性。 ● Forest plot: 菱形对零线+异质性+发表偏倚的注意。Hill:证据的权重,时序性优先。 ● 诊断图:漏斗形→方差非恒定;QQ 弯曲→非正态→变换/用假设更少的方法。 ● 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data:大≠无偏;effect size 与伦理胜过 P。 AskSia Library · MAST20034 · 双语 Bilingual EXAM MORNING . THE DECODER EXAM MORNING . THE DECODER[15]Source: asksia-bible-mast20034-bilingual.pdf混合方法=定性+定量;定性解释或为定量播种。永远点名其伦理/实务成本(时间、编码、匿名性)。 · Trap: never call qualitative "unscientific"; never "just take a bigger survey" for a why question; never swap bottom-up + top-down. I 陷阱:绝不把定性称为“不科学”;对一个为何的问题绝不“就去做更大的调查”;绝不把自下而上←自上而下对调。 ● Quant=“什么/有多少”;qual=“为什么” -- 按问题的性质来选,而非按哪个“更好”。 ● 四种来源: interviews (深度) · focus groups (互动)· observation (做≠说)· documents/artefacts (已经存在) -- 各有其风险。 ● 编码:bottom-up =归纳,codes 从数据中涌现;top-down =演绎,codes 来自先验理论。Themes =归组后的 codes (thematic analysis) . · Convergence (收敛)=定性的停止规则(类比于 power/样本量):当新数据不再增添新主题时停止。 · Rigour (严谨): credibility (~ 内部效度)、transferability (~外部效度)、transparency、有目的的采集。 ● Mixed methods (混合方法) = qual + quant; qual 解释或孕育 quant。始终点名伦理/实务代价(时间、编码、 匿名)。 ● 陷阱:绝不把定性叫“不科学”;对一个为什么问题绝不“干脆做更大的调查”;绝不把 bottom-up ←> top-down 互 换。 AskSia Library · MAST20034 · 双语 Bilingual WEEK 7 . FRAMEWORKS FOR INFERENCE - WEEK 7 . FRAMEWORKS FOR INFERENCE CH 7 . ESTIMATION & SAMPLING DISTRIBUTIONS From sample to population: estimation & the CLT 从样本到总体:估计与 CLT Why one sample can speak for a whole population - and how confidently 为何一个样本能为整个总体发声 -- 以及有多大把握 TL;DR. Inference runs the arrow backwards: probability reasons population - sample, inference reasons sample - population. A point estimate is one number; a confidence interval is honest because it carries "how close". The whole machine rests on the sampling distribution - what the estimate would do over many samples - which the Central Limit Theorem makes Normal. Everything on these three pages is about reading and explaining this, never computing it. TL;DR. 推断把箭头倒过来跑:概率从总体→样本推理,推断从样本→ 总体推理。一个 point estimate(点估计)是一个数 字;一个 confidence interval(置信区间)之所以诚实,是因为它带着“有多接近”。整套机器都立在 sampling distribution (抽样分布)之上 -- 即这个估计在许多样本上会有的表现 -- 而 Central Limit Theorem (CLT,中心极限定理)让它呈正 态。这三页里的一切都关于读懂并解释它,从不计算它。 ★ What the exam asks here 考试在这里问什么 The 60% final is short-answer reasoning only - no calculator, no calculations, no multiple choice. You bring in four sides of your own notes. For inference you will be handed a CI or a P-value to interpret and asked to say what it does (and does not) mean, or to name an error / explain power in a scenario. The marking is explicit: "explaining your reasoning and choices is typically more important than any answer. " Every mark is a because - so carry the definitions, the CI/P-value interpretation rules, and the Type I/II decoder, not arithmetic. 60% 期末只考简答推理 -- 无计算器、无计算、无多选。你带入四面自己的笔记。对于推断,你会被递给一个要解读的 CI 或 P-value,要你说它意味着什么(以及不意味着什么),或在某情景中点名一种 error/解释 power。评分明确: “阐释你的推理与选择,通常比任何答案本身更重要。”每一分都是一个because -- 所以带上定义、CI/P-value 解读规 则,以及 Type l/II 解码器,而非算术。 7. 1 Estimation: point estimate vs confidence interval 7. 1fait: point estimate vs confidence interval Definitions. A point estimate is a single number computed from the sample that stands in for an unknown population parameter (the sample mean x estimates u; the sample proportion p estimates p; s estimates o). A confidence interval (CI) is a range - the estimate plus a margin that encodes how close it is likely to be: estimate ± (distribution multiplier) x (variability). The width comes from sampling variability (and so shrinks as n grows); the multiplier comes from the confidence level you choose. - 定义。一个 point estimate 是从样本算出的单个数字,用来替代一个未知的总体 parameter(参数)(样本均值 x估计 μ; 样本比例 p^估计 p; s 估计 o)。一个 confidence interval (CI,置信区间)是一个范围 -- 估计值加上一个编码了它有多大 可能接近的余量:估计值+(分布乘数)× (变异性)。宽度来自抽样变异性(所以随n增大而收缩);乘数来自你选择的置 信水平。 Quantity Symbol Lives in
- 你要会说:
-
P-value
- 正确定义:
- $P = \Pr(\text{data or more extreme} \mid H_0)$[9]Source: asksia-bible-mast20034-bilingual.pdfThe PPDAC cycle - the spine of the whole unit, and a one-glance map of how an exam scenario hangs together: every question lives somewhere on Problem - Plan - Data - Analysis - Conclusion. Locating the stage tells you which concept the marker wants. PPDAC 循环 -- 整个单元的脊柱,也是一张让你一眼看清考试情景如何拼接的地图:每道题都栖身于 Problem → Plan → Data → Analysis → Conclusion 中的某处。定位到阶段,就知道评分者想要哪个概念。 AskSia Library . MAST20034 . XXia Bilingual ★ Concepts to recall - the whole-book checklist 要回忆的概念 -- 全书清单 · Context first (Ch1): data are value-laden; ask who/why/what/how/when; critique # criticism (always offer a constructive fix). 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · Graphics (Ch2): the 5 principles; match graph to variable types; two good features + one specific improvement. I 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 · Design (Ch3): validity = randomise/compare/control (kills bias); precision = replicate/stratify/balance (kills variability); they are independent axes. 设计(第3章): validity = 随机化/比较/控制(杀 bias); precision = 重复/分层/平衡(杀 variability);二者是独 立的轴。 · Observational (Ch4): cohort=group-by-exposure, case-control=group-by-outcome; confounder links to both; correlation # causation. 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠ 因果。 I 报告(第5章):中心/离散/趋势/离群点;报告 Cl+水平、以及统计量 +P,而非只报P。 · Qualitative (Ch6): "why" not "what"; bottom-up vs top-down coding; convergence as the stopping rule. 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence 作为停止规则。 推断(第7章):随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大P 不证明 Ho; Type l/ll 与 power. · Modelling (Ch8): signal+noise; "all models wrong, some useful"; parsimony; read residual/QQ plots - interpret, never fit. 建模(第8章):信号+噪声;“所有模型都是错的,有些有用”;简约性;读残差/QQ图 -- 解读,绝不拟合。 · Sampling (Ch9): frame vs sample; a big sample won't fix bias; 4 random + 4 non-random methods; WEIRD; reproducibility crisis. I 抽样(第9章):抽样框 vs样本;大样本修不了偏倚;4种随机+4种非随机方法;WEIRD;可重复性危机。 I 累积(第10章):森林图(零线+菱形);Hill 准则(时序性+梯度);发表偏倚。 · Big data (Ch11): significance # importance at scale; provenance, ethics, scepticism toward Al findings. Big data (第11章):在大规模下显著 ≠重要;来源、伦理、对 AI发现的怀疑。 而且永远 -- 铁律:点名概念,然后给 because。祝你好运。 AskSia Library · MAST20034 · 双语 Bilingual ● 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 ● 设计(第3章): validity =随机化/比较/控制(杀 bias); precision= 重复/分层/平衡(杀 variability);二者是独 立的轴。 ● 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠因果。 ● 报告(第5章):中心/离散/趋势/离群点;报告 CI+水平、以及统计量+P,而非只报P。 ● 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence(收敛)作为停止规则。 推断(第7章): 随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大 P 不证明 Ho; Type l/ll 与 power.[11]Source: asksia-bible-mast20034-bilingual.pdf(b) Big-data 批判(because):巨大的 n修不了 bias -- 它只是那些已经在用这个 app 的人(selection bias); 在海量 n 下一切看起来都“显著”,所以 effect size 与来源比 P 更要紧。再加上数据伦理标记:被记录用户的隐私/ 同意。 What earns the marks. a justified qual choice (the "why" logic + convergence) + a big-data critique naming that size # representativeness, with an ethics flag. 什么能得分。一个有论证的 qual 选择(“why”逻辑+ convergence)+一个 big-data 批判,点明规模 ≠代表性,并附 上一个 ethics 标记。 Trap. dismissing qualitative as "unscientific"; equating large n with representative; forgetting consent/provenance for found data. 陷阱。把 qualitative 斥为“不科学”;把大n 等同于有代表性;对找来的数据忘了 consent/provenance(同意/来源)。 AskSia Library · MAST20034 · 双语 Bilingual ★ Recall checklist - the decision rules for the bank 回忆清单 -- 题库的决策规则 1 每个答案:点名→定义→在情境中给理由→ because(后果)。按分值数理由。 · Design/sampling: justify the choice against its alternative; non-probability methods are biased - size won't cure it. 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 · Confounder: must link to both exposure and outcome; observational - associated with, never causes. Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 · Graph critique: two good features to a principle; one issue + a fix that matches it. 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 · Type I/II: false +/false -; power = 1-ß; small n -+ low power; rare condition - base-rate false positives. Type l/ll: 假阳/假阴;power = 1-β;小 n→低 power;罕见情况→基础率导致的假阳性。 · Qual vs quant: why vs what; convergence is the qual stopping rule. Big data: size # unbiased; effect size & ethics over P. 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data: 大≠无偏;effect size 与伦理胜过 P。 ● 每个答案:点名 →定义→在情境中给理由→ because(后果)。按分值数理由。 ● 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 ● Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 ● 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 ● CI:随机的是区间,固定的是参数。P-value: Pr(datalHo),而非 Pr(Holdata);显著 ≠重要。 ● Type I/ll: 假阳/假阴;power = 1-β;小n→低 power;罕见情况→ 基础率导致的假阳性。 ● Forest plot: 菱形对零线+异质性+发表偏倚的注意。Hill:证据的权重,时序性优先。 ● 诊断图:漏斗形→方差非恒定;QQ 弯曲→非正态→变换/用假设更少的方法。 ● 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data:大≠无偏;effect size 与伦理胜过 P。 AskSia Library · MAST20034 · 双语 Bilingual EXAM MORNING . THE DECODER EXAM MORNING . THE DECODER[15]Source: asksia-bible-mast20034-bilingual.pdf混合方法=定性+定量;定性解释或为定量播种。永远点名其伦理/实务成本(时间、编码、匿名性)。 · Trap: never call qualitative "unscientific"; never "just take a bigger survey" for a why question; never swap bottom-up + top-down. I 陷阱:绝不把定性称为“不科学”;对一个为何的问题绝不“就去做更大的调查”;绝不把自下而上←自上而下对调。 ● Quant=“什么/有多少”;qual=“为什么” -- 按问题的性质来选,而非按哪个“更好”。 ● 四种来源: interviews (深度) · focus groups (互动)· observation (做≠说)· documents/artefacts (已经存在) -- 各有其风险。 ● 编码:bottom-up =归纳,codes 从数据中涌现;top-down =演绎,codes 来自先验理论。Themes =归组后的 codes (thematic analysis) . · Convergence (收敛)=定性的停止规则(类比于 power/样本量):当新数据不再增添新主题时停止。 · Rigour (严谨): credibility (~ 内部效度)、transferability (~外部效度)、transparency、有目的的采集。 ● Mixed methods (混合方法) = qual + quant; qual 解释或孕育 quant。始终点名伦理/实务代价(时间、编码、 匿名)。 ● 陷阱:绝不把定性叫“不科学”;对一个为什么问题绝不“干脆做更大的调查”;绝不把 bottom-up ←> top-down 互 换。 AskSia Library · MAST20034 · 双语 Bilingual WEEK 7 . FRAMEWORKS FOR INFERENCE - WEEK 7 . FRAMEWORKS FOR INFERENCE CH 7 . ESTIMATION & SAMPLING DISTRIBUTIONS From sample to population: estimation & the CLT 从样本到总体:估计与 CLT Why one sample can speak for a whole population - and how confidently 为何一个样本能为整个总体发声 -- 以及有多大把握 TL;DR. Inference runs the arrow backwards: probability reasons population - sample, inference reasons sample - population. A point estimate is one number; a confidence interval is honest because it carries "how close". The whole machine rests on the sampling distribution - what the estimate would do over many samples - which the Central Limit Theorem makes Normal. Everything on these three pages is about reading and explaining this, never computing it. TL;DR. 推断把箭头倒过来跑:概率从总体→样本推理,推断从样本→ 总体推理。一个 point estimate(点估计)是一个数 字;一个 confidence interval(置信区间)之所以诚实,是因为它带着“有多接近”。整套机器都立在 sampling distribution (抽样分布)之上 -- 即这个估计在许多样本上会有的表现 -- 而 Central Limit Theorem (CLT,中心极限定理)让它呈正 态。这三页里的一切都关于读懂并解释它,从不计算它。 ★ What the exam asks here 考试在这里问什么 The 60% final is short-answer reasoning only - no calculator, no calculations, no multiple choice. You bring in four sides of your own notes. For inference you will be handed a CI or a P-value to interpret and asked to say what it does (and does not) mean, or to name an error / explain power in a scenario. The marking is explicit: "explaining your reasoning and choices is typically more important than any answer. " Every mark is a because - so carry the definitions, the CI/P-value interpretation rules, and the Type I/II decoder, not arithmetic. 60% 期末只考简答推理 -- 无计算器、无计算、无多选。你带入四面自己的笔记。对于推断,你会被递给一个要解读的 CI 或 P-value,要你说它意味着什么(以及不意味着什么),或在某情景中点名一种 error/解释 power。评分明确: “阐释你的推理与选择,通常比任何答案本身更重要。”每一分都是一个because -- 所以带上定义、CI/P-value 解读规 则,以及 Type l/II 解码器,而非算术。 7. 1 Estimation: point estimate vs confidence interval 7. 1fait: point estimate vs confidence interval Definitions. A point estimate is a single number computed from the sample that stands in for an unknown population parameter (the sample mean x estimates u; the sample proportion p estimates p; s estimates o). A confidence interval (CI) is a range - the estimate plus a margin that encodes how close it is likely to be: estimate ± (distribution multiplier) x (variability). The width comes from sampling variability (and so shrinks as n grows); the multiplier comes from the confidence level you choose. - 定义。一个 point estimate 是从样本算出的单个数字,用来替代一个未知的总体 parameter(参数)(样本均值 x估计 μ; 样本比例 p^估计 p; s 估计 o)。一个 confidence interval (CI,置信区间)是一个范围 -- 估计值加上一个编码了它有多大 可能接近的余量:估计值+(分布乘数)× (变异性)。宽度来自抽样变异性(所以随n增大而收缩);乘数来自你选择的置 信水平。 Quantity Symbol Lives in
- 要会说:
- 小 $P$:对 $H_0$ 不利的证据较强[3]Source: asksia-bible-mast20034-bilingual.pdf"Interpret this CI / P- value / output / forest plot" Inference reading rules . Ch7, Ch8, Ch10 (1) State what it shows in context (CI excludes O - evidence of an effect; small P - strong evidence against H. ). (2) Add the correct caveat (the interval is random, u is fixed; large P does not prove H. ). (3) Comment on strength / meaning - significant # important. SAMPLING & TRUST "Is this sample OK?"; "what's wrong with how they recruited?" Sampling bias + WEIRD · Ch1, Ch9 (1) Ask who is missing - frame / selection / non-response / volunteer gap. (2) Name the method and its bias (convenience - people similar to each other). (3) State that a bigger sample will NOT fix bias - it repeats the mistake at scale; consider WEIRD over-sampling. "Too good to be true"; "a surprising significant result"; "just barely p‹0. 05" Reproducibility + p-hacking · Ch9-10 (1) Publication bias - novel/significant results over-published, inflating effects. (2) Watch for p-hacking / HARKing (one-sided chosen after the data, multiple looks). (3) Ask for replication, a CI / effect size, and pre- registration before trusting it. "Big data / an Al claim"; "with millions of records . . . " Ethics + validity at scale · Ch1, Ch11 (1) Huge n - everything is significant - judge effect size & practical importance, not P. (2) Apply the context questions (who/why/what/how) + data ethics (consent, fairness, stewardship). (3) Be sceptical of Al - check provenance and the missing data. ✓ How to use the table under pressure 在压力下如何使用这张表 Underline the verb and the noun in the stem first ("choose a design", "critique this graph", "interpret the output"). That two-word cue picks the row; the right column is your paragraph. Then convert each of the three things into a sentence that ends in a because. You are never asked to compute - resist the urge. 先在题干里给动词和名词划线(“选择一个设计”、“批判这张图”、“解读这段输出”)。那个两字提示挑出对应的行;右侧 那一栏就是你要写的段落。然后把这三样东西各转成一个以 because 收尾的句子。题目从不要求你计算 -- 忍住冲动。 AskSia Library . MAST20034 . XXia Bilingual EXAM MORNING . THE DECODER - EXAM MORNING . THE DECODER BUILDING THE NOTES YOU CARRY IN Your 4 sides, the 'because' rule, and the 3 hours 你的4 页笔记、‘because’规则,以及那3个小时 TL;DR. You may bring four sides of your own notes and there is no calculator - so do not waste space on formulas. Fill the four sides with decision trees, checklists, and crisp definitions: the machinery that turns a cue into a justified answer. This page lays out what to put on each side, the one rule that wins short- answer marks, a timing plan for the three hours, and the closing concepts-to-recall list. TL;DR. 你可以带四面自己的笔记,而且没有计算器 -- 所以不要把空间浪费在公式上。把这四面填满decision trees (决策 树)、checklists (清单)和精炼的定义:那些把线索变成有论证答案的机器。本页摆出每一面该放什么、赢得简答分的那一 条规则、三小时的时间规划,以及收官的待回忆概念清单。 - 12. 2 The 4-side notes plan 12. 2四页笔记计划 The exam is reasoning, not recall of numbers, so your sheet is a reasoning toolkit. A good layout maps one side to each job of the decoder above. Trees and checklists earn marks; a wall of formulae does not (there is nothing to calculate). 考的是推理,而非对数字的回忆,所以你的小抄是一个推理工具箱。好的布局把每一面对应到上面解码器的一项工作。树与清 单能得分;一墙公式不能(没有任何东西要算)。 Side What goes on it
- 大 $P$:不证明 $H_0$ 为真[3]Source: asksia-bible-mast20034-bilingual.pdf"Interpret this CI / P- value / output / forest plot" Inference reading rules . Ch7, Ch8, Ch10 (1) State what it shows in context (CI excludes O - evidence of an effect; small P - strong evidence against H. ). (2) Add the correct caveat (the interval is random, u is fixed; large P does not prove H. ). (3) Comment on strength / meaning - significant # important. SAMPLING & TRUST "Is this sample OK?"; "what's wrong with how they recruited?" Sampling bias + WEIRD · Ch1, Ch9 (1) Ask who is missing - frame / selection / non-response / volunteer gap. (2) Name the method and its bias (convenience - people similar to each other). (3) State that a bigger sample will NOT fix bias - it repeats the mistake at scale; consider WEIRD over-sampling. "Too good to be true"; "a surprising significant result"; "just barely p‹0. 05" Reproducibility + p-hacking · Ch9-10 (1) Publication bias - novel/significant results over-published, inflating effects. (2) Watch for p-hacking / HARKing (one-sided chosen after the data, multiple looks). (3) Ask for replication, a CI / effect size, and pre- registration before trusting it. "Big data / an Al claim"; "with millions of records . . . " Ethics + validity at scale · Ch1, Ch11 (1) Huge n - everything is significant - judge effect size & practical importance, not P. (2) Apply the context questions (who/why/what/how) + data ethics (consent, fairness, stewardship). (3) Be sceptical of Al - check provenance and the missing data. ✓ How to use the table under pressure 在压力下如何使用这张表 Underline the verb and the noun in the stem first ("choose a design", "critique this graph", "interpret the output"). That two-word cue picks the row; the right column is your paragraph. Then convert each of the three things into a sentence that ends in a because. You are never asked to compute - resist the urge. 先在题干里给动词和名词划线(“选择一个设计”、“批判这张图”、“解读这段输出”)。那个两字提示挑出对应的行;右侧 那一栏就是你要写的段落。然后把这三样东西各转成一个以 because 收尾的句子。题目从不要求你计算 -- 忍住冲动。 AskSia Library . MAST20034 . XXia Bilingual EXAM MORNING . THE DECODER - EXAM MORNING . THE DECODER BUILDING THE NOTES YOU CARRY IN Your 4 sides, the 'because' rule, and the 3 hours 你的4 页笔记、‘because’规则,以及那3个小时 TL;DR. You may bring four sides of your own notes and there is no calculator - so do not waste space on formulas. Fill the four sides with decision trees, checklists, and crisp definitions: the machinery that turns a cue into a justified answer. This page lays out what to put on each side, the one rule that wins short- answer marks, a timing plan for the three hours, and the closing concepts-to-recall list. TL;DR. 你可以带四面自己的笔记,而且没有计算器 -- 所以不要把空间浪费在公式上。把这四面填满decision trees (决策 树)、checklists (清单)和精炼的定义:那些把线索变成有论证答案的机器。本页摆出每一面该放什么、赢得简答分的那一 条规则、三小时的时间规划,以及收官的待回忆概念清单。 - 12. 2 The 4-side notes plan 12. 2四页笔记计划 The exam is reasoning, not recall of numbers, so your sheet is a reasoning toolkit. A good layout maps one side to each job of the decoder above. Trees and checklists earn marks; a wall of formulae does not (there is nothing to calculate). 考的是推理,而非对数字的回忆,所以你的小抄是一个推理工具箱。好的布局把每一面对应到上面解码器的一项工作。树与清 单能得分;一墙公式不能(没有任何东西要算)。 Side What goes on it[9]Source: asksia-bible-mast20034-bilingual.pdfThe PPDAC cycle - the spine of the whole unit, and a one-glance map of how an exam scenario hangs together: every question lives somewhere on Problem - Plan - Data - Analysis - Conclusion. Locating the stage tells you which concept the marker wants. PPDAC 循环 -- 整个单元的脊柱,也是一张让你一眼看清考试情景如何拼接的地图:每道题都栖身于 Problem → Plan → Data → Analysis → Conclusion 中的某处。定位到阶段,就知道评分者想要哪个概念。 AskSia Library . MAST20034 . XXia Bilingual ★ Concepts to recall - the whole-book checklist 要回忆的概念 -- 全书清单 · Context first (Ch1): data are value-laden; ask who/why/what/how/when; critique # criticism (always offer a constructive fix). 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · Graphics (Ch2): the 5 principles; match graph to variable types; two good features + one specific improvement. I 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 · Design (Ch3): validity = randomise/compare/control (kills bias); precision = replicate/stratify/balance (kills variability); they are independent axes. 设计(第3章): validity = 随机化/比较/控制(杀 bias); precision = 重复/分层/平衡(杀 variability);二者是独 立的轴。 · Observational (Ch4): cohort=group-by-exposure, case-control=group-by-outcome; confounder links to both; correlation # causation. 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠ 因果。 I 报告(第5章):中心/离散/趋势/离群点;报告 Cl+水平、以及统计量 +P,而非只报P。 · Qualitative (Ch6): "why" not "what"; bottom-up vs top-down coding; convergence as the stopping rule. 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence 作为停止规则。 推断(第7章):随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大P 不证明 Ho; Type l/ll 与 power. · Modelling (Ch8): signal+noise; "all models wrong, some useful"; parsimony; read residual/QQ plots - interpret, never fit. 建模(第8章):信号+噪声;“所有模型都是错的,有些有用”;简约性;读残差/QQ图 -- 解读,绝不拟合。 · Sampling (Ch9): frame vs sample; a big sample won't fix bias; 4 random + 4 non-random methods; WEIRD; reproducibility crisis. I 抽样(第9章):抽样框 vs样本;大样本修不了偏倚;4种随机+4种非随机方法;WEIRD;可重复性危机。 I 累积(第10章):森林图(零线+菱形);Hill 准则(时序性+梯度);发表偏倚。 · Big data (Ch11): significance # importance at scale; provenance, ethics, scepticism toward Al findings. Big data (第11章):在大规模下显著 ≠重要;来源、伦理、对 AI发现的怀疑。 而且永远 -- 铁律:点名概念,然后给 because。祝你好运。 AskSia Library · MAST20034 · 双语 Bilingual ● 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 ● 设计(第3章): validity =随机化/比较/控制(杀 bias); precision= 重复/分层/平衡(杀 variability);二者是独 立的轴。 ● 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠因果。 ● 报告(第5章):中心/离散/趋势/离群点;报告 CI+水平、以及统计量+P,而非只报P。 ● 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence(收敛)作为停止规则。 推断(第7章): 随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大 P 不证明 Ho; Type l/ll 与 power.[16]Source: asksia-cheatsheet-mast20034.pdfqualitative . convergence 17 . The "Because" Rule HOW TO BANK MARKS EVERY ANSWER = A CHAIN NAME the concept/framework - APPLY it to the context + BECAUSE . . . (the reason wins the mark) · Marks are per correct, sufficiently-detailed reason - one detailed reason often = full marks; restating the definition earns nothing · Dot points are fine . no grammar/spelling marks . 3 hours SIA > No question needs a calculator and none recall case-study details. If you're computing, you've misread it - they want the reasoning. Spend the words on the because. 18 . Top Traps to THE MARKS LOST MOST Avoid · "P = Pr(Ho true)" . "large P proves Ho" . choosing one- sided after the data · Calling a natural experiment an experiment . "association proves cause" · Cohort>case-control mix-up · stratified > quota . cluster<> stratified · Hill as a tick-box · ignoring temporality · "Significant" = "important" . trusting a result because n is huge the core of your 4-side bring-in notes . confirm on the MAST20034 exam-info page name it. apply it. because . . . asksia. ai/cheatsheet/ unimelb-mast20034 . side 2/2 AskSia CHEAT SHEET SERIES RESTRICTED BRING-IN . NO CALCULATOR
- 正确定义:
-
significant $\ne$ important
- 统计显著不等于实际重要。[3]Source: asksia-bible-mast20034-bilingual.pdf"Interpret this CI / P- value / output / forest plot" Inference reading rules . Ch7, Ch8, Ch10 (1) State what it shows in context (CI excludes O - evidence of an effect; small P - strong evidence against H. ). (2) Add the correct caveat (the interval is random, u is fixed; large P does not prove H. ). (3) Comment on strength / meaning - significant # important. SAMPLING & TRUST "Is this sample OK?"; "what's wrong with how they recruited?" Sampling bias + WEIRD · Ch1, Ch9 (1) Ask who is missing - frame / selection / non-response / volunteer gap. (2) Name the method and its bias (convenience - people similar to each other). (3) State that a bigger sample will NOT fix bias - it repeats the mistake at scale; consider WEIRD over-sampling. "Too good to be true"; "a surprising significant result"; "just barely p‹0. 05" Reproducibility + p-hacking · Ch9-10 (1) Publication bias - novel/significant results over-published, inflating effects. (2) Watch for p-hacking / HARKing (one-sided chosen after the data, multiple looks). (3) Ask for replication, a CI / effect size, and pre- registration before trusting it. "Big data / an Al claim"; "with millions of records . . . " Ethics + validity at scale · Ch1, Ch11 (1) Huge n - everything is significant - judge effect size & practical importance, not P. (2) Apply the context questions (who/why/what/how) + data ethics (consent, fairness, stewardship). (3) Be sceptical of Al - check provenance and the missing data. ✓ How to use the table under pressure 在压力下如何使用这张表 Underline the verb and the noun in the stem first ("choose a design", "critique this graph", "interpret the output"). That two-word cue picks the row; the right column is your paragraph. Then convert each of the three things into a sentence that ends in a because. You are never asked to compute - resist the urge. 先在题干里给动词和名词划线(“选择一个设计”、“批判这张图”、“解读这段输出”)。那个两字提示挑出对应的行;右侧 那一栏就是你要写的段落。然后把这三样东西各转成一个以 because 收尾的句子。题目从不要求你计算 -- 忍住冲动。 AskSia Library . MAST20034 . XXia Bilingual EXAM MORNING . THE DECODER - EXAM MORNING . THE DECODER BUILDING THE NOTES YOU CARRY IN Your 4 sides, the 'because' rule, and the 3 hours 你的4 页笔记、‘because’规则,以及那3个小时 TL;DR. You may bring four sides of your own notes and there is no calculator - so do not waste space on formulas. Fill the four sides with decision trees, checklists, and crisp definitions: the machinery that turns a cue into a justified answer. This page lays out what to put on each side, the one rule that wins short- answer marks, a timing plan for the three hours, and the closing concepts-to-recall list. TL;DR. 你可以带四面自己的笔记,而且没有计算器 -- 所以不要把空间浪费在公式上。把这四面填满decision trees (决策 树)、checklists (清单)和精炼的定义:那些把线索变成有论证答案的机器。本页摆出每一面该放什么、赢得简答分的那一 条规则、三小时的时间规划,以及收官的待回忆概念清单。 - 12. 2 The 4-side notes plan 12. 2四页笔记计划 The exam is reasoning, not recall of numbers, so your sheet is a reasoning toolkit. A good layout maps one side to each job of the decoder above. Trees and checklists earn marks; a wall of formulae does not (there is nothing to calculate). 考的是推理,而非对数字的回忆,所以你的小抄是一个推理工具箱。好的布局把每一面对应到上面解码器的一项工作。树与清 单能得分;一墙公式不能(没有任何东西要算)。 Side What goes on it[11]Source: asksia-bible-mast20034-bilingual.pdf(b) Big-data 批判(because):巨大的 n修不了 bias -- 它只是那些已经在用这个 app 的人(selection bias); 在海量 n 下一切看起来都“显著”,所以 effect size 与来源比 P 更要紧。再加上数据伦理标记:被记录用户的隐私/ 同意。 What earns the marks. a justified qual choice (the "why" logic + convergence) + a big-data critique naming that size # representativeness, with an ethics flag. 什么能得分。一个有论证的 qual 选择(“why”逻辑+ convergence)+一个 big-data 批判,点明规模 ≠代表性,并附 上一个 ethics 标记。 Trap. dismissing qualitative as "unscientific"; equating large n with representative; forgetting consent/provenance for found data. 陷阱。把 qualitative 斥为“不科学”;把大n 等同于有代表性;对找来的数据忘了 consent/provenance(同意/来源)。 AskSia Library · MAST20034 · 双语 Bilingual ★ Recall checklist - the decision rules for the bank 回忆清单 -- 题库的决策规则 1 每个答案:点名→定义→在情境中给理由→ because(后果)。按分值数理由。 · Design/sampling: justify the choice against its alternative; non-probability methods are biased - size won't cure it. 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 · Confounder: must link to both exposure and outcome; observational - associated with, never causes. Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 · Graph critique: two good features to a principle; one issue + a fix that matches it. 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 · Type I/II: false +/false -; power = 1-ß; small n -+ low power; rare condition - base-rate false positives. Type l/ll: 假阳/假阴;power = 1-β;小 n→低 power;罕见情况→基础率导致的假阳性。 · Qual vs quant: why vs what; convergence is the qual stopping rule. Big data: size # unbiased; effect size & ethics over P. 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data: 大≠无偏;effect size 与伦理胜过 P。 ● 每个答案:点名 →定义→在情境中给理由→ because(后果)。按分值数理由。 ● 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 ● Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 ● 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 ● CI:随机的是区间,固定的是参数。P-value: Pr(datalHo),而非 Pr(Holdata);显著 ≠重要。 ● Type I/ll: 假阳/假阴;power = 1-β;小n→低 power;罕见情况→ 基础率导致的假阳性。 ● Forest plot: 菱形对零线+异质性+发表偏倚的注意。Hill:证据的权重,时序性优先。 ● 诊断图:漏斗形→方差非恒定;QQ 弯曲→非正态→变换/用假设更少的方法。 ● 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data:大≠无偏;effect size 与伦理胜过 P。 AskSia Library · MAST20034 · 双语 Bilingual EXAM MORNING . THE DECODER EXAM MORNING . THE DECODER[16]Source: asksia-cheatsheet-mast20034.pdfqualitative . convergence 17 . The "Because" Rule HOW TO BANK MARKS EVERY ANSWER = A CHAIN NAME the concept/framework - APPLY it to the context + BECAUSE . . . (the reason wins the mark) · Marks are per correct, sufficiently-detailed reason - one detailed reason often = full marks; restating the definition earns nothing · Dot points are fine . no grammar/spelling marks . 3 hours SIA > No question needs a calculator and none recall case-study details. If you're computing, you've misread it - they want the reasoning. Spend the words on the because. 18 . Top Traps to THE MARKS LOST MOST Avoid · "P = Pr(Ho true)" . "large P proves Ho" . choosing one- sided after the data · Calling a natural experiment an experiment . "association proves cause" · Cohort>case-control mix-up · stratified > quota . cluster<> stratified · Hill as a tick-box · ignoring temporality · "Significant" = "important" . trusting a result because n is huge the core of your 4-side bring-in notes . confirm on the MAST20034 exam-info page name it. apply it. because . . . asksia. ai/cheatsheet/ unimelb-mast20034 . side 2/2 AskSia CHEAT SHEET SERIES RESTRICTED BRING-IN . NO CALCULATOR
- 尤其在大样本 / big data 里,几乎什么都能显著,所以要看:
- effect size
- context
- practical importance[3]Source: asksia-bible-mast20034-bilingual.pdf"Interpret this CI / P- value / output / forest plot" Inference reading rules . Ch7, Ch8, Ch10 (1) State what it shows in context (CI excludes O - evidence of an effect; small P - strong evidence against H. ). (2) Add the correct caveat (the interval is random, u is fixed; large P does not prove H. ). (3) Comment on strength / meaning - significant # important. SAMPLING & TRUST "Is this sample OK?"; "what's wrong with how they recruited?" Sampling bias + WEIRD · Ch1, Ch9 (1) Ask who is missing - frame / selection / non-response / volunteer gap. (2) Name the method and its bias (convenience - people similar to each other). (3) State that a bigger sample will NOT fix bias - it repeats the mistake at scale; consider WEIRD over-sampling. "Too good to be true"; "a surprising significant result"; "just barely p‹0. 05" Reproducibility + p-hacking · Ch9-10 (1) Publication bias - novel/significant results over-published, inflating effects. (2) Watch for p-hacking / HARKing (one-sided chosen after the data, multiple looks). (3) Ask for replication, a CI / effect size, and pre- registration before trusting it. "Big data / an Al claim"; "with millions of records . . . " Ethics + validity at scale · Ch1, Ch11 (1) Huge n - everything is significant - judge effect size & practical importance, not P. (2) Apply the context questions (who/why/what/how) + data ethics (consent, fairness, stewardship). (3) Be sceptical of Al - check provenance and the missing data. ✓ How to use the table under pressure 在压力下如何使用这张表 Underline the verb and the noun in the stem first ("choose a design", "critique this graph", "interpret the output"). That two-word cue picks the row; the right column is your paragraph. Then convert each of the three things into a sentence that ends in a because. You are never asked to compute - resist the urge. 先在题干里给动词和名词划线(“选择一个设计”、“批判这张图”、“解读这段输出”)。那个两字提示挑出对应的行;右侧 那一栏就是你要写的段落。然后把这三样东西各转成一个以 because 收尾的句子。题目从不要求你计算 -- 忍住冲动。 AskSia Library . MAST20034 . XXia Bilingual EXAM MORNING . THE DECODER - EXAM MORNING . THE DECODER BUILDING THE NOTES YOU CARRY IN Your 4 sides, the 'because' rule, and the 3 hours 你的4 页笔记、‘because’规则,以及那3个小时 TL;DR. You may bring four sides of your own notes and there is no calculator - so do not waste space on formulas. Fill the four sides with decision trees, checklists, and crisp definitions: the machinery that turns a cue into a justified answer. This page lays out what to put on each side, the one rule that wins short- answer marks, a timing plan for the three hours, and the closing concepts-to-recall list. TL;DR. 你可以带四面自己的笔记,而且没有计算器 -- 所以不要把空间浪费在公式上。把这四面填满decision trees (决策 树)、checklists (清单)和精炼的定义:那些把线索变成有论证答案的机器。本页摆出每一面该放什么、赢得简答分的那一 条规则、三小时的时间规划,以及收官的待回忆概念清单。 - 12. 2 The 4-side notes plan 12. 2四页笔记计划 The exam is reasoning, not recall of numbers, so your sheet is a reasoning toolkit. A good layout maps one side to each job of the decoder above. Trees and checklists earn marks; a wall of formulae does not (there is nothing to calculate). 考的是推理,而非对数字的回忆,所以你的小抄是一个推理工具箱。好的布局把每一面对应到上面解码器的一项工作。树与清 单能得分;一墙公式不能(没有任何东西要算)。 Side What goes on it[11]Source: asksia-bible-mast20034-bilingual.pdf(b) Big-data 批判(because):巨大的 n修不了 bias -- 它只是那些已经在用这个 app 的人(selection bias); 在海量 n 下一切看起来都“显著”,所以 effect size 与来源比 P 更要紧。再加上数据伦理标记:被记录用户的隐私/ 同意。 What earns the marks. a justified qual choice (the "why" logic + convergence) + a big-data critique naming that size # representativeness, with an ethics flag. 什么能得分。一个有论证的 qual 选择(“why”逻辑+ convergence)+一个 big-data 批判,点明规模 ≠代表性,并附 上一个 ethics 标记。 Trap. dismissing qualitative as "unscientific"; equating large n with representative; forgetting consent/provenance for found data. 陷阱。把 qualitative 斥为“不科学”;把大n 等同于有代表性;对找来的数据忘了 consent/provenance(同意/来源)。 AskSia Library · MAST20034 · 双语 Bilingual ★ Recall checklist - the decision rules for the bank 回忆清单 -- 题库的决策规则 1 每个答案:点名→定义→在情境中给理由→ because(后果)。按分值数理由。 · Design/sampling: justify the choice against its alternative; non-probability methods are biased - size won't cure it. 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 · Confounder: must link to both exposure and outcome; observational - associated with, never causes. Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 · Graph critique: two good features to a principle; one issue + a fix that matches it. 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 · Type I/II: false +/false -; power = 1-ß; small n -+ low power; rare condition - base-rate false positives. Type l/ll: 假阳/假阴;power = 1-β;小 n→低 power;罕见情况→基础率导致的假阳性。 · Qual vs quant: why vs what; convergence is the qual stopping rule. Big data: size # unbiased; effect size & ethics over P. 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data: 大≠无偏;effect size 与伦理胜过 P。 ● 每个答案:点名 →定义→在情境中给理由→ because(后果)。按分值数理由。 ● 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 ● Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 ● 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 ● CI:随机的是区间,固定的是参数。P-value: Pr(datalHo),而非 Pr(Holdata);显著 ≠重要。 ● Type I/ll: 假阳/假阴;power = 1-β;小n→低 power;罕见情况→ 基础率导致的假阳性。 ● Forest plot: 菱形对零线+异质性+发表偏倚的注意。Hill:证据的权重,时序性优先。 ● 诊断图:漏斗形→方差非恒定;QQ 弯曲→非正态→变换/用假设更少的方法。 ● 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data:大≠无偏;effect size 与伦理胜过 P。 AskSia Library · MAST20034 · 双语 Bilingual EXAM MORNING . THE DECODER EXAM MORNING . THE DECODER[14]Source: asksia-bible-mast20034-bilingual.pdfGLM 连接:连续→identity/Normal;二元→logit/Bernouli (优势比);计数→log/Poisson。认出来,别去算。 · Diagnostics: residuals-vs-fitted = funnel(variance)/curve(nonlinearity); QQ tails = non-Normal. Any structure = a problem. I 诊断:残差对拟合=漏斗形(方差)/弯曲(非线性);QQ 尾部=非正态。任何结构=一个问题。 修复点名的违例:变换 · 加项 · 假设更少的检验。 · Parsimony over complexity (compare via AIC/BIC - don't compute); over-fitting is a fault, not a virtue. 一 简约性胜过复杂性(用AIC/BIC比较 -- 别去算);过拟合是缺陷,而非美德。 两个拒绝:不向数据之外做 extrapolation(外推);相关 ≠因果 -- 点出那个混杂因素。 · Model = signal + noise (模型=信号+噪声);“所有模型都是错的,有些有用” -- 评判有用性+简约性,而非 真伪。 ● 从三个轴读一个系数:符号 · 大小 · 显著性(CI 不含 0/小P=真实,而非噪声)。 ● 预测变量:数值型→斜率;分类型→相对基线的组间偏移;两者皆有→调整后的效应。 ● GLM links: 连续型→identity/Normal;二元→logit/Bernoulli (odds ratios);计数→log/Poisson。会识别,不计 算。 ● 诊断:residuals-vs-fitted = 漏斗形(方差)/曲线(非线性);QQ 尾部=非正态。任何结构=一个问题。 ● 修正点名的那个违背:transform · 加一项 · 假设更少的检验。 · 简约性(Parsimony)优于复杂性(通过 AIC/BIC 比较 -- 不计算);over-fitting 是缺点,不是优点。 ● 两条拒绝:不在数据之外做 extrapolation; correlation ≠ causation––点名 confounder。 AskSia Library · MAST20034 · 双语 Bilingual WEEK 9 . SAMPLING WEEK 9 . SAMPLING CH 9 . REASONING, NOT COMPUTING Population, frame & sample - and why a big sample can't fix a bad one population、frame 与 sample -- 以及为何大样本救不了一个坏样本 A sample is a guess about a population; the method decides if the guess is honest 样本是对总体的一次猜测;方法决定这猜测是否诚实 TL;DR. Almost no study measures everyone, so we measure a sample and reason about the population. Two different things can go wrong, and the exam lives in the gap between them: sampling error is the harmless luck-of-the-draw wobble that shrinks as the sample grows, while bias is a systematic lean baked in by the method - and a bigger sample only repeats that mistake on a larger scale. The whole skill is naming who is missing from the sample and which way that tilts the answer. TL;DR. 几乎没有研究能测量所有人,所以我们测量一个 sample 并就 population 推理。两种不同的东西可能出错,而考试 就活在它们之间的缝隙里:sampling error(抽样误差)是无害的、抽签运气式的抖动,随样本增大而收缩,而 bias 是由方 法烤进去的一种系统性偏斜 -- 更大的样本只是把那个错误放在更大的规模上重演一遍。整个技能就是点名样本里谁缺席了, 以及那朝哪个方向歪了答案。 ★ What the exam asks here 考试在这里问什么 Sampling seeds the most-rehearsed reasoning question on the 60% final - a 3-hour, short-answer paper with no calculator and no calculations, where you bring in four sides of your own notes. The released sample question asks you to define a sampling method and say why it is (not) recommended, and the tutorial species asks which sampling method would you use here and why. You will also be handed a scenario and asked to identify who is excluded and the direction of the resulting bias. The marking rule is explicit: "explaining your reasoning and choices is typically more important than any answer. " Every mark is a because - carry the taxonomy and the bias checklist, and spend the words on the consequence, not the label. Sampling 在60% 期末上孕育出最常演练的推理题 -- 一份 3小时、简答、无计算器、无计算的卷子,你带入四面自己 的笔记。发布的样题要你定义一种 sampling 方法并说出它为何(不)被推荐,教程题种则问此处你会用哪种 sampling 方法、为什么。你也会被递给一个情景,要你识别谁被排除,以及由此产生的bias的方向。评分规则明确:“阐释你的 推理与选择,通常比任何答案本身更重要。”每一分都是一个because -- 带上分类法和 bias 清单,把字花在后果上,而 非标签上。 9. 1 The core vocabulary - unit, population, frame, sample 9. 1核心词汇 -- unit、population、frame、sample Definition. A unit is one thing you study (a person, a tree, a transaction). The population is all the units you want to talk about. A census measures every unit in the population. The sampling frame is the actual list of units you can draw from - the electoral roll, a class list, a customer database. The sample is the subset you measure. Write each term as a defined object, because markers reward the defined term before the reasoning.
-
解释输出的三步法
-
例句
- “The CI excludes 0, so there is evidence of an effect in this sample context. But significance is not the same as importance, so we should still judge the effect size and whether it matters practically.”[3]Source: asksia-bible-mast20034-bilingual.pdf"Interpret this CI / P- value / output / forest plot" Inference reading rules . Ch7, Ch8, Ch10 (1) State what it shows in context (CI excludes O - evidence of an effect; small P - strong evidence against H. ). (2) Add the correct caveat (the interval is random, u is fixed; large P does not prove H. ). (3) Comment on strength / meaning - significant # important. SAMPLING & TRUST "Is this sample OK?"; "what's wrong with how they recruited?" Sampling bias + WEIRD · Ch1, Ch9 (1) Ask who is missing - frame / selection / non-response / volunteer gap. (2) Name the method and its bias (convenience - people similar to each other). (3) State that a bigger sample will NOT fix bias - it repeats the mistake at scale; consider WEIRD over-sampling. "Too good to be true"; "a surprising significant result"; "just barely p‹0. 05" Reproducibility + p-hacking · Ch9-10 (1) Publication bias - novel/significant results over-published, inflating effects. (2) Watch for p-hacking / HARKing (one-sided chosen after the data, multiple looks). (3) Ask for replication, a CI / effect size, and pre- registration before trusting it. "Big data / an Al claim"; "with millions of records . . . " Ethics + validity at scale · Ch1, Ch11 (1) Huge n - everything is significant - judge effect size & practical importance, not P. (2) Apply the context questions (who/why/what/how) + data ethics (consent, fairness, stewardship). (3) Be sceptical of Al - check provenance and the missing data. ✓ How to use the table under pressure 在压力下如何使用这张表 Underline the verb and the noun in the stem first ("choose a design", "critique this graph", "interpret the output"). That two-word cue picks the row; the right column is your paragraph. Then convert each of the three things into a sentence that ends in a because. You are never asked to compute - resist the urge. 先在题干里给动词和名词划线(“选择一个设计”、“批判这张图”、“解读这段输出”)。那个两字提示挑出对应的行;右侧 那一栏就是你要写的段落。然后把这三样东西各转成一个以 because 收尾的句子。题目从不要求你计算 -- 忍住冲动。 AskSia Library . MAST20034 . XXia Bilingual EXAM MORNING . THE DECODER - EXAM MORNING . THE DECODER BUILDING THE NOTES YOU CARRY IN Your 4 sides, the 'because' rule, and the 3 hours 你的4 页笔记、‘because’规则,以及那3个小时 TL;DR. You may bring four sides of your own notes and there is no calculator - so do not waste space on formulas. Fill the four sides with decision trees, checklists, and crisp definitions: the machinery that turns a cue into a justified answer. This page lays out what to put on each side, the one rule that wins short- answer marks, a timing plan for the three hours, and the closing concepts-to-recall list. TL;DR. 你可以带四面自己的笔记,而且没有计算器 -- 所以不要把空间浪费在公式上。把这四面填满decision trees (决策 树)、checklists (清单)和精炼的定义:那些把线索变成有论证答案的机器。本页摆出每一面该放什么、赢得简答分的那一 条规则、三小时的时间规划,以及收官的待回忆概念清单。 - 12. 2 The 4-side notes plan 12. 2四页笔记计划 The exam is reasoning, not recall of numbers, so your sheet is a reasoning toolkit. A good layout maps one side to each job of the decoder above. Trees and checklists earn marks; a wall of formulae does not (there is nothing to calculate). 考的是推理,而非对数字的回忆,所以你的小抄是一个推理工具箱。好的布局把每一面对应到上面解码器的一项工作。树与清 单能得分;一墙公式不能(没有任何东西要算)。 Side What goes on it[11]Source: asksia-bible-mast20034-bilingual.pdf(b) Big-data 批判(because):巨大的 n修不了 bias -- 它只是那些已经在用这个 app 的人(selection bias); 在海量 n 下一切看起来都“显著”,所以 effect size 与来源比 P 更要紧。再加上数据伦理标记:被记录用户的隐私/ 同意。 What earns the marks. a justified qual choice (the "why" logic + convergence) + a big-data critique naming that size # representativeness, with an ethics flag. 什么能得分。一个有论证的 qual 选择(“why”逻辑+ convergence)+一个 big-data 批判,点明规模 ≠代表性,并附 上一个 ethics 标记。 Trap. dismissing qualitative as "unscientific"; equating large n with representative; forgetting consent/provenance for found data. 陷阱。把 qualitative 斥为“不科学”;把大n 等同于有代表性;对找来的数据忘了 consent/provenance(同意/来源)。 AskSia Library · MAST20034 · 双语 Bilingual ★ Recall checklist - the decision rules for the bank 回忆清单 -- 题库的决策规则 1 每个答案:点名→定义→在情境中给理由→ because(后果)。按分值数理由。 · Design/sampling: justify the choice against its alternative; non-probability methods are biased - size won't cure it. 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 · Confounder: must link to both exposure and outcome; observational - associated with, never causes. Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 · Graph critique: two good features to a principle; one issue + a fix that matches it. 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 · Type I/II: false +/false -; power = 1-ß; small n -+ low power; rare condition - base-rate false positives. Type l/ll: 假阳/假阴;power = 1-β;小 n→低 power;罕见情况→基础率导致的假阳性。 · Qual vs quant: why vs what; convergence is the qual stopping rule. Big data: size # unbiased; effect size & ethics over P. 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data: 大≠无偏;effect size 与伦理胜过 P。 ● 每个答案:点名 →定义→在情境中给理由→ because(后果)。按分值数理由。 ● 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 ● Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 ● 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 ● CI:随机的是区间,固定的是参数。P-value: Pr(datalHo),而非 Pr(Holdata);显著 ≠重要。 ● Type I/ll: 假阳/假阴;power = 1-β;小n→低 power;罕见情况→ 基础率导致的假阳性。 ● Forest plot: 菱形对零线+异质性+发表偏倚的注意。Hill:证据的权重,时序性优先。 ● 诊断图:漏斗形→方差非恒定;QQ 弯曲→非正态→变换/用假设更少的方法。 ● 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data:大≠无偏;effect size 与伦理胜过 P。 AskSia Library · MAST20034 · 双语 Bilingual EXAM MORNING . THE DECODER EXAM MORNING . THE DECODER
-
7)Type I / Type II / power
-
这部分常常和情境题一起出。
-
定义
- Type I error:假阳性
- Type II error:假阴性
- power:$$1-\beta$$[11]Source: asksia-bible-mast20034-bilingual.pdf(b) Big-data 批判(because):巨大的 n修不了 bias -- 它只是那些已经在用这个 app 的人(selection bias); 在海量 n 下一切看起来都“显著”,所以 effect size 与来源比 P 更要紧。再加上数据伦理标记:被记录用户的隐私/ 同意。 What earns the marks. a justified qual choice (the "why" logic + convergence) + a big-data critique naming that size # representativeness, with an ethics flag. 什么能得分。一个有论证的 qual 选择(“why”逻辑+ convergence)+一个 big-data 批判,点明规模 ≠代表性,并附 上一个 ethics 标记。 Trap. dismissing qualitative as "unscientific"; equating large n with representative; forgetting consent/provenance for found data. 陷阱。把 qualitative 斥为“不科学”;把大n 等同于有代表性;对找来的数据忘了 consent/provenance(同意/来源)。 AskSia Library · MAST20034 · 双语 Bilingual ★ Recall checklist - the decision rules for the bank 回忆清单 -- 题库的决策规则 1 每个答案:点名→定义→在情境中给理由→ because(后果)。按分值数理由。 · Design/sampling: justify the choice against its alternative; non-probability methods are biased - size won't cure it. 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 · Confounder: must link to both exposure and outcome; observational - associated with, never causes. Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 · Graph critique: two good features to a principle; one issue + a fix that matches it. 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 · Type I/II: false +/false -; power = 1-ß; small n -+ low power; rare condition - base-rate false positives. Type l/ll: 假阳/假阴;power = 1-β;小 n→低 power;罕见情况→基础率导致的假阳性。 · Qual vs quant: why vs what; convergence is the qual stopping rule. Big data: size # unbiased; effect size & ethics over P. 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data: 大≠无偏;effect size 与伦理胜过 P。 ● 每个答案:点名 →定义→在情境中给理由→ because(后果)。按分值数理由。 ● 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 ● Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 ● 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 ● CI:随机的是区间,固定的是参数。P-value: Pr(datalHo),而非 Pr(Holdata);显著 ≠重要。 ● Type I/ll: 假阳/假阴;power = 1-β;小n→低 power;罕见情况→ 基础率导致的假阳性。 ● Forest plot: 菱形对零线+异质性+发表偏倚的注意。Hill:证据的权重,时序性优先。 ● 诊断图:漏斗形→方差非恒定;QQ 弯曲→非正态→变换/用假设更少的方法。 ● 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data:大≠无偏;effect size 与伦理胜过 P。 AskSia Library · MAST20034 · 双语 Bilingual EXAM MORNING . THE DECODER EXAM MORNING . THE DECODER[15]Source: asksia-bible-mast20034-bilingual.pdf混合方法=定性+定量;定性解释或为定量播种。永远点名其伦理/实务成本(时间、编码、匿名性)。 · Trap: never call qualitative "unscientific"; never "just take a bigger survey" for a why question; never swap bottom-up + top-down. I 陷阱:绝不把定性称为“不科学”;对一个为何的问题绝不“就去做更大的调查”;绝不把自下而上←自上而下对调。 ● Quant=“什么/有多少”;qual=“为什么” -- 按问题的性质来选,而非按哪个“更好”。 ● 四种来源: interviews (深度) · focus groups (互动)· observation (做≠说)· documents/artefacts (已经存在) -- 各有其风险。 ● 编码:bottom-up =归纳,codes 从数据中涌现;top-down =演绎,codes 来自先验理论。Themes =归组后的 codes (thematic analysis) . · Convergence (收敛)=定性的停止规则(类比于 power/样本量):当新数据不再增添新主题时停止。 · Rigour (严谨): credibility (~ 内部效度)、transferability (~外部效度)、transparency、有目的的采集。 ● Mixed methods (混合方法) = qual + quant; qual 解释或孕育 quant。始终点名伦理/实务代价(时间、编码、 匿名)。 ● 陷阱:绝不把定性叫“不科学”;对一个为什么问题绝不“干脆做更大的调查”;绝不把 bottom-up ←> top-down 互 换。 AskSia Library · MAST20034 · 双语 Bilingual WEEK 7 . FRAMEWORKS FOR INFERENCE - WEEK 7 . FRAMEWORKS FOR INFERENCE CH 7 . ESTIMATION & SAMPLING DISTRIBUTIONS From sample to population: estimation & the CLT 从样本到总体:估计与 CLT Why one sample can speak for a whole population - and how confidently 为何一个样本能为整个总体发声 -- 以及有多大把握 TL;DR. Inference runs the arrow backwards: probability reasons population - sample, inference reasons sample - population. A point estimate is one number; a confidence interval is honest because it carries "how close". The whole machine rests on the sampling distribution - what the estimate would do over many samples - which the Central Limit Theorem makes Normal. Everything on these three pages is about reading and explaining this, never computing it. TL;DR. 推断把箭头倒过来跑:概率从总体→样本推理,推断从样本→ 总体推理。一个 point estimate(点估计)是一个数 字;一个 confidence interval(置信区间)之所以诚实,是因为它带着“有多接近”。整套机器都立在 sampling distribution (抽样分布)之上 -- 即这个估计在许多样本上会有的表现 -- 而 Central Limit Theorem (CLT,中心极限定理)让它呈正 态。这三页里的一切都关于读懂并解释它,从不计算它。 ★ What the exam asks here 考试在这里问什么 The 60% final is short-answer reasoning only - no calculator, no calculations, no multiple choice. You bring in four sides of your own notes. For inference you will be handed a CI or a P-value to interpret and asked to say what it does (and does not) mean, or to name an error / explain power in a scenario. The marking is explicit: "explaining your reasoning and choices is typically more important than any answer. " Every mark is a because - so carry the definitions, the CI/P-value interpretation rules, and the Type I/II decoder, not arithmetic. 60% 期末只考简答推理 -- 无计算器、无计算、无多选。你带入四面自己的笔记。对于推断,你会被递给一个要解读的 CI 或 P-value,要你说它意味着什么(以及不意味着什么),或在某情景中点名一种 error/解释 power。评分明确: “阐释你的推理与选择,通常比任何答案本身更重要。”每一分都是一个because -- 所以带上定义、CI/P-value 解读规 则,以及 Type l/II 解码器,而非算术。 7. 1 Estimation: point estimate vs confidence interval 7. 1fait: point estimate vs confidence interval Definitions. A point estimate is a single number computed from the sample that stands in for an unknown population parameter (the sample mean x estimates u; the sample proportion p estimates p; s estimates o). A confidence interval (CI) is a range - the estimate plus a margin that encodes how close it is likely to be: estimate ± (distribution multiplier) x (variability). The width comes from sampling variability (and so shrinks as n grows); the multiplier comes from the confidence level you choose. - 定义。一个 point estimate 是从样本算出的单个数字,用来替代一个未知的总体 parameter(参数)(样本均值 x估计 μ; 样本比例 p^估计 p; s 估计 o)。一个 confidence interval (CI,置信区间)是一个范围 -- 估计值加上一个编码了它有多大 可能接近的余量:估计值+(分布乘数)× (变异性)。宽度来自抽样变异性(所以随n增大而收缩);乘数来自你选择的置 信水平。 Quantity Symbol Lives in
-
你要会的联系
- 小样本 $\to$ low power[11]Source: asksia-bible-mast20034-bilingual.pdf(b) Big-data 批判(because):巨大的 n修不了 bias -- 它只是那些已经在用这个 app 的人(selection bias); 在海量 n 下一切看起来都“显著”,所以 effect size 与来源比 P 更要紧。再加上数据伦理标记:被记录用户的隐私/ 同意。 What earns the marks. a justified qual choice (the "why" logic + convergence) + a big-data critique naming that size # representativeness, with an ethics flag. 什么能得分。一个有论证的 qual 选择(“why”逻辑+ convergence)+一个 big-data 批判,点明规模 ≠代表性,并附 上一个 ethics 标记。 Trap. dismissing qualitative as "unscientific"; equating large n with representative; forgetting consent/provenance for found data. 陷阱。把 qualitative 斥为“不科学”;把大n 等同于有代表性;对找来的数据忘了 consent/provenance(同意/来源)。 AskSia Library · MAST20034 · 双语 Bilingual ★ Recall checklist - the decision rules for the bank 回忆清单 -- 题库的决策规则 1 每个答案:点名→定义→在情境中给理由→ because(后果)。按分值数理由。 · Design/sampling: justify the choice against its alternative; non-probability methods are biased - size won't cure it. 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 · Confounder: must link to both exposure and outcome; observational - associated with, never causes. Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 · Graph critique: two good features to a principle; one issue + a fix that matches it. 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 · Type I/II: false +/false -; power = 1-ß; small n -+ low power; rare condition - base-rate false positives. Type l/ll: 假阳/假阴;power = 1-β;小 n→低 power;罕见情况→基础率导致的假阳性。 · Qual vs quant: why vs what; convergence is the qual stopping rule. Big data: size # unbiased; effect size & ethics over P. 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data: 大≠无偏;effect size 与伦理胜过 P。 ● 每个答案:点名 →定义→在情境中给理由→ because(后果)。按分值数理由。 ● 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 ● Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 ● 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 ● CI:随机的是区间,固定的是参数。P-value: Pr(datalHo),而非 Pr(Holdata);显著 ≠重要。 ● Type I/ll: 假阳/假阴;power = 1-β;小n→低 power;罕见情况→ 基础率导致的假阳性。 ● Forest plot: 菱形对零线+异质性+发表偏倚的注意。Hill:证据的权重,时序性优先。 ● 诊断图:漏斗形→方差非恒定;QQ 弯曲→非正态→变换/用假设更少的方法。 ● 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data:大≠无偏;effect size 与伦理胜过 P。 AskSia Library · MAST20034 · 双语 Bilingual EXAM MORNING . THE DECODER EXAM MORNING . THE DECODER
- 罕见情境可能更容易出现“基础率导致的假阳性思考误区”[11]Source: asksia-bible-mast20034-bilingual.pdf(b) Big-data 批判(because):巨大的 n修不了 bias -- 它只是那些已经在用这个 app 的人(selection bias); 在海量 n 下一切看起来都“显著”,所以 effect size 与来源比 P 更要紧。再加上数据伦理标记:被记录用户的隐私/ 同意。 What earns the marks. a justified qual choice (the "why" logic + convergence) + a big-data critique naming that size # representativeness, with an ethics flag. 什么能得分。一个有论证的 qual 选择(“why”逻辑+ convergence)+一个 big-data 批判,点明规模 ≠代表性,并附 上一个 ethics 标记。 Trap. dismissing qualitative as "unscientific"; equating large n with representative; forgetting consent/provenance for found data. 陷阱。把 qualitative 斥为“不科学”;把大n 等同于有代表性;对找来的数据忘了 consent/provenance(同意/来源)。 AskSia Library · MAST20034 · 双语 Bilingual ★ Recall checklist - the decision rules for the bank 回忆清单 -- 题库的决策规则 1 每个答案:点名→定义→在情境中给理由→ because(后果)。按分值数理由。 · Design/sampling: justify the choice against its alternative; non-probability methods are biased - size won't cure it. 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 · Confounder: must link to both exposure and outcome; observational - associated with, never causes. Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 · Graph critique: two good features to a principle; one issue + a fix that matches it. 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 · Type I/II: false +/false -; power = 1-ß; small n -+ low power; rare condition - base-rate false positives. Type l/ll: 假阳/假阴;power = 1-β;小 n→低 power;罕见情况→基础率导致的假阳性。 · Qual vs quant: why vs what; convergence is the qual stopping rule. Big data: size # unbiased; effect size & ethics over P. 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data: 大≠无偏;effect size 与伦理胜过 P。 ● 每个答案:点名 →定义→在情境中给理由→ because(后果)。按分值数理由。 ● 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 ● Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 ● 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 ● CI:随机的是区间,固定的是参数。P-value: Pr(datalHo),而非 Pr(Holdata);显著 ≠重要。 ● Type I/ll: 假阳/假阴;power = 1-β;小n→低 power;罕见情况→ 基础率导致的假阳性。 ● Forest plot: 菱形对零线+异质性+发表偏倚的注意。Hill:证据的权重,时序性优先。 ● 诊断图:漏斗形→方差非恒定;QQ 弯曲→非正态→变换/用假设更少的方法。 ● 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data:大≠无偏;effect size 与伦理胜过 P。 AskSia Library · MAST20034 · 双语 Bilingual EXAM MORNING . THE DECODER EXAM MORNING . THE DECODER
-
考试答法
- 不只是点名错误类型
- 要结合情境说:
- 错会错成什么
- 后果是什么
- 为什么这个场景特别在意这种错[11]Source: asksia-bible-mast20034-bilingual.pdf(b) Big-data 批判(because):巨大的 n修不了 bias -- 它只是那些已经在用这个 app 的人(selection bias); 在海量 n 下一切看起来都“显著”,所以 effect size 与来源比 P 更要紧。再加上数据伦理标记:被记录用户的隐私/ 同意。 What earns the marks. a justified qual choice (the "why" logic + convergence) + a big-data critique naming that size # representativeness, with an ethics flag. 什么能得分。一个有论证的 qual 选择(“why”逻辑+ convergence)+一个 big-data 批判,点明规模 ≠代表性,并附 上一个 ethics 标记。 Trap. dismissing qualitative as "unscientific"; equating large n with representative; forgetting consent/provenance for found data. 陷阱。把 qualitative 斥为“不科学”;把大n 等同于有代表性;对找来的数据忘了 consent/provenance(同意/来源)。 AskSia Library · MAST20034 · 双语 Bilingual ★ Recall checklist - the decision rules for the bank 回忆清单 -- 题库的决策规则 1 每个答案:点名→定义→在情境中给理由→ because(后果)。按分值数理由。 · Design/sampling: justify the choice against its alternative; non-probability methods are biased - size won't cure it. 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 · Confounder: must link to both exposure and outcome; observational - associated with, never causes. Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 · Graph critique: two good features to a principle; one issue + a fix that matches it. 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 · Type I/II: false +/false -; power = 1-ß; small n -+ low power; rare condition - base-rate false positives. Type l/ll: 假阳/假阴;power = 1-β;小 n→低 power;罕见情况→基础率导致的假阳性。 · Qual vs quant: why vs what; convergence is the qual stopping rule. Big data: size # unbiased; effect size & ethics over P. 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data: 大≠无偏;effect size 与伦理胜过 P。 ● 每个答案:点名 →定义→在情境中给理由→ because(后果)。按分值数理由。 ● 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 ● Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 ● 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 ● CI:随机的是区间,固定的是参数。P-value: Pr(datalHo),而非 Pr(Holdata);显著 ≠重要。 ● Type I/ll: 假阳/假阴;power = 1-β;小n→低 power;罕见情况→ 基础率导致的假阳性。 ● Forest plot: 菱形对零线+异质性+发表偏倚的注意。Hill:证据的权重,时序性优先。 ● 诊断图:漏斗形→方差非恒定;QQ 弯曲→非正态→变换/用假设更少的方法。 ● 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data:大≠无偏;effect size 与伦理胜过 P。 AskSia Library · MAST20034 · 双语 Bilingual EXAM MORNING . THE DECODER EXAM MORNING . THE DECODER[15]Source: asksia-bible-mast20034-bilingual.pdf混合方法=定性+定量;定性解释或为定量播种。永远点名其伦理/实务成本(时间、编码、匿名性)。 · Trap: never call qualitative "unscientific"; never "just take a bigger survey" for a why question; never swap bottom-up + top-down. I 陷阱:绝不把定性称为“不科学”;对一个为何的问题绝不“就去做更大的调查”;绝不把自下而上←自上而下对调。 ● Quant=“什么/有多少”;qual=“为什么” -- 按问题的性质来选,而非按哪个“更好”。 ● 四种来源: interviews (深度) · focus groups (互动)· observation (做≠说)· documents/artefacts (已经存在) -- 各有其风险。 ● 编码:bottom-up =归纳,codes 从数据中涌现;top-down =演绎,codes 来自先验理论。Themes =归组后的 codes (thematic analysis) . · Convergence (收敛)=定性的停止规则(类比于 power/样本量):当新数据不再增添新主题时停止。 · Rigour (严谨): credibility (~ 内部效度)、transferability (~外部效度)、transparency、有目的的采集。 ● Mixed methods (混合方法) = qual + quant; qual 解释或孕育 quant。始终点名伦理/实务代价(时间、编码、 匿名)。 ● 陷阱:绝不把定性叫“不科学”;对一个为什么问题绝不“干脆做更大的调查”;绝不把 bottom-up ←> top-down 互 换。 AskSia Library · MAST20034 · 双语 Bilingual WEEK 7 . FRAMEWORKS FOR INFERENCE - WEEK 7 . FRAMEWORKS FOR INFERENCE CH 7 . ESTIMATION & SAMPLING DISTRIBUTIONS From sample to population: estimation & the CLT 从样本到总体:估计与 CLT Why one sample can speak for a whole population - and how confidently 为何一个样本能为整个总体发声 -- 以及有多大把握 TL;DR. Inference runs the arrow backwards: probability reasons population - sample, inference reasons sample - population. A point estimate is one number; a confidence interval is honest because it carries "how close". The whole machine rests on the sampling distribution - what the estimate would do over many samples - which the Central Limit Theorem makes Normal. Everything on these three pages is about reading and explaining this, never computing it. TL;DR. 推断把箭头倒过来跑:概率从总体→样本推理,推断从样本→ 总体推理。一个 point estimate(点估计)是一个数 字;一个 confidence interval(置信区间)之所以诚实,是因为它带着“有多接近”。整套机器都立在 sampling distribution (抽样分布)之上 -- 即这个估计在许多样本上会有的表现 -- 而 Central Limit Theorem (CLT,中心极限定理)让它呈正 态。这三页里的一切都关于读懂并解释它,从不计算它。 ★ What the exam asks here 考试在这里问什么 The 60% final is short-answer reasoning only - no calculator, no calculations, no multiple choice. You bring in four sides of your own notes. For inference you will be handed a CI or a P-value to interpret and asked to say what it does (and does not) mean, or to name an error / explain power in a scenario. The marking is explicit: "explaining your reasoning and choices is typically more important than any answer. " Every mark is a because - so carry the definitions, the CI/P-value interpretation rules, and the Type I/II decoder, not arithmetic. 60% 期末只考简答推理 -- 无计算器、无计算、无多选。你带入四面自己的笔记。对于推断,你会被递给一个要解读的 CI 或 P-value,要你说它意味着什么(以及不意味着什么),或在某情景中点名一种 error/解释 power。评分明确: “阐释你的推理与选择,通常比任何答案本身更重要。”每一分都是一个because -- 所以带上定义、CI/P-value 解读规 则,以及 Type l/II 解码器,而非算术。 7. 1 Estimation: point estimate vs confidence interval 7. 1fait: point estimate vs confidence interval Definitions. A point estimate is a single number computed from the sample that stands in for an unknown population parameter (the sample mean x estimates u; the sample proportion p estimates p; s estimates o). A confidence interval (CI) is a range - the estimate plus a margin that encodes how close it is likely to be: estimate ± (distribution multiplier) x (variability). The width comes from sampling variability (and so shrinks as n grows); the multiplier comes from the confidence level you choose. - 定义。一个 point estimate 是从样本算出的单个数字,用来替代一个未知的总体 parameter(参数)(样本均值 x估计 μ; 样本比例 p^估计 p; s 估计 o)。一个 confidence interval (CI,置信区间)是一个范围 -- 估计值加上一个编码了它有多大 可能接近的余量:估计值+(分布乘数)× (变异性)。宽度来自抽样变异性(所以随n增大而收缩);乘数来自你选择的置 信水平。 Quantity Symbol Lives in
-
8)Qualitative methods
-
这块也很重要,而且很多人容易忽略。
-
核心对比:
- quantitative:回答“什么 / 有多少”
- qualitative:回答“为什么”[11]Source: asksia-bible-mast20034-bilingual.pdf(b) Big-data 批判(because):巨大的 n修不了 bias -- 它只是那些已经在用这个 app 的人(selection bias); 在海量 n 下一切看起来都“显著”,所以 effect size 与来源比 P 更要紧。再加上数据伦理标记:被记录用户的隐私/ 同意。 What earns the marks. a justified qual choice (the "why" logic + convergence) + a big-data critique naming that size # representativeness, with an ethics flag. 什么能得分。一个有论证的 qual 选择(“why”逻辑+ convergence)+一个 big-data 批判,点明规模 ≠代表性,并附 上一个 ethics 标记。 Trap. dismissing qualitative as "unscientific"; equating large n with representative; forgetting consent/provenance for found data. 陷阱。把 qualitative 斥为“不科学”;把大n 等同于有代表性;对找来的数据忘了 consent/provenance(同意/来源)。 AskSia Library · MAST20034 · 双语 Bilingual ★ Recall checklist - the decision rules for the bank 回忆清单 -- 题库的决策规则 1 每个答案:点名→定义→在情境中给理由→ because(后果)。按分值数理由。 · Design/sampling: justify the choice against its alternative; non-probability methods are biased - size won't cure it. 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 · Confounder: must link to both exposure and outcome; observational - associated with, never causes. Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 · Graph critique: two good features to a principle; one issue + a fix that matches it. 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 · Type I/II: false +/false -; power = 1-ß; small n -+ low power; rare condition - base-rate false positives. Type l/ll: 假阳/假阴;power = 1-β;小 n→低 power;罕见情况→基础率导致的假阳性。 · Qual vs quant: why vs what; convergence is the qual stopping rule. Big data: size # unbiased; effect size & ethics over P. 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data: 大≠无偏;effect size 与伦理胜过 P。 ● 每个答案:点名 →定义→在情境中给理由→ because(后果)。按分值数理由。 ● 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 ● Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 ● 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 ● CI:随机的是区间,固定的是参数。P-value: Pr(datalHo),而非 Pr(Holdata);显著 ≠重要。 ● Type I/ll: 假阳/假阴;power = 1-β;小n→低 power;罕见情况→ 基础率导致的假阳性。 ● Forest plot: 菱形对零线+异质性+发表偏倚的注意。Hill:证据的权重,时序性优先。 ● 诊断图:漏斗形→方差非恒定;QQ 弯曲→非正态→变换/用假设更少的方法。 ● 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data:大≠无偏;effect size 与伦理胜过 P。 AskSia Library · MAST20034 · 双语 Bilingual EXAM MORNING . THE DECODER EXAM MORNING . THE DECODER[15]Source: asksia-bible-mast20034-bilingual.pdf混合方法=定性+定量;定性解释或为定量播种。永远点名其伦理/实务成本(时间、编码、匿名性)。 · Trap: never call qualitative "unscientific"; never "just take a bigger survey" for a why question; never swap bottom-up + top-down. I 陷阱:绝不把定性称为“不科学”;对一个为何的问题绝不“就去做更大的调查”;绝不把自下而上←自上而下对调。 ● Quant=“什么/有多少”;qual=“为什么” -- 按问题的性质来选,而非按哪个“更好”。 ● 四种来源: interviews (深度) · focus groups (互动)· observation (做≠说)· documents/artefacts (已经存在) -- 各有其风险。 ● 编码:bottom-up =归纳,codes 从数据中涌现;top-down =演绎,codes 来自先验理论。Themes =归组后的 codes (thematic analysis) . · Convergence (收敛)=定性的停止规则(类比于 power/样本量):当新数据不再增添新主题时停止。 · Rigour (严谨): credibility (~ 内部效度)、transferability (~外部效度)、transparency、有目的的采集。 ● Mixed methods (混合方法) = qual + quant; qual 解释或孕育 quant。始终点名伦理/实务代价(时间、编码、 匿名)。 ● 陷阱:绝不把定性叫“不科学”;对一个为什么问题绝不“干脆做更大的调查”;绝不把 bottom-up ←> top-down 互 换。 AskSia Library · MAST20034 · 双语 Bilingual WEEK 7 . FRAMEWORKS FOR INFERENCE - WEEK 7 . FRAMEWORKS FOR INFERENCE CH 7 . ESTIMATION & SAMPLING DISTRIBUTIONS From sample to population: estimation & the CLT 从样本到总体:估计与 CLT Why one sample can speak for a whole population - and how confidently 为何一个样本能为整个总体发声 -- 以及有多大把握 TL;DR. Inference runs the arrow backwards: probability reasons population - sample, inference reasons sample - population. A point estimate is one number; a confidence interval is honest because it carries "how close". The whole machine rests on the sampling distribution - what the estimate would do over many samples - which the Central Limit Theorem makes Normal. Everything on these three pages is about reading and explaining this, never computing it. TL;DR. 推断把箭头倒过来跑:概率从总体→样本推理,推断从样本→ 总体推理。一个 point estimate(点估计)是一个数 字;一个 confidence interval(置信区间)之所以诚实,是因为它带着“有多接近”。整套机器都立在 sampling distribution (抽样分布)之上 -- 即这个估计在许多样本上会有的表现 -- 而 Central Limit Theorem (CLT,中心极限定理)让它呈正 态。这三页里的一切都关于读懂并解释它,从不计算它。 ★ What the exam asks here 考试在这里问什么 The 60% final is short-answer reasoning only - no calculator, no calculations, no multiple choice. You bring in four sides of your own notes. For inference you will be handed a CI or a P-value to interpret and asked to say what it does (and does not) mean, or to name an error / explain power in a scenario. The marking is explicit: "explaining your reasoning and choices is typically more important than any answer. " Every mark is a because - so carry the definitions, the CI/P-value interpretation rules, and the Type I/II decoder, not arithmetic. 60% 期末只考简答推理 -- 无计算器、无计算、无多选。你带入四面自己的笔记。对于推断,你会被递给一个要解读的 CI 或 P-value,要你说它意味着什么(以及不意味着什么),或在某情景中点名一种 error/解释 power。评分明确: “阐释你的推理与选择,通常比任何答案本身更重要。”每一分都是一个because -- 所以带上定义、CI/P-value 解读规 则,以及 Type l/II 解码器,而非算术。 7. 1 Estimation: point estimate vs confidence interval 7. 1fait: point estimate vs confidence interval Definitions. A point estimate is a single number computed from the sample that stands in for an unknown population parameter (the sample mean x estimates u; the sample proportion p estimates p; s estimates o). A confidence interval (CI) is a range - the estimate plus a margin that encodes how close it is likely to be: estimate ± (distribution multiplier) x (variability). The width comes from sampling variability (and so shrinks as n grows); the multiplier comes from the confidence level you choose. - 定义。一个 point estimate 是从样本算出的单个数字,用来替代一个未知的总体 parameter(参数)(样本均值 x估计 μ; 样本比例 p^估计 p; s 估计 o)。一个 confidence interval (CI,置信区间)是一个范围 -- 估计值加上一个编码了它有多大 可能接近的余量:估计值+(分布乘数)× (变异性)。宽度来自抽样变异性(所以随n增大而收缩);乘数来自你选择的置 信水平。 Quantity Symbol Lives in
-
定性方法来源:
-
coding
- bottom-up:归纳,codes 从数据中长出来[15]Source: asksia-bible-mast20034-bilingual.pdf混合方法=定性+定量;定性解释或为定量播种。永远点名其伦理/实务成本(时间、编码、匿名性)。 · Trap: never call qualitative "unscientific"; never "just take a bigger survey" for a why question; never swap bottom-up + top-down. I 陷阱:绝不把定性称为“不科学”;对一个为何的问题绝不“就去做更大的调查”;绝不把自下而上←自上而下对调。 ● Quant=“什么/有多少”;qual=“为什么” -- 按问题的性质来选,而非按哪个“更好”。 ● 四种来源: interviews (深度) · focus groups (互动)· observation (做≠说)· documents/artefacts (已经存在) -- 各有其风险。 ● 编码:bottom-up =归纳,codes 从数据中涌现;top-down =演绎,codes 来自先验理论。Themes =归组后的 codes (thematic analysis) . · Convergence (收敛)=定性的停止规则(类比于 power/样本量):当新数据不再增添新主题时停止。 · Rigour (严谨): credibility (~ 内部效度)、transferability (~外部效度)、transparency、有目的的采集。 ● Mixed methods (混合方法) = qual + quant; qual 解释或孕育 quant。始终点名伦理/实务代价(时间、编码、 匿名)。 ● 陷阱:绝不把定性叫“不科学”;对一个为什么问题绝不“干脆做更大的调查”;绝不把 bottom-up ←> top-down 互 换。 AskSia Library · MAST20034 · 双语 Bilingual WEEK 7 . FRAMEWORKS FOR INFERENCE - WEEK 7 . FRAMEWORKS FOR INFERENCE CH 7 . ESTIMATION & SAMPLING DISTRIBUTIONS From sample to population: estimation & the CLT 从样本到总体:估计与 CLT Why one sample can speak for a whole population - and how confidently 为何一个样本能为整个总体发声 -- 以及有多大把握 TL;DR. Inference runs the arrow backwards: probability reasons population - sample, inference reasons sample - population. A point estimate is one number; a confidence interval is honest because it carries "how close". The whole machine rests on the sampling distribution - what the estimate would do over many samples - which the Central Limit Theorem makes Normal. Everything on these three pages is about reading and explaining this, never computing it. TL;DR. 推断把箭头倒过来跑:概率从总体→样本推理,推断从样本→ 总体推理。一个 point estimate(点估计)是一个数 字;一个 confidence interval(置信区间)之所以诚实,是因为它带着“有多接近”。整套机器都立在 sampling distribution (抽样分布)之上 -- 即这个估计在许多样本上会有的表现 -- 而 Central Limit Theorem (CLT,中心极限定理)让它呈正 态。这三页里的一切都关于读懂并解释它,从不计算它。 ★ What the exam asks here 考试在这里问什么 The 60% final is short-answer reasoning only - no calculator, no calculations, no multiple choice. You bring in four sides of your own notes. For inference you will be handed a CI or a P-value to interpret and asked to say what it does (and does not) mean, or to name an error / explain power in a scenario. The marking is explicit: "explaining your reasoning and choices is typically more important than any answer. " Every mark is a because - so carry the definitions, the CI/P-value interpretation rules, and the Type I/II decoder, not arithmetic. 60% 期末只考简答推理 -- 无计算器、无计算、无多选。你带入四面自己的笔记。对于推断,你会被递给一个要解读的 CI 或 P-value,要你说它意味着什么(以及不意味着什么),或在某情景中点名一种 error/解释 power。评分明确: “阐释你的推理与选择,通常比任何答案本身更重要。”每一分都是一个because -- 所以带上定义、CI/P-value 解读规 则,以及 Type l/II 解码器,而非算术。 7. 1 Estimation: point estimate vs confidence interval 7. 1fait: point estimate vs confidence interval Definitions. A point estimate is a single number computed from the sample that stands in for an unknown population parameter (the sample mean x estimates u; the sample proportion p estimates p; s estimates o). A confidence interval (CI) is a range - the estimate plus a margin that encodes how close it is likely to be: estimate ± (distribution multiplier) x (variability). The width comes from sampling variability (and so shrinks as n grows); the multiplier comes from the confidence level you choose. - 定义。一个 point estimate 是从样本算出的单个数字,用来替代一个未知的总体 parameter(参数)(样本均值 x估计 μ; 样本比例 p^估计 p; s 估计 o)。一个 confidence interval (CI,置信区间)是一个范围 -- 估计值加上一个编码了它有多大 可能接近的余量:估计值+(分布乘数)× (变异性)。宽度来自抽样变异性(所以随n增大而收缩);乘数来自你选择的置 信水平。 Quantity Symbol Lives in
- top-down:演绎,codes 来自先验理论[15]Source: asksia-bible-mast20034-bilingual.pdf混合方法=定性+定量;定性解释或为定量播种。永远点名其伦理/实务成本(时间、编码、匿名性)。 · Trap: never call qualitative "unscientific"; never "just take a bigger survey" for a why question; never swap bottom-up + top-down. I 陷阱:绝不把定性称为“不科学”;对一个为何的问题绝不“就去做更大的调查”;绝不把自下而上←自上而下对调。 ● Quant=“什么/有多少”;qual=“为什么” -- 按问题的性质来选,而非按哪个“更好”。 ● 四种来源: interviews (深度) · focus groups (互动)· observation (做≠说)· documents/artefacts (已经存在) -- 各有其风险。 ● 编码:bottom-up =归纳,codes 从数据中涌现;top-down =演绎,codes 来自先验理论。Themes =归组后的 codes (thematic analysis) . · Convergence (收敛)=定性的停止规则(类比于 power/样本量):当新数据不再增添新主题时停止。 · Rigour (严谨): credibility (~ 内部效度)、transferability (~外部效度)、transparency、有目的的采集。 ● Mixed methods (混合方法) = qual + quant; qual 解释或孕育 quant。始终点名伦理/实务代价(时间、编码、 匿名)。 ● 陷阱:绝不把定性叫“不科学”;对一个为什么问题绝不“干脆做更大的调查”;绝不把 bottom-up ←> top-down 互 换。 AskSia Library · MAST20034 · 双语 Bilingual WEEK 7 . FRAMEWORKS FOR INFERENCE - WEEK 7 . FRAMEWORKS FOR INFERENCE CH 7 . ESTIMATION & SAMPLING DISTRIBUTIONS From sample to population: estimation & the CLT 从样本到总体:估计与 CLT Why one sample can speak for a whole population - and how confidently 为何一个样本能为整个总体发声 -- 以及有多大把握 TL;DR. Inference runs the arrow backwards: probability reasons population - sample, inference reasons sample - population. A point estimate is one number; a confidence interval is honest because it carries "how close". The whole machine rests on the sampling distribution - what the estimate would do over many samples - which the Central Limit Theorem makes Normal. Everything on these three pages is about reading and explaining this, never computing it. TL;DR. 推断把箭头倒过来跑:概率从总体→样本推理,推断从样本→ 总体推理。一个 point estimate(点估计)是一个数 字;一个 confidence interval(置信区间)之所以诚实,是因为它带着“有多接近”。整套机器都立在 sampling distribution (抽样分布)之上 -- 即这个估计在许多样本上会有的表现 -- 而 Central Limit Theorem (CLT,中心极限定理)让它呈正 态。这三页里的一切都关于读懂并解释它,从不计算它。 ★ What the exam asks here 考试在这里问什么 The 60% final is short-answer reasoning only - no calculator, no calculations, no multiple choice. You bring in four sides of your own notes. For inference you will be handed a CI or a P-value to interpret and asked to say what it does (and does not) mean, or to name an error / explain power in a scenario. The marking is explicit: "explaining your reasoning and choices is typically more important than any answer. " Every mark is a because - so carry the definitions, the CI/P-value interpretation rules, and the Type I/II decoder, not arithmetic. 60% 期末只考简答推理 -- 无计算器、无计算、无多选。你带入四面自己的笔记。对于推断,你会被递给一个要解读的 CI 或 P-value,要你说它意味着什么(以及不意味着什么),或在某情景中点名一种 error/解释 power。评分明确: “阐释你的推理与选择,通常比任何答案本身更重要。”每一分都是一个because -- 所以带上定义、CI/P-value 解读规 则,以及 Type l/II 解码器,而非算术。 7. 1 Estimation: point estimate vs confidence interval 7. 1fait: point estimate vs confidence interval Definitions. A point estimate is a single number computed from the sample that stands in for an unknown population parameter (the sample mean x estimates u; the sample proportion p estimates p; s estimates o). A confidence interval (CI) is a range - the estimate plus a margin that encodes how close it is likely to be: estimate ± (distribution multiplier) x (variability). The width comes from sampling variability (and so shrinks as n grows); the multiplier comes from the confidence level you choose. - 定义。一个 point estimate 是从样本算出的单个数字,用来替代一个未知的总体 parameter(参数)(样本均值 x估计 μ; 样本比例 p^估计 p; s 估计 o)。一个 confidence interval (CI,置信区间)是一个范围 -- 估计值加上一个编码了它有多大 可能接近的余量:估计值+(分布乘数)× (变异性)。宽度来自抽样变异性(所以随n增大而收缩);乘数来自你选择的置 信水平。 Quantity Symbol Lives in
-
convergence
-
严谨性
-
常见陷阱
- 不能把 qualitative 说成 “不科学”[11]Source: asksia-bible-mast20034-bilingual.pdf(b) Big-data 批判(because):巨大的 n修不了 bias -- 它只是那些已经在用这个 app 的人(selection bias); 在海量 n 下一切看起来都“显著”,所以 effect size 与来源比 P 更要紧。再加上数据伦理标记:被记录用户的隐私/ 同意。 What earns the marks. a justified qual choice (the "why" logic + convergence) + a big-data critique naming that size # representativeness, with an ethics flag. 什么能得分。一个有论证的 qual 选择(“why”逻辑+ convergence)+一个 big-data 批判,点明规模 ≠代表性,并附 上一个 ethics 标记。 Trap. dismissing qualitative as "unscientific"; equating large n with representative; forgetting consent/provenance for found data. 陷阱。把 qualitative 斥为“不科学”;把大n 等同于有代表性;对找来的数据忘了 consent/provenance(同意/来源)。 AskSia Library · MAST20034 · 双语 Bilingual ★ Recall checklist - the decision rules for the bank 回忆清单 -- 题库的决策规则 1 每个答案:点名→定义→在情境中给理由→ because(后果)。按分值数理由。 · Design/sampling: justify the choice against its alternative; non-probability methods are biased - size won't cure it. 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 · Confounder: must link to both exposure and outcome; observational - associated with, never causes. Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 · Graph critique: two good features to a principle; one issue + a fix that matches it. 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 · Type I/II: false +/false -; power = 1-ß; small n -+ low power; rare condition - base-rate false positives. Type l/ll: 假阳/假阴;power = 1-β;小 n→低 power;罕见情况→基础率导致的假阳性。 · Qual vs quant: why vs what; convergence is the qual stopping rule. Big data: size # unbiased; effect size & ethics over P. 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data: 大≠无偏;effect size 与伦理胜过 P。 ● 每个答案:点名 →定义→在情境中给理由→ because(后果)。按分值数理由。 ● 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 ● Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 ● 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 ● CI:随机的是区间,固定的是参数。P-value: Pr(datalHo),而非 Pr(Holdata);显著 ≠重要。 ● Type I/ll: 假阳/假阴;power = 1-β;小n→低 power;罕见情况→ 基础率导致的假阳性。 ● Forest plot: 菱形对零线+异质性+发表偏倚的注意。Hill:证据的权重,时序性优先。 ● 诊断图:漏斗形→方差非恒定;QQ 弯曲→非正态→变换/用假设更少的方法。 ● 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data:大≠无偏;effect size 与伦理胜过 P。 AskSia Library · MAST20034 · 双语 Bilingual EXAM MORNING . THE DECODER EXAM MORNING . THE DECODER[15]Source: asksia-bible-mast20034-bilingual.pdf混合方法=定性+定量;定性解释或为定量播种。永远点名其伦理/实务成本(时间、编码、匿名性)。 · Trap: never call qualitative "unscientific"; never "just take a bigger survey" for a why question; never swap bottom-up + top-down. I 陷阱:绝不把定性称为“不科学”;对一个为何的问题绝不“就去做更大的调查”;绝不把自下而上←自上而下对调。 ● Quant=“什么/有多少”;qual=“为什么” -- 按问题的性质来选,而非按哪个“更好”。 ● 四种来源: interviews (深度) · focus groups (互动)· observation (做≠说)· documents/artefacts (已经存在) -- 各有其风险。 ● 编码:bottom-up =归纳,codes 从数据中涌现;top-down =演绎,codes 来自先验理论。Themes =归组后的 codes (thematic analysis) . · Convergence (收敛)=定性的停止规则(类比于 power/样本量):当新数据不再增添新主题时停止。 · Rigour (严谨): credibility (~ 内部效度)、transferability (~外部效度)、transparency、有目的的采集。 ● Mixed methods (混合方法) = qual + quant; qual 解释或孕育 quant。始终点名伦理/实务代价(时间、编码、 匿名)。 ● 陷阱:绝不把定性叫“不科学”;对一个为什么问题绝不“干脆做更大的调查”;绝不把 bottom-up ←> top-down 互 换。 AskSia Library · MAST20034 · 双语 Bilingual WEEK 7 . FRAMEWORKS FOR INFERENCE - WEEK 7 . FRAMEWORKS FOR INFERENCE CH 7 . ESTIMATION & SAMPLING DISTRIBUTIONS From sample to population: estimation & the CLT 从样本到总体:估计与 CLT Why one sample can speak for a whole population - and how confidently 为何一个样本能为整个总体发声 -- 以及有多大把握 TL;DR. Inference runs the arrow backwards: probability reasons population - sample, inference reasons sample - population. A point estimate is one number; a confidence interval is honest because it carries "how close". The whole machine rests on the sampling distribution - what the estimate would do over many samples - which the Central Limit Theorem makes Normal. Everything on these three pages is about reading and explaining this, never computing it. TL;DR. 推断把箭头倒过来跑:概率从总体→样本推理,推断从样本→ 总体推理。一个 point estimate(点估计)是一个数 字;一个 confidence interval(置信区间)之所以诚实,是因为它带着“有多接近”。整套机器都立在 sampling distribution (抽样分布)之上 -- 即这个估计在许多样本上会有的表现 -- 而 Central Limit Theorem (CLT,中心极限定理)让它呈正 态。这三页里的一切都关于读懂并解释它,从不计算它。 ★ What the exam asks here 考试在这里问什么 The 60% final is short-answer reasoning only - no calculator, no calculations, no multiple choice. You bring in four sides of your own notes. For inference you will be handed a CI or a P-value to interpret and asked to say what it does (and does not) mean, or to name an error / explain power in a scenario. The marking is explicit: "explaining your reasoning and choices is typically more important than any answer. " Every mark is a because - so carry the definitions, the CI/P-value interpretation rules, and the Type I/II decoder, not arithmetic. 60% 期末只考简答推理 -- 无计算器、无计算、无多选。你带入四面自己的笔记。对于推断,你会被递给一个要解读的 CI 或 P-value,要你说它意味着什么(以及不意味着什么),或在某情景中点名一种 error/解释 power。评分明确: “阐释你的推理与选择,通常比任何答案本身更重要。”每一分都是一个because -- 所以带上定义、CI/P-value 解读规 则,以及 Type l/II 解码器,而非算术。 7. 1 Estimation: point estimate vs confidence interval 7. 1fait: point estimate vs confidence interval Definitions. A point estimate is a single number computed from the sample that stands in for an unknown population parameter (the sample mean x estimates u; the sample proportion p estimates p; s estimates o). A confidence interval (CI) is a range - the estimate plus a margin that encodes how close it is likely to be: estimate ± (distribution multiplier) x (variability). The width comes from sampling variability (and so shrinks as n grows); the multiplier comes from the confidence level you choose. - 定义。一个 point estimate 是从样本算出的单个数字,用来替代一个未知的总体 parameter(参数)(样本均值 x估计 μ; 样本比例 p^估计 p; s 估计 o)。一个 confidence interval (CI,置信区间)是一个范围 -- 估计值加上一个编码了它有多大 可能接近的余量:估计值+(分布乘数)× (变异性)。宽度来自抽样变异性(所以随n增大而收缩);乘数来自你选择的置 信水平。 Quantity Symbol Lives in
- 对“为什么”问题,不能说“那就做更大的 survey”[15]Source: asksia-bible-mast20034-bilingual.pdf混合方法=定性+定量;定性解释或为定量播种。永远点名其伦理/实务成本(时间、编码、匿名性)。 · Trap: never call qualitative "unscientific"; never "just take a bigger survey" for a why question; never swap bottom-up + top-down. I 陷阱:绝不把定性称为“不科学”;对一个为何的问题绝不“就去做更大的调查”;绝不把自下而上←自上而下对调。 ● Quant=“什么/有多少”;qual=“为什么” -- 按问题的性质来选,而非按哪个“更好”。 ● 四种来源: interviews (深度) · focus groups (互动)· observation (做≠说)· documents/artefacts (已经存在) -- 各有其风险。 ● 编码:bottom-up =归纳,codes 从数据中涌现;top-down =演绎,codes 来自先验理论。Themes =归组后的 codes (thematic analysis) . · Convergence (收敛)=定性的停止规则(类比于 power/样本量):当新数据不再增添新主题时停止。 · Rigour (严谨): credibility (~ 内部效度)、transferability (~外部效度)、transparency、有目的的采集。 ● Mixed methods (混合方法) = qual + quant; qual 解释或孕育 quant。始终点名伦理/实务代价(时间、编码、 匿名)。 ● 陷阱:绝不把定性叫“不科学”;对一个为什么问题绝不“干脆做更大的调查”;绝不把 bottom-up ←> top-down 互 换。 AskSia Library · MAST20034 · 双语 Bilingual WEEK 7 . FRAMEWORKS FOR INFERENCE - WEEK 7 . FRAMEWORKS FOR INFERENCE CH 7 . ESTIMATION & SAMPLING DISTRIBUTIONS From sample to population: estimation & the CLT 从样本到总体:估计与 CLT Why one sample can speak for a whole population - and how confidently 为何一个样本能为整个总体发声 -- 以及有多大把握 TL;DR. Inference runs the arrow backwards: probability reasons population - sample, inference reasons sample - population. A point estimate is one number; a confidence interval is honest because it carries "how close". The whole machine rests on the sampling distribution - what the estimate would do over many samples - which the Central Limit Theorem makes Normal. Everything on these three pages is about reading and explaining this, never computing it. TL;DR. 推断把箭头倒过来跑:概率从总体→样本推理,推断从样本→ 总体推理。一个 point estimate(点估计)是一个数 字;一个 confidence interval(置信区间)之所以诚实,是因为它带着“有多接近”。整套机器都立在 sampling distribution (抽样分布)之上 -- 即这个估计在许多样本上会有的表现 -- 而 Central Limit Theorem (CLT,中心极限定理)让它呈正 态。这三页里的一切都关于读懂并解释它,从不计算它。 ★ What the exam asks here 考试在这里问什么 The 60% final is short-answer reasoning only - no calculator, no calculations, no multiple choice. You bring in four sides of your own notes. For inference you will be handed a CI or a P-value to interpret and asked to say what it does (and does not) mean, or to name an error / explain power in a scenario. The marking is explicit: "explaining your reasoning and choices is typically more important than any answer. " Every mark is a because - so carry the definitions, the CI/P-value interpretation rules, and the Type I/II decoder, not arithmetic. 60% 期末只考简答推理 -- 无计算器、无计算、无多选。你带入四面自己的笔记。对于推断,你会被递给一个要解读的 CI 或 P-value,要你说它意味着什么(以及不意味着什么),或在某情景中点名一种 error/解释 power。评分明确: “阐释你的推理与选择,通常比任何答案本身更重要。”每一分都是一个because -- 所以带上定义、CI/P-value 解读规 则,以及 Type l/II 解码器,而非算术。 7. 1 Estimation: point estimate vs confidence interval 7. 1fait: point estimate vs confidence interval Definitions. A point estimate is a single number computed from the sample that stands in for an unknown population parameter (the sample mean x estimates u; the sample proportion p estimates p; s estimates o). A confidence interval (CI) is a range - the estimate plus a margin that encodes how close it is likely to be: estimate ± (distribution multiplier) x (variability). The width comes from sampling variability (and so shrinks as n grows); the multiplier comes from the confidence level you choose. - 定义。一个 point estimate 是从样本算出的单个数字,用来替代一个未知的总体 parameter(参数)(样本均值 x估计 μ; 样本比例 p^估计 p; s 估计 o)。一个 confidence interval (CI,置信区间)是一个范围 -- 估计值加上一个编码了它有多大 可能接近的余量:估计值+(分布乘数)× (变异性)。宽度来自抽样变异性(所以随n增大而收缩);乘数来自你选择的置 信水平。 Quantity Symbol Lives in
-
9)Modelling / diagnostics / parsimony
-
这部分当前摘录支持的内容主要是“会读、会批判,不会算”。
-
核心:
- 模型是 signal + noise[9]Source: asksia-bible-mast20034-bilingual.pdfThe PPDAC cycle - the spine of the whole unit, and a one-glance map of how an exam scenario hangs together: every question lives somewhere on Problem - Plan - Data - Analysis - Conclusion. Locating the stage tells you which concept the marker wants. PPDAC 循环 -- 整个单元的脊柱,也是一张让你一眼看清考试情景如何拼接的地图:每道题都栖身于 Problem → Plan → Data → Analysis → Conclusion 中的某处。定位到阶段,就知道评分者想要哪个概念。 AskSia Library . MAST20034 . XXia Bilingual ★ Concepts to recall - the whole-book checklist 要回忆的概念 -- 全书清单 · Context first (Ch1): data are value-laden; ask who/why/what/how/when; critique # criticism (always offer a constructive fix). 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · Graphics (Ch2): the 5 principles; match graph to variable types; two good features + one specific improvement. I 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 · Design (Ch3): validity = randomise/compare/control (kills bias); precision = replicate/stratify/balance (kills variability); they are independent axes. 设计(第3章): validity = 随机化/比较/控制(杀 bias); precision = 重复/分层/平衡(杀 variability);二者是独 立的轴。 · Observational (Ch4): cohort=group-by-exposure, case-control=group-by-outcome; confounder links to both; correlation # causation. 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠ 因果。 I 报告(第5章):中心/离散/趋势/离群点;报告 Cl+水平、以及统计量 +P,而非只报P。 · Qualitative (Ch6): "why" not "what"; bottom-up vs top-down coding; convergence as the stopping rule. 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence 作为停止规则。 推断(第7章):随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大P 不证明 Ho; Type l/ll 与 power. · Modelling (Ch8): signal+noise; "all models wrong, some useful"; parsimony; read residual/QQ plots - interpret, never fit. 建模(第8章):信号+噪声;“所有模型都是错的,有些有用”;简约性;读残差/QQ图 -- 解读,绝不拟合。 · Sampling (Ch9): frame vs sample; a big sample won't fix bias; 4 random + 4 non-random methods; WEIRD; reproducibility crisis. I 抽样(第9章):抽样框 vs样本;大样本修不了偏倚;4种随机+4种非随机方法;WEIRD;可重复性危机。 I 累积(第10章):森林图(零线+菱形);Hill 准则(时序性+梯度);发表偏倚。 · Big data (Ch11): significance # importance at scale; provenance, ethics, scepticism toward Al findings. Big data (第11章):在大规模下显著 ≠重要;来源、伦理、对 AI发现的怀疑。 而且永远 -- 铁律:点名概念,然后给 because。祝你好运。 AskSia Library · MAST20034 · 双语 Bilingual ● 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 ● 设计(第3章): validity =随机化/比较/控制(杀 bias); precision= 重复/分层/平衡(杀 variability);二者是独 立的轴。 ● 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠因果。 ● 报告(第5章):中心/离散/趋势/离群点;报告 CI+水平、以及统计量+P,而非只报P。 ● 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence(收敛)作为停止规则。 推断(第7章): 随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大 P 不证明 Ho; Type l/ll 与 power.[14]Source: asksia-bible-mast20034-bilingual.pdfGLM 连接:连续→identity/Normal;二元→logit/Bernouli (优势比);计数→log/Poisson。认出来,别去算。 · Diagnostics: residuals-vs-fitted = funnel(variance)/curve(nonlinearity); QQ tails = non-Normal. Any structure = a problem. I 诊断:残差对拟合=漏斗形(方差)/弯曲(非线性);QQ 尾部=非正态。任何结构=一个问题。 修复点名的违例:变换 · 加项 · 假设更少的检验。 · Parsimony over complexity (compare via AIC/BIC - don't compute); over-fitting is a fault, not a virtue. 一 简约性胜过复杂性(用AIC/BIC比较 -- 别去算);过拟合是缺陷,而非美德。 两个拒绝:不向数据之外做 extrapolation(外推);相关 ≠因果 -- 点出那个混杂因素。 · Model = signal + noise (模型=信号+噪声);“所有模型都是错的,有些有用” -- 评判有用性+简约性,而非 真伪。 ● 从三个轴读一个系数:符号 · 大小 · 显著性(CI 不含 0/小P=真实,而非噪声)。 ● 预测变量:数值型→斜率;分类型→相对基线的组间偏移;两者皆有→调整后的效应。 ● GLM links: 连续型→identity/Normal;二元→logit/Bernoulli (odds ratios);计数→log/Poisson。会识别,不计 算。 ● 诊断:residuals-vs-fitted = 漏斗形(方差)/曲线(非线性);QQ 尾部=非正态。任何结构=一个问题。 ● 修正点名的那个违背:transform · 加一项 · 假设更少的检验。 · 简约性(Parsimony)优于复杂性(通过 AIC/BIC 比较 -- 不计算);over-fitting 是缺点,不是优点。 ● 两条拒绝:不在数据之外做 extrapolation; correlation ≠ causation––点名 confounder。 AskSia Library · MAST20034 · 双语 Bilingual WEEK 9 . SAMPLING WEEK 9 . SAMPLING CH 9 . REASONING, NOT COMPUTING Population, frame & sample - and why a big sample can't fix a bad one population、frame 与 sample -- 以及为何大样本救不了一个坏样本 A sample is a guess about a population; the method decides if the guess is honest 样本是对总体的一次猜测;方法决定这猜测是否诚实 TL;DR. Almost no study measures everyone, so we measure a sample and reason about the population. Two different things can go wrong, and the exam lives in the gap between them: sampling error is the harmless luck-of-the-draw wobble that shrinks as the sample grows, while bias is a systematic lean baked in by the method - and a bigger sample only repeats that mistake on a larger scale. The whole skill is naming who is missing from the sample and which way that tilts the answer. TL;DR. 几乎没有研究能测量所有人,所以我们测量一个 sample 并就 population 推理。两种不同的东西可能出错,而考试 就活在它们之间的缝隙里:sampling error(抽样误差)是无害的、抽签运气式的抖动,随样本增大而收缩,而 bias 是由方 法烤进去的一种系统性偏斜 -- 更大的样本只是把那个错误放在更大的规模上重演一遍。整个技能就是点名样本里谁缺席了, 以及那朝哪个方向歪了答案。 ★ What the exam asks here 考试在这里问什么 Sampling seeds the most-rehearsed reasoning question on the 60% final - a 3-hour, short-answer paper with no calculator and no calculations, where you bring in four sides of your own notes. The released sample question asks you to define a sampling method and say why it is (not) recommended, and the tutorial species asks which sampling method would you use here and why. You will also be handed a scenario and asked to identify who is excluded and the direction of the resulting bias. The marking rule is explicit: "explaining your reasoning and choices is typically more important than any answer. " Every mark is a because - carry the taxonomy and the bias checklist, and spend the words on the consequence, not the label. Sampling 在60% 期末上孕育出最常演练的推理题 -- 一份 3小时、简答、无计算器、无计算的卷子,你带入四面自己 的笔记。发布的样题要你定义一种 sampling 方法并说出它为何(不)被推荐,教程题种则问此处你会用哪种 sampling 方法、为什么。你也会被递给一个情景,要你识别谁被排除,以及由此产生的bias的方向。评分规则明确:“阐释你的 推理与选择,通常比任何答案本身更重要。”每一分都是一个because -- 带上分类法和 bias 清单,把字花在后果上,而 非标签上。 9. 1 The core vocabulary - unit, population, frame, sample 9. 1核心词汇 -- unit、population、frame、sample Definition. A unit is one thing you study (a person, a tree, a transaction). The population is all the units you want to talk about. A census measures every unit in the population. The sampling frame is the actual list of units you can draw from - the electoral roll, a class list, a customer database. The sample is the subset you measure. Write each term as a defined object, because markers reward the defined term before the reasoning.
- all models are wrong, some are useful[9]Source: asksia-bible-mast20034-bilingual.pdfThe PPDAC cycle - the spine of the whole unit, and a one-glance map of how an exam scenario hangs together: every question lives somewhere on Problem - Plan - Data - Analysis - Conclusion. Locating the stage tells you which concept the marker wants. PPDAC 循环 -- 整个单元的脊柱,也是一张让你一眼看清考试情景如何拼接的地图:每道题都栖身于 Problem → Plan → Data → Analysis → Conclusion 中的某处。定位到阶段,就知道评分者想要哪个概念。 AskSia Library . MAST20034 . XXia Bilingual ★ Concepts to recall - the whole-book checklist 要回忆的概念 -- 全书清单 · Context first (Ch1): data are value-laden; ask who/why/what/how/when; critique # criticism (always offer a constructive fix). 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · Graphics (Ch2): the 5 principles; match graph to variable types; two good features + one specific improvement. I 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 · Design (Ch3): validity = randomise/compare/control (kills bias); precision = replicate/stratify/balance (kills variability); they are independent axes. 设计(第3章): validity = 随机化/比较/控制(杀 bias); precision = 重复/分层/平衡(杀 variability);二者是独 立的轴。 · Observational (Ch4): cohort=group-by-exposure, case-control=group-by-outcome; confounder links to both; correlation # causation. 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠ 因果。 I 报告(第5章):中心/离散/趋势/离群点;报告 Cl+水平、以及统计量 +P,而非只报P。 · Qualitative (Ch6): "why" not "what"; bottom-up vs top-down coding; convergence as the stopping rule. 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence 作为停止规则。 推断(第7章):随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大P 不证明 Ho; Type l/ll 与 power. · Modelling (Ch8): signal+noise; "all models wrong, some useful"; parsimony; read residual/QQ plots - interpret, never fit. 建模(第8章):信号+噪声;“所有模型都是错的,有些有用”;简约性;读残差/QQ图 -- 解读,绝不拟合。 · Sampling (Ch9): frame vs sample; a big sample won't fix bias; 4 random + 4 non-random methods; WEIRD; reproducibility crisis. I 抽样(第9章):抽样框 vs样本;大样本修不了偏倚;4种随机+4种非随机方法;WEIRD;可重复性危机。 I 累积(第10章):森林图(零线+菱形);Hill 准则(时序性+梯度);发表偏倚。 · Big data (Ch11): significance # importance at scale; provenance, ethics, scepticism toward Al findings. Big data (第11章):在大规模下显著 ≠重要;来源、伦理、对 AI发现的怀疑。 而且永远 -- 铁律:点名概念,然后给 because。祝你好运。 AskSia Library · MAST20034 · 双语 Bilingual ● 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 ● 设计(第3章): validity =随机化/比较/控制(杀 bias); precision= 重复/分层/平衡(杀 variability);二者是独 立的轴。 ● 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠因果。 ● 报告(第5章):中心/离散/趋势/离群点;报告 CI+水平、以及统计量+P,而非只报P。 ● 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence(收敛)作为停止规则。 推断(第7章): 随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大 P 不证明 Ho; Type l/ll 与 power.[14]Source: asksia-bible-mast20034-bilingual.pdfGLM 连接:连续→identity/Normal;二元→logit/Bernouli (优势比);计数→log/Poisson。认出来,别去算。 · Diagnostics: residuals-vs-fitted = funnel(variance)/curve(nonlinearity); QQ tails = non-Normal. Any structure = a problem. I 诊断:残差对拟合=漏斗形(方差)/弯曲(非线性);QQ 尾部=非正态。任何结构=一个问题。 修复点名的违例:变换 · 加项 · 假设更少的检验。 · Parsimony over complexity (compare via AIC/BIC - don't compute); over-fitting is a fault, not a virtue. 一 简约性胜过复杂性(用AIC/BIC比较 -- 别去算);过拟合是缺陷,而非美德。 两个拒绝:不向数据之外做 extrapolation(外推);相关 ≠因果 -- 点出那个混杂因素。 · Model = signal + noise (模型=信号+噪声);“所有模型都是错的,有些有用” -- 评判有用性+简约性,而非 真伪。 ● 从三个轴读一个系数:符号 · 大小 · 显著性(CI 不含 0/小P=真实,而非噪声)。 ● 预测变量:数值型→斜率;分类型→相对基线的组间偏移;两者皆有→调整后的效应。 ● GLM links: 连续型→identity/Normal;二元→logit/Bernoulli (odds ratios);计数→log/Poisson。会识别,不计 算。 ● 诊断:residuals-vs-fitted = 漏斗形(方差)/曲线(非线性);QQ 尾部=非正态。任何结构=一个问题。 ● 修正点名的那个违背:transform · 加一项 · 假设更少的检验。 · 简约性(Parsimony)优于复杂性(通过 AIC/BIC 比较 -- 不计算);over-fitting 是缺点,不是优点。 ● 两条拒绝:不在数据之外做 extrapolation; correlation ≠ causation––点名 confounder。 AskSia Library · MAST20034 · 双语 Bilingual WEEK 9 . SAMPLING WEEK 9 . SAMPLING CH 9 . REASONING, NOT COMPUTING Population, frame & sample - and why a big sample can't fix a bad one population、frame 与 sample -- 以及为何大样本救不了一个坏样本 A sample is a guess about a population; the method decides if the guess is honest 样本是对总体的一次猜测;方法决定这猜测是否诚实 TL;DR. Almost no study measures everyone, so we measure a sample and reason about the population. Two different things can go wrong, and the exam lives in the gap between them: sampling error is the harmless luck-of-the-draw wobble that shrinks as the sample grows, while bias is a systematic lean baked in by the method - and a bigger sample only repeats that mistake on a larger scale. The whole skill is naming who is missing from the sample and which way that tilts the answer. TL;DR. 几乎没有研究能测量所有人,所以我们测量一个 sample 并就 population 推理。两种不同的东西可能出错,而考试 就活在它们之间的缝隙里:sampling error(抽样误差)是无害的、抽签运气式的抖动,随样本增大而收缩,而 bias 是由方 法烤进去的一种系统性偏斜 -- 更大的样本只是把那个错误放在更大的规模上重演一遍。整个技能就是点名样本里谁缺席了, 以及那朝哪个方向歪了答案。 ★ What the exam asks here 考试在这里问什么 Sampling seeds the most-rehearsed reasoning question on the 60% final - a 3-hour, short-answer paper with no calculator and no calculations, where you bring in four sides of your own notes. The released sample question asks you to define a sampling method and say why it is (not) recommended, and the tutorial species asks which sampling method would you use here and why. You will also be handed a scenario and asked to identify who is excluded and the direction of the resulting bias. The marking rule is explicit: "explaining your reasoning and choices is typically more important than any answer. " Every mark is a because - carry the taxonomy and the bias checklist, and spend the words on the consequence, not the label. Sampling 在60% 期末上孕育出最常演练的推理题 -- 一份 3小时、简答、无计算器、无计算的卷子,你带入四面自己 的笔记。发布的样题要你定义一种 sampling 方法并说出它为何(不)被推荐,教程题种则问此处你会用哪种 sampling 方法、为什么。你也会被递给一个情景,要你识别谁被排除,以及由此产生的bias的方向。评分规则明确:“阐释你的 推理与选择,通常比任何答案本身更重要。”每一分都是一个because -- 带上分类法和 bias 清单,把字花在后果上,而 非标签上。 9. 1 The core vocabulary - unit, population, frame, sample 9. 1核心词汇 -- unit、population、frame、sample Definition. A unit is one thing you study (a person, a tree, a transaction). The population is all the units you want to talk about. A census measures every unit in the population. The sampling frame is the actual list of units you can draw from - the electoral roll, a class list, a customer database. The sample is the subset you measure. Write each term as a defined object, because markers reward the defined term before the reasoning.
- 要追求 parsimony,不是越复杂越好[14]Source: asksia-bible-mast20034-bilingual.pdfGLM 连接:连续→identity/Normal;二元→logit/Bernouli (优势比);计数→log/Poisson。认出来,别去算。 · Diagnostics: residuals-vs-fitted = funnel(variance)/curve(nonlinearity); QQ tails = non-Normal. Any structure = a problem. I 诊断:残差对拟合=漏斗形(方差)/弯曲(非线性);QQ 尾部=非正态。任何结构=一个问题。 修复点名的违例:变换 · 加项 · 假设更少的检验。 · Parsimony over complexity (compare via AIC/BIC - don't compute); over-fitting is a fault, not a virtue. 一 简约性胜过复杂性(用AIC/BIC比较 -- 别去算);过拟合是缺陷,而非美德。 两个拒绝:不向数据之外做 extrapolation(外推);相关 ≠因果 -- 点出那个混杂因素。 · Model = signal + noise (模型=信号+噪声);“所有模型都是错的,有些有用” -- 评判有用性+简约性,而非 真伪。 ● 从三个轴读一个系数:符号 · 大小 · 显著性(CI 不含 0/小P=真实,而非噪声)。 ● 预测变量:数值型→斜率;分类型→相对基线的组间偏移;两者皆有→调整后的效应。 ● GLM links: 连续型→identity/Normal;二元→logit/Bernoulli (odds ratios);计数→log/Poisson。会识别,不计 算。 ● 诊断:residuals-vs-fitted = 漏斗形(方差)/曲线(非线性);QQ 尾部=非正态。任何结构=一个问题。 ● 修正点名的那个违背:transform · 加一项 · 假设更少的检验。 · 简约性(Parsimony)优于复杂性(通过 AIC/BIC 比较 -- 不计算);over-fitting 是缺点,不是优点。 ● 两条拒绝:不在数据之外做 extrapolation; correlation ≠ causation––点名 confounder。 AskSia Library · MAST20034 · 双语 Bilingual WEEK 9 . SAMPLING WEEK 9 . SAMPLING CH 9 . REASONING, NOT COMPUTING Population, frame & sample - and why a big sample can't fix a bad one population、frame 与 sample -- 以及为何大样本救不了一个坏样本 A sample is a guess about a population; the method decides if the guess is honest 样本是对总体的一次猜测;方法决定这猜测是否诚实 TL;DR. Almost no study measures everyone, so we measure a sample and reason about the population. Two different things can go wrong, and the exam lives in the gap between them: sampling error is the harmless luck-of-the-draw wobble that shrinks as the sample grows, while bias is a systematic lean baked in by the method - and a bigger sample only repeats that mistake on a larger scale. The whole skill is naming who is missing from the sample and which way that tilts the answer. TL;DR. 几乎没有研究能测量所有人,所以我们测量一个 sample 并就 population 推理。两种不同的东西可能出错,而考试 就活在它们之间的缝隙里:sampling error(抽样误差)是无害的、抽签运气式的抖动,随样本增大而收缩,而 bias 是由方 法烤进去的一种系统性偏斜 -- 更大的样本只是把那个错误放在更大的规模上重演一遍。整个技能就是点名样本里谁缺席了, 以及那朝哪个方向歪了答案。 ★ What the exam asks here 考试在这里问什么 Sampling seeds the most-rehearsed reasoning question on the 60% final - a 3-hour, short-answer paper with no calculator and no calculations, where you bring in four sides of your own notes. The released sample question asks you to define a sampling method and say why it is (not) recommended, and the tutorial species asks which sampling method would you use here and why. You will also be handed a scenario and asked to identify who is excluded and the direction of the resulting bias. The marking rule is explicit: "explaining your reasoning and choices is typically more important than any answer. " Every mark is a because - carry the taxonomy and the bias checklist, and spend the words on the consequence, not the label. Sampling 在60% 期末上孕育出最常演练的推理题 -- 一份 3小时、简答、无计算器、无计算的卷子,你带入四面自己 的笔记。发布的样题要你定义一种 sampling 方法并说出它为何(不)被推荐,教程题种则问此处你会用哪种 sampling 方法、为什么。你也会被递给一个情景,要你识别谁被排除,以及由此产生的bias的方向。评分规则明确:“阐释你的 推理与选择,通常比任何答案本身更重要。”每一分都是一个because -- 带上分类法和 bias 清单,把字花在后果上,而 非标签上。 9. 1 The core vocabulary - unit, population, frame, sample 9. 1核心词汇 -- unit、population、frame、sample Definition. A unit is one thing you study (a person, a tree, a transaction). The population is all the units you want to talk about. A census measures every unit in the population. The sampling frame is the actual list of units you can draw from - the electoral roll, a class list, a customer database. The sample is the subset you measure. Write each term as a defined object, because markers reward the defined term before the reasoning.
-
诊断图你要会看:
- residuals-vs-fitted 漏斗形 → 方差不恒定[11]Source: asksia-bible-mast20034-bilingual.pdf(b) Big-data 批判(because):巨大的 n修不了 bias -- 它只是那些已经在用这个 app 的人(selection bias); 在海量 n 下一切看起来都“显著”,所以 effect size 与来源比 P 更要紧。再加上数据伦理标记:被记录用户的隐私/ 同意。 What earns the marks. a justified qual choice (the "why" logic + convergence) + a big-data critique naming that size # representativeness, with an ethics flag. 什么能得分。一个有论证的 qual 选择(“why”逻辑+ convergence)+一个 big-data 批判,点明规模 ≠代表性,并附 上一个 ethics 标记。 Trap. dismissing qualitative as "unscientific"; equating large n with representative; forgetting consent/provenance for found data. 陷阱。把 qualitative 斥为“不科学”;把大n 等同于有代表性;对找来的数据忘了 consent/provenance(同意/来源)。 AskSia Library · MAST20034 · 双语 Bilingual ★ Recall checklist - the decision rules for the bank 回忆清单 -- 题库的决策规则 1 每个答案:点名→定义→在情境中给理由→ because(后果)。按分值数理由。 · Design/sampling: justify the choice against its alternative; non-probability methods are biased - size won't cure it. 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 · Confounder: must link to both exposure and outcome; observational - associated with, never causes. Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 · Graph critique: two good features to a principle; one issue + a fix that matches it. 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 · Type I/II: false +/false -; power = 1-ß; small n -+ low power; rare condition - base-rate false positives. Type l/ll: 假阳/假阴;power = 1-β;小 n→低 power;罕见情况→基础率导致的假阳性。 · Qual vs quant: why vs what; convergence is the qual stopping rule. Big data: size # unbiased; effect size & ethics over P. 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data: 大≠无偏;effect size 与伦理胜过 P。 ● 每个答案:点名 →定义→在情境中给理由→ because(后果)。按分值数理由。 ● 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 ● Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 ● 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 ● CI:随机的是区间,固定的是参数。P-value: Pr(datalHo),而非 Pr(Holdata);显著 ≠重要。 ● Type I/ll: 假阳/假阴;power = 1-β;小n→低 power;罕见情况→ 基础率导致的假阳性。 ● Forest plot: 菱形对零线+异质性+发表偏倚的注意。Hill:证据的权重,时序性优先。 ● 诊断图:漏斗形→方差非恒定;QQ 弯曲→非正态→变换/用假设更少的方法。 ● 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data:大≠无偏;effect size 与伦理胜过 P。 AskSia Library · MAST20034 · 双语 Bilingual EXAM MORNING . THE DECODER EXAM MORNING . THE DECODER[14]Source: asksia-bible-mast20034-bilingual.pdfGLM 连接:连续→identity/Normal;二元→logit/Bernouli (优势比);计数→log/Poisson。认出来,别去算。 · Diagnostics: residuals-vs-fitted = funnel(variance)/curve(nonlinearity); QQ tails = non-Normal. Any structure = a problem. I 诊断:残差对拟合=漏斗形(方差)/弯曲(非线性);QQ 尾部=非正态。任何结构=一个问题。 修复点名的违例:变换 · 加项 · 假设更少的检验。 · Parsimony over complexity (compare via AIC/BIC - don't compute); over-fitting is a fault, not a virtue. 一 简约性胜过复杂性(用AIC/BIC比较 -- 别去算);过拟合是缺陷,而非美德。 两个拒绝:不向数据之外做 extrapolation(外推);相关 ≠因果 -- 点出那个混杂因素。 · Model = signal + noise (模型=信号+噪声);“所有模型都是错的,有些有用” -- 评判有用性+简约性,而非 真伪。 ● 从三个轴读一个系数:符号 · 大小 · 显著性(CI 不含 0/小P=真实,而非噪声)。 ● 预测变量:数值型→斜率;分类型→相对基线的组间偏移;两者皆有→调整后的效应。 ● GLM links: 连续型→identity/Normal;二元→logit/Bernoulli (odds ratios);计数→log/Poisson。会识别,不计 算。 ● 诊断:residuals-vs-fitted = 漏斗形(方差)/曲线(非线性);QQ 尾部=非正态。任何结构=一个问题。 ● 修正点名的那个违背:transform · 加一项 · 假设更少的检验。 · 简约性(Parsimony)优于复杂性(通过 AIC/BIC 比较 -- 不计算);over-fitting 是缺点,不是优点。 ● 两条拒绝:不在数据之外做 extrapolation; correlation ≠ causation––点名 confounder。 AskSia Library · MAST20034 · 双语 Bilingual WEEK 9 . SAMPLING WEEK 9 . SAMPLING CH 9 . REASONING, NOT COMPUTING Population, frame & sample - and why a big sample can't fix a bad one population、frame 与 sample -- 以及为何大样本救不了一个坏样本 A sample is a guess about a population; the method decides if the guess is honest 样本是对总体的一次猜测;方法决定这猜测是否诚实 TL;DR. Almost no study measures everyone, so we measure a sample and reason about the population. Two different things can go wrong, and the exam lives in the gap between them: sampling error is the harmless luck-of-the-draw wobble that shrinks as the sample grows, while bias is a systematic lean baked in by the method - and a bigger sample only repeats that mistake on a larger scale. The whole skill is naming who is missing from the sample and which way that tilts the answer. TL;DR. 几乎没有研究能测量所有人,所以我们测量一个 sample 并就 population 推理。两种不同的东西可能出错,而考试 就活在它们之间的缝隙里:sampling error(抽样误差)是无害的、抽签运气式的抖动,随样本增大而收缩,而 bias 是由方 法烤进去的一种系统性偏斜 -- 更大的样本只是把那个错误放在更大的规模上重演一遍。整个技能就是点名样本里谁缺席了, 以及那朝哪个方向歪了答案。 ★ What the exam asks here 考试在这里问什么 Sampling seeds the most-rehearsed reasoning question on the 60% final - a 3-hour, short-answer paper with no calculator and no calculations, where you bring in four sides of your own notes. The released sample question asks you to define a sampling method and say why it is (not) recommended, and the tutorial species asks which sampling method would you use here and why. You will also be handed a scenario and asked to identify who is excluded and the direction of the resulting bias. The marking rule is explicit: "explaining your reasoning and choices is typically more important than any answer. " Every mark is a because - carry the taxonomy and the bias checklist, and spend the words on the consequence, not the label. Sampling 在60% 期末上孕育出最常演练的推理题 -- 一份 3小时、简答、无计算器、无计算的卷子,你带入四面自己 的笔记。发布的样题要你定义一种 sampling 方法并说出它为何(不)被推荐,教程题种则问此处你会用哪种 sampling 方法、为什么。你也会被递给一个情景,要你识别谁被排除,以及由此产生的bias的方向。评分规则明确:“阐释你的 推理与选择,通常比任何答案本身更重要。”每一分都是一个because -- 带上分类法和 bias 清单,把字花在后果上,而 非标签上。 9. 1 The core vocabulary - unit, population, frame, sample 9. 1核心词汇 -- unit、population、frame、sample Definition. A unit is one thing you study (a person, a tree, a transaction). The population is all the units you want to talk about. A census measures every unit in the population. The sampling frame is the actual list of units you can draw from - the electoral roll, a class list, a customer database. The sample is the subset you measure. Write each term as a defined object, because markers reward the defined term before the reasoning.
- residuals-vs-fitted 弯曲 → 非线性[14]Source: asksia-bible-mast20034-bilingual.pdfGLM 连接:连续→identity/Normal;二元→logit/Bernouli (优势比);计数→log/Poisson。认出来,别去算。 · Diagnostics: residuals-vs-fitted = funnel(variance)/curve(nonlinearity); QQ tails = non-Normal. Any structure = a problem. I 诊断:残差对拟合=漏斗形(方差)/弯曲(非线性);QQ 尾部=非正态。任何结构=一个问题。 修复点名的违例:变换 · 加项 · 假设更少的检验。 · Parsimony over complexity (compare via AIC/BIC - don't compute); over-fitting is a fault, not a virtue. 一 简约性胜过复杂性(用AIC/BIC比较 -- 别去算);过拟合是缺陷,而非美德。 两个拒绝:不向数据之外做 extrapolation(外推);相关 ≠因果 -- 点出那个混杂因素。 · Model = signal + noise (模型=信号+噪声);“所有模型都是错的,有些有用” -- 评判有用性+简约性,而非 真伪。 ● 从三个轴读一个系数:符号 · 大小 · 显著性(CI 不含 0/小P=真实,而非噪声)。 ● 预测变量:数值型→斜率;分类型→相对基线的组间偏移;两者皆有→调整后的效应。 ● GLM links: 连续型→identity/Normal;二元→logit/Bernoulli (odds ratios);计数→log/Poisson。会识别,不计 算。 ● 诊断:residuals-vs-fitted = 漏斗形(方差)/曲线(非线性);QQ 尾部=非正态。任何结构=一个问题。 ● 修正点名的那个违背:transform · 加一项 · 假设更少的检验。 · 简约性(Parsimony)优于复杂性(通过 AIC/BIC 比较 -- 不计算);over-fitting 是缺点,不是优点。 ● 两条拒绝:不在数据之外做 extrapolation; correlation ≠ causation––点名 confounder。 AskSia Library · MAST20034 · 双语 Bilingual WEEK 9 . SAMPLING WEEK 9 . SAMPLING CH 9 . REASONING, NOT COMPUTING Population, frame & sample - and why a big sample can't fix a bad one population、frame 与 sample -- 以及为何大样本救不了一个坏样本 A sample is a guess about a population; the method decides if the guess is honest 样本是对总体的一次猜测;方法决定这猜测是否诚实 TL;DR. Almost no study measures everyone, so we measure a sample and reason about the population. Two different things can go wrong, and the exam lives in the gap between them: sampling error is the harmless luck-of-the-draw wobble that shrinks as the sample grows, while bias is a systematic lean baked in by the method - and a bigger sample only repeats that mistake on a larger scale. The whole skill is naming who is missing from the sample and which way that tilts the answer. TL;DR. 几乎没有研究能测量所有人,所以我们测量一个 sample 并就 population 推理。两种不同的东西可能出错,而考试 就活在它们之间的缝隙里:sampling error(抽样误差)是无害的、抽签运气式的抖动,随样本增大而收缩,而 bias 是由方 法烤进去的一种系统性偏斜 -- 更大的样本只是把那个错误放在更大的规模上重演一遍。整个技能就是点名样本里谁缺席了, 以及那朝哪个方向歪了答案。 ★ What the exam asks here 考试在这里问什么 Sampling seeds the most-rehearsed reasoning question on the 60% final - a 3-hour, short-answer paper with no calculator and no calculations, where you bring in four sides of your own notes. The released sample question asks you to define a sampling method and say why it is (not) recommended, and the tutorial species asks which sampling method would you use here and why. You will also be handed a scenario and asked to identify who is excluded and the direction of the resulting bias. The marking rule is explicit: "explaining your reasoning and choices is typically more important than any answer. " Every mark is a because - carry the taxonomy and the bias checklist, and spend the words on the consequence, not the label. Sampling 在60% 期末上孕育出最常演练的推理题 -- 一份 3小时、简答、无计算器、无计算的卷子,你带入四面自己 的笔记。发布的样题要你定义一种 sampling 方法并说出它为何(不)被推荐,教程题种则问此处你会用哪种 sampling 方法、为什么。你也会被递给一个情景,要你识别谁被排除,以及由此产生的bias的方向。评分规则明确:“阐释你的 推理与选择,通常比任何答案本身更重要。”每一分都是一个because -- 带上分类法和 bias 清单,把字花在后果上,而 非标签上。 9. 1 The core vocabulary - unit, population, frame, sample 9. 1核心词汇 -- unit、population、frame、sample Definition. A unit is one thing you study (a person, a tree, a transaction). The population is all the units you want to talk about. A census measures every unit in the population. The sampling frame is the actual list of units you can draw from - the electoral roll, a class list, a customer database. The sample is the subset you measure. Write each term as a defined object, because markers reward the defined term before the reasoning.
- QQ plot 尾部偏离/弯曲 → 非正态[11]Source: asksia-bible-mast20034-bilingual.pdf(b) Big-data 批判(because):巨大的 n修不了 bias -- 它只是那些已经在用这个 app 的人(selection bias); 在海量 n 下一切看起来都“显著”,所以 effect size 与来源比 P 更要紧。再加上数据伦理标记:被记录用户的隐私/ 同意。 What earns the marks. a justified qual choice (the "why" logic + convergence) + a big-data critique naming that size # representativeness, with an ethics flag. 什么能得分。一个有论证的 qual 选择(“why”逻辑+ convergence)+一个 big-data 批判,点明规模 ≠代表性,并附 上一个 ethics 标记。 Trap. dismissing qualitative as "unscientific"; equating large n with representative; forgetting consent/provenance for found data. 陷阱。把 qualitative 斥为“不科学”;把大n 等同于有代表性;对找来的数据忘了 consent/provenance(同意/来源)。 AskSia Library · MAST20034 · 双语 Bilingual ★ Recall checklist - the decision rules for the bank 回忆清单 -- 题库的决策规则 1 每个答案:点名→定义→在情境中给理由→ because(后果)。按分值数理由。 · Design/sampling: justify the choice against its alternative; non-probability methods are biased - size won't cure it. 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 · Confounder: must link to both exposure and outcome; observational - associated with, never causes. Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 · Graph critique: two good features to a principle; one issue + a fix that matches it. 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 · Type I/II: false +/false -; power = 1-ß; small n -+ low power; rare condition - base-rate false positives. Type l/ll: 假阳/假阴;power = 1-β;小 n→低 power;罕见情况→基础率导致的假阳性。 · Qual vs quant: why vs what; convergence is the qual stopping rule. Big data: size # unbiased; effect size & ethics over P. 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data: 大≠无偏;effect size 与伦理胜过 P。 ● 每个答案:点名 →定义→在情境中给理由→ because(后果)。按分值数理由。 ● 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 ● Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 ● 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 ● CI:随机的是区间,固定的是参数。P-value: Pr(datalHo),而非 Pr(Holdata);显著 ≠重要。 ● Type I/ll: 假阳/假阴;power = 1-β;小n→低 power;罕见情况→ 基础率导致的假阳性。 ● Forest plot: 菱形对零线+异质性+发表偏倚的注意。Hill:证据的权重,时序性优先。 ● 诊断图:漏斗形→方差非恒定;QQ 弯曲→非正态→变换/用假设更少的方法。 ● 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data:大≠无偏;effect size 与伦理胜过 P。 AskSia Library · MAST20034 · 双语 Bilingual EXAM MORNING . THE DECODER EXAM MORNING . THE DECODER[14]Source: asksia-bible-mast20034-bilingual.pdfGLM 连接:连续→identity/Normal;二元→logit/Bernouli (优势比);计数→log/Poisson。认出来,别去算。 · Diagnostics: residuals-vs-fitted = funnel(variance)/curve(nonlinearity); QQ tails = non-Normal. Any structure = a problem. I 诊断:残差对拟合=漏斗形(方差)/弯曲(非线性);QQ 尾部=非正态。任何结构=一个问题。 修复点名的违例:变换 · 加项 · 假设更少的检验。 · Parsimony over complexity (compare via AIC/BIC - don't compute); over-fitting is a fault, not a virtue. 一 简约性胜过复杂性(用AIC/BIC比较 -- 别去算);过拟合是缺陷,而非美德。 两个拒绝:不向数据之外做 extrapolation(外推);相关 ≠因果 -- 点出那个混杂因素。 · Model = signal + noise (模型=信号+噪声);“所有模型都是错的,有些有用” -- 评判有用性+简约性,而非 真伪。 ● 从三个轴读一个系数:符号 · 大小 · 显著性(CI 不含 0/小P=真实,而非噪声)。 ● 预测变量:数值型→斜率;分类型→相对基线的组间偏移;两者皆有→调整后的效应。 ● GLM links: 连续型→identity/Normal;二元→logit/Bernoulli (odds ratios);计数→log/Poisson。会识别,不计 算。 ● 诊断:residuals-vs-fitted = 漏斗形(方差)/曲线(非线性);QQ 尾部=非正态。任何结构=一个问题。 ● 修正点名的那个违背:transform · 加一项 · 假设更少的检验。 · 简约性(Parsimony)优于复杂性(通过 AIC/BIC 比较 -- 不计算);over-fitting 是缺点,不是优点。 ● 两条拒绝:不在数据之外做 extrapolation; correlation ≠ causation––点名 confounder。 AskSia Library · MAST20034 · 双语 Bilingual WEEK 9 . SAMPLING WEEK 9 . SAMPLING CH 9 . REASONING, NOT COMPUTING Population, frame & sample - and why a big sample can't fix a bad one population、frame 与 sample -- 以及为何大样本救不了一个坏样本 A sample is a guess about a population; the method decides if the guess is honest 样本是对总体的一次猜测;方法决定这猜测是否诚实 TL;DR. Almost no study measures everyone, so we measure a sample and reason about the population. Two different things can go wrong, and the exam lives in the gap between them: sampling error is the harmless luck-of-the-draw wobble that shrinks as the sample grows, while bias is a systematic lean baked in by the method - and a bigger sample only repeats that mistake on a larger scale. The whole skill is naming who is missing from the sample and which way that tilts the answer. TL;DR. 几乎没有研究能测量所有人,所以我们测量一个 sample 并就 population 推理。两种不同的东西可能出错,而考试 就活在它们之间的缝隙里:sampling error(抽样误差)是无害的、抽签运气式的抖动,随样本增大而收缩,而 bias 是由方 法烤进去的一种系统性偏斜 -- 更大的样本只是把那个错误放在更大的规模上重演一遍。整个技能就是点名样本里谁缺席了, 以及那朝哪个方向歪了答案。 ★ What the exam asks here 考试在这里问什么 Sampling seeds the most-rehearsed reasoning question on the 60% final - a 3-hour, short-answer paper with no calculator and no calculations, where you bring in four sides of your own notes. The released sample question asks you to define a sampling method and say why it is (not) recommended, and the tutorial species asks which sampling method would you use here and why. You will also be handed a scenario and asked to identify who is excluded and the direction of the resulting bias. The marking rule is explicit: "explaining your reasoning and choices is typically more important than any answer. " Every mark is a because - carry the taxonomy and the bias checklist, and spend the words on the consequence, not the label. Sampling 在60% 期末上孕育出最常演练的推理题 -- 一份 3小时、简答、无计算器、无计算的卷子,你带入四面自己 的笔记。发布的样题要你定义一种 sampling 方法并说出它为何(不)被推荐,教程题种则问此处你会用哪种 sampling 方法、为什么。你也会被递给一个情景,要你识别谁被排除,以及由此产生的bias的方向。评分规则明确:“阐释你的 推理与选择,通常比任何答案本身更重要。”每一分都是一个because -- 带上分类法和 bias 清单,把字花在后果上,而 非标签上。 9. 1 The core vocabulary - unit, population, frame, sample 9. 1核心词汇 -- unit、population、frame、sample Definition. A unit is one thing you study (a person, a tree, a transaction). The population is all the units you want to talk about. A census measures every unit in the population. The sampling frame is the actual list of units you can draw from - the electoral roll, a class list, a customer database. The sample is the subset you measure. Write each term as a defined object, because markers reward the defined term before the reasoning.
-
出现问题时的修复方向:
- transform
- add a term
- use a method with fewer assumptions[11]Source: asksia-bible-mast20034-bilingual.pdf(b) Big-data 批判(because):巨大的 n修不了 bias -- 它只是那些已经在用这个 app 的人(selection bias); 在海量 n 下一切看起来都“显著”,所以 effect size 与来源比 P 更要紧。再加上数据伦理标记:被记录用户的隐私/ 同意。 What earns the marks. a justified qual choice (the "why" logic + convergence) + a big-data critique naming that size # representativeness, with an ethics flag. 什么能得分。一个有论证的 qual 选择(“why”逻辑+ convergence)+一个 big-data 批判,点明规模 ≠代表性,并附 上一个 ethics 标记。 Trap. dismissing qualitative as "unscientific"; equating large n with representative; forgetting consent/provenance for found data. 陷阱。把 qualitative 斥为“不科学”;把大n 等同于有代表性;对找来的数据忘了 consent/provenance(同意/来源)。 AskSia Library · MAST20034 · 双语 Bilingual ★ Recall checklist - the decision rules for the bank 回忆清单 -- 题库的决策规则 1 每个答案:点名→定义→在情境中给理由→ because(后果)。按分值数理由。 · Design/sampling: justify the choice against its alternative; non-probability methods are biased - size won't cure it. 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 · Confounder: must link to both exposure and outcome; observational - associated with, never causes. Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 · Graph critique: two good features to a principle; one issue + a fix that matches it. 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 · Type I/II: false +/false -; power = 1-ß; small n -+ low power; rare condition - base-rate false positives. Type l/ll: 假阳/假阴;power = 1-β;小 n→低 power;罕见情况→基础率导致的假阳性。 · Qual vs quant: why vs what; convergence is the qual stopping rule. Big data: size # unbiased; effect size & ethics over P. 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data: 大≠无偏;effect size 与伦理胜过 P。 ● 每个答案:点名 →定义→在情境中给理由→ because(后果)。按分值数理由。 ● 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 ● Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 ● 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 ● CI:随机的是区间,固定的是参数。P-value: Pr(datalHo),而非 Pr(Holdata);显著 ≠重要。 ● Type I/ll: 假阳/假阴;power = 1-β;小n→低 power;罕见情况→ 基础率导致的假阳性。 ● Forest plot: 菱形对零线+异质性+发表偏倚的注意。Hill:证据的权重,时序性优先。 ● 诊断图:漏斗形→方差非恒定;QQ 弯曲→非正态→变换/用假设更少的方法。 ● 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data:大≠无偏;effect size 与伦理胜过 P。 AskSia Library · MAST20034 · 双语 Bilingual EXAM MORNING . THE DECODER EXAM MORNING . THE DECODER[14]Source: asksia-bible-mast20034-bilingual.pdfGLM 连接:连续→identity/Normal;二元→logit/Bernouli (优势比);计数→log/Poisson。认出来,别去算。 · Diagnostics: residuals-vs-fitted = funnel(variance)/curve(nonlinearity); QQ tails = non-Normal. Any structure = a problem. I 诊断:残差对拟合=漏斗形(方差)/弯曲(非线性);QQ 尾部=非正态。任何结构=一个问题。 修复点名的违例:变换 · 加项 · 假设更少的检验。 · Parsimony over complexity (compare via AIC/BIC - don't compute); over-fitting is a fault, not a virtue. 一 简约性胜过复杂性(用AIC/BIC比较 -- 别去算);过拟合是缺陷,而非美德。 两个拒绝:不向数据之外做 extrapolation(外推);相关 ≠因果 -- 点出那个混杂因素。 · Model = signal + noise (模型=信号+噪声);“所有模型都是错的,有些有用” -- 评判有用性+简约性,而非 真伪。 ● 从三个轴读一个系数:符号 · 大小 · 显著性(CI 不含 0/小P=真实,而非噪声)。 ● 预测变量:数值型→斜率;分类型→相对基线的组间偏移;两者皆有→调整后的效应。 ● GLM links: 连续型→identity/Normal;二元→logit/Bernoulli (odds ratios);计数→log/Poisson。会识别,不计 算。 ● 诊断:residuals-vs-fitted = 漏斗形(方差)/曲线(非线性);QQ 尾部=非正态。任何结构=一个问题。 ● 修正点名的那个违背:transform · 加一项 · 假设更少的检验。 · 简约性(Parsimony)优于复杂性(通过 AIC/BIC 比较 -- 不计算);over-fitting 是缺点,不是优点。 ● 两条拒绝:不在数据之外做 extrapolation; correlation ≠ causation––点名 confounder。 AskSia Library · MAST20034 · 双语 Bilingual WEEK 9 . SAMPLING WEEK 9 . SAMPLING CH 9 . REASONING, NOT COMPUTING Population, frame & sample - and why a big sample can't fix a bad one population、frame 与 sample -- 以及为何大样本救不了一个坏样本 A sample is a guess about a population; the method decides if the guess is honest 样本是对总体的一次猜测;方法决定这猜测是否诚实 TL;DR. Almost no study measures everyone, so we measure a sample and reason about the population. Two different things can go wrong, and the exam lives in the gap between them: sampling error is the harmless luck-of-the-draw wobble that shrinks as the sample grows, while bias is a systematic lean baked in by the method - and a bigger sample only repeats that mistake on a larger scale. The whole skill is naming who is missing from the sample and which way that tilts the answer. TL;DR. 几乎没有研究能测量所有人,所以我们测量一个 sample 并就 population 推理。两种不同的东西可能出错,而考试 就活在它们之间的缝隙里:sampling error(抽样误差)是无害的、抽签运气式的抖动,随样本增大而收缩,而 bias 是由方 法烤进去的一种系统性偏斜 -- 更大的样本只是把那个错误放在更大的规模上重演一遍。整个技能就是点名样本里谁缺席了, 以及那朝哪个方向歪了答案。 ★ What the exam asks here 考试在这里问什么 Sampling seeds the most-rehearsed reasoning question on the 60% final - a 3-hour, short-answer paper with no calculator and no calculations, where you bring in four sides of your own notes. The released sample question asks you to define a sampling method and say why it is (not) recommended, and the tutorial species asks which sampling method would you use here and why. You will also be handed a scenario and asked to identify who is excluded and the direction of the resulting bias. The marking rule is explicit: "explaining your reasoning and choices is typically more important than any answer. " Every mark is a because - carry the taxonomy and the bias checklist, and spend the words on the consequence, not the label. Sampling 在60% 期末上孕育出最常演练的推理题 -- 一份 3小时、简答、无计算器、无计算的卷子,你带入四面自己 的笔记。发布的样题要你定义一种 sampling 方法并说出它为何(不)被推荐,教程题种则问此处你会用哪种 sampling 方法、为什么。你也会被递给一个情景,要你识别谁被排除,以及由此产生的bias的方向。评分规则明确:“阐释你的 推理与选择,通常比任何答案本身更重要。”每一分都是一个because -- 带上分类法和 bias 清单,把字花在后果上,而 非标签上。 9. 1 The core vocabulary - unit, population, frame, sample 9. 1核心词汇 -- unit、population、frame、sample Definition. A unit is one thing you study (a person, a tree, a transaction). The population is all the units you want to talk about. A census measures every unit in the population. The sampling frame is the actual list of units you can draw from - the electoral roll, a class list, a customer database. The sample is the subset you measure. Write each term as a defined object, because markers reward the defined term before the reasoning.
-
还要记住:
-
10)Big data / ethics / AI skepticism
-
这块是现代统计素养题,非常爱考批判思维。
-
核心点:
- huge n 不等于 unbiased[3]Source: asksia-bible-mast20034-bilingual.pdf"Interpret this CI / P- value / output / forest plot" Inference reading rules . Ch7, Ch8, Ch10 (1) State what it shows in context (CI excludes O - evidence of an effect; small P - strong evidence against H. ). (2) Add the correct caveat (the interval is random, u is fixed; large P does not prove H. ). (3) Comment on strength / meaning - significant # important. SAMPLING & TRUST "Is this sample OK?"; "what's wrong with how they recruited?" Sampling bias + WEIRD · Ch1, Ch9 (1) Ask who is missing - frame / selection / non-response / volunteer gap. (2) Name the method and its bias (convenience - people similar to each other). (3) State that a bigger sample will NOT fix bias - it repeats the mistake at scale; consider WEIRD over-sampling. "Too good to be true"; "a surprising significant result"; "just barely p‹0. 05" Reproducibility + p-hacking · Ch9-10 (1) Publication bias - novel/significant results over-published, inflating effects. (2) Watch for p-hacking / HARKing (one-sided chosen after the data, multiple looks). (3) Ask for replication, a CI / effect size, and pre- registration before trusting it. "Big data / an Al claim"; "with millions of records . . . " Ethics + validity at scale · Ch1, Ch11 (1) Huge n - everything is significant - judge effect size & practical importance, not P. (2) Apply the context questions (who/why/what/how) + data ethics (consent, fairness, stewardship). (3) Be sceptical of Al - check provenance and the missing data. ✓ How to use the table under pressure 在压力下如何使用这张表 Underline the verb and the noun in the stem first ("choose a design", "critique this graph", "interpret the output"). That two-word cue picks the row; the right column is your paragraph. Then convert each of the three things into a sentence that ends in a because. You are never asked to compute - resist the urge. 先在题干里给动词和名词划线(“选择一个设计”、“批判这张图”、“解读这段输出”)。那个两字提示挑出对应的行;右侧 那一栏就是你要写的段落。然后把这三样东西各转成一个以 because 收尾的句子。题目从不要求你计算 -- 忍住冲动。 AskSia Library . MAST20034 . XXia Bilingual EXAM MORNING . THE DECODER - EXAM MORNING . THE DECODER BUILDING THE NOTES YOU CARRY IN Your 4 sides, the 'because' rule, and the 3 hours 你的4 页笔记、‘because’规则,以及那3个小时 TL;DR. You may bring four sides of your own notes and there is no calculator - so do not waste space on formulas. Fill the four sides with decision trees, checklists, and crisp definitions: the machinery that turns a cue into a justified answer. This page lays out what to put on each side, the one rule that wins short- answer marks, a timing plan for the three hours, and the closing concepts-to-recall list. TL;DR. 你可以带四面自己的笔记,而且没有计算器 -- 所以不要把空间浪费在公式上。把这四面填满decision trees (决策 树)、checklists (清单)和精炼的定义:那些把线索变成有论证答案的机器。本页摆出每一面该放什么、赢得简答分的那一 条规则、三小时的时间规划,以及收官的待回忆概念清单。 - 12. 2 The 4-side notes plan 12. 2四页笔记计划 The exam is reasoning, not recall of numbers, so your sheet is a reasoning toolkit. A good layout maps one side to each job of the decoder above. Trees and checklists earn marks; a wall of formulae does not (there is nothing to calculate). 考的是推理,而非对数字的回忆,所以你的小抄是一个推理工具箱。好的布局把每一面对应到上面解码器的一项工作。树与清 单能得分;一墙公式不能(没有任何东西要算)。 Side What goes on it[11]Source: asksia-bible-mast20034-bilingual.pdf(b) Big-data 批判(because):巨大的 n修不了 bias -- 它只是那些已经在用这个 app 的人(selection bias); 在海量 n 下一切看起来都“显著”,所以 effect size 与来源比 P 更要紧。再加上数据伦理标记:被记录用户的隐私/ 同意。 What earns the marks. a justified qual choice (the "why" logic + convergence) + a big-data critique naming that size # representativeness, with an ethics flag. 什么能得分。一个有论证的 qual 选择(“why”逻辑+ convergence)+一个 big-data 批判,点明规模 ≠代表性,并附 上一个 ethics 标记。 Trap. dismissing qualitative as "unscientific"; equating large n with representative; forgetting consent/provenance for found data. 陷阱。把 qualitative 斥为“不科学”;把大n 等同于有代表性;对找来的数据忘了 consent/provenance(同意/来源)。 AskSia Library · MAST20034 · 双语 Bilingual ★ Recall checklist - the decision rules for the bank 回忆清单 -- 题库的决策规则 1 每个答案:点名→定义→在情境中给理由→ because(后果)。按分值数理由。 · Design/sampling: justify the choice against its alternative; non-probability methods are biased - size won't cure it. 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 · Confounder: must link to both exposure and outcome; observational - associated with, never causes. Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 · Graph critique: two good features to a principle; one issue + a fix that matches it. 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 · Type I/II: false +/false -; power = 1-ß; small n -+ low power; rare condition - base-rate false positives. Type l/ll: 假阳/假阴;power = 1-β;小 n→低 power;罕见情况→基础率导致的假阳性。 · Qual vs quant: why vs what; convergence is the qual stopping rule. Big data: size # unbiased; effect size & ethics over P. 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data: 大≠无偏;effect size 与伦理胜过 P。 ● 每个答案:点名 →定义→在情境中给理由→ because(后果)。按分值数理由。 ● 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 ● Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 ● 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 ● CI:随机的是区间,固定的是参数。P-value: Pr(datalHo),而非 Pr(Holdata);显著 ≠重要。 ● Type I/ll: 假阳/假阴;power = 1-β;小n→低 power;罕见情况→ 基础率导致的假阳性。 ● Forest plot: 菱形对零线+异质性+发表偏倚的注意。Hill:证据的权重,时序性优先。 ● 诊断图:漏斗形→方差非恒定;QQ 弯曲→非正态→变换/用假设更少的方法。 ● 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data:大≠无偏;effect size 与伦理胜过 P。 AskSia Library · MAST20034 · 双语 Bilingual EXAM MORNING . THE DECODER EXAM MORNING . THE DECODER
- significant 不等于 important[3]Source: asksia-bible-mast20034-bilingual.pdf"Interpret this CI / P- value / output / forest plot" Inference reading rules . Ch7, Ch8, Ch10 (1) State what it shows in context (CI excludes O - evidence of an effect; small P - strong evidence against H. ). (2) Add the correct caveat (the interval is random, u is fixed; large P does not prove H. ). (3) Comment on strength / meaning - significant # important. SAMPLING & TRUST "Is this sample OK?"; "what's wrong with how they recruited?" Sampling bias + WEIRD · Ch1, Ch9 (1) Ask who is missing - frame / selection / non-response / volunteer gap. (2) Name the method and its bias (convenience - people similar to each other). (3) State that a bigger sample will NOT fix bias - it repeats the mistake at scale; consider WEIRD over-sampling. "Too good to be true"; "a surprising significant result"; "just barely p‹0. 05" Reproducibility + p-hacking · Ch9-10 (1) Publication bias - novel/significant results over-published, inflating effects. (2) Watch for p-hacking / HARKing (one-sided chosen after the data, multiple looks). (3) Ask for replication, a CI / effect size, and pre- registration before trusting it. "Big data / an Al claim"; "with millions of records . . . " Ethics + validity at scale · Ch1, Ch11 (1) Huge n - everything is significant - judge effect size & practical importance, not P. (2) Apply the context questions (who/why/what/how) + data ethics (consent, fairness, stewardship). (3) Be sceptical of Al - check provenance and the missing data. ✓ How to use the table under pressure 在压力下如何使用这张表 Underline the verb and the noun in the stem first ("choose a design", "critique this graph", "interpret the output"). That two-word cue picks the row; the right column is your paragraph. Then convert each of the three things into a sentence that ends in a because. You are never asked to compute - resist the urge. 先在题干里给动词和名词划线(“选择一个设计”、“批判这张图”、“解读这段输出”)。那个两字提示挑出对应的行;右侧 那一栏就是你要写的段落。然后把这三样东西各转成一个以 because 收尾的句子。题目从不要求你计算 -- 忍住冲动。 AskSia Library . MAST20034 . XXia Bilingual EXAM MORNING . THE DECODER - EXAM MORNING . THE DECODER BUILDING THE NOTES YOU CARRY IN Your 4 sides, the 'because' rule, and the 3 hours 你的4 页笔记、‘because’规则,以及那3个小时 TL;DR. You may bring four sides of your own notes and there is no calculator - so do not waste space on formulas. Fill the four sides with decision trees, checklists, and crisp definitions: the machinery that turns a cue into a justified answer. This page lays out what to put on each side, the one rule that wins short- answer marks, a timing plan for the three hours, and the closing concepts-to-recall list. TL;DR. 你可以带四面自己的笔记,而且没有计算器 -- 所以不要把空间浪费在公式上。把这四面填满decision trees (决策 树)、checklists (清单)和精炼的定义:那些把线索变成有论证答案的机器。本页摆出每一面该放什么、赢得简答分的那一 条规则、三小时的时间规划,以及收官的待回忆概念清单。 - 12. 2 The 4-side notes plan 12. 2四页笔记计划 The exam is reasoning, not recall of numbers, so your sheet is a reasoning toolkit. A good layout maps one side to each job of the decoder above. Trees and checklists earn marks; a wall of formulae does not (there is nothing to calculate). 考的是推理,而非对数字的回忆,所以你的小抄是一个推理工具箱。好的布局把每一面对应到上面解码器的一项工作。树与清 单能得分;一墙公式不能(没有任何东西要算)。 Side What goes on it[11]Source: asksia-bible-mast20034-bilingual.pdf(b) Big-data 批判(because):巨大的 n修不了 bias -- 它只是那些已经在用这个 app 的人(selection bias); 在海量 n 下一切看起来都“显著”,所以 effect size 与来源比 P 更要紧。再加上数据伦理标记:被记录用户的隐私/ 同意。 What earns the marks. a justified qual choice (the "why" logic + convergence) + a big-data critique naming that size # representativeness, with an ethics flag. 什么能得分。一个有论证的 qual 选择(“why”逻辑+ convergence)+一个 big-data 批判,点明规模 ≠代表性,并附 上一个 ethics 标记。 Trap. dismissing qualitative as "unscientific"; equating large n with representative; forgetting consent/provenance for found data. 陷阱。把 qualitative 斥为“不科学”;把大n 等同于有代表性;对找来的数据忘了 consent/provenance(同意/来源)。 AskSia Library · MAST20034 · 双语 Bilingual ★ Recall checklist - the decision rules for the bank 回忆清单 -- 题库的决策规则 1 每个答案:点名→定义→在情境中给理由→ because(后果)。按分值数理由。 · Design/sampling: justify the choice against its alternative; non-probability methods are biased - size won't cure it. 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 · Confounder: must link to both exposure and outcome; observational - associated with, never causes. Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 · Graph critique: two good features to a principle; one issue + a fix that matches it. 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 · Type I/II: false +/false -; power = 1-ß; small n -+ low power; rare condition - base-rate false positives. Type l/ll: 假阳/假阴;power = 1-β;小 n→低 power;罕见情况→基础率导致的假阳性。 · Qual vs quant: why vs what; convergence is the qual stopping rule. Big data: size # unbiased; effect size & ethics over P. 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data: 大≠无偏;effect size 与伦理胜过 P。 ● 每个答案:点名 →定义→在情境中给理由→ because(后果)。按分值数理由。 ● 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 ● Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 ● 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 ● CI:随机的是区间,固定的是参数。P-value: Pr(datalHo),而非 Pr(Holdata);显著 ≠重要。 ● Type I/ll: 假阳/假阴;power = 1-β;小n→低 power;罕见情况→ 基础率导致的假阳性。 ● Forest plot: 菱形对零线+异质性+发表偏倚的注意。Hill:证据的权重,时序性优先。 ● 诊断图:漏斗形→方差非恒定;QQ 弯曲→非正态→变换/用假设更少的方法。 ● 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data:大≠无偏;effect size 与伦理胜过 P。 AskSia Library · MAST20034 · 双语 Bilingual EXAM MORNING . THE DECODER EXAM MORNING . THE DECODER[14]Source: asksia-bible-mast20034-bilingual.pdfGLM 连接:连续→identity/Normal;二元→logit/Bernouli (优势比);计数→log/Poisson。认出来,别去算。 · Diagnostics: residuals-vs-fitted = funnel(variance)/curve(nonlinearity); QQ tails = non-Normal. Any structure = a problem. I 诊断:残差对拟合=漏斗形(方差)/弯曲(非线性);QQ 尾部=非正态。任何结构=一个问题。 修复点名的违例:变换 · 加项 · 假设更少的检验。 · Parsimony over complexity (compare via AIC/BIC - don't compute); over-fitting is a fault, not a virtue. 一 简约性胜过复杂性(用AIC/BIC比较 -- 别去算);过拟合是缺陷,而非美德。 两个拒绝:不向数据之外做 extrapolation(外推);相关 ≠因果 -- 点出那个混杂因素。 · Model = signal + noise (模型=信号+噪声);“所有模型都是错的,有些有用” -- 评判有用性+简约性,而非 真伪。 ● 从三个轴读一个系数:符号 · 大小 · 显著性(CI 不含 0/小P=真实,而非噪声)。 ● 预测变量:数值型→斜率;分类型→相对基线的组间偏移;两者皆有→调整后的效应。 ● GLM links: 连续型→identity/Normal;二元→logit/Bernoulli (odds ratios);计数→log/Poisson。会识别,不计 算。 ● 诊断:residuals-vs-fitted = 漏斗形(方差)/曲线(非线性);QQ 尾部=非正态。任何结构=一个问题。 ● 修正点名的那个违背:transform · 加一项 · 假设更少的检验。 · 简约性(Parsimony)优于复杂性(通过 AIC/BIC 比较 -- 不计算);over-fitting 是缺点,不是优点。 ● 两条拒绝:不在数据之外做 extrapolation; correlation ≠ causation––点名 confounder。 AskSia Library · MAST20034 · 双语 Bilingual WEEK 9 . SAMPLING WEEK 9 . SAMPLING CH 9 . REASONING, NOT COMPUTING Population, frame & sample - and why a big sample can't fix a bad one population、frame 与 sample -- 以及为何大样本救不了一个坏样本 A sample is a guess about a population; the method decides if the guess is honest 样本是对总体的一次猜测;方法决定这猜测是否诚实 TL;DR. Almost no study measures everyone, so we measure a sample and reason about the population. Two different things can go wrong, and the exam lives in the gap between them: sampling error is the harmless luck-of-the-draw wobble that shrinks as the sample grows, while bias is a systematic lean baked in by the method - and a bigger sample only repeats that mistake on a larger scale. The whole skill is naming who is missing from the sample and which way that tilts the answer. TL;DR. 几乎没有研究能测量所有人,所以我们测量一个 sample 并就 population 推理。两种不同的东西可能出错,而考试 就活在它们之间的缝隙里:sampling error(抽样误差)是无害的、抽签运气式的抖动,随样本增大而收缩,而 bias 是由方 法烤进去的一种系统性偏斜 -- 更大的样本只是把那个错误放在更大的规模上重演一遍。整个技能就是点名样本里谁缺席了, 以及那朝哪个方向歪了答案。 ★ What the exam asks here 考试在这里问什么 Sampling seeds the most-rehearsed reasoning question on the 60% final - a 3-hour, short-answer paper with no calculator and no calculations, where you bring in four sides of your own notes. The released sample question asks you to define a sampling method and say why it is (not) recommended, and the tutorial species asks which sampling method would you use here and why. You will also be handed a scenario and asked to identify who is excluded and the direction of the resulting bias. The marking rule is explicit: "explaining your reasoning and choices is typically more important than any answer. " Every mark is a because - carry the taxonomy and the bias checklist, and spend the words on the consequence, not the label. Sampling 在60% 期末上孕育出最常演练的推理题 -- 一份 3小时、简答、无计算器、无计算的卷子,你带入四面自己 的笔记。发布的样题要你定义一种 sampling 方法并说出它为何(不)被推荐,教程题种则问此处你会用哪种 sampling 方法、为什么。你也会被递给一个情景,要你识别谁被排除,以及由此产生的bias的方向。评分规则明确:“阐释你的 推理与选择,通常比任何答案本身更重要。”每一分都是一个because -- 带上分类法和 bias 清单,把字花在后果上,而 非标签上。 9. 1 The core vocabulary - unit, population, frame, sample 9. 1核心词汇 -- unit、population、frame、sample Definition. A unit is one thing you study (a person, a tree, a transaction). The population is all the units you want to talk about. A census measures every unit in the population. The sampling frame is the actual list of units you can draw from - the electoral roll, a class list, a customer database. The sample is the subset you measure. Write each term as a defined object, because markers reward the defined term before the reasoning.
- 要追问:
- 数据来源 provenance
- consent
- privacy
- fairness
- stewardship[3]Source: asksia-bible-mast20034-bilingual.pdf"Interpret this CI / P- value / output / forest plot" Inference reading rules . Ch7, Ch8, Ch10 (1) State what it shows in context (CI excludes O - evidence of an effect; small P - strong evidence against H. ). (2) Add the correct caveat (the interval is random, u is fixed; large P does not prove H. ). (3) Comment on strength / meaning - significant # important. SAMPLING & TRUST "Is this sample OK?"; "what's wrong with how they recruited?" Sampling bias + WEIRD · Ch1, Ch9 (1) Ask who is missing - frame / selection / non-response / volunteer gap. (2) Name the method and its bias (convenience - people similar to each other). (3) State that a bigger sample will NOT fix bias - it repeats the mistake at scale; consider WEIRD over-sampling. "Too good to be true"; "a surprising significant result"; "just barely p‹0. 05" Reproducibility + p-hacking · Ch9-10 (1) Publication bias - novel/significant results over-published, inflating effects. (2) Watch for p-hacking / HARKing (one-sided chosen after the data, multiple looks). (3) Ask for replication, a CI / effect size, and pre- registration before trusting it. "Big data / an Al claim"; "with millions of records . . . " Ethics + validity at scale · Ch1, Ch11 (1) Huge n - everything is significant - judge effect size & practical importance, not P. (2) Apply the context questions (who/why/what/how) + data ethics (consent, fairness, stewardship). (3) Be sceptical of Al - check provenance and the missing data. ✓ How to use the table under pressure 在压力下如何使用这张表 Underline the verb and the noun in the stem first ("choose a design", "critique this graph", "interpret the output"). That two-word cue picks the row; the right column is your paragraph. Then convert each of the three things into a sentence that ends in a because. You are never asked to compute - resist the urge. 先在题干里给动词和名词划线(“选择一个设计”、“批判这张图”、“解读这段输出”)。那个两字提示挑出对应的行;右侧 那一栏就是你要写的段落。然后把这三样东西各转成一个以 because 收尾的句子。题目从不要求你计算 -- 忍住冲动。 AskSia Library . MAST20034 . XXia Bilingual EXAM MORNING . THE DECODER - EXAM MORNING . THE DECODER BUILDING THE NOTES YOU CARRY IN Your 4 sides, the 'because' rule, and the 3 hours 你的4 页笔记、‘because’规则,以及那3个小时 TL;DR. You may bring four sides of your own notes and there is no calculator - so do not waste space on formulas. Fill the four sides with decision trees, checklists, and crisp definitions: the machinery that turns a cue into a justified answer. This page lays out what to put on each side, the one rule that wins short- answer marks, a timing plan for the three hours, and the closing concepts-to-recall list. TL;DR. 你可以带四面自己的笔记,而且没有计算器 -- 所以不要把空间浪费在公式上。把这四面填满decision trees (决策 树)、checklists (清单)和精炼的定义:那些把线索变成有论证答案的机器。本页摆出每一面该放什么、赢得简答分的那一 条规则、三小时的时间规划,以及收官的待回忆概念清单。 - 12. 2 The 4-side notes plan 12. 2四页笔记计划 The exam is reasoning, not recall of numbers, so your sheet is a reasoning toolkit. A good layout maps one side to each job of the decoder above. Trees and checklists earn marks; a wall of formulae does not (there is nothing to calculate). 考的是推理,而非对数字的回忆,所以你的小抄是一个推理工具箱。好的布局把每一面对应到上面解码器的一项工作。树与清 单能得分;一墙公式不能(没有任何东西要算)。 Side What goes on it[11]Source: asksia-bible-mast20034-bilingual.pdf(b) Big-data 批判(because):巨大的 n修不了 bias -- 它只是那些已经在用这个 app 的人(selection bias); 在海量 n 下一切看起来都“显著”,所以 effect size 与来源比 P 更要紧。再加上数据伦理标记:被记录用户的隐私/ 同意。 What earns the marks. a justified qual choice (the "why" logic + convergence) + a big-data critique naming that size # representativeness, with an ethics flag. 什么能得分。一个有论证的 qual 选择(“why”逻辑+ convergence)+一个 big-data 批判,点明规模 ≠代表性,并附 上一个 ethics 标记。 Trap. dismissing qualitative as "unscientific"; equating large n with representative; forgetting consent/provenance for found data. 陷阱。把 qualitative 斥为“不科学”;把大n 等同于有代表性;对找来的数据忘了 consent/provenance(同意/来源)。 AskSia Library · MAST20034 · 双语 Bilingual ★ Recall checklist - the decision rules for the bank 回忆清单 -- 题库的决策规则 1 每个答案:点名→定义→在情境中给理由→ because(后果)。按分值数理由。 · Design/sampling: justify the choice against its alternative; non-probability methods are biased - size won't cure it. 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 · Confounder: must link to both exposure and outcome; observational - associated with, never causes. Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 · Graph critique: two good features to a principle; one issue + a fix that matches it. 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 · Type I/II: false +/false -; power = 1-ß; small n -+ low power; rare condition - base-rate false positives. Type l/ll: 假阳/假阴;power = 1-β;小 n→低 power;罕见情况→基础率导致的假阳性。 · Qual vs quant: why vs what; convergence is the qual stopping rule. Big data: size # unbiased; effect size & ethics over P. 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data: 大≠无偏;effect size 与伦理胜过 P。 ● 每个答案:点名 →定义→在情境中给理由→ because(后果)。按分值数理由。 ● 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 ● Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 ● 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 ● CI:随机的是区间,固定的是参数。P-value: Pr(datalHo),而非 Pr(Holdata);显著 ≠重要。 ● Type I/ll: 假阳/假阴;power = 1-β;小n→低 power;罕见情况→ 基础率导致的假阳性。 ● Forest plot: 菱形对零线+异质性+发表偏倚的注意。Hill:证据的权重,时序性优先。 ● 诊断图:漏斗形→方差非恒定;QQ 弯曲→非正态→变换/用假设更少的方法。 ● 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data:大≠无偏;effect size 与伦理胜过 P。 AskSia Library · MAST20034 · 双语 Bilingual EXAM MORNING . THE DECODER EXAM MORNING . THE DECODER
-
对 AI claim / big data claim 的标准批判:
- 谁被记录了,谁没被记录?
- 数据是否有 selection bias?
- 效应大小是否真的重要?
- 伦理上是否合规?[3]Source: asksia-bible-mast20034-bilingual.pdf"Interpret this CI / P- value / output / forest plot" Inference reading rules . Ch7, Ch8, Ch10 (1) State what it shows in context (CI excludes O - evidence of an effect; small P - strong evidence against H. ). (2) Add the correct caveat (the interval is random, u is fixed; large P does not prove H. ). (3) Comment on strength / meaning - significant # important. SAMPLING & TRUST "Is this sample OK?"; "what's wrong with how they recruited?" Sampling bias + WEIRD · Ch1, Ch9 (1) Ask who is missing - frame / selection / non-response / volunteer gap. (2) Name the method and its bias (convenience - people similar to each other). (3) State that a bigger sample will NOT fix bias - it repeats the mistake at scale; consider WEIRD over-sampling. "Too good to be true"; "a surprising significant result"; "just barely p‹0. 05" Reproducibility + p-hacking · Ch9-10 (1) Publication bias - novel/significant results over-published, inflating effects. (2) Watch for p-hacking / HARKing (one-sided chosen after the data, multiple looks). (3) Ask for replication, a CI / effect size, and pre- registration before trusting it. "Big data / an Al claim"; "with millions of records . . . " Ethics + validity at scale · Ch1, Ch11 (1) Huge n - everything is significant - judge effect size & practical importance, not P. (2) Apply the context questions (who/why/what/how) + data ethics (consent, fairness, stewardship). (3) Be sceptical of Al - check provenance and the missing data. ✓ How to use the table under pressure 在压力下如何使用这张表 Underline the verb and the noun in the stem first ("choose a design", "critique this graph", "interpret the output"). That two-word cue picks the row; the right column is your paragraph. Then convert each of the three things into a sentence that ends in a because. You are never asked to compute - resist the urge. 先在题干里给动词和名词划线(“选择一个设计”、“批判这张图”、“解读这段输出”)。那个两字提示挑出对应的行;右侧 那一栏就是你要写的段落。然后把这三样东西各转成一个以 because 收尾的句子。题目从不要求你计算 -- 忍住冲动。 AskSia Library . MAST20034 . XXia Bilingual EXAM MORNING . THE DECODER - EXAM MORNING . THE DECODER BUILDING THE NOTES YOU CARRY IN Your 4 sides, the 'because' rule, and the 3 hours 你的4 页笔记、‘because’规则,以及那3个小时 TL;DR. You may bring four sides of your own notes and there is no calculator - so do not waste space on formulas. Fill the four sides with decision trees, checklists, and crisp definitions: the machinery that turns a cue into a justified answer. This page lays out what to put on each side, the one rule that wins short- answer marks, a timing plan for the three hours, and the closing concepts-to-recall list. TL;DR. 你可以带四面自己的笔记,而且没有计算器 -- 所以不要把空间浪费在公式上。把这四面填满decision trees (决策 树)、checklists (清单)和精炼的定义:那些把线索变成有论证答案的机器。本页摆出每一面该放什么、赢得简答分的那一 条规则、三小时的时间规划,以及收官的待回忆概念清单。 - 12. 2 The 4-side notes plan 12. 2四页笔记计划 The exam is reasoning, not recall of numbers, so your sheet is a reasoning toolkit. A good layout maps one side to each job of the decoder above. Trees and checklists earn marks; a wall of formulae does not (there is nothing to calculate). 考的是推理,而非对数字的回忆,所以你的小抄是一个推理工具箱。好的布局把每一面对应到上面解码器的一项工作。树与清 单能得分;一墙公式不能(没有任何东西要算)。 Side What goes on it[11]Source: asksia-bible-mast20034-bilingual.pdf(b) Big-data 批判(because):巨大的 n修不了 bias -- 它只是那些已经在用这个 app 的人(selection bias); 在海量 n 下一切看起来都“显著”,所以 effect size 与来源比 P 更要紧。再加上数据伦理标记:被记录用户的隐私/ 同意。 What earns the marks. a justified qual choice (the "why" logic + convergence) + a big-data critique naming that size # representativeness, with an ethics flag. 什么能得分。一个有论证的 qual 选择(“why”逻辑+ convergence)+一个 big-data 批判,点明规模 ≠代表性,并附 上一个 ethics 标记。 Trap. dismissing qualitative as "unscientific"; equating large n with representative; forgetting consent/provenance for found data. 陷阱。把 qualitative 斥为“不科学”;把大n 等同于有代表性;对找来的数据忘了 consent/provenance(同意/来源)。 AskSia Library · MAST20034 · 双语 Bilingual ★ Recall checklist - the decision rules for the bank 回忆清单 -- 题库的决策规则 1 每个答案:点名→定义→在情境中给理由→ because(后果)。按分值数理由。 · Design/sampling: justify the choice against its alternative; non-probability methods are biased - size won't cure it. 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 · Confounder: must link to both exposure and outcome; observational - associated with, never causes. Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 · Graph critique: two good features to a principle; one issue + a fix that matches it. 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 · Type I/II: false +/false -; power = 1-ß; small n -+ low power; rare condition - base-rate false positives. Type l/ll: 假阳/假阴;power = 1-β;小 n→低 power;罕见情况→基础率导致的假阳性。 · Qual vs quant: why vs what; convergence is the qual stopping rule. Big data: size # unbiased; effect size & ethics over P. 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data: 大≠无偏;effect size 与伦理胜过 P。 ● 每个答案:点名 →定义→在情境中给理由→ because(后果)。按分值数理由。 ● 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 ● Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 ● 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 ● CI:随机的是区间,固定的是参数。P-value: Pr(datalHo),而非 Pr(Holdata);显著 ≠重要。 ● Type I/ll: 假阳/假阴;power = 1-β;小n→低 power;罕见情况→ 基础率导致的假阳性。 ● Forest plot: 菱形对零线+异质性+发表偏倚的注意。Hill:证据的权重,时序性优先。 ● 诊断图:漏斗形→方差非恒定;QQ 弯曲→非正态→变换/用假设更少的方法。 ● 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data:大≠无偏;effect size 与伦理胜过 P。 AskSia Library · MAST20034 · 双语 Bilingual EXAM MORNING . THE DECODER EXAM MORNING . THE DECODER
-
11)Meta-analysis / forest plot / accumulating research
-
当前摘录支持你至少掌握这些:[3]Source: asksia-bible-mast20034-bilingual.pdf"Interpret this CI / P- value / output / forest plot" Inference reading rules . Ch7, Ch8, Ch10 (1) State what it shows in context (CI excludes O - evidence of an effect; small P - strong evidence against H. ). (2) Add the correct caveat (the interval is random, u is fixed; large P does not prove H. ). (3) Comment on strength / meaning - significant # important. SAMPLING & TRUST "Is this sample OK?"; "what's wrong with how they recruited?" Sampling bias + WEIRD · Ch1, Ch9 (1) Ask who is missing - frame / selection / non-response / volunteer gap. (2) Name the method and its bias (convenience - people similar to each other). (3) State that a bigger sample will NOT fix bias - it repeats the mistake at scale; consider WEIRD over-sampling. "Too good to be true"; "a surprising significant result"; "just barely p‹0. 05" Reproducibility + p-hacking · Ch9-10 (1) Publication bias - novel/significant results over-published, inflating effects. (2) Watch for p-hacking / HARKing (one-sided chosen after the data, multiple looks). (3) Ask for replication, a CI / effect size, and pre- registration before trusting it. "Big data / an Al claim"; "with millions of records . . . " Ethics + validity at scale · Ch1, Ch11 (1) Huge n - everything is significant - judge effect size & practical importance, not P. (2) Apply the context questions (who/why/what/how) + data ethics (consent, fairness, stewardship). (3) Be sceptical of Al - check provenance and the missing data. ✓ How to use the table under pressure 在压力下如何使用这张表 Underline the verb and the noun in the stem first ("choose a design", "critique this graph", "interpret the output"). That two-word cue picks the row; the right column is your paragraph. Then convert each of the three things into a sentence that ends in a because. You are never asked to compute - resist the urge. 先在题干里给动词和名词划线(“选择一个设计”、“批判这张图”、“解读这段输出”)。那个两字提示挑出对应的行;右侧 那一栏就是你要写的段落。然后把这三样东西各转成一个以 because 收尾的句子。题目从不要求你计算 -- 忍住冲动。 AskSia Library . MAST20034 . XXia Bilingual EXAM MORNING . THE DECODER - EXAM MORNING . THE DECODER BUILDING THE NOTES YOU CARRY IN Your 4 sides, the 'because' rule, and the 3 hours 你的4 页笔记、‘because’规则,以及那3个小时 TL;DR. You may bring four sides of your own notes and there is no calculator - so do not waste space on formulas. Fill the four sides with decision trees, checklists, and crisp definitions: the machinery that turns a cue into a justified answer. This page lays out what to put on each side, the one rule that wins short- answer marks, a timing plan for the three hours, and the closing concepts-to-recall list. TL;DR. 你可以带四面自己的笔记,而且没有计算器 -- 所以不要把空间浪费在公式上。把这四面填满decision trees (决策 树)、checklists (清单)和精炼的定义:那些把线索变成有论证答案的机器。本页摆出每一面该放什么、赢得简答分的那一 条规则、三小时的时间规划,以及收官的待回忆概念清单。 - 12. 2 The 4-side notes plan 12. 2四页笔记计划 The exam is reasoning, not recall of numbers, so your sheet is a reasoning toolkit. A good layout maps one side to each job of the decoder above. Trees and checklists earn marks; a wall of formulae does not (there is nothing to calculate). 考的是推理,而非对数字的回忆,所以你的小抄是一个推理工具箱。好的布局把每一面对应到上面解码器的一项工作。树与清 单能得分;一墙公式不能(没有任何东西要算)。 Side What goes on it[11]Source: asksia-bible-mast20034-bilingual.pdf(b) Big-data 批判(because):巨大的 n修不了 bias -- 它只是那些已经在用这个 app 的人(selection bias); 在海量 n 下一切看起来都“显著”,所以 effect size 与来源比 P 更要紧。再加上数据伦理标记:被记录用户的隐私/ 同意。 What earns the marks. a justified qual choice (the "why" logic + convergence) + a big-data critique naming that size # representativeness, with an ethics flag. 什么能得分。一个有论证的 qual 选择(“why”逻辑+ convergence)+一个 big-data 批判,点明规模 ≠代表性,并附 上一个 ethics 标记。 Trap. dismissing qualitative as "unscientific"; equating large n with representative; forgetting consent/provenance for found data. 陷阱。把 qualitative 斥为“不科学”;把大n 等同于有代表性;对找来的数据忘了 consent/provenance(同意/来源)。 AskSia Library · MAST20034 · 双语 Bilingual ★ Recall checklist - the decision rules for the bank 回忆清单 -- 题库的决策规则 1 每个答案:点名→定义→在情境中给理由→ because(后果)。按分值数理由。 · Design/sampling: justify the choice against its alternative; non-probability methods are biased - size won't cure it. 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 · Confounder: must link to both exposure and outcome; observational - associated with, never causes. Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 · Graph critique: two good features to a principle; one issue + a fix that matches it. 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 · Type I/II: false +/false -; power = 1-ß; small n -+ low power; rare condition - base-rate false positives. Type l/ll: 假阳/假阴;power = 1-β;小 n→低 power;罕见情况→基础率导致的假阳性。 · Qual vs quant: why vs what; convergence is the qual stopping rule. Big data: size # unbiased; effect size & ethics over P. 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data: 大≠无偏;effect size 与伦理胜过 P。 ● 每个答案:点名 →定义→在情境中给理由→ because(后果)。按分值数理由。 ● 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 ● Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 ● 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 ● CI:随机的是区间,固定的是参数。P-value: Pr(datalHo),而非 Pr(Holdata);显著 ≠重要。 ● Type I/ll: 假阳/假阴;power = 1-β;小n→低 power;罕见情况→ 基础率导致的假阳性。 ● Forest plot: 菱形对零线+异质性+发表偏倚的注意。Hill:证据的权重,时序性优先。 ● 诊断图:漏斗形→方差非恒定;QQ 弯曲→非正态→变换/用假设更少的方法。 ● 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data:大≠无偏;effect size 与伦理胜过 P。 AskSia Library · MAST20034 · 双语 Bilingual EXAM MORNING . THE DECODER EXAM MORNING . THE DECODER
- forest plot 要会看:
- 零线
- 菱形
- 异质性
- 发表偏倚
- Bradford Hill 是跨研究积累证据的重要框架[1]Source: asksia-bible-mast20034-bilingual.pdfB 2 . REVISE 2 · REVISE 2 . REVISE You've done the week. Use the tables and the chapter-end recall checklists to self-test: can you list the four observational designs, name three sampling biases, give the five graphics principles, recite the Hill criteria? The checklists are written to be lifted almost verbatim onto your four-side notes sheet. 你已经上完本周。用各表格和章 末的recall checklists (回忆清 单)来自测:你能列出四种观察 性设计、点名三种 sampling bias、给出五条图表原则、背出 Hill 准则吗?这些清单写出来就 是为了几乎逐字誉到你的四面笔 记纸上。 C 3 . APPLY 3 . APPLY 3 . APPLY You're building your notes sheet or sitting the paper. Run the name-the-concept decoder (Ch 14) on every prompt: read the cue - name the design / bias / method -> write the because. With four sides of notes carried in and no calculator, your edge is reasoning discipline, not recall under pressure. 你正在做笔记纸,或正在考场 上。对每道题跑一遍name-the- concept decoder (点名概念解 码器)(第14章):读线索→点 design / bias / method -> 写下because。带着四面笔记、 不用计算器,你的优势是推理纪 律,而非压力下的回忆。 AskSia Library · MAST20034 · 双语 Bilingual ! Read this first: the assessment shape, and the bring-in rule 先读这个:评估的形态,以及可带入规则 MAST20034 is assessed by four pieces: 5 revision quizzes (5%), 4 short assignments (20%, each a tight 200- word critique with hard word penalties), a group project (15%, study design/critique + a Week 11 presentation), and the 60% final exam. The final is in-person, short-answer reasoning, three hours. You may bring in up to two A4 pages double-sided - four sides - of your own notes, and calculators are not permitted (there are no questions that need one). So your notes sheet should carry definitions, taxonomies, decision rules and checklists, never arithmetic. Always confirm the current weights, dates and exam conditions on your own LMS, as details shift between cohorts. MAST20034 由四个部分评估:5 次复习 quiz(5%)、4次短作业(20%,每次是一篇严格200词的批判,超字数有硬 扣分)、一个小组项目(15%,研究设计/批判+第11周展示),以及60% 的期末考。期末是线下、简答推理、三小时。 你可以带入最多两张 A4 双面纸 -- 四面 -- 自己的笔记,不允许用计算器(也没有需要计算器的题目)。所以你的笔记 纸应承载定义、分类法、决策规则与清单,绝不放算术。请始终在你自己的 LMS 上确认当前的权重、日期与考试条件, 因为细节会随届次变动。 i How this book was built - the two-layer rule 这本书是怎么搭出来的 -- 两层规则 The framework canon here is standard, widely-published statistical-literacy theory - the PPDAC investigation cycle (Wild & Pfannkuch), EDA (Tukey), the standard study-design and sampling taxonomies, validity & precision principles, NHST + confidence-interval logic, the WEIRD-bias critique, the Bradford Hill criteria, and data-feminism / data-ethics (D'Ignazio & Klein). These are non-copyrightable canon, stated plainly. The course's own case-study stems and tutorial examples are paraphrased and re-authored with our own scenarios - we never reproduce a case study's specific data. Book status quoted and honoured (four-side bring-in, no calculator, short-answer). Verify on your LMS. 这里的框架经典是标准的、被广泛发表的统计素养理论 -- PPDAC 探究循环(Wild & Pfannkuch)、EDA (Tukey)、 标准的研究设计与 sampling 分类法、validity & precision (效度与精密度)原则、NHST+置信区间逻辑、WEIRD- bias 批判、Bradford Hill 准则,以及data-feminism / data-ethics (数据女性主义/数据伦理)(D'Ignazio & Klein)。这些是不可受版权保护的经典,平实陈述。本课自身的案例研究题干与教程例子都被转述并以我们自己的情景重 写 -- 我们绝不复制任何案例研究的具体数据。书面状态如实引用并遵守(四面带入、不用计算器、简答)。请在你的 LMS 上核实。 AskSia Library · MAST20034 · 双语 Bilingual THE BLUEPRINT - THE EXAM BLUEPRINT 60% FINAL . EVERY MARK IS A 'BECAUSE' Where every mark lives 每一分都落在哪里 One 60% short-answer final - reasoning only, no calculator, four sides of your own notes 一场占 60% 的 short-answer 期末 -- 只考推理、不许用计算器、可带四页自备笔记 TL;DR. Sixty percent is a short-answer reasoning final - no calculator, no calculations, no software, with four sides of your own notes carried in. Its make-or-break skill is "name the concept, then justify the call": you are handed a graph, a study or a piece of statistical output and asked to critique it and say how to fix it. Master the taxonomies and decision rules in this book and you hold the keys to the whole paper. TL;DR. 这 60% 是一场 short-answer 推理期末 -- 不许用计算器、不做计算、不用软件,可带入四页自备笔记。它成败攸 关的技能是“为概念命名,再论证你的判断”:你会拿到一张图、一项研究或一段统计输出,被要求批判它并说出如何修补。 掌握本书里的分类法与决策规则,你就握住了整张试卷的钥匙。 60% FINAL EXAM (3 HR) 期末考试(3 小时)[9]Source: asksia-bible-mast20034-bilingual.pdfThe PPDAC cycle - the spine of the whole unit, and a one-glance map of how an exam scenario hangs together: every question lives somewhere on Problem - Plan - Data - Analysis - Conclusion. Locating the stage tells you which concept the marker wants. PPDAC 循环 -- 整个单元的脊柱,也是一张让你一眼看清考试情景如何拼接的地图:每道题都栖身于 Problem → Plan → Data → Analysis → Conclusion 中的某处。定位到阶段,就知道评分者想要哪个概念。 AskSia Library . MAST20034 . XXia Bilingual ★ Concepts to recall - the whole-book checklist 要回忆的概念 -- 全书清单 · Context first (Ch1): data are value-laden; ask who/why/what/how/when; critique # criticism (always offer a constructive fix). 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · Graphics (Ch2): the 5 principles; match graph to variable types; two good features + one specific improvement. I 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 · Design (Ch3): validity = randomise/compare/control (kills bias); precision = replicate/stratify/balance (kills variability); they are independent axes. 设计(第3章): validity = 随机化/比较/控制(杀 bias); precision = 重复/分层/平衡(杀 variability);二者是独 立的轴。 · Observational (Ch4): cohort=group-by-exposure, case-control=group-by-outcome; confounder links to both; correlation # causation. 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠ 因果。 I 报告(第5章):中心/离散/趋势/离群点;报告 Cl+水平、以及统计量 +P,而非只报P。 · Qualitative (Ch6): "why" not "what"; bottom-up vs top-down coding; convergence as the stopping rule. 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence 作为停止规则。 推断(第7章):随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大P 不证明 Ho; Type l/ll 与 power. · Modelling (Ch8): signal+noise; "all models wrong, some useful"; parsimony; read residual/QQ plots - interpret, never fit. 建模(第8章):信号+噪声;“所有模型都是错的,有些有用”;简约性;读残差/QQ图 -- 解读,绝不拟合。 · Sampling (Ch9): frame vs sample; a big sample won't fix bias; 4 random + 4 non-random methods; WEIRD; reproducibility crisis. I 抽样(第9章):抽样框 vs样本;大样本修不了偏倚;4种随机+4种非随机方法;WEIRD;可重复性危机。 I 累积(第10章):森林图(零线+菱形);Hill 准则(时序性+梯度);发表偏倚。 · Big data (Ch11): significance # importance at scale; provenance, ethics, scepticism toward Al findings. Big data (第11章):在大规模下显著 ≠重要;来源、伦理、对 AI发现的怀疑。 而且永远 -- 铁律:点名概念,然后给 because。祝你好运。 AskSia Library · MAST20034 · 双语 Bilingual ● 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 ● 设计(第3章): validity =随机化/比较/控制(杀 bias); precision= 重复/分层/平衡(杀 variability);二者是独 立的轴。 ● 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠因果。 ● 报告(第5章):中心/离散/趋势/离群点;报告 CI+水平、以及统计量+P,而非只报P。 ● 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence(收敛)作为停止规则。 推断(第7章): 随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大 P 不证明 Ho; Type l/ll 与 power.[11]Source: asksia-bible-mast20034-bilingual.pdf(b) Big-data 批判(because):巨大的 n修不了 bias -- 它只是那些已经在用这个 app 的人(selection bias); 在海量 n 下一切看起来都“显著”,所以 effect size 与来源比 P 更要紧。再加上数据伦理标记:被记录用户的隐私/ 同意。 What earns the marks. a justified qual choice (the "why" logic + convergence) + a big-data critique naming that size # representativeness, with an ethics flag. 什么能得分。一个有论证的 qual 选择(“why”逻辑+ convergence)+一个 big-data 批判,点明规模 ≠代表性,并附 上一个 ethics 标记。 Trap. dismissing qualitative as "unscientific"; equating large n with representative; forgetting consent/provenance for found data. 陷阱。把 qualitative 斥为“不科学”;把大n 等同于有代表性;对找来的数据忘了 consent/provenance(同意/来源)。 AskSia Library · MAST20034 · 双语 Bilingual ★ Recall checklist - the decision rules for the bank 回忆清单 -- 题库的决策规则 1 每个答案:点名→定义→在情境中给理由→ because(后果)。按分值数理由。 · Design/sampling: justify the choice against its alternative; non-probability methods are biased - size won't cure it. 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 · Confounder: must link to both exposure and outcome; observational - associated with, never causes. Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 · Graph critique: two good features to a principle; one issue + a fix that matches it. 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 · Type I/II: false +/false -; power = 1-ß; small n -+ low power; rare condition - base-rate false positives. Type l/ll: 假阳/假阴;power = 1-β;小 n→低 power;罕见情况→基础率导致的假阳性。 · Qual vs quant: why vs what; convergence is the qual stopping rule. Big data: size # unbiased; effect size & ethics over P. 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data: 大≠无偏;effect size 与伦理胜过 P。 ● 每个答案:点名 →定义→在情境中给理由→ because(后果)。按分值数理由。 ● 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 ● Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 ● 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 ● CI:随机的是区间,固定的是参数。P-value: Pr(datalHo),而非 Pr(Holdata);显著 ≠重要。 ● Type I/ll: 假阳/假阴;power = 1-β;小n→低 power;罕见情况→ 基础率导致的假阳性。 ● Forest plot: 菱形对零线+异质性+发表偏倚的注意。Hill:证据的权重,时序性优先。 ● 诊断图:漏斗形→方差非恒定;QQ 弯曲→非正态→变换/用假设更少的方法。 ● 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data:大≠无偏;effect size 与伦理胜过 P。 AskSia Library · MAST20034 · 双语 Bilingual EXAM MORNING . THE DECODER EXAM MORNING . THE DECODER
- forest plot 要会看:
-
如果题目问“多项研究能不能支持因果”,你就要:
- 看结果是否一致
- 看 temporality
- 看 gradient
- 看发表偏倚[3]Source: asksia-bible-mast20034-bilingual.pdf"Interpret this CI / P- value / output / forest plot" Inference reading rules . Ch7, Ch8, Ch10 (1) State what it shows in context (CI excludes O - evidence of an effect; small P - strong evidence against H. ). (2) Add the correct caveat (the interval is random, u is fixed; large P does not prove H. ). (3) Comment on strength / meaning - significant # important. SAMPLING & TRUST "Is this sample OK?"; "what's wrong with how they recruited?" Sampling bias + WEIRD · Ch1, Ch9 (1) Ask who is missing - frame / selection / non-response / volunteer gap. (2) Name the method and its bias (convenience - people similar to each other). (3) State that a bigger sample will NOT fix bias - it repeats the mistake at scale; consider WEIRD over-sampling. "Too good to be true"; "a surprising significant result"; "just barely p‹0. 05" Reproducibility + p-hacking · Ch9-10 (1) Publication bias - novel/significant results over-published, inflating effects. (2) Watch for p-hacking / HARKing (one-sided chosen after the data, multiple looks). (3) Ask for replication, a CI / effect size, and pre- registration before trusting it. "Big data / an Al claim"; "with millions of records . . . " Ethics + validity at scale · Ch1, Ch11 (1) Huge n - everything is significant - judge effect size & practical importance, not P. (2) Apply the context questions (who/why/what/how) + data ethics (consent, fairness, stewardship). (3) Be sceptical of Al - check provenance and the missing data. ✓ How to use the table under pressure 在压力下如何使用这张表 Underline the verb and the noun in the stem first ("choose a design", "critique this graph", "interpret the output"). That two-word cue picks the row; the right column is your paragraph. Then convert each of the three things into a sentence that ends in a because. You are never asked to compute - resist the urge. 先在题干里给动词和名词划线(“选择一个设计”、“批判这张图”、“解读这段输出”)。那个两字提示挑出对应的行;右侧 那一栏就是你要写的段落。然后把这三样东西各转成一个以 because 收尾的句子。题目从不要求你计算 -- 忍住冲动。 AskSia Library . MAST20034 . XXia Bilingual EXAM MORNING . THE DECODER - EXAM MORNING . THE DECODER BUILDING THE NOTES YOU CARRY IN Your 4 sides, the 'because' rule, and the 3 hours 你的4 页笔记、‘because’规则,以及那3个小时 TL;DR. You may bring four sides of your own notes and there is no calculator - so do not waste space on formulas. Fill the four sides with decision trees, checklists, and crisp definitions: the machinery that turns a cue into a justified answer. This page lays out what to put on each side, the one rule that wins short- answer marks, a timing plan for the three hours, and the closing concepts-to-recall list. TL;DR. 你可以带四面自己的笔记,而且没有计算器 -- 所以不要把空间浪费在公式上。把这四面填满decision trees (决策 树)、checklists (清单)和精炼的定义:那些把线索变成有论证答案的机器。本页摆出每一面该放什么、赢得简答分的那一 条规则、三小时的时间规划,以及收官的待回忆概念清单。 - 12. 2 The 4-side notes plan 12. 2四页笔记计划 The exam is reasoning, not recall of numbers, so your sheet is a reasoning toolkit. A good layout maps one side to each job of the decoder above. Trees and checklists earn marks; a wall of formulae does not (there is nothing to calculate). 考的是推理,而非对数字的回忆,所以你的小抄是一个推理工具箱。好的布局把每一面对应到上面解码器的一项工作。树与清 单能得分;一墙公式不能(没有任何东西要算)。 Side What goes on it[11]Source: asksia-bible-mast20034-bilingual.pdf(b) Big-data 批判(because):巨大的 n修不了 bias -- 它只是那些已经在用这个 app 的人(selection bias); 在海量 n 下一切看起来都“显著”,所以 effect size 与来源比 P 更要紧。再加上数据伦理标记:被记录用户的隐私/ 同意。 What earns the marks. a justified qual choice (the "why" logic + convergence) + a big-data critique naming that size # representativeness, with an ethics flag. 什么能得分。一个有论证的 qual 选择(“why”逻辑+ convergence)+一个 big-data 批判,点明规模 ≠代表性,并附 上一个 ethics 标记。 Trap. dismissing qualitative as "unscientific"; equating large n with representative; forgetting consent/provenance for found data. 陷阱。把 qualitative 斥为“不科学”;把大n 等同于有代表性;对找来的数据忘了 consent/provenance(同意/来源)。 AskSia Library · MAST20034 · 双语 Bilingual ★ Recall checklist - the decision rules for the bank 回忆清单 -- 题库的决策规则 1 每个答案:点名→定义→在情境中给理由→ because(后果)。按分值数理由。 · Design/sampling: justify the choice against its alternative; non-probability methods are biased - size won't cure it. 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 · Confounder: must link to both exposure and outcome; observational - associated with, never causes. Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 · Graph critique: two good features to a principle; one issue + a fix that matches it. 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 · Type I/II: false +/false -; power = 1-ß; small n -+ low power; rare condition - base-rate false positives. Type l/ll: 假阳/假阴;power = 1-β;小 n→低 power;罕见情况→基础率导致的假阳性。 · Qual vs quant: why vs what; convergence is the qual stopping rule. Big data: size # unbiased; effect size & ethics over P. 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data: 大≠无偏;effect size 与伦理胜过 P。 ● 每个答案:点名 →定义→在情境中给理由→ because(后果)。按分值数理由。 ● 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 ● Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 ● 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 ● CI:随机的是区间,固定的是参数。P-value: Pr(datalHo),而非 Pr(Holdata);显著 ≠重要。 ● Type I/ll: 假阳/假阴;power = 1-β;小n→低 power;罕见情况→ 基础率导致的假阳性。 ● Forest plot: 菱形对零线+异质性+发表偏倚的注意。Hill:证据的权重,时序性优先。 ● 诊断图:漏斗形→方差非恒定;QQ 弯曲→非正态→变换/用假设更少的方法。 ● 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data:大≠无偏;effect size 与伦理胜过 P。 AskSia Library · MAST20034 · 双语 Bilingual EXAM MORNING . THE DECODER EXAM MORNING . THE DECODER
-
五、这门课最该背的“定义一句话清单”
-
下面这些你最好背到能直接写出来:
-
Confounder
- 一个同时与 exposure 和 outcome 相关、可能扭曲二者关系的变量。[9]Source: asksia-bible-mast20034-bilingual.pdfThe PPDAC cycle - the spine of the whole unit, and a one-glance map of how an exam scenario hangs together: every question lives somewhere on Problem - Plan - Data - Analysis - Conclusion. Locating the stage tells you which concept the marker wants. PPDAC 循环 -- 整个单元的脊柱,也是一张让你一眼看清考试情景如何拼接的地图:每道题都栖身于 Problem → Plan → Data → Analysis → Conclusion 中的某处。定位到阶段,就知道评分者想要哪个概念。 AskSia Library . MAST20034 . XXia Bilingual ★ Concepts to recall - the whole-book checklist 要回忆的概念 -- 全书清单 · Context first (Ch1): data are value-laden; ask who/why/what/how/when; critique # criticism (always offer a constructive fix). 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · Graphics (Ch2): the 5 principles; match graph to variable types; two good features + one specific improvement. I 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 · Design (Ch3): validity = randomise/compare/control (kills bias); precision = replicate/stratify/balance (kills variability); they are independent axes. 设计(第3章): validity = 随机化/比较/控制(杀 bias); precision = 重复/分层/平衡(杀 variability);二者是独 立的轴。 · Observational (Ch4): cohort=group-by-exposure, case-control=group-by-outcome; confounder links to both; correlation # causation. 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠ 因果。 I 报告(第5章):中心/离散/趋势/离群点;报告 Cl+水平、以及统计量 +P,而非只报P。 · Qualitative (Ch6): "why" not "what"; bottom-up vs top-down coding; convergence as the stopping rule. 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence 作为停止规则。 推断(第7章):随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大P 不证明 Ho; Type l/ll 与 power. · Modelling (Ch8): signal+noise; "all models wrong, some useful"; parsimony; read residual/QQ plots - interpret, never fit. 建模(第8章):信号+噪声;“所有模型都是错的,有些有用”;简约性;读残差/QQ图 -- 解读,绝不拟合。 · Sampling (Ch9): frame vs sample; a big sample won't fix bias; 4 random + 4 non-random methods; WEIRD; reproducibility crisis. I 抽样(第9章):抽样框 vs样本;大样本修不了偏倚;4种随机+4种非随机方法;WEIRD;可重复性危机。 I 累积(第10章):森林图(零线+菱形);Hill 准则(时序性+梯度);发表偏倚。 · Big data (Ch11): significance # importance at scale; provenance, ethics, scepticism toward Al findings. Big data (第11章):在大规模下显著 ≠重要;来源、伦理、对 AI发现的怀疑。 而且永远 -- 铁律:点名概念,然后给 because。祝你好运。 AskSia Library · MAST20034 · 双语 Bilingual ● 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 ● 设计(第3章): validity =随机化/比较/控制(杀 bias); precision= 重复/分层/平衡(杀 variability);二者是独 立的轴。 ● 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠因果。 ● 报告(第5章):中心/离散/趋势/离群点;报告 CI+水平、以及统计量+P,而非只报P。 ● 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence(收敛)作为停止规则。 推断(第7章): 随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大 P 不证明 Ho; Type l/ll 与 power.[11]Source: asksia-bible-mast20034-bilingual.pdf(b) Big-data 批判(because):巨大的 n修不了 bias -- 它只是那些已经在用这个 app 的人(selection bias); 在海量 n 下一切看起来都“显著”,所以 effect size 与来源比 P 更要紧。再加上数据伦理标记:被记录用户的隐私/ 同意。 What earns the marks. a justified qual choice (the "why" logic + convergence) + a big-data critique naming that size # representativeness, with an ethics flag. 什么能得分。一个有论证的 qual 选择(“why”逻辑+ convergence)+一个 big-data 批判,点明规模 ≠代表性,并附 上一个 ethics 标记。 Trap. dismissing qualitative as "unscientific"; equating large n with representative; forgetting consent/provenance for found data. 陷阱。把 qualitative 斥为“不科学”;把大n 等同于有代表性;对找来的数据忘了 consent/provenance(同意/来源)。 AskSia Library · MAST20034 · 双语 Bilingual ★ Recall checklist - the decision rules for the bank 回忆清单 -- 题库的决策规则 1 每个答案:点名→定义→在情境中给理由→ because(后果)。按分值数理由。 · Design/sampling: justify the choice against its alternative; non-probability methods are biased - size won't cure it. 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 · Confounder: must link to both exposure and outcome; observational - associated with, never causes. Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 · Graph critique: two good features to a principle; one issue + a fix that matches it. 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 · Type I/II: false +/false -; power = 1-ß; small n -+ low power; rare condition - base-rate false positives. Type l/ll: 假阳/假阴;power = 1-β;小 n→低 power;罕见情况→基础率导致的假阳性。 · Qual vs quant: why vs what; convergence is the qual stopping rule. Big data: size # unbiased; effect size & ethics over P. 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data: 大≠无偏;effect size 与伦理胜过 P。 ● 每个答案:点名 →定义→在情境中给理由→ because(后果)。按分值数理由。 ● 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 ● Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 ● 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 ● CI:随机的是区间,固定的是参数。P-value: Pr(datalHo),而非 Pr(Holdata);显著 ≠重要。 ● Type I/ll: 假阳/假阴;power = 1-β;小n→低 power;罕见情况→ 基础率导致的假阳性。 ● Forest plot: 菱形对零线+异质性+发表偏倚的注意。Hill:证据的权重,时序性优先。 ● 诊断图:漏斗形→方差非恒定;QQ 弯曲→非正态→变换/用假设更少的方法。 ● 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data:大≠无偏;effect size 与伦理胜过 P。 AskSia Library · MAST20034 · 双语 Bilingual EXAM MORNING . THE DECODER EXAM MORNING . THE DECODER[12]Source: asksia-bible-mast20034-bilingual.pdfTrap. offering a variable that affects only the outcome (that's not a confounder - a confounder must link to both); claiming causation. 陷阱。给出一个只影响结局(outcome)的变量(那不是 confounder -- confounder 必须同时连向两端);声称存在因 果。 Q5 GRAPH CRITIQUE - PRAISE [2: 1 mark each] Prompt (paraphrased). Given a multi-panel scatterplot, identify two good features in terms of communicating information. 题目(转述)。给定一张多面板散点图,从信息传达的角度识别两个优点。 - Model reasoning - the because skeleton. Name two concrete features, each tied to a 范例推理 -- because骨架。 graphics principle: (1) a scatterplot is the standard form for two numerical variables; (2) colour and symbol double-encode the groups (accessible / redundant coding); or panels separate groups instead of overplotting; or a clear title + source gives context. What earns the marks. two specific features, each named to a principle - 1 mark each. Specificity is everything. 什么能得分。两个具体的特征,每个都挂到一条原则上 -- 各1分。具体性就是一切。 Trap. generic praise ("nice colours", "easy to read") with no principle; praising a feature that isn't actually in the chart. 陷阱。没有原则的笼统夸赞(“颜色好看”“易读”);夸了一个图里其实没有的特征。 AskSia Library . MAST20034 . XXia Bilingual 06 GRAPH CRITIQUE - FIX [3: 1 issue + 2 specific fix] Prompt (paraphrased). Identify one feature that could be improved and suggest a specific improvement. 题目(转述)。识别一个可改进之处,并提出一个具体的改进。 Model reasoning - the because skeleton. 范例推理 -- because骨架。 (1) Issue: name one real fault - e. g. panels are on different y-axis scales, so the eye mis-reads the comparison. (2) Fix (because): "set every panel to a common axis range so the heights are directly comparable" - the fix must address the named issue. (Other valid pairs: no title - add a title stating data + context + source; cryptic labels - spell out the full variable names. ) What earns the marks. 1 mark to name a real issue + 2 marks for a fix that directly resolves that exact issue. The fix must match the fault. 什么能得分。1分给点名一个真实问题+2 分给一个直接解决那个确切问题的修正。修正必须对得上故障。 Trap. a fix that doesn't address the issue you named (rubric zeroes it); naming an issue but giving no concrete fix; "fixing" something that was fine. 陷阱。一个不针对你所点名问题的修正(评分会归零);只点名问题却不给具体修正;“修”了本来没问题的东西。 AskSia Library . MAST20034 . XXia Bilingual REVISION . SHORT - ANSWER BANK REVISION . SHORT - ANSWER BANK INTERPRET -AND - CRITIQUE SPECIES Drills 7-12: inference, errors, causation & ethics 演练 7-12: inference、误差、causation 与伦理
-
P-value
- $H_0$ 为真时,观察到当前数据或更极端数据的概率。[9]Source: asksia-bible-mast20034-bilingual.pdfThe PPDAC cycle - the spine of the whole unit, and a one-glance map of how an exam scenario hangs together: every question lives somewhere on Problem - Plan - Data - Analysis - Conclusion. Locating the stage tells you which concept the marker wants. PPDAC 循环 -- 整个单元的脊柱,也是一张让你一眼看清考试情景如何拼接的地图:每道题都栖身于 Problem → Plan → Data → Analysis → Conclusion 中的某处。定位到阶段,就知道评分者想要哪个概念。 AskSia Library . MAST20034 . XXia Bilingual ★ Concepts to recall - the whole-book checklist 要回忆的概念 -- 全书清单 · Context first (Ch1): data are value-laden; ask who/why/what/how/when; critique # criticism (always offer a constructive fix). 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · Graphics (Ch2): the 5 principles; match graph to variable types; two good features + one specific improvement. I 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 · Design (Ch3): validity = randomise/compare/control (kills bias); precision = replicate/stratify/balance (kills variability); they are independent axes. 设计(第3章): validity = 随机化/比较/控制(杀 bias); precision = 重复/分层/平衡(杀 variability);二者是独 立的轴。 · Observational (Ch4): cohort=group-by-exposure, case-control=group-by-outcome; confounder links to both; correlation # causation. 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠ 因果。 I 报告(第5章):中心/离散/趋势/离群点;报告 Cl+水平、以及统计量 +P,而非只报P。 · Qualitative (Ch6): "why" not "what"; bottom-up vs top-down coding; convergence as the stopping rule. 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence 作为停止规则。 推断(第7章):随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大P 不证明 Ho; Type l/ll 与 power. · Modelling (Ch8): signal+noise; "all models wrong, some useful"; parsimony; read residual/QQ plots - interpret, never fit. 建模(第8章):信号+噪声;“所有模型都是错的,有些有用”;简约性;读残差/QQ图 -- 解读,绝不拟合。 · Sampling (Ch9): frame vs sample; a big sample won't fix bias; 4 random + 4 non-random methods; WEIRD; reproducibility crisis. I 抽样(第9章):抽样框 vs样本;大样本修不了偏倚;4种随机+4种非随机方法;WEIRD;可重复性危机。 I 累积(第10章):森林图(零线+菱形);Hill 准则(时序性+梯度);发表偏倚。 · Big data (Ch11): significance # importance at scale; provenance, ethics, scepticism toward Al findings. Big data (第11章):在大规模下显著 ≠重要;来源、伦理、对 AI发现的怀疑。 而且永远 -- 铁律:点名概念,然后给 because。祝你好运。 AskSia Library · MAST20034 · 双语 Bilingual ● 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 ● 设计(第3章): validity =随机化/比较/控制(杀 bias); precision= 重复/分层/平衡(杀 variability);二者是独 立的轴。 ● 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠因果。 ● 报告(第5章):中心/离散/趋势/离群点;报告 CI+水平、以及统计量+P,而非只报P。 ● 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence(收敛)作为停止规则。 推断(第7章): 随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大 P 不证明 Ho; Type l/ll 与 power.[11]Source: asksia-bible-mast20034-bilingual.pdf(b) Big-data 批判(because):巨大的 n修不了 bias -- 它只是那些已经在用这个 app 的人(selection bias); 在海量 n 下一切看起来都“显著”,所以 effect size 与来源比 P 更要紧。再加上数据伦理标记:被记录用户的隐私/ 同意。 What earns the marks. a justified qual choice (the "why" logic + convergence) + a big-data critique naming that size # representativeness, with an ethics flag. 什么能得分。一个有论证的 qual 选择(“why”逻辑+ convergence)+一个 big-data 批判,点明规模 ≠代表性,并附 上一个 ethics 标记。 Trap. dismissing qualitative as "unscientific"; equating large n with representative; forgetting consent/provenance for found data. 陷阱。把 qualitative 斥为“不科学”;把大n 等同于有代表性;对找来的数据忘了 consent/provenance(同意/来源)。 AskSia Library · MAST20034 · 双语 Bilingual ★ Recall checklist - the decision rules for the bank 回忆清单 -- 题库的决策规则 1 每个答案:点名→定义→在情境中给理由→ because(后果)。按分值数理由。 · Design/sampling: justify the choice against its alternative; non-probability methods are biased - size won't cure it. 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 · Confounder: must link to both exposure and outcome; observational - associated with, never causes. Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 · Graph critique: two good features to a principle; one issue + a fix that matches it. 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 · Type I/II: false +/false -; power = 1-ß; small n -+ low power; rare condition - base-rate false positives. Type l/ll: 假阳/假阴;power = 1-β;小 n→低 power;罕见情况→基础率导致的假阳性。 · Qual vs quant: why vs what; convergence is the qual stopping rule. Big data: size # unbiased; effect size & ethics over P. 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data: 大≠无偏;effect size 与伦理胜过 P。 ● 每个答案:点名 →定义→在情境中给理由→ because(后果)。按分值数理由。 ● 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 ● Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 ● 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 ● CI:随机的是区间,固定的是参数。P-value: Pr(datalHo),而非 Pr(Holdata);显著 ≠重要。 ● Type I/ll: 假阳/假阴;power = 1-β;小n→低 power;罕见情况→ 基础率导致的假阳性。 ● Forest plot: 菱形对零线+异质性+发表偏倚的注意。Hill:证据的权重,时序性优先。 ● 诊断图:漏斗形→方差非恒定;QQ 弯曲→非正态→变换/用假设更少的方法。 ● 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data:大≠无偏;effect size 与伦理胜过 P。 AskSia Library · MAST20034 · 双语 Bilingual EXAM MORNING . THE DECODER EXAM MORNING . THE DECODER[15]Source: asksia-bible-mast20034-bilingual.pdf混合方法=定性+定量;定性解释或为定量播种。永远点名其伦理/实务成本(时间、编码、匿名性)。 · Trap: never call qualitative "unscientific"; never "just take a bigger survey" for a why question; never swap bottom-up + top-down. I 陷阱:绝不把定性称为“不科学”;对一个为何的问题绝不“就去做更大的调查”;绝不把自下而上←自上而下对调。 ● Quant=“什么/有多少”;qual=“为什么” -- 按问题的性质来选,而非按哪个“更好”。 ● 四种来源: interviews (深度) · focus groups (互动)· observation (做≠说)· documents/artefacts (已经存在) -- 各有其风险。 ● 编码:bottom-up =归纳,codes 从数据中涌现;top-down =演绎,codes 来自先验理论。Themes =归组后的 codes (thematic analysis) . · Convergence (收敛)=定性的停止规则(类比于 power/样本量):当新数据不再增添新主题时停止。 · Rigour (严谨): credibility (~ 内部效度)、transferability (~外部效度)、transparency、有目的的采集。 ● Mixed methods (混合方法) = qual + quant; qual 解释或孕育 quant。始终点名伦理/实务代价(时间、编码、 匿名)。 ● 陷阱:绝不把定性叫“不科学”;对一个为什么问题绝不“干脆做更大的调查”;绝不把 bottom-up ←> top-down 互 换。 AskSia Library · MAST20034 · 双语 Bilingual WEEK 7 . FRAMEWORKS FOR INFERENCE - WEEK 7 . FRAMEWORKS FOR INFERENCE CH 7 . ESTIMATION & SAMPLING DISTRIBUTIONS From sample to population: estimation & the CLT 从样本到总体:估计与 CLT Why one sample can speak for a whole population - and how confidently 为何一个样本能为整个总体发声 -- 以及有多大把握 TL;DR. Inference runs the arrow backwards: probability reasons population - sample, inference reasons sample - population. A point estimate is one number; a confidence interval is honest because it carries "how close". The whole machine rests on the sampling distribution - what the estimate would do over many samples - which the Central Limit Theorem makes Normal. Everything on these three pages is about reading and explaining this, never computing it. TL;DR. 推断把箭头倒过来跑:概率从总体→样本推理,推断从样本→ 总体推理。一个 point estimate(点估计)是一个数 字;一个 confidence interval(置信区间)之所以诚实,是因为它带着“有多接近”。整套机器都立在 sampling distribution (抽样分布)之上 -- 即这个估计在许多样本上会有的表现 -- 而 Central Limit Theorem (CLT,中心极限定理)让它呈正 态。这三页里的一切都关于读懂并解释它,从不计算它。 ★ What the exam asks here 考试在这里问什么 The 60% final is short-answer reasoning only - no calculator, no calculations, no multiple choice. You bring in four sides of your own notes. For inference you will be handed a CI or a P-value to interpret and asked to say what it does (and does not) mean, or to name an error / explain power in a scenario. The marking is explicit: "explaining your reasoning and choices is typically more important than any answer. " Every mark is a because - so carry the definitions, the CI/P-value interpretation rules, and the Type I/II decoder, not arithmetic. 60% 期末只考简答推理 -- 无计算器、无计算、无多选。你带入四面自己的笔记。对于推断,你会被递给一个要解读的 CI 或 P-value,要你说它意味着什么(以及不意味着什么),或在某情景中点名一种 error/解释 power。评分明确: “阐释你的推理与选择,通常比任何答案本身更重要。”每一分都是一个because -- 所以带上定义、CI/P-value 解读规 则,以及 Type l/II 解码器,而非算术。 7. 1 Estimation: point estimate vs confidence interval 7. 1fait: point estimate vs confidence interval Definitions. A point estimate is a single number computed from the sample that stands in for an unknown population parameter (the sample mean x estimates u; the sample proportion p estimates p; s estimates o). A confidence interval (CI) is a range - the estimate plus a margin that encodes how close it is likely to be: estimate ± (distribution multiplier) x (variability). The width comes from sampling variability (and so shrinks as n grows); the multiplier comes from the confidence level you choose. - 定义。一个 point estimate 是从样本算出的单个数字,用来替代一个未知的总体 parameter(参数)(样本均值 x估计 μ; 样本比例 p^估计 p; s 估计 o)。一个 confidence interval (CI,置信区间)是一个范围 -- 估计值加上一个编码了它有多大 可能接近的余量:估计值+(分布乘数)× (变异性)。宽度来自抽样变异性(所以随n增大而收缩);乘数来自你选择的置 信水平。 Quantity Symbol Lives in
-
Confidence interval
- 一个由样本构造出的区间,用来表达对未知总体参数估计的不确定性;随机的是区间,参数是固定的。[9]Source: asksia-bible-mast20034-bilingual.pdfThe PPDAC cycle - the spine of the whole unit, and a one-glance map of how an exam scenario hangs together: every question lives somewhere on Problem - Plan - Data - Analysis - Conclusion. Locating the stage tells you which concept the marker wants. PPDAC 循环 -- 整个单元的脊柱,也是一张让你一眼看清考试情景如何拼接的地图:每道题都栖身于 Problem → Plan → Data → Analysis → Conclusion 中的某处。定位到阶段,就知道评分者想要哪个概念。 AskSia Library . MAST20034 . XXia Bilingual ★ Concepts to recall - the whole-book checklist 要回忆的概念 -- 全书清单 · Context first (Ch1): data are value-laden; ask who/why/what/how/when; critique # criticism (always offer a constructive fix). 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · Graphics (Ch2): the 5 principles; match graph to variable types; two good features + one specific improvement. I 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 · Design (Ch3): validity = randomise/compare/control (kills bias); precision = replicate/stratify/balance (kills variability); they are independent axes. 设计(第3章): validity = 随机化/比较/控制(杀 bias); precision = 重复/分层/平衡(杀 variability);二者是独 立的轴。 · Observational (Ch4): cohort=group-by-exposure, case-control=group-by-outcome; confounder links to both; correlation # causation. 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠ 因果。 I 报告(第5章):中心/离散/趋势/离群点;报告 Cl+水平、以及统计量 +P,而非只报P。 · Qualitative (Ch6): "why" not "what"; bottom-up vs top-down coding; convergence as the stopping rule. 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence 作为停止规则。 推断(第7章):随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大P 不证明 Ho; Type l/ll 与 power. · Modelling (Ch8): signal+noise; "all models wrong, some useful"; parsimony; read residual/QQ plots - interpret, never fit. 建模(第8章):信号+噪声;“所有模型都是错的,有些有用”;简约性;读残差/QQ图 -- 解读,绝不拟合。 · Sampling (Ch9): frame vs sample; a big sample won't fix bias; 4 random + 4 non-random methods; WEIRD; reproducibility crisis. I 抽样(第9章):抽样框 vs样本;大样本修不了偏倚;4种随机+4种非随机方法;WEIRD;可重复性危机。 I 累积(第10章):森林图(零线+菱形);Hill 准则(时序性+梯度);发表偏倚。 · Big data (Ch11): significance # importance at scale; provenance, ethics, scepticism toward Al findings. Big data (第11章):在大规模下显著 ≠重要;来源、伦理、对 AI发现的怀疑。 而且永远 -- 铁律:点名概念,然后给 because。祝你好运。 AskSia Library · MAST20034 · 双语 Bilingual ● 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 ● 设计(第3章): validity =随机化/比较/控制(杀 bias); precision= 重复/分层/平衡(杀 variability);二者是独 立的轴。 ● 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠因果。 ● 报告(第5章):中心/离散/趋势/离群点;报告 CI+水平、以及统计量+P,而非只报P。 ● 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence(收敛)作为停止规则。 推断(第7章): 随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大 P 不证明 Ho; Type l/ll 与 power.[11]Source: asksia-bible-mast20034-bilingual.pdf(b) Big-data 批判(because):巨大的 n修不了 bias -- 它只是那些已经在用这个 app 的人(selection bias); 在海量 n 下一切看起来都“显著”,所以 effect size 与来源比 P 更要紧。再加上数据伦理标记:被记录用户的隐私/ 同意。 What earns the marks. a justified qual choice (the "why" logic + convergence) + a big-data critique naming that size # representativeness, with an ethics flag. 什么能得分。一个有论证的 qual 选择(“why”逻辑+ convergence)+一个 big-data 批判,点明规模 ≠代表性,并附 上一个 ethics 标记。 Trap. dismissing qualitative as "unscientific"; equating large n with representative; forgetting consent/provenance for found data. 陷阱。把 qualitative 斥为“不科学”;把大n 等同于有代表性;对找来的数据忘了 consent/provenance(同意/来源)。 AskSia Library · MAST20034 · 双语 Bilingual ★ Recall checklist - the decision rules for the bank 回忆清单 -- 题库的决策规则 1 每个答案:点名→定义→在情境中给理由→ because(后果)。按分值数理由。 · Design/sampling: justify the choice against its alternative; non-probability methods are biased - size won't cure it. 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 · Confounder: must link to both exposure and outcome; observational - associated with, never causes. Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 · Graph critique: two good features to a principle; one issue + a fix that matches it. 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 · Type I/II: false +/false -; power = 1-ß; small n -+ low power; rare condition - base-rate false positives. Type l/ll: 假阳/假阴;power = 1-β;小 n→低 power;罕见情况→基础率导致的假阳性。 · Qual vs quant: why vs what; convergence is the qual stopping rule. Big data: size # unbiased; effect size & ethics over P. 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data: 大≠无偏;effect size 与伦理胜过 P。 ● 每个答案:点名 →定义→在情境中给理由→ because(后果)。按分值数理由。 ● 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 ● Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 ● 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 ● CI:随机的是区间,固定的是参数。P-value: Pr(datalHo),而非 Pr(Holdata);显著 ≠重要。 ● Type I/ll: 假阳/假阴;power = 1-β;小n→低 power;罕见情况→ 基础率导致的假阳性。 ● Forest plot: 菱形对零线+异质性+发表偏倚的注意。Hill:证据的权重,时序性优先。 ● 诊断图:漏斗形→方差非恒定;QQ 弯曲→非正态→变换/用假设更少的方法。 ● 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data:大≠无偏;effect size 与伦理胜过 P。 AskSia Library · MAST20034 · 双语 Bilingual EXAM MORNING . THE DECODER EXAM MORNING . THE DECODER[15]Source: asksia-bible-mast20034-bilingual.pdf混合方法=定性+定量;定性解释或为定量播种。永远点名其伦理/实务成本(时间、编码、匿名性)。 · Trap: never call qualitative "unscientific"; never "just take a bigger survey" for a why question; never swap bottom-up + top-down. I 陷阱:绝不把定性称为“不科学”;对一个为何的问题绝不“就去做更大的调查”;绝不把自下而上←自上而下对调。 ● Quant=“什么/有多少”;qual=“为什么” -- 按问题的性质来选,而非按哪个“更好”。 ● 四种来源: interviews (深度) · focus groups (互动)· observation (做≠说)· documents/artefacts (已经存在) -- 各有其风险。 ● 编码:bottom-up =归纳,codes 从数据中涌现;top-down =演绎,codes 来自先验理论。Themes =归组后的 codes (thematic analysis) . · Convergence (收敛)=定性的停止规则(类比于 power/样本量):当新数据不再增添新主题时停止。 · Rigour (严谨): credibility (~ 内部效度)、transferability (~外部效度)、transparency、有目的的采集。 ● Mixed methods (混合方法) = qual + quant; qual 解释或孕育 quant。始终点名伦理/实务代价(时间、编码、 匿名)。 ● 陷阱:绝不把定性叫“不科学”;对一个为什么问题绝不“干脆做更大的调查”;绝不把 bottom-up ←> top-down 互 换。 AskSia Library · MAST20034 · 双语 Bilingual WEEK 7 . FRAMEWORKS FOR INFERENCE - WEEK 7 . FRAMEWORKS FOR INFERENCE CH 7 . ESTIMATION & SAMPLING DISTRIBUTIONS From sample to population: estimation & the CLT 从样本到总体:估计与 CLT Why one sample can speak for a whole population - and how confidently 为何一个样本能为整个总体发声 -- 以及有多大把握 TL;DR. Inference runs the arrow backwards: probability reasons population - sample, inference reasons sample - population. A point estimate is one number; a confidence interval is honest because it carries "how close". The whole machine rests on the sampling distribution - what the estimate would do over many samples - which the Central Limit Theorem makes Normal. Everything on these three pages is about reading and explaining this, never computing it. TL;DR. 推断把箭头倒过来跑:概率从总体→样本推理,推断从样本→ 总体推理。一个 point estimate(点估计)是一个数 字;一个 confidence interval(置信区间)之所以诚实,是因为它带着“有多接近”。整套机器都立在 sampling distribution (抽样分布)之上 -- 即这个估计在许多样本上会有的表现 -- 而 Central Limit Theorem (CLT,中心极限定理)让它呈正 态。这三页里的一切都关于读懂并解释它,从不计算它。 ★ What the exam asks here 考试在这里问什么 The 60% final is short-answer reasoning only - no calculator, no calculations, no multiple choice. You bring in four sides of your own notes. For inference you will be handed a CI or a P-value to interpret and asked to say what it does (and does not) mean, or to name an error / explain power in a scenario. The marking is explicit: "explaining your reasoning and choices is typically more important than any answer. " Every mark is a because - so carry the definitions, the CI/P-value interpretation rules, and the Type I/II decoder, not arithmetic. 60% 期末只考简答推理 -- 无计算器、无计算、无多选。你带入四面自己的笔记。对于推断,你会被递给一个要解读的 CI 或 P-value,要你说它意味着什么(以及不意味着什么),或在某情景中点名一种 error/解释 power。评分明确: “阐释你的推理与选择,通常比任何答案本身更重要。”每一分都是一个because -- 所以带上定义、CI/P-value 解读规 则,以及 Type l/II 解码器,而非算术。 7. 1 Estimation: point estimate vs confidence interval 7. 1fait: point estimate vs confidence interval Definitions. A point estimate is a single number computed from the sample that stands in for an unknown population parameter (the sample mean x estimates u; the sample proportion p estimates p; s estimates o). A confidence interval (CI) is a range - the estimate plus a margin that encodes how close it is likely to be: estimate ± (distribution multiplier) x (variability). The width comes from sampling variability (and so shrinks as n grows); the multiplier comes from the confidence level you choose. - 定义。一个 point estimate 是从样本算出的单个数字,用来替代一个未知的总体 parameter(参数)(样本均值 x估计 μ; 样本比例 p^估计 p; s 估计 o)。一个 confidence interval (CI,置信区间)是一个范围 -- 估计值加上一个编码了它有多大 可能接近的余量:估计值+(分布乘数)× (变异性)。宽度来自抽样变异性(所以随n增大而收缩);乘数来自你选择的置 信水平。 Quantity Symbol Lives in
-
Type I error
- 把真的原假设错拒绝。[11]Source: asksia-bible-mast20034-bilingual.pdf(b) Big-data 批判(because):巨大的 n修不了 bias -- 它只是那些已经在用这个 app 的人(selection bias); 在海量 n 下一切看起来都“显著”,所以 effect size 与来源比 P 更要紧。再加上数据伦理标记:被记录用户的隐私/ 同意。 What earns the marks. a justified qual choice (the "why" logic + convergence) + a big-data critique naming that size # representativeness, with an ethics flag. 什么能得分。一个有论证的 qual 选择(“why”逻辑+ convergence)+一个 big-data 批判,点明规模 ≠代表性,并附 上一个 ethics 标记。 Trap. dismissing qualitative as "unscientific"; equating large n with representative; forgetting consent/provenance for found data. 陷阱。把 qualitative 斥为“不科学”;把大n 等同于有代表性;对找来的数据忘了 consent/provenance(同意/来源)。 AskSia Library · MAST20034 · 双语 Bilingual ★ Recall checklist - the decision rules for the bank 回忆清单 -- 题库的决策规则 1 每个答案:点名→定义→在情境中给理由→ because(后果)。按分值数理由。 · Design/sampling: justify the choice against its alternative; non-probability methods are biased - size won't cure it. 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 · Confounder: must link to both exposure and outcome; observational - associated with, never causes. Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 · Graph critique: two good features to a principle; one issue + a fix that matches it. 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 · Type I/II: false +/false -; power = 1-ß; small n -+ low power; rare condition - base-rate false positives. Type l/ll: 假阳/假阴;power = 1-β;小 n→低 power;罕见情况→基础率导致的假阳性。 · Qual vs quant: why vs what; convergence is the qual stopping rule. Big data: size # unbiased; effect size & ethics over P. 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data: 大≠无偏;effect size 与伦理胜过 P。 ● 每个答案:点名 →定义→在情境中给理由→ because(后果)。按分值数理由。 ● 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 ● Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 ● 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 ● CI:随机的是区间,固定的是参数。P-value: Pr(datalHo),而非 Pr(Holdata);显著 ≠重要。 ● Type I/ll: 假阳/假阴;power = 1-β;小n→低 power;罕见情况→ 基础率导致的假阳性。 ● Forest plot: 菱形对零线+异质性+发表偏倚的注意。Hill:证据的权重,时序性优先。 ● 诊断图:漏斗形→方差非恒定;QQ 弯曲→非正态→变换/用假设更少的方法。 ● 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data:大≠无偏;effect size 与伦理胜过 P。 AskSia Library · MAST20034 · 双语 Bilingual EXAM MORNING . THE DECODER EXAM MORNING . THE DECODER[15]Source: asksia-bible-mast20034-bilingual.pdf混合方法=定性+定量;定性解释或为定量播种。永远点名其伦理/实务成本(时间、编码、匿名性)。 · Trap: never call qualitative "unscientific"; never "just take a bigger survey" for a why question; never swap bottom-up + top-down. I 陷阱:绝不把定性称为“不科学”;对一个为何的问题绝不“就去做更大的调查”;绝不把自下而上←自上而下对调。 ● Quant=“什么/有多少”;qual=“为什么” -- 按问题的性质来选,而非按哪个“更好”。 ● 四种来源: interviews (深度) · focus groups (互动)· observation (做≠说)· documents/artefacts (已经存在) -- 各有其风险。 ● 编码:bottom-up =归纳,codes 从数据中涌现;top-down =演绎,codes 来自先验理论。Themes =归组后的 codes (thematic analysis) . · Convergence (收敛)=定性的停止规则(类比于 power/样本量):当新数据不再增添新主题时停止。 · Rigour (严谨): credibility (~ 内部效度)、transferability (~外部效度)、transparency、有目的的采集。 ● Mixed methods (混合方法) = qual + quant; qual 解释或孕育 quant。始终点名伦理/实务代价(时间、编码、 匿名)。 ● 陷阱:绝不把定性叫“不科学”;对一个为什么问题绝不“干脆做更大的调查”;绝不把 bottom-up ←> top-down 互 换。 AskSia Library · MAST20034 · 双语 Bilingual WEEK 7 . FRAMEWORKS FOR INFERENCE - WEEK 7 . FRAMEWORKS FOR INFERENCE CH 7 . ESTIMATION & SAMPLING DISTRIBUTIONS From sample to population: estimation & the CLT 从样本到总体:估计与 CLT Why one sample can speak for a whole population - and how confidently 为何一个样本能为整个总体发声 -- 以及有多大把握 TL;DR. Inference runs the arrow backwards: probability reasons population - sample, inference reasons sample - population. A point estimate is one number; a confidence interval is honest because it carries "how close". The whole machine rests on the sampling distribution - what the estimate would do over many samples - which the Central Limit Theorem makes Normal. Everything on these three pages is about reading and explaining this, never computing it. TL;DR. 推断把箭头倒过来跑:概率从总体→样本推理,推断从样本→ 总体推理。一个 point estimate(点估计)是一个数 字;一个 confidence interval(置信区间)之所以诚实,是因为它带着“有多接近”。整套机器都立在 sampling distribution (抽样分布)之上 -- 即这个估计在许多样本上会有的表现 -- 而 Central Limit Theorem (CLT,中心极限定理)让它呈正 态。这三页里的一切都关于读懂并解释它,从不计算它。 ★ What the exam asks here 考试在这里问什么 The 60% final is short-answer reasoning only - no calculator, no calculations, no multiple choice. You bring in four sides of your own notes. For inference you will be handed a CI or a P-value to interpret and asked to say what it does (and does not) mean, or to name an error / explain power in a scenario. The marking is explicit: "explaining your reasoning and choices is typically more important than any answer. " Every mark is a because - so carry the definitions, the CI/P-value interpretation rules, and the Type I/II decoder, not arithmetic. 60% 期末只考简答推理 -- 无计算器、无计算、无多选。你带入四面自己的笔记。对于推断,你会被递给一个要解读的 CI 或 P-value,要你说它意味着什么(以及不意味着什么),或在某情景中点名一种 error/解释 power。评分明确: “阐释你的推理与选择,通常比任何答案本身更重要。”每一分都是一个because -- 所以带上定义、CI/P-value 解读规 则,以及 Type l/II 解码器,而非算术。 7. 1 Estimation: point estimate vs confidence interval 7. 1fait: point estimate vs confidence interval Definitions. A point estimate is a single number computed from the sample that stands in for an unknown population parameter (the sample mean x estimates u; the sample proportion p estimates p; s estimates o). A confidence interval (CI) is a range - the estimate plus a margin that encodes how close it is likely to be: estimate ± (distribution multiplier) x (variability). The width comes from sampling variability (and so shrinks as n grows); the multiplier comes from the confidence level you choose. - 定义。一个 point estimate 是从样本算出的单个数字,用来替代一个未知的总体 parameter(参数)(样本均值 x估计 μ; 样本比例 p^估计 p; s 估计 o)。一个 confidence interval (CI,置信区间)是一个范围 -- 估计值加上一个编码了它有多大 可能接近的余量:估计值+(分布乘数)× (变异性)。宽度来自抽样变异性(所以随n增大而收缩);乘数来自你选择的置 信水平。 Quantity Symbol Lives in
-
Type II error
- 没拒绝一个其实是假的原假设。[11]Source: asksia-bible-mast20034-bilingual.pdf(b) Big-data 批判(because):巨大的 n修不了 bias -- 它只是那些已经在用这个 app 的人(selection bias); 在海量 n 下一切看起来都“显著”,所以 effect size 与来源比 P 更要紧。再加上数据伦理标记:被记录用户的隐私/ 同意。 What earns the marks. a justified qual choice (the "why" logic + convergence) + a big-data critique naming that size # representativeness, with an ethics flag. 什么能得分。一个有论证的 qual 选择(“why”逻辑+ convergence)+一个 big-data 批判,点明规模 ≠代表性,并附 上一个 ethics 标记。 Trap. dismissing qualitative as "unscientific"; equating large n with representative; forgetting consent/provenance for found data. 陷阱。把 qualitative 斥为“不科学”;把大n 等同于有代表性;对找来的数据忘了 consent/provenance(同意/来源)。 AskSia Library · MAST20034 · 双语 Bilingual ★ Recall checklist - the decision rules for the bank 回忆清单 -- 题库的决策规则 1 每个答案:点名→定义→在情境中给理由→ because(后果)。按分值数理由。 · Design/sampling: justify the choice against its alternative; non-probability methods are biased - size won't cure it. 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 · Confounder: must link to both exposure and outcome; observational - associated with, never causes. Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 · Graph critique: two good features to a principle; one issue + a fix that matches it. 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 · Type I/II: false +/false -; power = 1-ß; small n -+ low power; rare condition - base-rate false positives. Type l/ll: 假阳/假阴;power = 1-β;小 n→低 power;罕见情况→基础率导致的假阳性。 · Qual vs quant: why vs what; convergence is the qual stopping rule. Big data: size # unbiased; effect size & ethics over P. 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data: 大≠无偏;effect size 与伦理胜过 P。 ● 每个答案:点名 →定义→在情境中给理由→ because(后果)。按分值数理由。 ● 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 ● Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 ● 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 ● CI:随机的是区间,固定的是参数。P-value: Pr(datalHo),而非 Pr(Holdata);显著 ≠重要。 ● Type I/ll: 假阳/假阴;power = 1-β;小n→低 power;罕见情况→ 基础率导致的假阳性。 ● Forest plot: 菱形对零线+异质性+发表偏倚的注意。Hill:证据的权重,时序性优先。 ● 诊断图:漏斗形→方差非恒定;QQ 弯曲→非正态→变换/用假设更少的方法。 ● 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data:大≠无偏;effect size 与伦理胜过 P。 AskSia Library · MAST20034 · 双语 Bilingual EXAM MORNING . THE DECODER EXAM MORNING . THE DECODER[15]Source: asksia-bible-mast20034-bilingual.pdf混合方法=定性+定量;定性解释或为定量播种。永远点名其伦理/实务成本(时间、编码、匿名性)。 · Trap: never call qualitative "unscientific"; never "just take a bigger survey" for a why question; never swap bottom-up + top-down. I 陷阱:绝不把定性称为“不科学”;对一个为何的问题绝不“就去做更大的调查”;绝不把自下而上←自上而下对调。 ● Quant=“什么/有多少”;qual=“为什么” -- 按问题的性质来选,而非按哪个“更好”。 ● 四种来源: interviews (深度) · focus groups (互动)· observation (做≠说)· documents/artefacts (已经存在) -- 各有其风险。 ● 编码:bottom-up =归纳,codes 从数据中涌现;top-down =演绎,codes 来自先验理论。Themes =归组后的 codes (thematic analysis) . · Convergence (收敛)=定性的停止规则(类比于 power/样本量):当新数据不再增添新主题时停止。 · Rigour (严谨): credibility (~ 内部效度)、transferability (~外部效度)、transparency、有目的的采集。 ● Mixed methods (混合方法) = qual + quant; qual 解释或孕育 quant。始终点名伦理/实务代价(时间、编码、 匿名)。 ● 陷阱:绝不把定性叫“不科学”;对一个为什么问题绝不“干脆做更大的调查”;绝不把 bottom-up ←> top-down 互 换。 AskSia Library · MAST20034 · 双语 Bilingual WEEK 7 . FRAMEWORKS FOR INFERENCE - WEEK 7 . FRAMEWORKS FOR INFERENCE CH 7 . ESTIMATION & SAMPLING DISTRIBUTIONS From sample to population: estimation & the CLT 从样本到总体:估计与 CLT Why one sample can speak for a whole population - and how confidently 为何一个样本能为整个总体发声 -- 以及有多大把握 TL;DR. Inference runs the arrow backwards: probability reasons population - sample, inference reasons sample - population. A point estimate is one number; a confidence interval is honest because it carries "how close". The whole machine rests on the sampling distribution - what the estimate would do over many samples - which the Central Limit Theorem makes Normal. Everything on these three pages is about reading and explaining this, never computing it. TL;DR. 推断把箭头倒过来跑:概率从总体→样本推理,推断从样本→ 总体推理。一个 point estimate(点估计)是一个数 字;一个 confidence interval(置信区间)之所以诚实,是因为它带着“有多接近”。整套机器都立在 sampling distribution (抽样分布)之上 -- 即这个估计在许多样本上会有的表现 -- 而 Central Limit Theorem (CLT,中心极限定理)让它呈正 态。这三页里的一切都关于读懂并解释它,从不计算它。 ★ What the exam asks here 考试在这里问什么 The 60% final is short-answer reasoning only - no calculator, no calculations, no multiple choice. You bring in four sides of your own notes. For inference you will be handed a CI or a P-value to interpret and asked to say what it does (and does not) mean, or to name an error / explain power in a scenario. The marking is explicit: "explaining your reasoning and choices is typically more important than any answer. " Every mark is a because - so carry the definitions, the CI/P-value interpretation rules, and the Type I/II decoder, not arithmetic. 60% 期末只考简答推理 -- 无计算器、无计算、无多选。你带入四面自己的笔记。对于推断,你会被递给一个要解读的 CI 或 P-value,要你说它意味着什么(以及不意味着什么),或在某情景中点名一种 error/解释 power。评分明确: “阐释你的推理与选择,通常比任何答案本身更重要。”每一分都是一个because -- 所以带上定义、CI/P-value 解读规 则,以及 Type l/II 解码器,而非算术。 7. 1 Estimation: point estimate vs confidence interval 7. 1fait: point estimate vs confidence interval Definitions. A point estimate is a single number computed from the sample that stands in for an unknown population parameter (the sample mean x estimates u; the sample proportion p estimates p; s estimates o). A confidence interval (CI) is a range - the estimate plus a margin that encodes how close it is likely to be: estimate ± (distribution multiplier) x (variability). The width comes from sampling variability (and so shrinks as n grows); the multiplier comes from the confidence level you choose. - 定义。一个 point estimate 是从样本算出的单个数字,用来替代一个未知的总体 parameter(参数)(样本均值 x估计 μ; 样本比例 p^估计 p; s 估计 o)。一个 confidence interval (CI,置信区间)是一个范围 -- 估计值加上一个编码了它有多大 可能接近的余量:估计值+(分布乘数)× (变异性)。宽度来自抽样变异性(所以随n增大而收缩);乘数来自你选择的置 信水平。 Quantity Symbol Lives in
-
Power
- 当真实存在效应时,检验把它检出来的概率,$$\text{Power}=1-\beta$$。[11]Source: asksia-bible-mast20034-bilingual.pdf(b) Big-data 批判(because):巨大的 n修不了 bias -- 它只是那些已经在用这个 app 的人(selection bias); 在海量 n 下一切看起来都“显著”,所以 effect size 与来源比 P 更要紧。再加上数据伦理标记:被记录用户的隐私/ 同意。 What earns the marks. a justified qual choice (the "why" logic + convergence) + a big-data critique naming that size # representativeness, with an ethics flag. 什么能得分。一个有论证的 qual 选择(“why”逻辑+ convergence)+一个 big-data 批判,点明规模 ≠代表性,并附 上一个 ethics 标记。 Trap. dismissing qualitative as "unscientific"; equating large n with representative; forgetting consent/provenance for found data. 陷阱。把 qualitative 斥为“不科学”;把大n 等同于有代表性;对找来的数据忘了 consent/provenance(同意/来源)。 AskSia Library · MAST20034 · 双语 Bilingual ★ Recall checklist - the decision rules for the bank 回忆清单 -- 题库的决策规则 1 每个答案:点名→定义→在情境中给理由→ because(后果)。按分值数理由。 · Design/sampling: justify the choice against its alternative; non-probability methods are biased - size won't cure it. 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 · Confounder: must link to both exposure and outcome; observational - associated with, never causes. Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 · Graph critique: two good features to a principle; one issue + a fix that matches it. 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 · Type I/II: false +/false -; power = 1-ß; small n -+ low power; rare condition - base-rate false positives. Type l/ll: 假阳/假阴;power = 1-β;小 n→低 power;罕见情况→基础率导致的假阳性。 · Qual vs quant: why vs what; convergence is the qual stopping rule. Big data: size # unbiased; effect size & ethics over P. 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data: 大≠无偏;effect size 与伦理胜过 P。 ● 每个答案:点名 →定义→在情境中给理由→ because(后果)。按分值数理由。 ● 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 ● Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 ● 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 ● CI:随机的是区间,固定的是参数。P-value: Pr(datalHo),而非 Pr(Holdata);显著 ≠重要。 ● Type I/ll: 假阳/假阴;power = 1-β;小n→低 power;罕见情况→ 基础率导致的假阳性。 ● Forest plot: 菱形对零线+异质性+发表偏倚的注意。Hill:证据的权重,时序性优先。 ● 诊断图:漏斗形→方差非恒定;QQ 弯曲→非正态→变换/用假设更少的方法。 ● 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data:大≠无偏;effect size 与伦理胜过 P。 AskSia Library · MAST20034 · 双语 Bilingual EXAM MORNING . THE DECODER EXAM MORNING . THE DECODER[15]Source: asksia-bible-mast20034-bilingual.pdf混合方法=定性+定量;定性解释或为定量播种。永远点名其伦理/实务成本(时间、编码、匿名性)。 · Trap: never call qualitative "unscientific"; never "just take a bigger survey" for a why question; never swap bottom-up + top-down. I 陷阱:绝不把定性称为“不科学”;对一个为何的问题绝不“就去做更大的调查”;绝不把自下而上←自上而下对调。 ● Quant=“什么/有多少”;qual=“为什么” -- 按问题的性质来选,而非按哪个“更好”。 ● 四种来源: interviews (深度) · focus groups (互动)· observation (做≠说)· documents/artefacts (已经存在) -- 各有其风险。 ● 编码:bottom-up =归纳,codes 从数据中涌现;top-down =演绎,codes 来自先验理论。Themes =归组后的 codes (thematic analysis) . · Convergence (收敛)=定性的停止规则(类比于 power/样本量):当新数据不再增添新主题时停止。 · Rigour (严谨): credibility (~ 内部效度)、transferability (~外部效度)、transparency、有目的的采集。 ● Mixed methods (混合方法) = qual + quant; qual 解释或孕育 quant。始终点名伦理/实务代价(时间、编码、 匿名)。 ● 陷阱:绝不把定性叫“不科学”;对一个为什么问题绝不“干脆做更大的调查”;绝不把 bottom-up ←> top-down 互 换。 AskSia Library · MAST20034 · 双语 Bilingual WEEK 7 . FRAMEWORKS FOR INFERENCE - WEEK 7 . FRAMEWORKS FOR INFERENCE CH 7 . ESTIMATION & SAMPLING DISTRIBUTIONS From sample to population: estimation & the CLT 从样本到总体:估计与 CLT Why one sample can speak for a whole population - and how confidently 为何一个样本能为整个总体发声 -- 以及有多大把握 TL;DR. Inference runs the arrow backwards: probability reasons population - sample, inference reasons sample - population. A point estimate is one number; a confidence interval is honest because it carries "how close". The whole machine rests on the sampling distribution - what the estimate would do over many samples - which the Central Limit Theorem makes Normal. Everything on these three pages is about reading and explaining this, never computing it. TL;DR. 推断把箭头倒过来跑:概率从总体→样本推理,推断从样本→ 总体推理。一个 point estimate(点估计)是一个数 字;一个 confidence interval(置信区间)之所以诚实,是因为它带着“有多接近”。整套机器都立在 sampling distribution (抽样分布)之上 -- 即这个估计在许多样本上会有的表现 -- 而 Central Limit Theorem (CLT,中心极限定理)让它呈正 态。这三页里的一切都关于读懂并解释它,从不计算它。 ★ What the exam asks here 考试在这里问什么 The 60% final is short-answer reasoning only - no calculator, no calculations, no multiple choice. You bring in four sides of your own notes. For inference you will be handed a CI or a P-value to interpret and asked to say what it does (and does not) mean, or to name an error / explain power in a scenario. The marking is explicit: "explaining your reasoning and choices is typically more important than any answer. " Every mark is a because - so carry the definitions, the CI/P-value interpretation rules, and the Type I/II decoder, not arithmetic. 60% 期末只考简答推理 -- 无计算器、无计算、无多选。你带入四面自己的笔记。对于推断,你会被递给一个要解读的 CI 或 P-value,要你说它意味着什么(以及不意味着什么),或在某情景中点名一种 error/解释 power。评分明确: “阐释你的推理与选择,通常比任何答案本身更重要。”每一分都是一个because -- 所以带上定义、CI/P-value 解读规 则,以及 Type l/II 解码器,而非算术。 7. 1 Estimation: point estimate vs confidence interval 7. 1fait: point estimate vs confidence interval Definitions. A point estimate is a single number computed from the sample that stands in for an unknown population parameter (the sample mean x estimates u; the sample proportion p estimates p; s estimates o). A confidence interval (CI) is a range - the estimate plus a margin that encodes how close it is likely to be: estimate ± (distribution multiplier) x (variability). The width comes from sampling variability (and so shrinks as n grows); the multiplier comes from the confidence level you choose. - 定义。一个 point estimate 是从样本算出的单个数字,用来替代一个未知的总体 parameter(参数)(样本均值 x估计 μ; 样本比例 p^估计 p; s 估计 o)。一个 confidence interval (CI,置信区间)是一个范围 -- 估计值加上一个编码了它有多大 可能接近的余量:估计值+(分布乘数)× (变异性)。宽度来自抽样变异性(所以随n增大而收缩);乘数来自你选择的置 信水平。 Quantity Symbol Lives in
-
Convenience sample
- 由于容易接触而取得的样本,不是按概率机制抽取,因此容易系统性偏倚。[6]Source: asksia-bible-mast20034-bilingual.pdfEvery unit (and every set of n) equally likely - the baseline. Cluster 整群抽 样 Randomly pick groups, survey all within. stratum; guarantees subgroup coverage. prone. ✓ How to spend a glossary term in the exam 如何在考试中「花掉」一个词汇表术语 Never just name it. Define - apply - because. e. g. "This is convenience sampling (define); here it over-represents metro students (apply); so a bigger sample won't fix the bias (because). " That three-move sentence is what the rubric pays for. 永远别只是点名。定义→应用→ because。例如:“这是 convenience sampling(便利抽样)(定义);这里它过度 代表了都市学生(应用);所以更大的样本也修复不了这个偏倚(because)。”那个三步句式,正是评分标准买单的东 西。 AskSia Library · MAST20034 · 双语 Bilingual REVISION . SHORT -ANSWER BANK - REVISION . SHORT - ANSWER BANK ALL CHAPTERS . EXAM REHEARSAL Practice bank: every mark is a because 练习题库:每一分都是一个 because Twelve short-answer species, each reasoned out the way the marker wants 十二类 short-answer 题型,每一类都按阅卷人想要的方式推理出来 TL;DR. The final is short-answer reasoning only - you name a concept, then reason it out, then say because . . . The rubric is explicit: "explaining your reasoning and choices is typically more important than any answer. " So marks are awarded per correct, sufficiently-detailed reason - not per fact recalled. Each card below gives you the skeleton (define - reason - because), the marking note (what actually scores), and the trap that zeroes a vague answer. TL;DR. 期末只考简答推理 -- 你先点名一个概念,再推理出来,然后说because . . . (因为 …. . . . . )。评分标准写得很明白:“阐 释你的推理与选择,通常比任何答案本身更重要。”所以分数是按每条正确、足够详细的理由给的 -- 而不是按回忆出的事实给 的。下面每张卡片给你骨架(定义→推理→ because)、评分提示(什么真正得分)以及让模糊答案归零的陷阱。 ★ What the exam asks here - the format you are rehearsing 考试在这里问什么 -- 你正在排练的那种格式 The 60% final is short-answer, no calculator, no calculations, no software operation. You may bring 4 sides of your own notes. Two question species recur: (1) "explain a concept / apply critical thinking to a context" and (2) "use critical thinking on a whole-class example" (anchored to a shared case, but you are never asked to recall its data). They may hand you statistical output or a graph to interpret - you read and explain it, you never compute it. Every card here is one rep of that move. 60% 的期末是简答题,不可用计算器,无需计算,无需操作软件。你可以带入自备4 面笔记。两类题目反复出现:(1) “解释一个概念/把批判性思维应用到某情境”和(2)“对一个全班案例运用批判性思维”(锚定在一个共享案例上,但从 不要求你回忆它的数据)。他们可能递给你一段统计输出或一张图让你解读 -- 你读它、解释它,但从不计算它。这里的 每一张卡片都是这一招的一次操练。 P. 1 How to read each card - the marking model P. 1如何读每张卡片 -- 评分模型 Markers do not reward the verb "explain"; they reward the linkage. A "4-mark, 2+2" item almost always means 2 marks for a precise definition and 2 marks for two distinct, consequence-level reasons. A reason that merely restates the definition earns nothing. The skeleton below is the spine of every answer. 评分者奖励的不是“explain(解释)”这个动词,而是关联(linkage)。一道“4分、2+2”的题几乎总意味着2分给精确定义,2 分给两条相异的、后果层面的理由。一条仅仅复述定义的理由得不到分。下面的骨架是每个答案的脊柱。 1 Name the concept / framework. Markers reward defined terms - say convenience sampling, confounder, Type I error by name before you reason. 点名概念 / 框架。评分者奖励定义清晰的术语 -- 在推理前先按名说出 convenience sampling、confounder、Type l error. 2 Define it precisely. One sentence that would let a stranger identify it; vagueness ("just picking people") loses the definition marks. 精确地定义它。用一句话让陌生人也能据此辨认它;含糊(“就是随便挑人”)会丢掉定义分。[13]Source: asksia-bible-mast20034-bilingual.pdfEX 12. 1 Turning a fact into a because (worked short-answer) name > consequence + because Stem (AskSia-invented): "A wellbeing app is evaluated by surveying users who clicked an in-app pop-up. Comment on the sample. " 题干(AskSia 自拟):“某福祉 app通过调查那些点击了应用内弹窗的用户来评估。评论这个样本。” Weak (no marks): "It is a convenience sample. " - a label, no reasoning. 弱(无分):“这是一个 convenience sample。” -- 只是标签,没有推理。 Strong (banks the marks): "This is a convenience / volunteer sample, because only users already engaged enough to click respond - so it suffers self-selection bias and likely over-states satisfaction (because dissatisfied users have churned and are missing). A larger pop-up sample would not fix this, because it repeats the same biased method at scale. " 强(存下分数):“这是一个 convenience/ volunteer(便利/自愿)样本,因为只有已经足够投入到会去点击的用户才 会作答 -- 所以它存在 self-selection bias(自我选择偏倚),很可能高估满意度(因为不满意的用户已经流失,处于 缺失(missing)状态)。更大的弹窗样本并不能解决这个问题,因为它只是把同一种有偏的方法放大重复。” - Read-out: three clauses, three becauses: name the method - name the consequence in context - pre- empt the 'bigger sample' trap. (Scenario AskSia-invented; no figures to compute. ) 读出结构:三个分句,三个because:点名方法→在情境中点名后果→预先化解“样本更大”的陷阱。(情景由 AskSia 自拟;没有要计算的数字。) - 12. 4 The 3-hour timing plan 12. 43 小时计时计划 Three hours for short-answer reasoning is generous - the risk is over-writing early questions, not running out of ideas. Budget by marks, leave a critique-polish pass at the end. 三小时做简答推理是宽裕的 -- 风险在于早段题目写过头,而非想不出点子。按分值分配时间,末尾留一遍批判-润色的检 查。 AskSia Library · MAST20034 · 双语 Bilingual 1 First 10 min - survey & map. Read every question; pencil the decoder row next to each (design? critique? interpret? sample?). Spot the high-mark items. 头10分钟 -- 通览与定位。把每道题读一遍;在每题旁用铅笔标出解码器的行(设计?批判?解读?抽样?)。挑出高分 题。 2 Bulk - answer by mark weight. Roughly a minute per mark; a 4-mark design question wants two detailed becauses, a 2-mark "two good features" wants exactly two. Do not pad. 主体 -- 按分值作答。大约每分一分钟;一道 4分的设计题想要两个详尽的 because,一道 2分的“两个好特征”就恰好两 个。别注水。 3 Discipline - never compute. If you feel an arithmetic urge, you have misread - the answer is an interpretation, not a number. 纪律 -- 绝不计算。若你感到一股算术冲动,那你读错题了 -- 答案是一个解读,而非一个数字。 4 Last 20 min - the because audit. Re-read each answer and check every claim ends in a reason tied to the scenario; add the missing because, the missing caveat (significance # importance), the missing specific fix. 最后 20 分钟 -- because 审计。重读每个答案,检查每一条主张是否都以一个系到情景的理由收尾;补上缺失的 because、缺失的注意点(显著≠重要)、缺失的具体修复。 FIG 12. 1 1 Problem define the question 5 Conclusion answer in context 2 Plan design how to get data PPDAC cycle 4 Analysis explore + model + test 3 Data collect / clean / store iterative: conclusions raise new problems -> cycle repeats
-
Volunteer sample
-
Cohort study
- 按 exposure 分组,再观察 outcome。[5]Source: asksia-bible-mast20034-bilingual.pdfFINAL . 60% . SHORT-ANSWER REASONING The exam-morning decoder 考试当天解码器 If the question says X, reach for Y, and say these three things 若题目说 X,就伸手去取 Y,并说出这三件事 TL;DR. The final hands you a scenario, a graph, or a piece of statistical output and asks you to name the concept and justify it. There is no calculator and no arithmetic - every mark is a because. This page is the lookup table: read the cue words in the stem, reach for the matching concept, then deliver the three reasons that bank the marks. Memorise the column on the right; that is the answer. TL;DR. 期末递给你一个scenario (情景)、一张图,或一段统计输出,要你点名概念并加以论证。没有计算器,也没有算术 -- 每一分都是一个because。本页就是查找表:读题干里的cue words (线索词),伸手抓对应的概念,再交出能存下分数 的三条理由。把右侧那一列背下来;那就是答案。 ★ What the exam asks here 考试在这里问什么 The 60% final is 3 hours, short-answer only (no MCQ, no essay), no calculator, and you bring in 4 sides of your own notes. The marking criteria are explicit: "explaining your reasoning and choices is typically more important than any answer. " Dot-points are fine; no marks for grammar/spelling. So this whole chapter trains the one move the exam pays for - name the concept - give the because. 60% 的期末为时 3 小时,仅简答(无MCQ、无论文),不可用计算器,且你带入自备4面笔记。评分标准写得很明 确:“解释你的推理与选择,通常比任何答案本身更重要。”用要点列举即可;语法/拼写不计分。所以整章都在训练考试 买单的那一招 -- 点名概念 →给出 because。 12. 1 The cue - concept - because table 12. 1cue -> concept -> because xJAR Each row is a question species you have already met in this book. The left column is what the stem sounds like; the middle is the framework to invoke (with the chapter); the right is the 3-part skeleton - say all three and you have earned the reasoning marks. 每一行都是你在本书中已经见过的一类问题species(题种)。左列是题干听起来像什么;中列是要调用的框架(附章节);右 列是三段式骨架 -- 三段都说出来,你就挣到了推理分。 If the question says . . . Reach for this concept Say these 3 things (the because) DESIGN & CAUSATION "Choose / justify a study design"; "how would you investigate . . . " Study-design tree + validity · Ch3-4 (1) Can you intervene? - experiment (RCT) vs observational. (2) Pick the type with a because - rare outcome-case-control, many outcomes-cohort, snapshot-cross-sectional, populations-ecological. (3) Name the design tools that protect validity (randomise/compare/control). "Is it causal?"; "does X cause Y?"; "can we conclude . . . " Confounding + Bradford Hill . Ch4, Ch10 (1) Correlation # causation - observational data give association only. (2) Name a plausible confounder (linked to both exposure and outcome). (3) Argue Hill - esp. temporality (cause first) + dose-response gradient; an RCT would strengthen it by removing confounders. AskSia Library . MAST20034 . XXia Bilingual If the question says . . . Reach for this concept Say these 3 things (the because) GRAPHS & OUTPUT "Critique this graph"; "two good features & one improvement" 5 graphics principles . Ch2 (1) Two good features, each tied to a principle (standard form / common scale / clear encoding / shows data / simple). (2) One real issue (no title, abbreviated labels, panels on different scales). (3) A specific fix that addresses that issue - vague fixes score zero.[9]Source: asksia-bible-mast20034-bilingual.pdfThe PPDAC cycle - the spine of the whole unit, and a one-glance map of how an exam scenario hangs together: every question lives somewhere on Problem - Plan - Data - Analysis - Conclusion. Locating the stage tells you which concept the marker wants. PPDAC 循环 -- 整个单元的脊柱,也是一张让你一眼看清考试情景如何拼接的地图:每道题都栖身于 Problem → Plan → Data → Analysis → Conclusion 中的某处。定位到阶段,就知道评分者想要哪个概念。 AskSia Library . MAST20034 . XXia Bilingual ★ Concepts to recall - the whole-book checklist 要回忆的概念 -- 全书清单 · Context first (Ch1): data are value-laden; ask who/why/what/how/when; critique # criticism (always offer a constructive fix). 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · Graphics (Ch2): the 5 principles; match graph to variable types; two good features + one specific improvement. I 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 · Design (Ch3): validity = randomise/compare/control (kills bias); precision = replicate/stratify/balance (kills variability); they are independent axes. 设计(第3章): validity = 随机化/比较/控制(杀 bias); precision = 重复/分层/平衡(杀 variability);二者是独 立的轴。 · Observational (Ch4): cohort=group-by-exposure, case-control=group-by-outcome; confounder links to both; correlation # causation. 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠ 因果。 I 报告(第5章):中心/离散/趋势/离群点;报告 Cl+水平、以及统计量 +P,而非只报P。 · Qualitative (Ch6): "why" not "what"; bottom-up vs top-down coding; convergence as the stopping rule. 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence 作为停止规则。 推断(第7章):随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大P 不证明 Ho; Type l/ll 与 power. · Modelling (Ch8): signal+noise; "all models wrong, some useful"; parsimony; read residual/QQ plots - interpret, never fit. 建模(第8章):信号+噪声;“所有模型都是错的,有些有用”;简约性;读残差/QQ图 -- 解读,绝不拟合。 · Sampling (Ch9): frame vs sample; a big sample won't fix bias; 4 random + 4 non-random methods; WEIRD; reproducibility crisis. I 抽样(第9章):抽样框 vs样本;大样本修不了偏倚;4种随机+4种非随机方法;WEIRD;可重复性危机。 I 累积(第10章):森林图(零线+菱形);Hill 准则(时序性+梯度);发表偏倚。 · Big data (Ch11): significance # importance at scale; provenance, ethics, scepticism toward Al findings. Big data (第11章):在大规模下显著 ≠重要;来源、伦理、对 AI发现的怀疑。 而且永远 -- 铁律:点名概念,然后给 because。祝你好运。 AskSia Library · MAST20034 · 双语 Bilingual ● 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 ● 设计(第3章): validity =随机化/比较/控制(杀 bias); precision= 重复/分层/平衡(杀 variability);二者是独 立的轴。 ● 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠因果。 ● 报告(第5章):中心/离散/趋势/离群点;报告 CI+水平、以及统计量+P,而非只报P。 ● 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence(收敛)作为停止规则。 推断(第7章): 随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大 P 不证明 Ho; Type l/ll 与 power.
-
Case-control study
- 按 outcome 分组,再回看 exposure。[5]Source: asksia-bible-mast20034-bilingual.pdfFINAL . 60% . SHORT-ANSWER REASONING The exam-morning decoder 考试当天解码器 If the question says X, reach for Y, and say these three things 若题目说 X,就伸手去取 Y,并说出这三件事 TL;DR. The final hands you a scenario, a graph, or a piece of statistical output and asks you to name the concept and justify it. There is no calculator and no arithmetic - every mark is a because. This page is the lookup table: read the cue words in the stem, reach for the matching concept, then deliver the three reasons that bank the marks. Memorise the column on the right; that is the answer. TL;DR. 期末递给你一个scenario (情景)、一张图,或一段统计输出,要你点名概念并加以论证。没有计算器,也没有算术 -- 每一分都是一个because。本页就是查找表:读题干里的cue words (线索词),伸手抓对应的概念,再交出能存下分数 的三条理由。把右侧那一列背下来;那就是答案。 ★ What the exam asks here 考试在这里问什么 The 60% final is 3 hours, short-answer only (no MCQ, no essay), no calculator, and you bring in 4 sides of your own notes. The marking criteria are explicit: "explaining your reasoning and choices is typically more important than any answer. " Dot-points are fine; no marks for grammar/spelling. So this whole chapter trains the one move the exam pays for - name the concept - give the because. 60% 的期末为时 3 小时,仅简答(无MCQ、无论文),不可用计算器,且你带入自备4面笔记。评分标准写得很明 确:“解释你的推理与选择,通常比任何答案本身更重要。”用要点列举即可;语法/拼写不计分。所以整章都在训练考试 买单的那一招 -- 点名概念 →给出 because。 12. 1 The cue - concept - because table 12. 1cue -> concept -> because xJAR Each row is a question species you have already met in this book. The left column is what the stem sounds like; the middle is the framework to invoke (with the chapter); the right is the 3-part skeleton - say all three and you have earned the reasoning marks. 每一行都是你在本书中已经见过的一类问题species(题种)。左列是题干听起来像什么;中列是要调用的框架(附章节);右 列是三段式骨架 -- 三段都说出来,你就挣到了推理分。 If the question says . . . Reach for this concept Say these 3 things (the because) DESIGN & CAUSATION "Choose / justify a study design"; "how would you investigate . . . " Study-design tree + validity · Ch3-4 (1) Can you intervene? - experiment (RCT) vs observational. (2) Pick the type with a because - rare outcome-case-control, many outcomes-cohort, snapshot-cross-sectional, populations-ecological. (3) Name the design tools that protect validity (randomise/compare/control). "Is it causal?"; "does X cause Y?"; "can we conclude . . . " Confounding + Bradford Hill . Ch4, Ch10 (1) Correlation # causation - observational data give association only. (2) Name a plausible confounder (linked to both exposure and outcome). (3) Argue Hill - esp. temporality (cause first) + dose-response gradient; an RCT would strengthen it by removing confounders. AskSia Library . MAST20034 . XXia Bilingual If the question says . . . Reach for this concept Say these 3 things (the because) GRAPHS & OUTPUT "Critique this graph"; "two good features & one improvement" 5 graphics principles . Ch2 (1) Two good features, each tied to a principle (standard form / common scale / clear encoding / shows data / simple). (2) One real issue (no title, abbreviated labels, panels on different scales). (3) A specific fix that addresses that issue - vague fixes score zero.[9]Source: asksia-bible-mast20034-bilingual.pdfThe PPDAC cycle - the spine of the whole unit, and a one-glance map of how an exam scenario hangs together: every question lives somewhere on Problem - Plan - Data - Analysis - Conclusion. Locating the stage tells you which concept the marker wants. PPDAC 循环 -- 整个单元的脊柱,也是一张让你一眼看清考试情景如何拼接的地图:每道题都栖身于 Problem → Plan → Data → Analysis → Conclusion 中的某处。定位到阶段,就知道评分者想要哪个概念。 AskSia Library . MAST20034 . XXia Bilingual ★ Concepts to recall - the whole-book checklist 要回忆的概念 -- 全书清单 · Context first (Ch1): data are value-laden; ask who/why/what/how/when; critique # criticism (always offer a constructive fix). 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · Graphics (Ch2): the 5 principles; match graph to variable types; two good features + one specific improvement. I 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 · Design (Ch3): validity = randomise/compare/control (kills bias); precision = replicate/stratify/balance (kills variability); they are independent axes. 设计(第3章): validity = 随机化/比较/控制(杀 bias); precision = 重复/分层/平衡(杀 variability);二者是独 立的轴。 · Observational (Ch4): cohort=group-by-exposure, case-control=group-by-outcome; confounder links to both; correlation # causation. 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠ 因果。 I 报告(第5章):中心/离散/趋势/离群点;报告 Cl+水平、以及统计量 +P,而非只报P。 · Qualitative (Ch6): "why" not "what"; bottom-up vs top-down coding; convergence as the stopping rule. 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence 作为停止规则。 推断(第7章):随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大P 不证明 Ho; Type l/ll 与 power. · Modelling (Ch8): signal+noise; "all models wrong, some useful"; parsimony; read residual/QQ plots - interpret, never fit. 建模(第8章):信号+噪声;“所有模型都是错的,有些有用”;简约性;读残差/QQ图 -- 解读,绝不拟合。 · Sampling (Ch9): frame vs sample; a big sample won't fix bias; 4 random + 4 non-random methods; WEIRD; reproducibility crisis. I 抽样(第9章):抽样框 vs样本;大样本修不了偏倚;4种随机+4种非随机方法;WEIRD;可重复性危机。 I 累积(第10章):森林图(零线+菱形);Hill 准则(时序性+梯度);发表偏倚。 · Big data (Ch11): significance # importance at scale; provenance, ethics, scepticism toward Al findings. Big data (第11章):在大规模下显著 ≠重要;来源、伦理、对 AI发现的怀疑。 而且永远 -- 铁律:点名概念,然后给 because。祝你好运。 AskSia Library · MAST20034 · 双语 Bilingual ● 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 ● 设计(第3章): validity =随机化/比较/控制(杀 bias); precision= 重复/分层/平衡(杀 variability);二者是独 立的轴。 ● 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠因果。 ● 报告(第5章):中心/离散/趋势/离群点;报告 CI+水平、以及统计量+P,而非只报P。 ● 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence(收敛)作为停止规则。 推断(第7章): 随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大 P 不证明 Ho; Type l/ll 与 power.
-
Cross-sectional study
-
Qualitative research
- 更适合回答“为什么”的问题,侧重主题、意义和解释。[11]Source: asksia-bible-mast20034-bilingual.pdf(b) Big-data 批判(because):巨大的 n修不了 bias -- 它只是那些已经在用这个 app 的人(selection bias); 在海量 n 下一切看起来都“显著”,所以 effect size 与来源比 P 更要紧。再加上数据伦理标记:被记录用户的隐私/ 同意。 What earns the marks. a justified qual choice (the "why" logic + convergence) + a big-data critique naming that size # representativeness, with an ethics flag. 什么能得分。一个有论证的 qual 选择(“why”逻辑+ convergence)+一个 big-data 批判,点明规模 ≠代表性,并附 上一个 ethics 标记。 Trap. dismissing qualitative as "unscientific"; equating large n with representative; forgetting consent/provenance for found data. 陷阱。把 qualitative 斥为“不科学”;把大n 等同于有代表性;对找来的数据忘了 consent/provenance(同意/来源)。 AskSia Library · MAST20034 · 双语 Bilingual ★ Recall checklist - the decision rules for the bank 回忆清单 -- 题库的决策规则 1 每个答案:点名→定义→在情境中给理由→ because(后果)。按分值数理由。 · Design/sampling: justify the choice against its alternative; non-probability methods are biased - size won't cure it. 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 · Confounder: must link to both exposure and outcome; observational - associated with, never causes. Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 · Graph critique: two good features to a principle; one issue + a fix that matches it. 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 · Type I/II: false +/false -; power = 1-ß; small n -+ low power; rare condition - base-rate false positives. Type l/ll: 假阳/假阴;power = 1-β;小 n→低 power;罕见情况→基础率导致的假阳性。 · Qual vs quant: why vs what; convergence is the qual stopping rule. Big data: size # unbiased; effect size & ethics over P. 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data: 大≠无偏;effect size 与伦理胜过 P。 ● 每个答案:点名 →定义→在情境中给理由→ because(后果)。按分值数理由。 ● 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 ● Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 ● 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 ● CI:随机的是区间,固定的是参数。P-value: Pr(datalHo),而非 Pr(Holdata);显著 ≠重要。 ● Type I/ll: 假阳/假阴;power = 1-β;小n→低 power;罕见情况→ 基础率导致的假阳性。 ● Forest plot: 菱形对零线+异质性+发表偏倚的注意。Hill:证据的权重,时序性优先。 ● 诊断图:漏斗形→方差非恒定;QQ 弯曲→非正态→变换/用假设更少的方法。 ● 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data:大≠无偏;effect size 与伦理胜过 P。 AskSia Library · MAST20034 · 双语 Bilingual EXAM MORNING . THE DECODER EXAM MORNING . THE DECODER[15]Source: asksia-bible-mast20034-bilingual.pdf混合方法=定性+定量;定性解释或为定量播种。永远点名其伦理/实务成本(时间、编码、匿名性)。 · Trap: never call qualitative "unscientific"; never "just take a bigger survey" for a why question; never swap bottom-up + top-down. I 陷阱:绝不把定性称为“不科学”;对一个为何的问题绝不“就去做更大的调查”;绝不把自下而上←自上而下对调。 ● Quant=“什么/有多少”;qual=“为什么” -- 按问题的性质来选,而非按哪个“更好”。 ● 四种来源: interviews (深度) · focus groups (互动)· observation (做≠说)· documents/artefacts (已经存在) -- 各有其风险。 ● 编码:bottom-up =归纳,codes 从数据中涌现;top-down =演绎,codes 来自先验理论。Themes =归组后的 codes (thematic analysis) . · Convergence (收敛)=定性的停止规则(类比于 power/样本量):当新数据不再增添新主题时停止。 · Rigour (严谨): credibility (~ 内部效度)、transferability (~外部效度)、transparency、有目的的采集。 ● Mixed methods (混合方法) = qual + quant; qual 解释或孕育 quant。始终点名伦理/实务代价(时间、编码、 匿名)。 ● 陷阱:绝不把定性叫“不科学”;对一个为什么问题绝不“干脆做更大的调查”;绝不把 bottom-up ←> top-down 互 换。 AskSia Library · MAST20034 · 双语 Bilingual WEEK 7 . FRAMEWORKS FOR INFERENCE - WEEK 7 . FRAMEWORKS FOR INFERENCE CH 7 . ESTIMATION & SAMPLING DISTRIBUTIONS From sample to population: estimation & the CLT 从样本到总体:估计与 CLT Why one sample can speak for a whole population - and how confidently 为何一个样本能为整个总体发声 -- 以及有多大把握 TL;DR. Inference runs the arrow backwards: probability reasons population - sample, inference reasons sample - population. A point estimate is one number; a confidence interval is honest because it carries "how close". The whole machine rests on the sampling distribution - what the estimate would do over many samples - which the Central Limit Theorem makes Normal. Everything on these three pages is about reading and explaining this, never computing it. TL;DR. 推断把箭头倒过来跑:概率从总体→样本推理,推断从样本→ 总体推理。一个 point estimate(点估计)是一个数 字;一个 confidence interval(置信区间)之所以诚实,是因为它带着“有多接近”。整套机器都立在 sampling distribution (抽样分布)之上 -- 即这个估计在许多样本上会有的表现 -- 而 Central Limit Theorem (CLT,中心极限定理)让它呈正 态。这三页里的一切都关于读懂并解释它,从不计算它。 ★ What the exam asks here 考试在这里问什么 The 60% final is short-answer reasoning only - no calculator, no calculations, no multiple choice. You bring in four sides of your own notes. For inference you will be handed a CI or a P-value to interpret and asked to say what it does (and does not) mean, or to name an error / explain power in a scenario. The marking is explicit: "explaining your reasoning and choices is typically more important than any answer. " Every mark is a because - so carry the definitions, the CI/P-value interpretation rules, and the Type I/II decoder, not arithmetic. 60% 期末只考简答推理 -- 无计算器、无计算、无多选。你带入四面自己的笔记。对于推断,你会被递给一个要解读的 CI 或 P-value,要你说它意味着什么(以及不意味着什么),或在某情景中点名一种 error/解释 power。评分明确: “阐释你的推理与选择,通常比任何答案本身更重要。”每一分都是一个because -- 所以带上定义、CI/P-value 解读规 则,以及 Type l/II 解码器,而非算术。 7. 1 Estimation: point estimate vs confidence interval 7. 1fait: point estimate vs confidence interval Definitions. A point estimate is a single number computed from the sample that stands in for an unknown population parameter (the sample mean x estimates u; the sample proportion p estimates p; s estimates o). A confidence interval (CI) is a range - the estimate plus a margin that encodes how close it is likely to be: estimate ± (distribution multiplier) x (variability). The width comes from sampling variability (and so shrinks as n grows); the multiplier comes from the confidence level you choose. - 定义。一个 point estimate 是从样本算出的单个数字,用来替代一个未知的总体 parameter(参数)(样本均值 x估计 μ; 样本比例 p^估计 p; s 估计 o)。一个 confidence interval (CI,置信区间)是一个范围 -- 估计值加上一个编码了它有多大 可能接近的余量:估计值+(分布乘数)× (变异性)。宽度来自抽样变异性(所以随n增大而收缩);乘数来自你选择的置 信水平。 Quantity Symbol Lives in
-
Convergence
-
六、最容易丢分的坑:这个一定要背
-
1. 只贴标签,不解释后果
- 错:
- “This is convenience sampling.”
- 对:
- “This is convenience sampling, which means participants were recruited by ease of access. That matters because ...”[6]Source: asksia-bible-mast20034-bilingual.pdfEvery unit (and every set of n) equally likely - the baseline. Cluster 整群抽 样 Randomly pick groups, survey all within. stratum; guarantees subgroup coverage. prone. ✓ How to spend a glossary term in the exam 如何在考试中「花掉」一个词汇表术语 Never just name it. Define - apply - because. e. g. "This is convenience sampling (define); here it over-represents metro students (apply); so a bigger sample won't fix the bias (because). " That three-move sentence is what the rubric pays for. 永远别只是点名。定义→应用→ because。例如:“这是 convenience sampling(便利抽样)(定义);这里它过度 代表了都市学生(应用);所以更大的样本也修复不了这个偏倚(because)。”那个三步句式,正是评分标准买单的东 西。 AskSia Library · MAST20034 · 双语 Bilingual REVISION . SHORT -ANSWER BANK - REVISION . SHORT - ANSWER BANK ALL CHAPTERS . EXAM REHEARSAL Practice bank: every mark is a because 练习题库:每一分都是一个 because Twelve short-answer species, each reasoned out the way the marker wants 十二类 short-answer 题型,每一类都按阅卷人想要的方式推理出来 TL;DR. The final is short-answer reasoning only - you name a concept, then reason it out, then say because . . . The rubric is explicit: "explaining your reasoning and choices is typically more important than any answer. " So marks are awarded per correct, sufficiently-detailed reason - not per fact recalled. Each card below gives you the skeleton (define - reason - because), the marking note (what actually scores), and the trap that zeroes a vague answer. TL;DR. 期末只考简答推理 -- 你先点名一个概念,再推理出来,然后说because . . . (因为 …. . . . . )。评分标准写得很明白:“阐 释你的推理与选择,通常比任何答案本身更重要。”所以分数是按每条正确、足够详细的理由给的 -- 而不是按回忆出的事实给 的。下面每张卡片给你骨架(定义→推理→ because)、评分提示(什么真正得分)以及让模糊答案归零的陷阱。 ★ What the exam asks here - the format you are rehearsing 考试在这里问什么 -- 你正在排练的那种格式 The 60% final is short-answer, no calculator, no calculations, no software operation. You may bring 4 sides of your own notes. Two question species recur: (1) "explain a concept / apply critical thinking to a context" and (2) "use critical thinking on a whole-class example" (anchored to a shared case, but you are never asked to recall its data). They may hand you statistical output or a graph to interpret - you read and explain it, you never compute it. Every card here is one rep of that move. 60% 的期末是简答题,不可用计算器,无需计算,无需操作软件。你可以带入自备4 面笔记。两类题目反复出现:(1) “解释一个概念/把批判性思维应用到某情境”和(2)“对一个全班案例运用批判性思维”(锚定在一个共享案例上,但从 不要求你回忆它的数据)。他们可能递给你一段统计输出或一张图让你解读 -- 你读它、解释它,但从不计算它。这里的 每一张卡片都是这一招的一次操练。 P. 1 How to read each card - the marking model P. 1如何读每张卡片 -- 评分模型 Markers do not reward the verb "explain"; they reward the linkage. A "4-mark, 2+2" item almost always means 2 marks for a precise definition and 2 marks for two distinct, consequence-level reasons. A reason that merely restates the definition earns nothing. The skeleton below is the spine of every answer. 评分者奖励的不是“explain(解释)”这个动词,而是关联(linkage)。一道“4分、2+2”的题几乎总意味着2分给精确定义,2 分给两条相异的、后果层面的理由。一条仅仅复述定义的理由得不到分。下面的骨架是每个答案的脊柱。 1 Name the concept / framework. Markers reward defined terms - say convenience sampling, confounder, Type I error by name before you reason. 点名概念 / 框架。评分者奖励定义清晰的术语 -- 在推理前先按名说出 convenience sampling、confounder、Type l error. 2 Define it precisely. One sentence that would let a stranger identify it; vagueness ("just picking people") loses the definition marks. 精确地定义它。用一句话让陌生人也能据此辨认它;含糊(“就是随便挑人”)会丢掉定义分。[13]Source: asksia-bible-mast20034-bilingual.pdfEX 12. 1 Turning a fact into a because (worked short-answer) name > consequence + because Stem (AskSia-invented): "A wellbeing app is evaluated by surveying users who clicked an in-app pop-up. Comment on the sample. " 题干(AskSia 自拟):“某福祉 app通过调查那些点击了应用内弹窗的用户来评估。评论这个样本。” Weak (no marks): "It is a convenience sample. " - a label, no reasoning. 弱(无分):“这是一个 convenience sample。” -- 只是标签,没有推理。 Strong (banks the marks): "This is a convenience / volunteer sample, because only users already engaged enough to click respond - so it suffers self-selection bias and likely over-states satisfaction (because dissatisfied users have churned and are missing). A larger pop-up sample would not fix this, because it repeats the same biased method at scale. " 强(存下分数):“这是一个 convenience/ volunteer(便利/自愿)样本,因为只有已经足够投入到会去点击的用户才 会作答 -- 所以它存在 self-selection bias(自我选择偏倚),很可能高估满意度(因为不满意的用户已经流失,处于 缺失(missing)状态)。更大的弹窗样本并不能解决这个问题,因为它只是把同一种有偏的方法放大重复。” - Read-out: three clauses, three becauses: name the method - name the consequence in context - pre- empt the 'bigger sample' trap. (Scenario AskSia-invented; no figures to compute. ) 读出结构:三个分句,三个because:点名方法→在情境中点名后果→预先化解“样本更大”的陷阱。(情景由 AskSia 自拟;没有要计算的数字。) - 12. 4 The 3-hour timing plan 12. 43 小时计时计划 Three hours for short-answer reasoning is generous - the risk is over-writing early questions, not running out of ideas. Budget by marks, leave a critique-polish pass at the end. 三小时做简答推理是宽裕的 -- 风险在于早段题目写过头,而非想不出点子。按分值分配时间,末尾留一遍批判-润色的检 查。 AskSia Library · MAST20034 · 双语 Bilingual 1 First 10 min - survey & map. Read every question; pencil the decoder row next to each (design? critique? interpret? sample?). Spot the high-mark items. 头10分钟 -- 通览与定位。把每道题读一遍;在每题旁用铅笔标出解码器的行(设计?批判?解读?抽样?)。挑出高分 题。 2 Bulk - answer by mark weight. Roughly a minute per mark; a 4-mark design question wants two detailed becauses, a 2-mark "two good features" wants exactly two. Do not pad. 主体 -- 按分值作答。大约每分一分钟;一道 4分的设计题想要两个详尽的 because,一道 2分的“两个好特征”就恰好两 个。别注水。 3 Discipline - never compute. If you feel an arithmetic urge, you have misread - the answer is an interpretation, not a number. 纪律 -- 绝不计算。若你感到一股算术冲动,那你读错题了 -- 答案是一个解读,而非一个数字。 4 Last 20 min - the because audit. Re-read each answer and check every claim ends in a reason tied to the scenario; add the missing because, the missing caveat (significance # importance), the missing specific fix. 最后 20 分钟 -- because 审计。重读每个答案,检查每一条主张是否都以一个系到情景的理由收尾;补上缺失的 because、缺失的注意点(显著≠重要)、缺失的具体修复。 FIG 12. 1 1 Problem define the question 5 Conclusion answer in context 2 Plan design how to get data PPDAC cycle 4 Analysis explore + model + test 3 Data collect / clean / store iterative: conclusions raise new problems -> cycle repeats
- 错:
-
2. 把定义重复一遍当理由
- 这不算 because,不得分。[7]Source: asksia-bible-mast20034-bilingual.pdfWhy it pays off Side 1 - design & cause Study-design tree (intervene? - experiment/observational - cohort/cross-sec/case-control/ecological); confounding triangle; the Hill 9 (star temporality + gradient); exposure/outcome synonyms. Covers the two biggest row- families - "choose a design" and "is it causal?" - with ready-made becauses. Side 2 - The 5 graphics principles as a critique checklist; the graph-chooser (by variable mix); the data-description checklist graphs & reporting (centre/spread/trend/outliers, concise+complete); inference-report checklist (CI + level, stat + P, n). Turns the graph-critique and "critique this description" species into fill-in-the-blank answers. Side 3 - inference rules Correct CI + P-value wordings (and the wrong ones to avoid); the Type I/II + power 2×2; the NHST 5 steps; the assumption hierarchy; the diagnostic-plot readings (funnel - non-constant variance). The "interpret this output" species is pure recall of the right sentence - have it verbatim. Side 4 - sampling, qual & ethics Sampling taxonomy (4 random + 4 non-random) with each method's bias; "size won't fix bias"; WEIRD + reproducibility; qual methods (why vs what, coding, convergence); data ethics + justice; the context questions. Mops up the trust / sampling / qualitative / big-data rows and the W1 + W12 context frame. AskSia Library . MAST20034 . XXia Bilingual ! Do NOT build a formula sheet 不要去做一张公式表 There is no calculator and no calculation on this exam. A side crammed with CLT algebra, t-formulae or regression normal equations is wasted - you will never plug numbers in. The only notation worth a line is the definition of a P- value or a CI in words. Every other millimetre should be a tree, a checklist, or a one-line definition. 这场考试没有计算器,也没有计算。一面塞满 CLT代数、t 公式或回归正规方程的笔记是浪费 -- 你永远不会代入数 字。唯一值得占一行的记号,是用文字写出 P-value 或 CI 的定义。其余每一毫米都应是一棵树、一份核对清单,或一 行定义。 12. 3 The short-answer 'because' rule 12. 3short-answer 的‘because’ 规则 Definition. A complete short answer = a named concept + a reason that connects it to the scenario. Marks are awarded per correct, sufficiently-detailed reason, not per fact stated. A reason that merely restates the definition is not a because and scores nothing. 定义。一个完整的简答=一个点名的概念+一条把它连到情景上的理由。分数按每条正确、足够详细的理由给,而非按陈述 的事实给。一条仅仅复述定义的理由不是 because,得不到分。[8]Source: asksia-bible-mast20034-bilingual.pdfAskSia Library · MAST20034 · 双语 Bilingual 3 Reason in the context given. Tie the concept to this scenario, not a generic textbook one. 在给定的情境中推理。把概念系到这个情景,而非一个泛泛的教科书情景。 4 Close each point with a because. State the consequence (the bias it induces, the assumption it breaks, the conclusion it licenses) - this is the mark-bearing clause. 每个要点都以一个 because 收尾。说出后果(它引发的偏倚、它破坏的假设、它许可的结论) -- 这是承载分数的从句。 5 Count your reasons against the marks. If it says [4: 2+2], deliver a definition and two separate consequence-level reasons. 按分值数你的理由。若标着[4:2+2],就要给出一个定义以及两个各自独立、到后果层面的理由。 ✓ The universal sentence frame 通用句式框架 "This is [named concept], which means [definition]. Here it matters because [consequence #1], and also because [consequence #2]. "Drop any scenario into that frame and you have structured an answer that the rubric can find marks in. “这是[点名的概念],意思是[定义]。在这里它之所以重要,是因为[后果#1],也因为[后果 #2]。”把任意情景套进这 个框架,你就把一个评分标准能找到分的答案给搭好了。 - ! The four ways students throw away marks 学生白白丢分的四种方式 复述,而非推理 -- “它有偏倚,因为它不随机”只是给定义换了个名字;要说出这个偏倚做了什么。 |一个理由套两件外套 -- 一道“2+2”的题目需要两个不同的理由,而非把一个换说法说两遍。 · Trying to compute - there is no calculator and no calc question; if you start arithmetic you have misread the task. 试图计算 -- 这里既无计算器也无计算题;若你开始做算术,你就读错题了。 过度宣称因果 -- 对观察性数据唯一合法的动词是 is associated with,绝不是 causes。 ● 复述,而非推理 -- “它有偏倚,因为它不随机”只是给定义换了个名字;要说出这个偏倚做了什么。 ● 一个理由套两件外套 -- 一道“2+2”的题目需要两个不同的理由,而非把一个换说法说两遍。 ● 试图计算 -- 这里既无计算器也无计算题;若你开始做算术,你就读错题了。 ● 过度宣称因果 -- 对观察性数据唯一合法的动词是 is associated with (与 . . . . . 相关),绝不是 causes(导致)。 AskSia Library · MAST20034 · 双语 Bilingual REVISION . SHORT - ANSWER BANK REVISION . SHORT - ANSWER BANK STUDY - PRODUCTION SPECIES Drills 1-6: design, sampling, confounding & graphs 演练 1-6: 设计、抽样、confounding 与图表 TL;DR. These six rehearse the "how the data were produced" family: choose & justify a design, pick a sampling method, name the confounder, identify exposure/outcome, and critique a graph (two good features + one specific fix). Reason the choice against its alternative - that contrast is where the marks live. TL;DR. 这六张演练“数据是如何产生的”这一族题:选择并论证一种设计、挑一种 sampling 方法、点名 confounder、识别 exposure/outcome,以及批判一张图(两个优点+一个具体的修正)。要把你的选择对照其备选方案来论证 -- 那个对比正 是分数所在。 Q1
-
3. 一题 2+2,只写一个理由换说法两次
-
4. 开始计算
- 这是读错题。[3]Source: asksia-bible-mast20034-bilingual.pdf"Interpret this CI / P- value / output / forest plot" Inference reading rules . Ch7, Ch8, Ch10 (1) State what it shows in context (CI excludes O - evidence of an effect; small P - strong evidence against H. ). (2) Add the correct caveat (the interval is random, u is fixed; large P does not prove H. ). (3) Comment on strength / meaning - significant # important. SAMPLING & TRUST "Is this sample OK?"; "what's wrong with how they recruited?" Sampling bias + WEIRD · Ch1, Ch9 (1) Ask who is missing - frame / selection / non-response / volunteer gap. (2) Name the method and its bias (convenience - people similar to each other). (3) State that a bigger sample will NOT fix bias - it repeats the mistake at scale; consider WEIRD over-sampling. "Too good to be true"; "a surprising significant result"; "just barely p‹0. 05" Reproducibility + p-hacking · Ch9-10 (1) Publication bias - novel/significant results over-published, inflating effects. (2) Watch for p-hacking / HARKing (one-sided chosen after the data, multiple looks). (3) Ask for replication, a CI / effect size, and pre- registration before trusting it. "Big data / an Al claim"; "with millions of records . . . " Ethics + validity at scale · Ch1, Ch11 (1) Huge n - everything is significant - judge effect size & practical importance, not P. (2) Apply the context questions (who/why/what/how) + data ethics (consent, fairness, stewardship). (3) Be sceptical of Al - check provenance and the missing data. ✓ How to use the table under pressure 在压力下如何使用这张表 Underline the verb and the noun in the stem first ("choose a design", "critique this graph", "interpret the output"). That two-word cue picks the row; the right column is your paragraph. Then convert each of the three things into a sentence that ends in a because. You are never asked to compute - resist the urge. 先在题干里给动词和名词划线(“选择一个设计”、“批判这张图”、“解读这段输出”)。那个两字提示挑出对应的行;右侧 那一栏就是你要写的段落。然后把这三样东西各转成一个以 because 收尾的句子。题目从不要求你计算 -- 忍住冲动。 AskSia Library . MAST20034 . XXia Bilingual EXAM MORNING . THE DECODER - EXAM MORNING . THE DECODER BUILDING THE NOTES YOU CARRY IN Your 4 sides, the 'because' rule, and the 3 hours 你的4 页笔记、‘because’规则,以及那3个小时 TL;DR. You may bring four sides of your own notes and there is no calculator - so do not waste space on formulas. Fill the four sides with decision trees, checklists, and crisp definitions: the machinery that turns a cue into a justified answer. This page lays out what to put on each side, the one rule that wins short- answer marks, a timing plan for the three hours, and the closing concepts-to-recall list. TL;DR. 你可以带四面自己的笔记,而且没有计算器 -- 所以不要把空间浪费在公式上。把这四面填满decision trees (决策 树)、checklists (清单)和精炼的定义:那些把线索变成有论证答案的机器。本页摆出每一面该放什么、赢得简答分的那一 条规则、三小时的时间规划,以及收官的待回忆概念清单。 - 12. 2 The 4-side notes plan 12. 2四页笔记计划 The exam is reasoning, not recall of numbers, so your sheet is a reasoning toolkit. A good layout maps one side to each job of the decoder above. Trees and checklists earn marks; a wall of formulae does not (there is nothing to calculate). 考的是推理,而非对数字的回忆,所以你的小抄是一个推理工具箱。好的布局把每一面对应到上面解码器的一项工作。树与清 单能得分;一墙公式不能(没有任何东西要算)。 Side What goes on it[8]Source: asksia-bible-mast20034-bilingual.pdfAskSia Library · MAST20034 · 双语 Bilingual 3 Reason in the context given. Tie the concept to this scenario, not a generic textbook one. 在给定的情境中推理。把概念系到这个情景,而非一个泛泛的教科书情景。 4 Close each point with a because. State the consequence (the bias it induces, the assumption it breaks, the conclusion it licenses) - this is the mark-bearing clause. 每个要点都以一个 because 收尾。说出后果(它引发的偏倚、它破坏的假设、它许可的结论) -- 这是承载分数的从句。 5 Count your reasons against the marks. If it says [4: 2+2], deliver a definition and two separate consequence-level reasons. 按分值数你的理由。若标着[4:2+2],就要给出一个定义以及两个各自独立、到后果层面的理由。 ✓ The universal sentence frame 通用句式框架 "This is [named concept], which means [definition]. Here it matters because [consequence #1], and also because [consequence #2]. "Drop any scenario into that frame and you have structured an answer that the rubric can find marks in. “这是[点名的概念],意思是[定义]。在这里它之所以重要,是因为[后果#1],也因为[后果 #2]。”把任意情景套进这 个框架,你就把一个评分标准能找到分的答案给搭好了。 - ! The four ways students throw away marks 学生白白丢分的四种方式 复述,而非推理 -- “它有偏倚,因为它不随机”只是给定义换了个名字;要说出这个偏倚做了什么。 |一个理由套两件外套 -- 一道“2+2”的题目需要两个不同的理由,而非把一个换说法说两遍。 · Trying to compute - there is no calculator and no calc question; if you start arithmetic you have misread the task. 试图计算 -- 这里既无计算器也无计算题;若你开始做算术,你就读错题了。 过度宣称因果 -- 对观察性数据唯一合法的动词是 is associated with,绝不是 causes。 ● 复述,而非推理 -- “它有偏倚,因为它不随机”只是给定义换了个名字;要说出这个偏倚做了什么。 ● 一个理由套两件外套 -- 一道“2+2”的题目需要两个不同的理由,而非把一个换说法说两遍。 ● 试图计算 -- 这里既无计算器也无计算题;若你开始做算术,你就读错题了。 ● 过度宣称因果 -- 对观察性数据唯一合法的动词是 is associated with (与 . . . . . 相关),绝不是 causes(导致)。 AskSia Library · MAST20034 · 双语 Bilingual REVISION . SHORT - ANSWER BANK REVISION . SHORT - ANSWER BANK STUDY - PRODUCTION SPECIES Drills 1-6: design, sampling, confounding & graphs 演练 1-6: 设计、抽样、confounding 与图表 TL;DR. These six rehearse the "how the data were produced" family: choose & justify a design, pick a sampling method, name the confounder, identify exposure/outcome, and critique a graph (two good features + one specific fix). Reason the choice against its alternative - that contrast is where the marks live. TL;DR. 这六张演练“数据是如何产生的”这一族题:选择并论证一种设计、挑一种 sampling 方法、点名 confounder、识别 exposure/outcome,以及批判一张图(两个优点+一个具体的修正)。要把你的选择对照其备选方案来论证 -- 那个对比正 是分数所在。 Q1[13]Source: asksia-bible-mast20034-bilingual.pdfEX 12. 1 Turning a fact into a because (worked short-answer) name > consequence + because Stem (AskSia-invented): "A wellbeing app is evaluated by surveying users who clicked an in-app pop-up. Comment on the sample. " 题干(AskSia 自拟):“某福祉 app通过调查那些点击了应用内弹窗的用户来评估。评论这个样本。” Weak (no marks): "It is a convenience sample. " - a label, no reasoning. 弱(无分):“这是一个 convenience sample。” -- 只是标签,没有推理。 Strong (banks the marks): "This is a convenience / volunteer sample, because only users already engaged enough to click respond - so it suffers self-selection bias and likely over-states satisfaction (because dissatisfied users have churned and are missing). A larger pop-up sample would not fix this, because it repeats the same biased method at scale. " 强(存下分数):“这是一个 convenience/ volunteer(便利/自愿)样本,因为只有已经足够投入到会去点击的用户才 会作答 -- 所以它存在 self-selection bias(自我选择偏倚),很可能高估满意度(因为不满意的用户已经流失,处于 缺失(missing)状态)。更大的弹窗样本并不能解决这个问题,因为它只是把同一种有偏的方法放大重复。” - Read-out: three clauses, three becauses: name the method - name the consequence in context - pre- empt the 'bigger sample' trap. (Scenario AskSia-invented; no figures to compute. ) 读出结构:三个分句,三个because:点名方法→在情境中点名后果→预先化解“样本更大”的陷阱。(情景由 AskSia 自拟;没有要计算的数字。) - 12. 4 The 3-hour timing plan 12. 43 小时计时计划 Three hours for short-answer reasoning is generous - the risk is over-writing early questions, not running out of ideas. Budget by marks, leave a critique-polish pass at the end. 三小时做简答推理是宽裕的 -- 风险在于早段题目写过头,而非想不出点子。按分值分配时间,末尾留一遍批判-润色的检 查。 AskSia Library · MAST20034 · 双语 Bilingual 1 First 10 min - survey & map. Read every question; pencil the decoder row next to each (design? critique? interpret? sample?). Spot the high-mark items. 头10分钟 -- 通览与定位。把每道题读一遍;在每题旁用铅笔标出解码器的行(设计?批判?解读?抽样?)。挑出高分 题。 2 Bulk - answer by mark weight. Roughly a minute per mark; a 4-mark design question wants two detailed becauses, a 2-mark "two good features" wants exactly two. Do not pad. 主体 -- 按分值作答。大约每分一分钟;一道 4分的设计题想要两个详尽的 because,一道 2分的“两个好特征”就恰好两 个。别注水。 3 Discipline - never compute. If you feel an arithmetic urge, you have misread - the answer is an interpretation, not a number. 纪律 -- 绝不计算。若你感到一股算术冲动,那你读错题了 -- 答案是一个解读,而非一个数字。 4 Last 20 min - the because audit. Re-read each answer and check every claim ends in a reason tied to the scenario; add the missing because, the missing caveat (significance # importance), the missing specific fix. 最后 20 分钟 -- because 审计。重读每个答案,检查每一条主张是否都以一个系到情景的理由收尾;补上缺失的 because、缺失的注意点(显著≠重要)、缺失的具体修复。 FIG 12. 1 1 Problem define the question 5 Conclusion answer in context 2 Plan design how to get data PPDAC cycle 4 Analysis explore + model + test 3 Data collect / clean / store iterative: conclusions raise new problems -> cycle repeats[16]Source: asksia-cheatsheet-mast20034.pdfqualitative . convergence 17 . The "Because" Rule HOW TO BANK MARKS EVERY ANSWER = A CHAIN NAME the concept/framework - APPLY it to the context + BECAUSE . . . (the reason wins the mark) · Marks are per correct, sufficiently-detailed reason - one detailed reason often = full marks; restating the definition earns nothing · Dot points are fine . no grammar/spelling marks . 3 hours SIA > No question needs a calculator and none recall case-study details. If you're computing, you've misread it - they want the reasoning. Spend the words on the because. 18 . Top Traps to THE MARKS LOST MOST Avoid · "P = Pr(Ho true)" . "large P proves Ho" . choosing one- sided after the data · Calling a natural experiment an experiment . "association proves cause" · Cohort>case-control mix-up · stratified > quota . cluster<> stratified · Hill as a tick-box · ignoring temporality · "Significant" = "important" . trusting a result because n is huge the core of your 4-side bring-in notes . confirm on the MAST20034 exam-info page name it. apply it. because . . . asksia. ai/cheatsheet/ unimelb-mast20034 . side 2/2 AskSia CHEAT SHEET SERIES RESTRICTED BRING-IN . NO CALCULATOR
-
5. 观察性研究乱说因果
- 只能说 associated with,不要轻易说 causes。[8]Source: asksia-bible-mast20034-bilingual.pdfAskSia Library · MAST20034 · 双语 Bilingual 3 Reason in the context given. Tie the concept to this scenario, not a generic textbook one. 在给定的情境中推理。把概念系到这个情景,而非一个泛泛的教科书情景。 4 Close each point with a because. State the consequence (the bias it induces, the assumption it breaks, the conclusion it licenses) - this is the mark-bearing clause. 每个要点都以一个 because 收尾。说出后果(它引发的偏倚、它破坏的假设、它许可的结论) -- 这是承载分数的从句。 5 Count your reasons against the marks. If it says [4: 2+2], deliver a definition and two separate consequence-level reasons. 按分值数你的理由。若标着[4:2+2],就要给出一个定义以及两个各自独立、到后果层面的理由。 ✓ The universal sentence frame 通用句式框架 "This is [named concept], which means [definition]. Here it matters because [consequence #1], and also because [consequence #2]. "Drop any scenario into that frame and you have structured an answer that the rubric can find marks in. “这是[点名的概念],意思是[定义]。在这里它之所以重要,是因为[后果#1],也因为[后果 #2]。”把任意情景套进这 个框架,你就把一个评分标准能找到分的答案给搭好了。 - ! The four ways students throw away marks 学生白白丢分的四种方式 复述,而非推理 -- “它有偏倚,因为它不随机”只是给定义换了个名字;要说出这个偏倚做了什么。 |一个理由套两件外套 -- 一道“2+2”的题目需要两个不同的理由,而非把一个换说法说两遍。 · Trying to compute - there is no calculator and no calc question; if you start arithmetic you have misread the task. 试图计算 -- 这里既无计算器也无计算题;若你开始做算术,你就读错题了。 过度宣称因果 -- 对观察性数据唯一合法的动词是 is associated with,绝不是 causes。 ● 复述,而非推理 -- “它有偏倚,因为它不随机”只是给定义换了个名字;要说出这个偏倚做了什么。 ● 一个理由套两件外套 -- 一道“2+2”的题目需要两个不同的理由,而非把一个换说法说两遍。 ● 试图计算 -- 这里既无计算器也无计算题;若你开始做算术,你就读错题了。 ● 过度宣称因果 -- 对观察性数据唯一合法的动词是 is associated with (与 . . . . . 相关),绝不是 causes(导致)。 AskSia Library · MAST20034 · 双语 Bilingual REVISION . SHORT - ANSWER BANK REVISION . SHORT - ANSWER BANK STUDY - PRODUCTION SPECIES Drills 1-6: design, sampling, confounding & graphs 演练 1-6: 设计、抽样、confounding 与图表 TL;DR. These six rehearse the "how the data were produced" family: choose & justify a design, pick a sampling method, name the confounder, identify exposure/outcome, and critique a graph (two good features + one specific fix). Reason the choice against its alternative - that contrast is where the marks live. TL;DR. 这六张演练“数据是如何产生的”这一族题:选择并论证一种设计、挑一种 sampling 方法、点名 confounder、识别 exposure/outcome,以及批判一张图(两个优点+一个具体的修正)。要把你的选择对照其备选方案来论证 -- 那个对比正 是分数所在。 Q1[11]Source: asksia-bible-mast20034-bilingual.pdf(b) Big-data 批判(because):巨大的 n修不了 bias -- 它只是那些已经在用这个 app 的人(selection bias); 在海量 n 下一切看起来都“显著”,所以 effect size 与来源比 P 更要紧。再加上数据伦理标记:被记录用户的隐私/ 同意。 What earns the marks. a justified qual choice (the "why" logic + convergence) + a big-data critique naming that size # representativeness, with an ethics flag. 什么能得分。一个有论证的 qual 选择(“why”逻辑+ convergence)+一个 big-data 批判,点明规模 ≠代表性,并附 上一个 ethics 标记。 Trap. dismissing qualitative as "unscientific"; equating large n with representative; forgetting consent/provenance for found data. 陷阱。把 qualitative 斥为“不科学”;把大n 等同于有代表性;对找来的数据忘了 consent/provenance(同意/来源)。 AskSia Library · MAST20034 · 双语 Bilingual ★ Recall checklist - the decision rules for the bank 回忆清单 -- 题库的决策规则 1 每个答案:点名→定义→在情境中给理由→ because(后果)。按分值数理由。 · Design/sampling: justify the choice against its alternative; non-probability methods are biased - size won't cure it. 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 · Confounder: must link to both exposure and outcome; observational - associated with, never causes. Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 · Graph critique: two good features to a principle; one issue + a fix that matches it. 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 · Type I/II: false +/false -; power = 1-ß; small n -+ low power; rare condition - base-rate false positives. Type l/ll: 假阳/假阴;power = 1-β;小 n→低 power;罕见情况→基础率导致的假阳性。 · Qual vs quant: why vs what; convergence is the qual stopping rule. Big data: size # unbiased; effect size & ethics over P. 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data: 大≠无偏;effect size 与伦理胜过 P。 ● 每个答案:点名 →定义→在情境中给理由→ because(后果)。按分值数理由。 ● 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 ● Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 ● 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 ● CI:随机的是区间,固定的是参数。P-value: Pr(datalHo),而非 Pr(Holdata);显著 ≠重要。 ● Type I/ll: 假阳/假阴;power = 1-β;小n→低 power;罕见情况→ 基础率导致的假阳性。 ● Forest plot: 菱形对零线+异质性+发表偏倚的注意。Hill:证据的权重,时序性优先。 ● 诊断图:漏斗形→方差非恒定;QQ 弯曲→非正态→变换/用假设更少的方法。 ● 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data:大≠无偏;effect size 与伦理胜过 P。 AskSia Library · MAST20034 · 双语 Bilingual EXAM MORNING . THE DECODER EXAM MORNING . THE DECODER[16]Source: asksia-cheatsheet-mast20034.pdfqualitative . convergence 17 . The "Because" Rule HOW TO BANK MARKS EVERY ANSWER = A CHAIN NAME the concept/framework - APPLY it to the context + BECAUSE . . . (the reason wins the mark) · Marks are per correct, sufficiently-detailed reason - one detailed reason often = full marks; restating the definition earns nothing · Dot points are fine . no grammar/spelling marks . 3 hours SIA > No question needs a calculator and none recall case-study details. If you're computing, you've misread it - they want the reasoning. Spend the words on the because. 18 . Top Traps to THE MARKS LOST MOST Avoid · "P = Pr(Ho true)" . "large P proves Ho" . choosing one- sided after the data · Calling a natural experiment an experiment . "association proves cause" · Cohort>case-control mix-up · stratified > quota . cluster<> stratified · Hill as a tick-box · ignoring temporality · "Significant" = "important" . trusting a result because n is huge the core of your 4-side bring-in notes . confirm on the MAST20034 exam-info page name it. apply it. because . . . asksia. ai/cheatsheet/ unimelb-mast20034 . side 2/2 AskSia CHEAT SHEET SERIES RESTRICTED BRING-IN . NO CALCULATOR
-
6. 大 P 说成证明 $H_0$
- 错。大 $P$ 只是“证据不足以反对 $H_0$”。[3]Source: asksia-bible-mast20034-bilingual.pdf"Interpret this CI / P- value / output / forest plot" Inference reading rules . Ch7, Ch8, Ch10 (1) State what it shows in context (CI excludes O - evidence of an effect; small P - strong evidence against H. ). (2) Add the correct caveat (the interval is random, u is fixed; large P does not prove H. ). (3) Comment on strength / meaning - significant # important. SAMPLING & TRUST "Is this sample OK?"; "what's wrong with how they recruited?" Sampling bias + WEIRD · Ch1, Ch9 (1) Ask who is missing - frame / selection / non-response / volunteer gap. (2) Name the method and its bias (convenience - people similar to each other). (3) State that a bigger sample will NOT fix bias - it repeats the mistake at scale; consider WEIRD over-sampling. "Too good to be true"; "a surprising significant result"; "just barely p‹0. 05" Reproducibility + p-hacking · Ch9-10 (1) Publication bias - novel/significant results over-published, inflating effects. (2) Watch for p-hacking / HARKing (one-sided chosen after the data, multiple looks). (3) Ask for replication, a CI / effect size, and pre- registration before trusting it. "Big data / an Al claim"; "with millions of records . . . " Ethics + validity at scale · Ch1, Ch11 (1) Huge n - everything is significant - judge effect size & practical importance, not P. (2) Apply the context questions (who/why/what/how) + data ethics (consent, fairness, stewardship). (3) Be sceptical of Al - check provenance and the missing data. ✓ How to use the table under pressure 在压力下如何使用这张表 Underline the verb and the noun in the stem first ("choose a design", "critique this graph", "interpret the output"). That two-word cue picks the row; the right column is your paragraph. Then convert each of the three things into a sentence that ends in a because. You are never asked to compute - resist the urge. 先在题干里给动词和名词划线(“选择一个设计”、“批判这张图”、“解读这段输出”)。那个两字提示挑出对应的行;右侧 那一栏就是你要写的段落。然后把这三样东西各转成一个以 because 收尾的句子。题目从不要求你计算 -- 忍住冲动。 AskSia Library . MAST20034 . XXia Bilingual EXAM MORNING . THE DECODER - EXAM MORNING . THE DECODER BUILDING THE NOTES YOU CARRY IN Your 4 sides, the 'because' rule, and the 3 hours 你的4 页笔记、‘because’规则,以及那3个小时 TL;DR. You may bring four sides of your own notes and there is no calculator - so do not waste space on formulas. Fill the four sides with decision trees, checklists, and crisp definitions: the machinery that turns a cue into a justified answer. This page lays out what to put on each side, the one rule that wins short- answer marks, a timing plan for the three hours, and the closing concepts-to-recall list. TL;DR. 你可以带四面自己的笔记,而且没有计算器 -- 所以不要把空间浪费在公式上。把这四面填满decision trees (决策 树)、checklists (清单)和精炼的定义:那些把线索变成有论证答案的机器。本页摆出每一面该放什么、赢得简答分的那一 条规则、三小时的时间规划,以及收官的待回忆概念清单。 - 12. 2 The 4-side notes plan 12. 2四页笔记计划 The exam is reasoning, not recall of numbers, so your sheet is a reasoning toolkit. A good layout maps one side to each job of the decoder above. Trees and checklists earn marks; a wall of formulae does not (there is nothing to calculate). 考的是推理,而非对数字的回忆,所以你的小抄是一个推理工具箱。好的布局把每一面对应到上面解码器的一项工作。树与清 单能得分;一墙公式不能(没有任何东西要算)。 Side What goes on it[9]Source: asksia-bible-mast20034-bilingual.pdfThe PPDAC cycle - the spine of the whole unit, and a one-glance map of how an exam scenario hangs together: every question lives somewhere on Problem - Plan - Data - Analysis - Conclusion. Locating the stage tells you which concept the marker wants. PPDAC 循环 -- 整个单元的脊柱,也是一张让你一眼看清考试情景如何拼接的地图:每道题都栖身于 Problem → Plan → Data → Analysis → Conclusion 中的某处。定位到阶段,就知道评分者想要哪个概念。 AskSia Library . MAST20034 . XXia Bilingual ★ Concepts to recall - the whole-book checklist 要回忆的概念 -- 全书清单 · Context first (Ch1): data are value-laden; ask who/why/what/how/when; critique # criticism (always offer a constructive fix). 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · Graphics (Ch2): the 5 principles; match graph to variable types; two good features + one specific improvement. I 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 · Design (Ch3): validity = randomise/compare/control (kills bias); precision = replicate/stratify/balance (kills variability); they are independent axes. 设计(第3章): validity = 随机化/比较/控制(杀 bias); precision = 重复/分层/平衡(杀 variability);二者是独 立的轴。 · Observational (Ch4): cohort=group-by-exposure, case-control=group-by-outcome; confounder links to both; correlation # causation. 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠ 因果。 I 报告(第5章):中心/离散/趋势/离群点;报告 Cl+水平、以及统计量 +P,而非只报P。 · Qualitative (Ch6): "why" not "what"; bottom-up vs top-down coding; convergence as the stopping rule. 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence 作为停止规则。 推断(第7章):随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大P 不证明 Ho; Type l/ll 与 power. · Modelling (Ch8): signal+noise; "all models wrong, some useful"; parsimony; read residual/QQ plots - interpret, never fit. 建模(第8章):信号+噪声;“所有模型都是错的,有些有用”;简约性;读残差/QQ图 -- 解读,绝不拟合。 · Sampling (Ch9): frame vs sample; a big sample won't fix bias; 4 random + 4 non-random methods; WEIRD; reproducibility crisis. I 抽样(第9章):抽样框 vs样本;大样本修不了偏倚;4种随机+4种非随机方法;WEIRD;可重复性危机。 I 累积(第10章):森林图(零线+菱形);Hill 准则(时序性+梯度);发表偏倚。 · Big data (Ch11): significance # importance at scale; provenance, ethics, scepticism toward Al findings. Big data (第11章):在大规模下显著 ≠重要;来源、伦理、对 AI发现的怀疑。 而且永远 -- 铁律:点名概念,然后给 because。祝你好运。 AskSia Library · MAST20034 · 双语 Bilingual ● 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 ● 设计(第3章): validity =随机化/比较/控制(杀 bias); precision= 重复/分层/平衡(杀 variability);二者是独 立的轴。 ● 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠因果。 ● 报告(第5章):中心/离散/趋势/离群点;报告 CI+水平、以及统计量+P,而非只报P。 ● 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence(收敛)作为停止规则。 推断(第7章): 随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大 P 不证明 Ho; Type l/ll 与 power.[16]Source: asksia-cheatsheet-mast20034.pdfqualitative . convergence 17 . The "Because" Rule HOW TO BANK MARKS EVERY ANSWER = A CHAIN NAME the concept/framework - APPLY it to the context + BECAUSE . . . (the reason wins the mark) · Marks are per correct, sufficiently-detailed reason - one detailed reason often = full marks; restating the definition earns nothing · Dot points are fine . no grammar/spelling marks . 3 hours SIA > No question needs a calculator and none recall case-study details. If you're computing, you've misread it - they want the reasoning. Spend the words on the because. 18 . Top Traps to THE MARKS LOST MOST Avoid · "P = Pr(Ho true)" . "large P proves Ho" . choosing one- sided after the data · Calling a natural experiment an experiment . "association proves cause" · Cohort>case-control mix-up · stratified > quota . cluster<> stratified · Hill as a tick-box · ignoring temporality · "Significant" = "important" . trusting a result because n is huge the core of your 4-side bring-in notes . confirm on the MAST20034 exam-info page name it. apply it. because . . . asksia. ai/cheatsheet/ unimelb-mast20034 . side 2/2 AskSia CHEAT SHEET SERIES RESTRICTED BRING-IN . NO CALCULATOR
-
7. 把 significant 当 important
- 错。尤其 big data 场景很爱考这个坑。[3]Source: asksia-bible-mast20034-bilingual.pdf"Interpret this CI / P- value / output / forest plot" Inference reading rules . Ch7, Ch8, Ch10 (1) State what it shows in context (CI excludes O - evidence of an effect; small P - strong evidence against H. ). (2) Add the correct caveat (the interval is random, u is fixed; large P does not prove H. ). (3) Comment on strength / meaning - significant # important. SAMPLING & TRUST "Is this sample OK?"; "what's wrong with how they recruited?" Sampling bias + WEIRD · Ch1, Ch9 (1) Ask who is missing - frame / selection / non-response / volunteer gap. (2) Name the method and its bias (convenience - people similar to each other). (3) State that a bigger sample will NOT fix bias - it repeats the mistake at scale; consider WEIRD over-sampling. "Too good to be true"; "a surprising significant result"; "just barely p‹0. 05" Reproducibility + p-hacking · Ch9-10 (1) Publication bias - novel/significant results over-published, inflating effects. (2) Watch for p-hacking / HARKing (one-sided chosen after the data, multiple looks). (3) Ask for replication, a CI / effect size, and pre- registration before trusting it. "Big data / an Al claim"; "with millions of records . . . " Ethics + validity at scale · Ch1, Ch11 (1) Huge n - everything is significant - judge effect size & practical importance, not P. (2) Apply the context questions (who/why/what/how) + data ethics (consent, fairness, stewardship). (3) Be sceptical of Al - check provenance and the missing data. ✓ How to use the table under pressure 在压力下如何使用这张表 Underline the verb and the noun in the stem first ("choose a design", "critique this graph", "interpret the output"). That two-word cue picks the row; the right column is your paragraph. Then convert each of the three things into a sentence that ends in a because. You are never asked to compute - resist the urge. 先在题干里给动词和名词划线(“选择一个设计”、“批判这张图”、“解读这段输出”)。那个两字提示挑出对应的行;右侧 那一栏就是你要写的段落。然后把这三样东西各转成一个以 because 收尾的句子。题目从不要求你计算 -- 忍住冲动。 AskSia Library . MAST20034 . XXia Bilingual EXAM MORNING . THE DECODER - EXAM MORNING . THE DECODER BUILDING THE NOTES YOU CARRY IN Your 4 sides, the 'because' rule, and the 3 hours 你的4 页笔记、‘because’规则,以及那3个小时 TL;DR. You may bring four sides of your own notes and there is no calculator - so do not waste space on formulas. Fill the four sides with decision trees, checklists, and crisp definitions: the machinery that turns a cue into a justified answer. This page lays out what to put on each side, the one rule that wins short- answer marks, a timing plan for the three hours, and the closing concepts-to-recall list. TL;DR. 你可以带四面自己的笔记,而且没有计算器 -- 所以不要把空间浪费在公式上。把这四面填满decision trees (决策 树)、checklists (清单)和精炼的定义:那些把线索变成有论证答案的机器。本页摆出每一面该放什么、赢得简答分的那一 条规则、三小时的时间规划,以及收官的待回忆概念清单。 - 12. 2 The 4-side notes plan 12. 2四页笔记计划 The exam is reasoning, not recall of numbers, so your sheet is a reasoning toolkit. A good layout maps one side to each job of the decoder above. Trees and checklists earn marks; a wall of formulae does not (there is nothing to calculate). 考的是推理,而非对数字的回忆,所以你的小抄是一个推理工具箱。好的布局把每一面对应到上面解码器的一项工作。树与清 单能得分;一墙公式不能(没有任何东西要算)。 Side What goes on it[11]Source: asksia-bible-mast20034-bilingual.pdf(b) Big-data 批判(because):巨大的 n修不了 bias -- 它只是那些已经在用这个 app 的人(selection bias); 在海量 n 下一切看起来都“显著”,所以 effect size 与来源比 P 更要紧。再加上数据伦理标记:被记录用户的隐私/ 同意。 What earns the marks. a justified qual choice (the "why" logic + convergence) + a big-data critique naming that size # representativeness, with an ethics flag. 什么能得分。一个有论证的 qual 选择(“why”逻辑+ convergence)+一个 big-data 批判,点明规模 ≠代表性,并附 上一个 ethics 标记。 Trap. dismissing qualitative as "unscientific"; equating large n with representative; forgetting consent/provenance for found data. 陷阱。把 qualitative 斥为“不科学”;把大n 等同于有代表性;对找来的数据忘了 consent/provenance(同意/来源)。 AskSia Library · MAST20034 · 双语 Bilingual ★ Recall checklist - the decision rules for the bank 回忆清单 -- 题库的决策规则 1 每个答案:点名→定义→在情境中给理由→ because(后果)。按分值数理由。 · Design/sampling: justify the choice against its alternative; non-probability methods are biased - size won't cure it. 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 · Confounder: must link to both exposure and outcome; observational - associated with, never causes. Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 · Graph critique: two good features to a principle; one issue + a fix that matches it. 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 · Type I/II: false +/false -; power = 1-ß; small n -+ low power; rare condition - base-rate false positives. Type l/ll: 假阳/假阴;power = 1-β;小 n→低 power;罕见情况→基础率导致的假阳性。 · Qual vs quant: why vs what; convergence is the qual stopping rule. Big data: size # unbiased; effect size & ethics over P. 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data: 大≠无偏;effect size 与伦理胜过 P。 ● 每个答案:点名 →定义→在情境中给理由→ because(后果)。按分值数理由。 ● 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 ● Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 ● 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 ● CI:随机的是区间,固定的是参数。P-value: Pr(datalHo),而非 Pr(Holdata);显著 ≠重要。 ● Type I/ll: 假阳/假阴;power = 1-β;小n→低 power;罕见情况→ 基础率导致的假阳性。 ● Forest plot: 菱形对零线+异质性+发表偏倚的注意。Hill:证据的权重,时序性优先。 ● 诊断图:漏斗形→方差非恒定;QQ 弯曲→非正态→变换/用假设更少的方法。 ● 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data:大≠无偏;effect size 与伦理胜过 P。 AskSia Library · MAST20034 · 双语 Bilingual EXAM MORNING . THE DECODER EXAM MORNING . THE DECODER[16]Source: asksia-cheatsheet-mast20034.pdfqualitative . convergence 17 . The "Because" Rule HOW TO BANK MARKS EVERY ANSWER = A CHAIN NAME the concept/framework - APPLY it to the context + BECAUSE . . . (the reason wins the mark) · Marks are per correct, sufficiently-detailed reason - one detailed reason often = full marks; restating the definition earns nothing · Dot points are fine . no grammar/spelling marks . 3 hours SIA > No question needs a calculator and none recall case-study details. If you're computing, you've misread it - they want the reasoning. Spend the words on the because. 18 . Top Traps to THE MARKS LOST MOST Avoid · "P = Pr(Ho true)" . "large P proves Ho" . choosing one- sided after the data · Calling a natural experiment an experiment . "association proves cause" · Cohort>case-control mix-up · stratified > quota . cluster<> stratified · Hill as a tick-box · ignoring temporality · "Significant" = "important" . trusting a result because n is huge the core of your 4-side bring-in notes . confirm on the MAST20034 exam-info page name it. apply it. because . . . asksia. ai/cheatsheet/ unimelb-mast20034 . side 2/2 AskSia CHEAT SHEET SERIES RESTRICTED BRING-IN . NO CALCULATOR
-
8. 以为大样本能修 bias
- 错。大样本只能缩 sampling error,修不了偏倚。[3]Source: asksia-bible-mast20034-bilingual.pdf"Interpret this CI / P- value / output / forest plot" Inference reading rules . Ch7, Ch8, Ch10 (1) State what it shows in context (CI excludes O - evidence of an effect; small P - strong evidence against H. ). (2) Add the correct caveat (the interval is random, u is fixed; large P does not prove H. ). (3) Comment on strength / meaning - significant # important. SAMPLING & TRUST "Is this sample OK?"; "what's wrong with how they recruited?" Sampling bias + WEIRD · Ch1, Ch9 (1) Ask who is missing - frame / selection / non-response / volunteer gap. (2) Name the method and its bias (convenience - people similar to each other). (3) State that a bigger sample will NOT fix bias - it repeats the mistake at scale; consider WEIRD over-sampling. "Too good to be true"; "a surprising significant result"; "just barely p‹0. 05" Reproducibility + p-hacking · Ch9-10 (1) Publication bias - novel/significant results over-published, inflating effects. (2) Watch for p-hacking / HARKing (one-sided chosen after the data, multiple looks). (3) Ask for replication, a CI / effect size, and pre- registration before trusting it. "Big data / an Al claim"; "with millions of records . . . " Ethics + validity at scale · Ch1, Ch11 (1) Huge n - everything is significant - judge effect size & practical importance, not P. (2) Apply the context questions (who/why/what/how) + data ethics (consent, fairness, stewardship). (3) Be sceptical of Al - check provenance and the missing data. ✓ How to use the table under pressure 在压力下如何使用这张表 Underline the verb and the noun in the stem first ("choose a design", "critique this graph", "interpret the output"). That two-word cue picks the row; the right column is your paragraph. Then convert each of the three things into a sentence that ends in a because. You are never asked to compute - resist the urge. 先在题干里给动词和名词划线(“选择一个设计”、“批判这张图”、“解读这段输出”)。那个两字提示挑出对应的行;右侧 那一栏就是你要写的段落。然后把这三样东西各转成一个以 because 收尾的句子。题目从不要求你计算 -- 忍住冲动。 AskSia Library . MAST20034 . XXia Bilingual EXAM MORNING . THE DECODER - EXAM MORNING . THE DECODER BUILDING THE NOTES YOU CARRY IN Your 4 sides, the 'because' rule, and the 3 hours 你的4 页笔记、‘because’规则,以及那3个小时 TL;DR. You may bring four sides of your own notes and there is no calculator - so do not waste space on formulas. Fill the four sides with decision trees, checklists, and crisp definitions: the machinery that turns a cue into a justified answer. This page lays out what to put on each side, the one rule that wins short- answer marks, a timing plan for the three hours, and the closing concepts-to-recall list. TL;DR. 你可以带四面自己的笔记,而且没有计算器 -- 所以不要把空间浪费在公式上。把这四面填满decision trees (决策 树)、checklists (清单)和精炼的定义:那些把线索变成有论证答案的机器。本页摆出每一面该放什么、赢得简答分的那一 条规则、三小时的时间规划,以及收官的待回忆概念清单。 - 12. 2 The 4-side notes plan 12. 2四页笔记计划 The exam is reasoning, not recall of numbers, so your sheet is a reasoning toolkit. A good layout maps one side to each job of the decoder above. Trees and checklists earn marks; a wall of formulae does not (there is nothing to calculate). 考的是推理,而非对数字的回忆,所以你的小抄是一个推理工具箱。好的布局把每一面对应到上面解码器的一项工作。树与清 单能得分;一墙公式不能(没有任何东西要算)。 Side What goes on it[9]Source: asksia-bible-mast20034-bilingual.pdfThe PPDAC cycle - the spine of the whole unit, and a one-glance map of how an exam scenario hangs together: every question lives somewhere on Problem - Plan - Data - Analysis - Conclusion. Locating the stage tells you which concept the marker wants. PPDAC 循环 -- 整个单元的脊柱,也是一张让你一眼看清考试情景如何拼接的地图:每道题都栖身于 Problem → Plan → Data → Analysis → Conclusion 中的某处。定位到阶段,就知道评分者想要哪个概念。 AskSia Library . MAST20034 . XXia Bilingual ★ Concepts to recall - the whole-book checklist 要回忆的概念 -- 全书清单 · Context first (Ch1): data are value-laden; ask who/why/what/how/when; critique # criticism (always offer a constructive fix). 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · Graphics (Ch2): the 5 principles; match graph to variable types; two good features + one specific improvement. I 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 · Design (Ch3): validity = randomise/compare/control (kills bias); precision = replicate/stratify/balance (kills variability); they are independent axes. 设计(第3章): validity = 随机化/比较/控制(杀 bias); precision = 重复/分层/平衡(杀 variability);二者是独 立的轴。 · Observational (Ch4): cohort=group-by-exposure, case-control=group-by-outcome; confounder links to both; correlation # causation. 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠ 因果。 I 报告(第5章):中心/离散/趋势/离群点;报告 Cl+水平、以及统计量 +P,而非只报P。 · Qualitative (Ch6): "why" not "what"; bottom-up vs top-down coding; convergence as the stopping rule. 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence 作为停止规则。 推断(第7章):随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大P 不证明 Ho; Type l/ll 与 power. · Modelling (Ch8): signal+noise; "all models wrong, some useful"; parsimony; read residual/QQ plots - interpret, never fit. 建模(第8章):信号+噪声;“所有模型都是错的,有些有用”;简约性;读残差/QQ图 -- 解读,绝不拟合。 · Sampling (Ch9): frame vs sample; a big sample won't fix bias; 4 random + 4 non-random methods; WEIRD; reproducibility crisis. I 抽样(第9章):抽样框 vs样本;大样本修不了偏倚;4种随机+4种非随机方法;WEIRD;可重复性危机。 I 累积(第10章):森林图(零线+菱形);Hill 准则(时序性+梯度);发表偏倚。 · Big data (Ch11): significance # importance at scale; provenance, ethics, scepticism toward Al findings. Big data (第11章):在大规模下显著 ≠重要;来源、伦理、对 AI发现的怀疑。 而且永远 -- 铁律:点名概念,然后给 because。祝你好运。 AskSia Library · MAST20034 · 双语 Bilingual ● 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 ● 设计(第3章): validity =随机化/比较/控制(杀 bias); precision= 重复/分层/平衡(杀 variability);二者是独 立的轴。 ● 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠因果。 ● 报告(第5章):中心/离散/趋势/离群点;报告 CI+水平、以及统计量+P,而非只报P。 ● 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence(收敛)作为停止规则。 推断(第7章): 随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大 P 不证明 Ho; Type l/ll 与 power.[13]Source: asksia-bible-mast20034-bilingual.pdfEX 12. 1 Turning a fact into a because (worked short-answer) name > consequence + because Stem (AskSia-invented): "A wellbeing app is evaluated by surveying users who clicked an in-app pop-up. Comment on the sample. " 题干(AskSia 自拟):“某福祉 app通过调查那些点击了应用内弹窗的用户来评估。评论这个样本。” Weak (no marks): "It is a convenience sample. " - a label, no reasoning. 弱(无分):“这是一个 convenience sample。” -- 只是标签,没有推理。 Strong (banks the marks): "This is a convenience / volunteer sample, because only users already engaged enough to click respond - so it suffers self-selection bias and likely over-states satisfaction (because dissatisfied users have churned and are missing). A larger pop-up sample would not fix this, because it repeats the same biased method at scale. " 强(存下分数):“这是一个 convenience/ volunteer(便利/自愿)样本,因为只有已经足够投入到会去点击的用户才 会作答 -- 所以它存在 self-selection bias(自我选择偏倚),很可能高估满意度(因为不满意的用户已经流失,处于 缺失(missing)状态)。更大的弹窗样本并不能解决这个问题,因为它只是把同一种有偏的方法放大重复。” - Read-out: three clauses, three becauses: name the method - name the consequence in context - pre- empt the 'bigger sample' trap. (Scenario AskSia-invented; no figures to compute. ) 读出结构:三个分句,三个because:点名方法→在情境中点名后果→预先化解“样本更大”的陷阱。(情景由 AskSia 自拟;没有要计算的数字。) - 12. 4 The 3-hour timing plan 12. 43 小时计时计划 Three hours for short-answer reasoning is generous - the risk is over-writing early questions, not running out of ideas. Budget by marks, leave a critique-polish pass at the end. 三小时做简答推理是宽裕的 -- 风险在于早段题目写过头,而非想不出点子。按分值分配时间,末尾留一遍批判-润色的检 查。 AskSia Library · MAST20034 · 双语 Bilingual 1 First 10 min - survey & map. Read every question; pencil the decoder row next to each (design? critique? interpret? sample?). Spot the high-mark items. 头10分钟 -- 通览与定位。把每道题读一遍;在每题旁用铅笔标出解码器的行(设计?批判?解读?抽样?)。挑出高分 题。 2 Bulk - answer by mark weight. Roughly a minute per mark; a 4-mark design question wants two detailed becauses, a 2-mark "two good features" wants exactly two. Do not pad. 主体 -- 按分值作答。大约每分一分钟;一道 4分的设计题想要两个详尽的 because,一道 2分的“两个好特征”就恰好两 个。别注水。 3 Discipline - never compute. If you feel an arithmetic urge, you have misread - the answer is an interpretation, not a number. 纪律 -- 绝不计算。若你感到一股算术冲动,那你读错题了 -- 答案是一个解读,而非一个数字。 4 Last 20 min - the because audit. Re-read each answer and check every claim ends in a reason tied to the scenario; add the missing because, the missing caveat (significance # importance), the missing specific fix. 最后 20 分钟 -- because 审计。重读每个答案,检查每一条主张是否都以一个系到情景的理由收尾;补上缺失的 because、缺失的注意点(显著≠重要)、缺失的具体修复。 FIG 12. 1 1 Problem define the question 5 Conclusion answer in context 2 Plan design how to get data PPDAC cycle 4 Analysis explore + model + test 3 Data collect / clean / store iterative: conclusions raise new problems -> cycle repeats[14]Source: asksia-bible-mast20034-bilingual.pdfGLM 连接:连续→identity/Normal;二元→logit/Bernouli (优势比);计数→log/Poisson。认出来,别去算。 · Diagnostics: residuals-vs-fitted = funnel(variance)/curve(nonlinearity); QQ tails = non-Normal. Any structure = a problem. I 诊断:残差对拟合=漏斗形(方差)/弯曲(非线性);QQ 尾部=非正态。任何结构=一个问题。 修复点名的违例:变换 · 加项 · 假设更少的检验。 · Parsimony over complexity (compare via AIC/BIC - don't compute); over-fitting is a fault, not a virtue. 一 简约性胜过复杂性(用AIC/BIC比较 -- 别去算);过拟合是缺陷,而非美德。 两个拒绝:不向数据之外做 extrapolation(外推);相关 ≠因果 -- 点出那个混杂因素。 · Model = signal + noise (模型=信号+噪声);“所有模型都是错的,有些有用” -- 评判有用性+简约性,而非 真伪。 ● 从三个轴读一个系数:符号 · 大小 · 显著性(CI 不含 0/小P=真实,而非噪声)。 ● 预测变量:数值型→斜率;分类型→相对基线的组间偏移;两者皆有→调整后的效应。 ● GLM links: 连续型→identity/Normal;二元→logit/Bernoulli (odds ratios);计数→log/Poisson。会识别,不计 算。 ● 诊断:residuals-vs-fitted = 漏斗形(方差)/曲线(非线性);QQ 尾部=非正态。任何结构=一个问题。 ● 修正点名的那个违背:transform · 加一项 · 假设更少的检验。 · 简约性(Parsimony)优于复杂性(通过 AIC/BIC 比较 -- 不计算);over-fitting 是缺点,不是优点。 ● 两条拒绝:不在数据之外做 extrapolation; correlation ≠ causation––点名 confounder。 AskSia Library · MAST20034 · 双语 Bilingual WEEK 9 . SAMPLING WEEK 9 . SAMPLING CH 9 . REASONING, NOT COMPUTING Population, frame & sample - and why a big sample can't fix a bad one population、frame 与 sample -- 以及为何大样本救不了一个坏样本 A sample is a guess about a population; the method decides if the guess is honest 样本是对总体的一次猜测;方法决定这猜测是否诚实 TL;DR. Almost no study measures everyone, so we measure a sample and reason about the population. Two different things can go wrong, and the exam lives in the gap between them: sampling error is the harmless luck-of-the-draw wobble that shrinks as the sample grows, while bias is a systematic lean baked in by the method - and a bigger sample only repeats that mistake on a larger scale. The whole skill is naming who is missing from the sample and which way that tilts the answer. TL;DR. 几乎没有研究能测量所有人,所以我们测量一个 sample 并就 population 推理。两种不同的东西可能出错,而考试 就活在它们之间的缝隙里:sampling error(抽样误差)是无害的、抽签运气式的抖动,随样本增大而收缩,而 bias 是由方 法烤进去的一种系统性偏斜 -- 更大的样本只是把那个错误放在更大的规模上重演一遍。整个技能就是点名样本里谁缺席了, 以及那朝哪个方向歪了答案。 ★ What the exam asks here 考试在这里问什么 Sampling seeds the most-rehearsed reasoning question on the 60% final - a 3-hour, short-answer paper with no calculator and no calculations, where you bring in four sides of your own notes. The released sample question asks you to define a sampling method and say why it is (not) recommended, and the tutorial species asks which sampling method would you use here and why. You will also be handed a scenario and asked to identify who is excluded and the direction of the resulting bias. The marking rule is explicit: "explaining your reasoning and choices is typically more important than any answer. " Every mark is a because - carry the taxonomy and the bias checklist, and spend the words on the consequence, not the label. Sampling 在60% 期末上孕育出最常演练的推理题 -- 一份 3小时、简答、无计算器、无计算的卷子,你带入四面自己 的笔记。发布的样题要你定义一种 sampling 方法并说出它为何(不)被推荐,教程题种则问此处你会用哪种 sampling 方法、为什么。你也会被递给一个情景,要你识别谁被排除,以及由此产生的bias的方向。评分规则明确:“阐释你的 推理与选择,通常比任何答案本身更重要。”每一分都是一个because -- 带上分类法和 bias 清单,把字花在后果上,而 非标签上。 9. 1 The core vocabulary - unit, population, frame, sample 9. 1核心词汇 -- unit、population、frame、sample Definition. A unit is one thing you study (a person, a tree, a transaction). The population is all the units you want to talk about. A census measures every unit in the population. The sampling frame is the actual list of units you can draw from - the electoral roll, a class list, a customer database. The sample is the subset you measure. Write each term as a defined object, because markers reward the defined term before the reasoning.
-
9. confounder 只连 outcome 不连 exposure
-
10. Hill criteria 当打勾表
-
七、你那 4 面 notes sheet 应该怎么做
-
资料已经很明确建议你:不要做公式表。[1]Source: asksia-bible-mast20034-bilingual.pdfB 2 . REVISE 2 · REVISE 2 . REVISE You've done the week. Use the tables and the chapter-end recall checklists to self-test: can you list the four observational designs, name three sampling biases, give the five graphics principles, recite the Hill criteria? The checklists are written to be lifted almost verbatim onto your four-side notes sheet. 你已经上完本周。用各表格和章 末的recall checklists (回忆清 单)来自测:你能列出四种观察 性设计、点名三种 sampling bias、给出五条图表原则、背出 Hill 准则吗?这些清单写出来就 是为了几乎逐字誉到你的四面笔 记纸上。 C 3 . APPLY 3 . APPLY 3 . APPLY You're building your notes sheet or sitting the paper. Run the name-the-concept decoder (Ch 14) on every prompt: read the cue - name the design / bias / method -> write the because. With four sides of notes carried in and no calculator, your edge is reasoning discipline, not recall under pressure. 你正在做笔记纸,或正在考场 上。对每道题跑一遍name-the- concept decoder (点名概念解 码器)(第14章):读线索→点 design / bias / method -> 写下because。带着四面笔记、 不用计算器,你的优势是推理纪 律,而非压力下的回忆。 AskSia Library · MAST20034 · 双语 Bilingual ! Read this first: the assessment shape, and the bring-in rule 先读这个:评估的形态,以及可带入规则 MAST20034 is assessed by four pieces: 5 revision quizzes (5%), 4 short assignments (20%, each a tight 200- word critique with hard word penalties), a group project (15%, study design/critique + a Week 11 presentation), and the 60% final exam. The final is in-person, short-answer reasoning, three hours. You may bring in up to two A4 pages double-sided - four sides - of your own notes, and calculators are not permitted (there are no questions that need one). So your notes sheet should carry definitions, taxonomies, decision rules and checklists, never arithmetic. Always confirm the current weights, dates and exam conditions on your own LMS, as details shift between cohorts. MAST20034 由四个部分评估:5 次复习 quiz(5%)、4次短作业(20%,每次是一篇严格200词的批判,超字数有硬 扣分)、一个小组项目(15%,研究设计/批判+第11周展示),以及60% 的期末考。期末是线下、简答推理、三小时。 你可以带入最多两张 A4 双面纸 -- 四面 -- 自己的笔记,不允许用计算器(也没有需要计算器的题目)。所以你的笔记 纸应承载定义、分类法、决策规则与清单,绝不放算术。请始终在你自己的 LMS 上确认当前的权重、日期与考试条件, 因为细节会随届次变动。 i How this book was built - the two-layer rule 这本书是怎么搭出来的 -- 两层规则 The framework canon here is standard, widely-published statistical-literacy theory - the PPDAC investigation cycle (Wild & Pfannkuch), EDA (Tukey), the standard study-design and sampling taxonomies, validity & precision principles, NHST + confidence-interval logic, the WEIRD-bias critique, the Bradford Hill criteria, and data-feminism / data-ethics (D'Ignazio & Klein). These are non-copyrightable canon, stated plainly. The course's own case-study stems and tutorial examples are paraphrased and re-authored with our own scenarios - we never reproduce a case study's specific data. Book status quoted and honoured (four-side bring-in, no calculator, short-answer). Verify on your LMS. 这里的框架经典是标准的、被广泛发表的统计素养理论 -- PPDAC 探究循环(Wild & Pfannkuch)、EDA (Tukey)、 标准的研究设计与 sampling 分类法、validity & precision (效度与精密度)原则、NHST+置信区间逻辑、WEIRD- bias 批判、Bradford Hill 准则,以及data-feminism / data-ethics (数据女性主义/数据伦理)(D'Ignazio & Klein)。这些是不可受版权保护的经典,平实陈述。本课自身的案例研究题干与教程例子都被转述并以我们自己的情景重 写 -- 我们绝不复制任何案例研究的具体数据。书面状态如实引用并遵守(四面带入、不用计算器、简答)。请在你的 LMS 上核实。 AskSia Library · MAST20034 · 双语 Bilingual THE BLUEPRINT - THE EXAM BLUEPRINT 60% FINAL . EVERY MARK IS A 'BECAUSE' Where every mark lives 每一分都落在哪里 One 60% short-answer final - reasoning only, no calculator, four sides of your own notes 一场占 60% 的 short-answer 期末 -- 只考推理、不许用计算器、可带四页自备笔记 TL;DR. Sixty percent is a short-answer reasoning final - no calculator, no calculations, no software, with four sides of your own notes carried in. Its make-or-break skill is "name the concept, then justify the call": you are handed a graph, a study or a piece of statistical output and asked to critique it and say how to fix it. Master the taxonomies and decision rules in this book and you hold the keys to the whole paper. TL;DR. 这 60% 是一场 short-answer 推理期末 -- 不许用计算器、不做计算、不用软件,可带入四页自备笔记。它成败攸 关的技能是“为概念命名,再论证你的判断”:你会拿到一张图、一项研究或一段统计输出,被要求批判它并说出如何修补。 掌握本书里的分类法与决策规则,你就握住了整张试卷的钥匙。 60% FINAL EXAM (3 HR) 期末考试(3 小时)[3]Source: asksia-bible-mast20034-bilingual.pdf"Interpret this CI / P- value / output / forest plot" Inference reading rules . Ch7, Ch8, Ch10 (1) State what it shows in context (CI excludes O - evidence of an effect; small P - strong evidence against H. ). (2) Add the correct caveat (the interval is random, u is fixed; large P does not prove H. ). (3) Comment on strength / meaning - significant # important. SAMPLING & TRUST "Is this sample OK?"; "what's wrong with how they recruited?" Sampling bias + WEIRD · Ch1, Ch9 (1) Ask who is missing - frame / selection / non-response / volunteer gap. (2) Name the method and its bias (convenience - people similar to each other). (3) State that a bigger sample will NOT fix bias - it repeats the mistake at scale; consider WEIRD over-sampling. "Too good to be true"; "a surprising significant result"; "just barely p‹0. 05" Reproducibility + p-hacking · Ch9-10 (1) Publication bias - novel/significant results over-published, inflating effects. (2) Watch for p-hacking / HARKing (one-sided chosen after the data, multiple looks). (3) Ask for replication, a CI / effect size, and pre- registration before trusting it. "Big data / an Al claim"; "with millions of records . . . " Ethics + validity at scale · Ch1, Ch11 (1) Huge n - everything is significant - judge effect size & practical importance, not P. (2) Apply the context questions (who/why/what/how) + data ethics (consent, fairness, stewardship). (3) Be sceptical of Al - check provenance and the missing data. ✓ How to use the table under pressure 在压力下如何使用这张表 Underline the verb and the noun in the stem first ("choose a design", "critique this graph", "interpret the output"). That two-word cue picks the row; the right column is your paragraph. Then convert each of the three things into a sentence that ends in a because. You are never asked to compute - resist the urge. 先在题干里给动词和名词划线(“选择一个设计”、“批判这张图”、“解读这段输出”)。那个两字提示挑出对应的行;右侧 那一栏就是你要写的段落。然后把这三样东西各转成一个以 because 收尾的句子。题目从不要求你计算 -- 忍住冲动。 AskSia Library . MAST20034 . XXia Bilingual EXAM MORNING . THE DECODER - EXAM MORNING . THE DECODER BUILDING THE NOTES YOU CARRY IN Your 4 sides, the 'because' rule, and the 3 hours 你的4 页笔记、‘because’规则,以及那3个小时 TL;DR. You may bring four sides of your own notes and there is no calculator - so do not waste space on formulas. Fill the four sides with decision trees, checklists, and crisp definitions: the machinery that turns a cue into a justified answer. This page lays out what to put on each side, the one rule that wins short- answer marks, a timing plan for the three hours, and the closing concepts-to-recall list. TL;DR. 你可以带四面自己的笔记,而且没有计算器 -- 所以不要把空间浪费在公式上。把这四面填满decision trees (决策 树)、checklists (清单)和精炼的定义:那些把线索变成有论证答案的机器。本页摆出每一面该放什么、赢得简答分的那一 条规则、三小时的时间规划,以及收官的待回忆概念清单。 - 12. 2 The 4-side notes plan 12. 2四页笔记计划 The exam is reasoning, not recall of numbers, so your sheet is a reasoning toolkit. A good layout maps one side to each job of the decoder above. Trees and checklists earn marks; a wall of formulae does not (there is nothing to calculate). 考的是推理,而非对数字的回忆,所以你的小抄是一个推理工具箱。好的布局把每一面对应到上面解码器的一项工作。树与清 单能得分;一墙公式不能(没有任何东西要算)。 Side What goes on it[7]Source: asksia-bible-mast20034-bilingual.pdfWhy it pays off Side 1 - design & cause Study-design tree (intervene? - experiment/observational - cohort/cross-sec/case-control/ecological); confounding triangle; the Hill 9 (star temporality + gradient); exposure/outcome synonyms. Covers the two biggest row- families - "choose a design" and "is it causal?" - with ready-made becauses. Side 2 - The 5 graphics principles as a critique checklist; the graph-chooser (by variable mix); the data-description checklist graphs & reporting (centre/spread/trend/outliers, concise+complete); inference-report checklist (CI + level, stat + P, n). Turns the graph-critique and "critique this description" species into fill-in-the-blank answers. Side 3 - inference rules Correct CI + P-value wordings (and the wrong ones to avoid); the Type I/II + power 2×2; the NHST 5 steps; the assumption hierarchy; the diagnostic-plot readings (funnel - non-constant variance). The "interpret this output" species is pure recall of the right sentence - have it verbatim. Side 4 - sampling, qual & ethics Sampling taxonomy (4 random + 4 non-random) with each method's bias; "size won't fix bias"; WEIRD + reproducibility; qual methods (why vs what, coding, convergence); data ethics + justice; the context questions. Mops up the trust / sampling / qualitative / big-data rows and the W1 + W12 context frame. AskSia Library . MAST20034 . XXia Bilingual ! Do NOT build a formula sheet 不要去做一张公式表 There is no calculator and no calculation on this exam. A side crammed with CLT algebra, t-formulae or regression normal equations is wasted - you will never plug numbers in. The only notation worth a line is the definition of a P- value or a CI in words. Every other millimetre should be a tree, a checklist, or a one-line definition. 这场考试没有计算器,也没有计算。一面塞满 CLT代数、t 公式或回归正规方程的笔记是浪费 -- 你永远不会代入数 字。唯一值得占一行的记号,是用文字写出 P-value 或 CI 的定义。其余每一毫米都应是一棵树、一份核对清单,或一 行定义。 12. 3 The short-answer 'because' rule 12. 3short-answer 的‘because’ 规则 Definition. A complete short answer = a named concept + a reason that connects it to the scenario. Marks are awarded per correct, sufficiently-detailed reason, not per fact stated. A reason that merely restates the definition is not a because and scores nothing. 定义。一个完整的简答=一个点名的概念+一条把它连到情景上的理由。分数按每条正确、足够详细的理由给,而非按陈述 的事实给。一条仅仅复述定义的理由不是 because,得不到分。
-
应该放:
- decision trees
- checklists
- one-line definitions
- 常见错句 vs 正确句
- 题型模板[3]Source: asksia-bible-mast20034-bilingual.pdf"Interpret this CI / P- value / output / forest plot" Inference reading rules . Ch7, Ch8, Ch10 (1) State what it shows in context (CI excludes O - evidence of an effect; small P - strong evidence against H. ). (2) Add the correct caveat (the interval is random, u is fixed; large P does not prove H. ). (3) Comment on strength / meaning - significant # important. SAMPLING & TRUST "Is this sample OK?"; "what's wrong with how they recruited?" Sampling bias + WEIRD · Ch1, Ch9 (1) Ask who is missing - frame / selection / non-response / volunteer gap. (2) Name the method and its bias (convenience - people similar to each other). (3) State that a bigger sample will NOT fix bias - it repeats the mistake at scale; consider WEIRD over-sampling. "Too good to be true"; "a surprising significant result"; "just barely p‹0. 05" Reproducibility + p-hacking · Ch9-10 (1) Publication bias - novel/significant results over-published, inflating effects. (2) Watch for p-hacking / HARKing (one-sided chosen after the data, multiple looks). (3) Ask for replication, a CI / effect size, and pre- registration before trusting it. "Big data / an Al claim"; "with millions of records . . . " Ethics + validity at scale · Ch1, Ch11 (1) Huge n - everything is significant - judge effect size & practical importance, not P. (2) Apply the context questions (who/why/what/how) + data ethics (consent, fairness, stewardship). (3) Be sceptical of Al - check provenance and the missing data. ✓ How to use the table under pressure 在压力下如何使用这张表 Underline the verb and the noun in the stem first ("choose a design", "critique this graph", "interpret the output"). That two-word cue picks the row; the right column is your paragraph. Then convert each of the three things into a sentence that ends in a because. You are never asked to compute - resist the urge. 先在题干里给动词和名词划线(“选择一个设计”、“批判这张图”、“解读这段输出”)。那个两字提示挑出对应的行;右侧 那一栏就是你要写的段落。然后把这三样东西各转成一个以 because 收尾的句子。题目从不要求你计算 -- 忍住冲动。 AskSia Library . MAST20034 . XXia Bilingual EXAM MORNING . THE DECODER - EXAM MORNING . THE DECODER BUILDING THE NOTES YOU CARRY IN Your 4 sides, the 'because' rule, and the 3 hours 你的4 页笔记、‘because’规则,以及那3个小时 TL;DR. You may bring four sides of your own notes and there is no calculator - so do not waste space on formulas. Fill the four sides with decision trees, checklists, and crisp definitions: the machinery that turns a cue into a justified answer. This page lays out what to put on each side, the one rule that wins short- answer marks, a timing plan for the three hours, and the closing concepts-to-recall list. TL;DR. 你可以带四面自己的笔记,而且没有计算器 -- 所以不要把空间浪费在公式上。把这四面填满decision trees (决策 树)、checklists (清单)和精炼的定义:那些把线索变成有论证答案的机器。本页摆出每一面该放什么、赢得简答分的那一 条规则、三小时的时间规划,以及收官的待回忆概念清单。 - 12. 2 The 4-side notes plan 12. 2四页笔记计划 The exam is reasoning, not recall of numbers, so your sheet is a reasoning toolkit. A good layout maps one side to each job of the decoder above. Trees and checklists earn marks; a wall of formulae does not (there is nothing to calculate). 考的是推理,而非对数字的回忆,所以你的小抄是一个推理工具箱。好的布局把每一面对应到上面解码器的一项工作。树与清 单能得分;一墙公式不能(没有任何东西要算)。 Side What goes on it[7]Source: asksia-bible-mast20034-bilingual.pdfWhy it pays off Side 1 - design & cause Study-design tree (intervene? - experiment/observational - cohort/cross-sec/case-control/ecological); confounding triangle; the Hill 9 (star temporality + gradient); exposure/outcome synonyms. Covers the two biggest row- families - "choose a design" and "is it causal?" - with ready-made becauses. Side 2 - The 5 graphics principles as a critique checklist; the graph-chooser (by variable mix); the data-description checklist graphs & reporting (centre/spread/trend/outliers, concise+complete); inference-report checklist (CI + level, stat + P, n). Turns the graph-critique and "critique this description" species into fill-in-the-blank answers. Side 3 - inference rules Correct CI + P-value wordings (and the wrong ones to avoid); the Type I/II + power 2×2; the NHST 5 steps; the assumption hierarchy; the diagnostic-plot readings (funnel - non-constant variance). The "interpret this output" species is pure recall of the right sentence - have it verbatim. Side 4 - sampling, qual & ethics Sampling taxonomy (4 random + 4 non-random) with each method's bias; "size won't fix bias"; WEIRD + reproducibility; qual methods (why vs what, coding, convergence); data ethics + justice; the context questions. Mops up the trust / sampling / qualitative / big-data rows and the W1 + W12 context frame. AskSia Library . MAST20034 . XXia Bilingual ! Do NOT build a formula sheet 不要去做一张公式表 There is no calculator and no calculation on this exam. A side crammed with CLT algebra, t-formulae or regression normal equations is wasted - you will never plug numbers in. The only notation worth a line is the definition of a P- value or a CI in words. Every other millimetre should be a tree, a checklist, or a one-line definition. 这场考试没有计算器,也没有计算。一面塞满 CLT代数、t 公式或回归正规方程的笔记是浪费 -- 你永远不会代入数 字。唯一值得占一行的记号,是用文字写出 P-value 或 CI 的定义。其余每一毫米都应是一棵树、一份核对清单,或一 行定义。 12. 3 The short-answer 'because' rule 12. 3short-answer 的‘because’ 规则 Definition. A complete short answer = a named concept + a reason that connects it to the scenario. Marks are awarded per correct, sufficiently-detailed reason, not per fact stated. A reason that merely restates the definition is not a because and scores nothing. 定义。一个完整的简答=一个点名的概念+一条把它连到情景上的理由。分数按每条正确、足够详细的理由给,而非按陈述 的事实给。一条仅仅复述定义的理由不是 because,得不到分。
-
材料建议的 4 面分工大致是:[3]Source: asksia-bible-mast20034-bilingual.pdf"Interpret this CI / P- value / output / forest plot" Inference reading rules . Ch7, Ch8, Ch10 (1) State what it shows in context (CI excludes O - evidence of an effect; small P - strong evidence against H. ). (2) Add the correct caveat (the interval is random, u is fixed; large P does not prove H. ). (3) Comment on strength / meaning - significant # important. SAMPLING & TRUST "Is this sample OK?"; "what's wrong with how they recruited?" Sampling bias + WEIRD · Ch1, Ch9 (1) Ask who is missing - frame / selection / non-response / volunteer gap. (2) Name the method and its bias (convenience - people similar to each other). (3) State that a bigger sample will NOT fix bias - it repeats the mistake at scale; consider WEIRD over-sampling. "Too good to be true"; "a surprising significant result"; "just barely p‹0. 05" Reproducibility + p-hacking · Ch9-10 (1) Publication bias - novel/significant results over-published, inflating effects. (2) Watch for p-hacking / HARKing (one-sided chosen after the data, multiple looks). (3) Ask for replication, a CI / effect size, and pre- registration before trusting it. "Big data / an Al claim"; "with millions of records . . . " Ethics + validity at scale · Ch1, Ch11 (1) Huge n - everything is significant - judge effect size & practical importance, not P. (2) Apply the context questions (who/why/what/how) + data ethics (consent, fairness, stewardship). (3) Be sceptical of Al - check provenance and the missing data. ✓ How to use the table under pressure 在压力下如何使用这张表 Underline the verb and the noun in the stem first ("choose a design", "critique this graph", "interpret the output"). That two-word cue picks the row; the right column is your paragraph. Then convert each of the three things into a sentence that ends in a because. You are never asked to compute - resist the urge. 先在题干里给动词和名词划线(“选择一个设计”、“批判这张图”、“解读这段输出”)。那个两字提示挑出对应的行;右侧 那一栏就是你要写的段落。然后把这三样东西各转成一个以 because 收尾的句子。题目从不要求你计算 -- 忍住冲动。 AskSia Library . MAST20034 . XXia Bilingual EXAM MORNING . THE DECODER - EXAM MORNING . THE DECODER BUILDING THE NOTES YOU CARRY IN Your 4 sides, the 'because' rule, and the 3 hours 你的4 页笔记、‘because’规则,以及那3个小时 TL;DR. You may bring four sides of your own notes and there is no calculator - so do not waste space on formulas. Fill the four sides with decision trees, checklists, and crisp definitions: the machinery that turns a cue into a justified answer. This page lays out what to put on each side, the one rule that wins short- answer marks, a timing plan for the three hours, and the closing concepts-to-recall list. TL;DR. 你可以带四面自己的笔记,而且没有计算器 -- 所以不要把空间浪费在公式上。把这四面填满decision trees (决策 树)、checklists (清单)和精炼的定义:那些把线索变成有论证答案的机器。本页摆出每一面该放什么、赢得简答分的那一 条规则、三小时的时间规划,以及收官的待回忆概念清单。 - 12. 2 The 4-side notes plan 12. 2四页笔记计划 The exam is reasoning, not recall of numbers, so your sheet is a reasoning toolkit. A good layout maps one side to each job of the decoder above. Trees and checklists earn marks; a wall of formulae does not (there is nothing to calculate). 考的是推理,而非对数字的回忆,所以你的小抄是一个推理工具箱。好的布局把每一面对应到上面解码器的一项工作。树与清 单能得分;一墙公式不能(没有任何东西要算)。 Side What goes on it[7]Source: asksia-bible-mast20034-bilingual.pdfWhy it pays off Side 1 - design & cause Study-design tree (intervene? - experiment/observational - cohort/cross-sec/case-control/ecological); confounding triangle; the Hill 9 (star temporality + gradient); exposure/outcome synonyms. Covers the two biggest row- families - "choose a design" and "is it causal?" - with ready-made becauses. Side 2 - The 5 graphics principles as a critique checklist; the graph-chooser (by variable mix); the data-description checklist graphs & reporting (centre/spread/trend/outliers, concise+complete); inference-report checklist (CI + level, stat + P, n). Turns the graph-critique and "critique this description" species into fill-in-the-blank answers. Side 3 - inference rules Correct CI + P-value wordings (and the wrong ones to avoid); the Type I/II + power 2×2; the NHST 5 steps; the assumption hierarchy; the diagnostic-plot readings (funnel - non-constant variance). The "interpret this output" species is pure recall of the right sentence - have it verbatim. Side 4 - sampling, qual & ethics Sampling taxonomy (4 random + 4 non-random) with each method's bias; "size won't fix bias"; WEIRD + reproducibility; qual methods (why vs what, coding, convergence); data ethics + justice; the context questions. Mops up the trust / sampling / qualitative / big-data rows and the W1 + W12 context frame. AskSia Library . MAST20034 . XXia Bilingual ! Do NOT build a formula sheet 不要去做一张公式表 There is no calculator and no calculation on this exam. A side crammed with CLT algebra, t-formulae or regression normal equations is wasted - you will never plug numbers in. The only notation worth a line is the definition of a P- value or a CI in words. Every other millimetre should be a tree, a checklist, or a one-line definition. 这场考试没有计算器,也没有计算。一面塞满 CLT代数、t 公式或回归正规方程的笔记是浪费 -- 你永远不会代入数 字。唯一值得占一行的记号,是用文字写出 P-value 或 CI 的定义。其余每一毫米都应是一棵树、一份核对清单,或一 行定义。 12. 3 The short-answer 'because' rule 12. 3short-answer 的‘because’ 规则 Definition. A complete short answer = a named concept + a reason that connects it to the scenario. Marks are awarded per correct, sufficiently-detailed reason, not per fact stated. A reason that merely restates the definition is not a because and scores nothing. 定义。一个完整的简答=一个点名的概念+一条把它连到情景上的理由。分数按每条正确、足够详细的理由给,而非按陈述 的事实给。一条仅仅复述定义的理由不是 because,得不到分。
- Side 1:design & causation
- 研究设计树
- confounding triangle
- Hill criteria
- Side 2:graphs & reporting
- 5 graphics principles
- graph chooser
- 描述/报告 checklist
- Side 3:inference rules
- CI 正确说法
- P-value 正确说法
- Type I / II / power
- 假设与诊断图
- Side 4:sampling, qual & ethics
- sampling taxonomy
- WEIRD
- reproducibility
- qualitative methods
- data ethics
- context questions
- Side 1:design & causation
-
八、考前最后冲刺:最有效的复习顺序
-
第一步:先背“答题动作”
- 每题都练:
- 点名概念
- 下一句定义
- 再一句应用到情境
- 再一句 because 后果[4]Source: asksia-bible-mast20034-bilingual.pdf4 Analysis explore + model + test 3 Data collect / clean / store iterative: conclusions raise new problems -> cycle repeats The course's engine in one picture: Problem - Plan + Data - Analysis - Conclusion - the investigation cycle that frames almost every critique prompt. Most exam answers are really a question about one node: was the Plan a sound design? Were the Data well sampled? Is the Conclusion licensed by the design? Learn to walk it from memory. 一幅图道尽本课程的引擎:Problem → Plan → Data → Analysis → Conclusion -- 这个研究循环 框定了几乎每一个批判题。多数考试答案其实是关于 某一个节点的问题:Plan 是不是一个可靠的设计? Data 抽样得当吗?Conclusion 是设计所许可的吗? 要学会凭记忆把它走一遍。 Examinable scope = the 12-week reasoning spine: objectivity & data · good graphics · study design . observational studies & confounding · reporting & critiquing claims . qualitative methods . frameworks for inference . analysis & modelling . sampling & AskSia Library . MAST20034 . XXia Bilingual WEIRD bias . accumulating research (meta-analysis & Hill) . big data · context. Research prompts touch only the whole-class case studies and never demand recalled details. 可考范围 = 12 周的推理主线:objectivity 与数据 · 好图表 · 研究设计 · observational study 与 confounding · 报告 与批判主张 · qualitative methods · 推断框架 · 分析与建 模 · 抽样与 WEIRD bias · 积累研究 (meta-analysis 与 Hill) · big data · 情境。研究类题目只触及全班共学的案 例研究,从不要求背诵细节。 What the exam is really testing 这场考试真正在考什么 The cue you get The move it rewards A graph / figure Critique it: name two good features + one specific fix (the graphics principles) A described study Name the design - say what conclusion is legal (causation vs association) An association Find the confounder / lurking variable and explain how it could fake the link Statistical output / a CI / P Interpret it in context - without the classic misreads A sampling scenario Name the method & the bias (incl. WEIRD) and why size won't cure it ✓ The one habit that wins this exam 赢下这场考试的那一个习惯 For every prompt, name the concept first, then write the because. "Two variables move together" - confounding / correlation#causation; "who got picked" - a sampling / selection bias; "is this graph any good" - the five graphics principles; "does X cause Y across studies" - Bradford Hill; "what does this P-value mean" - the interpretation rules. The decoder in Ch 14 lists every cue. 对每道题,先点名概念,再写because。“两个变量一 起变动”→ confounding / correlation≠causation;“谁被选中”→ 某种 sampling/ selection bias;“这张图好不好”→五条 图表原则;“跨多项研究X是否导致 Y”→ Bradford Hill;“这个 P-value 是什么意思”→解读规则。第 14 章的解码器列出了每一个线索。 ★ The single highest-value habit 价值最高的单一习惯 You may write in dot-points or sentences, and there are no marks for grammar or spelling - so spend every word on the reasoning. Practise answering in the shape the markers reward: (1) name the concept, (2) define it in a line, (3) apply it to the scenario, (4) state the consequence or fix. Four sentences, full marks. "Explaining your reasoning and choices is typically more important than any answer. " 你可以用要点或句子书写,且语法或拼写不计分 所以把每一个词都花在推理上。按评分者奖励的形态 练习作答:(1)点名概念,(2)一行内定义它,(3)把它 应用到情景,(4)陈述后果或修正。四句话,满分。 “阐释你的推理与选择,通常比任何答案本身更重要。”[8]Source: asksia-bible-mast20034-bilingual.pdfAskSia Library · MAST20034 · 双语 Bilingual 3 Reason in the context given. Tie the concept to this scenario, not a generic textbook one. 在给定的情境中推理。把概念系到这个情景,而非一个泛泛的教科书情景。 4 Close each point with a because. State the consequence (the bias it induces, the assumption it breaks, the conclusion it licenses) - this is the mark-bearing clause. 每个要点都以一个 because 收尾。说出后果(它引发的偏倚、它破坏的假设、它许可的结论) -- 这是承载分数的从句。 5 Count your reasons against the marks. If it says [4: 2+2], deliver a definition and two separate consequence-level reasons. 按分值数你的理由。若标着[4:2+2],就要给出一个定义以及两个各自独立、到后果层面的理由。 ✓ The universal sentence frame 通用句式框架 "This is [named concept], which means [definition]. Here it matters because [consequence #1], and also because [consequence #2]. "Drop any scenario into that frame and you have structured an answer that the rubric can find marks in. “这是[点名的概念],意思是[定义]。在这里它之所以重要,是因为[后果#1],也因为[后果 #2]。”把任意情景套进这 个框架,你就把一个评分标准能找到分的答案给搭好了。 - ! The four ways students throw away marks 学生白白丢分的四种方式 复述,而非推理 -- “它有偏倚,因为它不随机”只是给定义换了个名字;要说出这个偏倚做了什么。 |一个理由套两件外套 -- 一道“2+2”的题目需要两个不同的理由,而非把一个换说法说两遍。 · Trying to compute - there is no calculator and no calc question; if you start arithmetic you have misread the task. 试图计算 -- 这里既无计算器也无计算题;若你开始做算术,你就读错题了。 过度宣称因果 -- 对观察性数据唯一合法的动词是 is associated with,绝不是 causes。 ● 复述,而非推理 -- “它有偏倚,因为它不随机”只是给定义换了个名字;要说出这个偏倚做了什么。 ● 一个理由套两件外套 -- 一道“2+2”的题目需要两个不同的理由,而非把一个换说法说两遍。 ● 试图计算 -- 这里既无计算器也无计算题;若你开始做算术,你就读错题了。 ● 过度宣称因果 -- 对观察性数据唯一合法的动词是 is associated with (与 . . . . . 相关),绝不是 causes(导致)。 AskSia Library · MAST20034 · 双语 Bilingual REVISION . SHORT - ANSWER BANK REVISION . SHORT - ANSWER BANK STUDY - PRODUCTION SPECIES Drills 1-6: design, sampling, confounding & graphs 演练 1-6: 设计、抽样、confounding 与图表 TL;DR. These six rehearse the "how the data were produced" family: choose & justify a design, pick a sampling method, name the confounder, identify exposure/outcome, and critique a graph (two good features + one specific fix). Reason the choice against its alternative - that contrast is where the marks live. TL;DR. 这六张演练“数据是如何产生的”这一族题:选择并论证一种设计、挑一种 sampling 方法、点名 confounder、识别 exposure/outcome,以及批判一张图(两个优点+一个具体的修正)。要把你的选择对照其备选方案来论证 -- 那个对比正 是分数所在。 Q1[11]Source: asksia-bible-mast20034-bilingual.pdf(b) Big-data 批判(because):巨大的 n修不了 bias -- 它只是那些已经在用这个 app 的人(selection bias); 在海量 n 下一切看起来都“显著”,所以 effect size 与来源比 P 更要紧。再加上数据伦理标记:被记录用户的隐私/ 同意。 What earns the marks. a justified qual choice (the "why" logic + convergence) + a big-data critique naming that size # representativeness, with an ethics flag. 什么能得分。一个有论证的 qual 选择(“why”逻辑+ convergence)+一个 big-data 批判,点明规模 ≠代表性,并附 上一个 ethics 标记。 Trap. dismissing qualitative as "unscientific"; equating large n with representative; forgetting consent/provenance for found data. 陷阱。把 qualitative 斥为“不科学”;把大n 等同于有代表性;对找来的数据忘了 consent/provenance(同意/来源)。 AskSia Library · MAST20034 · 双语 Bilingual ★ Recall checklist - the decision rules for the bank 回忆清单 -- 题库的决策规则 1 每个答案:点名→定义→在情境中给理由→ because(后果)。按分值数理由。 · Design/sampling: justify the choice against its alternative; non-probability methods are biased - size won't cure it. 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 · Confounder: must link to both exposure and outcome; observational - associated with, never causes. Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 · Graph critique: two good features to a principle; one issue + a fix that matches it. 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 · Type I/II: false +/false -; power = 1-ß; small n -+ low power; rare condition - base-rate false positives. Type l/ll: 假阳/假阴;power = 1-β;小 n→低 power;罕见情况→基础率导致的假阳性。 · Qual vs quant: why vs what; convergence is the qual stopping rule. Big data: size # unbiased; effect size & ethics over P. 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data: 大≠无偏;effect size 与伦理胜过 P。 ● 每个答案:点名 →定义→在情境中给理由→ because(后果)。按分值数理由。 ● 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 ● Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 ● 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 ● CI:随机的是区间,固定的是参数。P-value: Pr(datalHo),而非 Pr(Holdata);显著 ≠重要。 ● Type I/ll: 假阳/假阴;power = 1-β;小n→低 power;罕见情况→ 基础率导致的假阳性。 ● Forest plot: 菱形对零线+异质性+发表偏倚的注意。Hill:证据的权重,时序性优先。 ● 诊断图:漏斗形→方差非恒定;QQ 弯曲→非正态→变换/用假设更少的方法。 ● 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data:大≠无偏;effect size 与伦理胜过 P。 AskSia Library · MAST20034 · 双语 Bilingual EXAM MORNING . THE DECODER EXAM MORNING . THE DECODER[16]Source: asksia-cheatsheet-mast20034.pdfqualitative . convergence 17 . The "Because" Rule HOW TO BANK MARKS EVERY ANSWER = A CHAIN NAME the concept/framework - APPLY it to the context + BECAUSE . . . (the reason wins the mark) · Marks are per correct, sufficiently-detailed reason - one detailed reason often = full marks; restating the definition earns nothing · Dot points are fine . no grammar/spelling marks . 3 hours SIA > No question needs a calculator and none recall case-study details. If you're computing, you've misread it - they want the reasoning. Spend the words on the because. 18 . Top Traps to THE MARKS LOST MOST Avoid · "P = Pr(Ho true)" . "large P proves Ho" . choosing one- sided after the data · Calling a natural experiment an experiment . "association proves cause" · Cohort>case-control mix-up · stratified > quota . cluster<> stratified · Hill as a tick-box · ignoring temporality · "Significant" = "important" . trusting a result because n is huge the core of your 4-side bring-in notes . confirm on the MAST20034 exam-info page name it. apply it. because . . . asksia. ai/cheatsheet/ unimelb-mast20034 . side 2/2 AskSia CHEAT SHEET SERIES RESTRICTED BRING-IN . NO CALCULATOR
- 每题都练:
-
第二步:背“最高频概念”
- PPDAC
- 设计分类
- confounding
- Hill
- sampling bias
- WEIRD
- graphics principles
- CI / P-value
- Type I / II / power
- qualitative / convergence
- ethics / big data[1]Source: asksia-bible-mast20034-bilingual.pdfB 2 . REVISE 2 · REVISE 2 . REVISE You've done the week. Use the tables and the chapter-end recall checklists to self-test: can you list the four observational designs, name three sampling biases, give the five graphics principles, recite the Hill criteria? The checklists are written to be lifted almost verbatim onto your four-side notes sheet. 你已经上完本周。用各表格和章 末的recall checklists (回忆清 单)来自测:你能列出四种观察 性设计、点名三种 sampling bias、给出五条图表原则、背出 Hill 准则吗?这些清单写出来就 是为了几乎逐字誉到你的四面笔 记纸上。 C 3 . APPLY 3 . APPLY 3 . APPLY You're building your notes sheet or sitting the paper. Run the name-the-concept decoder (Ch 14) on every prompt: read the cue - name the design / bias / method -> write the because. With four sides of notes carried in and no calculator, your edge is reasoning discipline, not recall under pressure. 你正在做笔记纸,或正在考场 上。对每道题跑一遍name-the- concept decoder (点名概念解 码器)(第14章):读线索→点 design / bias / method -> 写下because。带着四面笔记、 不用计算器,你的优势是推理纪 律,而非压力下的回忆。 AskSia Library · MAST20034 · 双语 Bilingual ! Read this first: the assessment shape, and the bring-in rule 先读这个:评估的形态,以及可带入规则 MAST20034 is assessed by four pieces: 5 revision quizzes (5%), 4 short assignments (20%, each a tight 200- word critique with hard word penalties), a group project (15%, study design/critique + a Week 11 presentation), and the 60% final exam. The final is in-person, short-answer reasoning, three hours. You may bring in up to two A4 pages double-sided - four sides - of your own notes, and calculators are not permitted (there are no questions that need one). So your notes sheet should carry definitions, taxonomies, decision rules and checklists, never arithmetic. Always confirm the current weights, dates and exam conditions on your own LMS, as details shift between cohorts. MAST20034 由四个部分评估:5 次复习 quiz(5%)、4次短作业(20%,每次是一篇严格200词的批判,超字数有硬 扣分)、一个小组项目(15%,研究设计/批判+第11周展示),以及60% 的期末考。期末是线下、简答推理、三小时。 你可以带入最多两张 A4 双面纸 -- 四面 -- 自己的笔记,不允许用计算器(也没有需要计算器的题目)。所以你的笔记 纸应承载定义、分类法、决策规则与清单,绝不放算术。请始终在你自己的 LMS 上确认当前的权重、日期与考试条件, 因为细节会随届次变动。 i How this book was built - the two-layer rule 这本书是怎么搭出来的 -- 两层规则 The framework canon here is standard, widely-published statistical-literacy theory - the PPDAC investigation cycle (Wild & Pfannkuch), EDA (Tukey), the standard study-design and sampling taxonomies, validity & precision principles, NHST + confidence-interval logic, the WEIRD-bias critique, the Bradford Hill criteria, and data-feminism / data-ethics (D'Ignazio & Klein). These are non-copyrightable canon, stated plainly. The course's own case-study stems and tutorial examples are paraphrased and re-authored with our own scenarios - we never reproduce a case study's specific data. Book status quoted and honoured (four-side bring-in, no calculator, short-answer). Verify on your LMS. 这里的框架经典是标准的、被广泛发表的统计素养理论 -- PPDAC 探究循环(Wild & Pfannkuch)、EDA (Tukey)、 标准的研究设计与 sampling 分类法、validity & precision (效度与精密度)原则、NHST+置信区间逻辑、WEIRD- bias 批判、Bradford Hill 准则,以及data-feminism / data-ethics (数据女性主义/数据伦理)(D'Ignazio & Klein)。这些是不可受版权保护的经典,平实陈述。本课自身的案例研究题干与教程例子都被转述并以我们自己的情景重 写 -- 我们绝不复制任何案例研究的具体数据。书面状态如实引用并遵守(四面带入、不用计算器、简答)。请在你的 LMS 上核实。 AskSia Library · MAST20034 · 双语 Bilingual THE BLUEPRINT - THE EXAM BLUEPRINT 60% FINAL . EVERY MARK IS A 'BECAUSE' Where every mark lives 每一分都落在哪里 One 60% short-answer final - reasoning only, no calculator, four sides of your own notes 一场占 60% 的 short-answer 期末 -- 只考推理、不许用计算器、可带四页自备笔记 TL;DR. Sixty percent is a short-answer reasoning final - no calculator, no calculations, no software, with four sides of your own notes carried in. Its make-or-break skill is "name the concept, then justify the call": you are handed a graph, a study or a piece of statistical output and asked to critique it and say how to fix it. Master the taxonomies and decision rules in this book and you hold the keys to the whole paper. TL;DR. 这 60% 是一场 short-answer 推理期末 -- 不许用计算器、不做计算、不用软件,可带入四页自备笔记。它成败攸 关的技能是“为概念命名,再论证你的判断”:你会拿到一张图、一项研究或一段统计输出,被要求批判它并说出如何修补。 掌握本书里的分类法与决策规则,你就握住了整张试卷的钥匙。 60% FINAL EXAM (3 HR) 期末考试(3 小时)[9]Source: asksia-bible-mast20034-bilingual.pdfThe PPDAC cycle - the spine of the whole unit, and a one-glance map of how an exam scenario hangs together: every question lives somewhere on Problem - Plan - Data - Analysis - Conclusion. Locating the stage tells you which concept the marker wants. PPDAC 循环 -- 整个单元的脊柱,也是一张让你一眼看清考试情景如何拼接的地图:每道题都栖身于 Problem → Plan → Data → Analysis → Conclusion 中的某处。定位到阶段,就知道评分者想要哪个概念。 AskSia Library . MAST20034 . XXia Bilingual ★ Concepts to recall - the whole-book checklist 要回忆的概念 -- 全书清单 · Context first (Ch1): data are value-laden; ask who/why/what/how/when; critique # criticism (always offer a constructive fix). 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · Graphics (Ch2): the 5 principles; match graph to variable types; two good features + one specific improvement. I 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 · Design (Ch3): validity = randomise/compare/control (kills bias); precision = replicate/stratify/balance (kills variability); they are independent axes. 设计(第3章): validity = 随机化/比较/控制(杀 bias); precision = 重复/分层/平衡(杀 variability);二者是独 立的轴。 · Observational (Ch4): cohort=group-by-exposure, case-control=group-by-outcome; confounder links to both; correlation # causation. 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠ 因果。 I 报告(第5章):中心/离散/趋势/离群点;报告 Cl+水平、以及统计量 +P,而非只报P。 · Qualitative (Ch6): "why" not "what"; bottom-up vs top-down coding; convergence as the stopping rule. 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence 作为停止规则。 推断(第7章):随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大P 不证明 Ho; Type l/ll 与 power. · Modelling (Ch8): signal+noise; "all models wrong, some useful"; parsimony; read residual/QQ plots - interpret, never fit. 建模(第8章):信号+噪声;“所有模型都是错的,有些有用”;简约性;读残差/QQ图 -- 解读,绝不拟合。 · Sampling (Ch9): frame vs sample; a big sample won't fix bias; 4 random + 4 non-random methods; WEIRD; reproducibility crisis. I 抽样(第9章):抽样框 vs样本;大样本修不了偏倚;4种随机+4种非随机方法;WEIRD;可重复性危机。 I 累积(第10章):森林图(零线+菱形);Hill 准则(时序性+梯度);发表偏倚。 · Big data (Ch11): significance # importance at scale; provenance, ethics, scepticism toward Al findings. Big data (第11章):在大规模下显著 ≠重要;来源、伦理、对 AI发现的怀疑。 而且永远 -- 铁律:点名概念,然后给 because。祝你好运。 AskSia Library · MAST20034 · 双语 Bilingual ● 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 ● 设计(第3章): validity =随机化/比较/控制(杀 bias); precision= 重复/分层/平衡(杀 variability);二者是独 立的轴。 ● 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠因果。 ● 报告(第5章):中心/离散/趋势/离群点;报告 CI+水平、以及统计量+P,而非只报P。 ● 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence(收敛)作为停止规则。 推断(第7章): 随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大 P 不证明 Ho; Type l/ll 与 power.
-
第三步:按题型刷
- 设计题:为什么这个设计,不是另一个
- 图表题:2优点 + 1问题 + 1修复
- confounder 题:变量同时连两端
- sampling 题:谁缺席 + 偏向谁 + 为什么大样本没用
- inference 题:正确解释 CI/P,顺便防经典误读[5]Source: asksia-bible-mast20034-bilingual.pdfFINAL . 60% . SHORT-ANSWER REASONING The exam-morning decoder 考试当天解码器 If the question says X, reach for Y, and say these three things 若题目说 X,就伸手去取 Y,并说出这三件事 TL;DR. The final hands you a scenario, a graph, or a piece of statistical output and asks you to name the concept and justify it. There is no calculator and no arithmetic - every mark is a because. This page is the lookup table: read the cue words in the stem, reach for the matching concept, then deliver the three reasons that bank the marks. Memorise the column on the right; that is the answer. TL;DR. 期末递给你一个scenario (情景)、一张图,或一段统计输出,要你点名概念并加以论证。没有计算器,也没有算术 -- 每一分都是一个because。本页就是查找表:读题干里的cue words (线索词),伸手抓对应的概念,再交出能存下分数 的三条理由。把右侧那一列背下来;那就是答案。 ★ What the exam asks here 考试在这里问什么 The 60% final is 3 hours, short-answer only (no MCQ, no essay), no calculator, and you bring in 4 sides of your own notes. The marking criteria are explicit: "explaining your reasoning and choices is typically more important than any answer. " Dot-points are fine; no marks for grammar/spelling. So this whole chapter trains the one move the exam pays for - name the concept - give the because. 60% 的期末为时 3 小时,仅简答(无MCQ、无论文),不可用计算器,且你带入自备4面笔记。评分标准写得很明 确:“解释你的推理与选择,通常比任何答案本身更重要。”用要点列举即可;语法/拼写不计分。所以整章都在训练考试 买单的那一招 -- 点名概念 →给出 because。 12. 1 The cue - concept - because table 12. 1cue -> concept -> because xJAR Each row is a question species you have already met in this book. The left column is what the stem sounds like; the middle is the framework to invoke (with the chapter); the right is the 3-part skeleton - say all three and you have earned the reasoning marks. 每一行都是你在本书中已经见过的一类问题species(题种)。左列是题干听起来像什么;中列是要调用的框架(附章节);右 列是三段式骨架 -- 三段都说出来,你就挣到了推理分。 If the question says . . . Reach for this concept Say these 3 things (the because) DESIGN & CAUSATION "Choose / justify a study design"; "how would you investigate . . . " Study-design tree + validity · Ch3-4 (1) Can you intervene? - experiment (RCT) vs observational. (2) Pick the type with a because - rare outcome-case-control, many outcomes-cohort, snapshot-cross-sectional, populations-ecological. (3) Name the design tools that protect validity (randomise/compare/control). "Is it causal?"; "does X cause Y?"; "can we conclude . . . " Confounding + Bradford Hill . Ch4, Ch10 (1) Correlation # causation - observational data give association only. (2) Name a plausible confounder (linked to both exposure and outcome). (3) Argue Hill - esp. temporality (cause first) + dose-response gradient; an RCT would strengthen it by removing confounders. AskSia Library . MAST20034 . XXia Bilingual If the question says . . . Reach for this concept Say these 3 things (the because) GRAPHS & OUTPUT "Critique this graph"; "two good features & one improvement" 5 graphics principles . Ch2 (1) Two good features, each tied to a principle (standard form / common scale / clear encoding / shows data / simple). (2) One real issue (no title, abbreviated labels, panels on different scales). (3) A specific fix that addresses that issue - vague fixes score zero.[6]Source: asksia-bible-mast20034-bilingual.pdfEvery unit (and every set of n) equally likely - the baseline. Cluster 整群抽 样 Randomly pick groups, survey all within. stratum; guarantees subgroup coverage. prone. ✓ How to spend a glossary term in the exam 如何在考试中「花掉」一个词汇表术语 Never just name it. Define - apply - because. e. g. "This is convenience sampling (define); here it over-represents metro students (apply); so a bigger sample won't fix the bias (because). " That three-move sentence is what the rubric pays for. 永远别只是点名。定义→应用→ because。例如:“这是 convenience sampling(便利抽样)(定义);这里它过度 代表了都市学生(应用);所以更大的样本也修复不了这个偏倚(because)。”那个三步句式,正是评分标准买单的东 西。 AskSia Library · MAST20034 · 双语 Bilingual REVISION . SHORT -ANSWER BANK - REVISION . SHORT - ANSWER BANK ALL CHAPTERS . EXAM REHEARSAL Practice bank: every mark is a because 练习题库:每一分都是一个 because Twelve short-answer species, each reasoned out the way the marker wants 十二类 short-answer 题型,每一类都按阅卷人想要的方式推理出来 TL;DR. The final is short-answer reasoning only - you name a concept, then reason it out, then say because . . . The rubric is explicit: "explaining your reasoning and choices is typically more important than any answer. " So marks are awarded per correct, sufficiently-detailed reason - not per fact recalled. Each card below gives you the skeleton (define - reason - because), the marking note (what actually scores), and the trap that zeroes a vague answer. TL;DR. 期末只考简答推理 -- 你先点名一个概念,再推理出来,然后说because . . . (因为 …. . . . . )。评分标准写得很明白:“阐 释你的推理与选择,通常比任何答案本身更重要。”所以分数是按每条正确、足够详细的理由给的 -- 而不是按回忆出的事实给 的。下面每张卡片给你骨架(定义→推理→ because)、评分提示(什么真正得分)以及让模糊答案归零的陷阱。 ★ What the exam asks here - the format you are rehearsing 考试在这里问什么 -- 你正在排练的那种格式 The 60% final is short-answer, no calculator, no calculations, no software operation. You may bring 4 sides of your own notes. Two question species recur: (1) "explain a concept / apply critical thinking to a context" and (2) "use critical thinking on a whole-class example" (anchored to a shared case, but you are never asked to recall its data). They may hand you statistical output or a graph to interpret - you read and explain it, you never compute it. Every card here is one rep of that move. 60% 的期末是简答题,不可用计算器,无需计算,无需操作软件。你可以带入自备4 面笔记。两类题目反复出现:(1) “解释一个概念/把批判性思维应用到某情境”和(2)“对一个全班案例运用批判性思维”(锚定在一个共享案例上,但从 不要求你回忆它的数据)。他们可能递给你一段统计输出或一张图让你解读 -- 你读它、解释它,但从不计算它。这里的 每一张卡片都是这一招的一次操练。 P. 1 How to read each card - the marking model P. 1如何读每张卡片 -- 评分模型 Markers do not reward the verb "explain"; they reward the linkage. A "4-mark, 2+2" item almost always means 2 marks for a precise definition and 2 marks for two distinct, consequence-level reasons. A reason that merely restates the definition earns nothing. The skeleton below is the spine of every answer. 评分者奖励的不是“explain(解释)”这个动词,而是关联(linkage)。一道“4分、2+2”的题几乎总意味着2分给精确定义,2 分给两条相异的、后果层面的理由。一条仅仅复述定义的理由得不到分。下面的骨架是每个答案的脊柱。 1 Name the concept / framework. Markers reward defined terms - say convenience sampling, confounder, Type I error by name before you reason. 点名概念 / 框架。评分者奖励定义清晰的术语 -- 在推理前先按名说出 convenience sampling、confounder、Type l error. 2 Define it precisely. One sentence that would let a stranger identify it; vagueness ("just picking people") loses the definition marks. 精确地定义它。用一句话让陌生人也能据此辨认它;含糊(“就是随便挑人”)会丢掉定义分。[12]Source: asksia-bible-mast20034-bilingual.pdfTrap. offering a variable that affects only the outcome (that's not a confounder - a confounder must link to both); claiming causation. 陷阱。给出一个只影响结局(outcome)的变量(那不是 confounder -- confounder 必须同时连向两端);声称存在因 果。 Q5 GRAPH CRITIQUE - PRAISE [2: 1 mark each] Prompt (paraphrased). Given a multi-panel scatterplot, identify two good features in terms of communicating information. 题目(转述)。给定一张多面板散点图,从信息传达的角度识别两个优点。 - Model reasoning - the because skeleton. Name two concrete features, each tied to a 范例推理 -- because骨架。 graphics principle: (1) a scatterplot is the standard form for two numerical variables; (2) colour and symbol double-encode the groups (accessible / redundant coding); or panels separate groups instead of overplotting; or a clear title + source gives context. What earns the marks. two specific features, each named to a principle - 1 mark each. Specificity is everything. 什么能得分。两个具体的特征,每个都挂到一条原则上 -- 各1分。具体性就是一切。 Trap. generic praise ("nice colours", "easy to read") with no principle; praising a feature that isn't actually in the chart. 陷阱。没有原则的笼统夸赞(“颜色好看”“易读”);夸了一个图里其实没有的特征。 AskSia Library . MAST20034 . XXia Bilingual 06 GRAPH CRITIQUE - FIX [3: 1 issue + 2 specific fix] Prompt (paraphrased). Identify one feature that could be improved and suggest a specific improvement. 题目(转述)。识别一个可改进之处,并提出一个具体的改进。 Model reasoning - the because skeleton. 范例推理 -- because骨架。 (1) Issue: name one real fault - e. g. panels are on different y-axis scales, so the eye mis-reads the comparison. (2) Fix (because): "set every panel to a common axis range so the heights are directly comparable" - the fix must address the named issue. (Other valid pairs: no title - add a title stating data + context + source; cryptic labels - spell out the full variable names. ) What earns the marks. 1 mark to name a real issue + 2 marks for a fix that directly resolves that exact issue. The fix must match the fault. 什么能得分。1分给点名一个真实问题+2 分给一个直接解决那个确切问题的修正。修正必须对得上故障。 Trap. a fix that doesn't address the issue you named (rubric zeroes it); naming an issue but giving no concrete fix; "fixing" something that was fine. 陷阱。一个不针对你所点名问题的修正(评分会归零);只点名问题却不给具体修正;“修”了本来没问题的东西。 AskSia Library . MAST20034 . XXia Bilingual REVISION . SHORT - ANSWER BANK REVISION . SHORT - ANSWER BANK INTERPRET -AND - CRITIQUE SPECIES Drills 7-12: inference, errors, causation & ethics 演练 7-12: inference、误差、causation 与伦理[13]Source: asksia-bible-mast20034-bilingual.pdfEX 12. 1 Turning a fact into a because (worked short-answer) name > consequence + because Stem (AskSia-invented): "A wellbeing app is evaluated by surveying users who clicked an in-app pop-up. Comment on the sample. " 题干(AskSia 自拟):“某福祉 app通过调查那些点击了应用内弹窗的用户来评估。评论这个样本。” Weak (no marks): "It is a convenience sample. " - a label, no reasoning. 弱(无分):“这是一个 convenience sample。” -- 只是标签,没有推理。 Strong (banks the marks): "This is a convenience / volunteer sample, because only users already engaged enough to click respond - so it suffers self-selection bias and likely over-states satisfaction (because dissatisfied users have churned and are missing). A larger pop-up sample would not fix this, because it repeats the same biased method at scale. " 强(存下分数):“这是一个 convenience/ volunteer(便利/自愿)样本,因为只有已经足够投入到会去点击的用户才 会作答 -- 所以它存在 self-selection bias(自我选择偏倚),很可能高估满意度(因为不满意的用户已经流失,处于 缺失(missing)状态)。更大的弹窗样本并不能解决这个问题,因为它只是把同一种有偏的方法放大重复。” - Read-out: three clauses, three becauses: name the method - name the consequence in context - pre- empt the 'bigger sample' trap. (Scenario AskSia-invented; no figures to compute. ) 读出结构:三个分句,三个because:点名方法→在情境中点名后果→预先化解“样本更大”的陷阱。(情景由 AskSia 自拟;没有要计算的数字。) - 12. 4 The 3-hour timing plan 12. 43 小时计时计划 Three hours for short-answer reasoning is generous - the risk is over-writing early questions, not running out of ideas. Budget by marks, leave a critique-polish pass at the end. 三小时做简答推理是宽裕的 -- 风险在于早段题目写过头,而非想不出点子。按分值分配时间,末尾留一遍批判-润色的检 查。 AskSia Library · MAST20034 · 双语 Bilingual 1 First 10 min - survey & map. Read every question; pencil the decoder row next to each (design? critique? interpret? sample?). Spot the high-mark items. 头10分钟 -- 通览与定位。把每道题读一遍;在每题旁用铅笔标出解码器的行(设计?批判?解读?抽样?)。挑出高分 题。 2 Bulk - answer by mark weight. Roughly a minute per mark; a 4-mark design question wants two detailed becauses, a 2-mark "two good features" wants exactly two. Do not pad. 主体 -- 按分值作答。大约每分一分钟;一道 4分的设计题想要两个详尽的 because,一道 2分的“两个好特征”就恰好两 个。别注水。 3 Discipline - never compute. If you feel an arithmetic urge, you have misread - the answer is an interpretation, not a number. 纪律 -- 绝不计算。若你感到一股算术冲动,那你读错题了 -- 答案是一个解读,而非一个数字。 4 Last 20 min - the because audit. Re-read each answer and check every claim ends in a reason tied to the scenario; add the missing because, the missing caveat (significance # importance), the missing specific fix. 最后 20 分钟 -- because 审计。重读每个答案,检查每一条主张是否都以一个系到情景的理由收尾;补上缺失的 because、缺失的注意点(显著≠重要)、缺失的具体修复。 FIG 12. 1 1 Problem define the question 5 Conclusion answer in context 2 Plan design how to get data PPDAC cycle 4 Analysis explore + model + test 3 Data collect / clean / store iterative: conclusions raise new problems -> cycle repeats[15]Source: asksia-bible-mast20034-bilingual.pdf混合方法=定性+定量;定性解释或为定量播种。永远点名其伦理/实务成本(时间、编码、匿名性)。 · Trap: never call qualitative "unscientific"; never "just take a bigger survey" for a why question; never swap bottom-up + top-down. I 陷阱:绝不把定性称为“不科学”;对一个为何的问题绝不“就去做更大的调查”;绝不把自下而上←自上而下对调。 ● Quant=“什么/有多少”;qual=“为什么” -- 按问题的性质来选,而非按哪个“更好”。 ● 四种来源: interviews (深度) · focus groups (互动)· observation (做≠说)· documents/artefacts (已经存在) -- 各有其风险。 ● 编码:bottom-up =归纳,codes 从数据中涌现;top-down =演绎,codes 来自先验理论。Themes =归组后的 codes (thematic analysis) . · Convergence (收敛)=定性的停止规则(类比于 power/样本量):当新数据不再增添新主题时停止。 · Rigour (严谨): credibility (~ 内部效度)、transferability (~外部效度)、transparency、有目的的采集。 ● Mixed methods (混合方法) = qual + quant; qual 解释或孕育 quant。始终点名伦理/实务代价(时间、编码、 匿名)。 ● 陷阱:绝不把定性叫“不科学”;对一个为什么问题绝不“干脆做更大的调查”;绝不把 bottom-up ←> top-down 互 换。 AskSia Library · MAST20034 · 双语 Bilingual WEEK 7 . FRAMEWORKS FOR INFERENCE - WEEK 7 . FRAMEWORKS FOR INFERENCE CH 7 . ESTIMATION & SAMPLING DISTRIBUTIONS From sample to population: estimation & the CLT 从样本到总体:估计与 CLT Why one sample can speak for a whole population - and how confidently 为何一个样本能为整个总体发声 -- 以及有多大把握 TL;DR. Inference runs the arrow backwards: probability reasons population - sample, inference reasons sample - population. A point estimate is one number; a confidence interval is honest because it carries "how close". The whole machine rests on the sampling distribution - what the estimate would do over many samples - which the Central Limit Theorem makes Normal. Everything on these three pages is about reading and explaining this, never computing it. TL;DR. 推断把箭头倒过来跑:概率从总体→样本推理,推断从样本→ 总体推理。一个 point estimate(点估计)是一个数 字;一个 confidence interval(置信区间)之所以诚实,是因为它带着“有多接近”。整套机器都立在 sampling distribution (抽样分布)之上 -- 即这个估计在许多样本上会有的表现 -- 而 Central Limit Theorem (CLT,中心极限定理)让它呈正 态。这三页里的一切都关于读懂并解释它,从不计算它。 ★ What the exam asks here 考试在这里问什么 The 60% final is short-answer reasoning only - no calculator, no calculations, no multiple choice. You bring in four sides of your own notes. For inference you will be handed a CI or a P-value to interpret and asked to say what it does (and does not) mean, or to name an error / explain power in a scenario. The marking is explicit: "explaining your reasoning and choices is typically more important than any answer. " Every mark is a because - so carry the definitions, the CI/P-value interpretation rules, and the Type I/II decoder, not arithmetic. 60% 期末只考简答推理 -- 无计算器、无计算、无多选。你带入四面自己的笔记。对于推断,你会被递给一个要解读的 CI 或 P-value,要你说它意味着什么(以及不意味着什么),或在某情景中点名一种 error/解释 power。评分明确: “阐释你的推理与选择,通常比任何答案本身更重要。”每一分都是一个because -- 所以带上定义、CI/P-value 解读规 则,以及 Type l/II 解码器,而非算术。 7. 1 Estimation: point estimate vs confidence interval 7. 1fait: point estimate vs confidence interval Definitions. A point estimate is a single number computed from the sample that stands in for an unknown population parameter (the sample mean x estimates u; the sample proportion p estimates p; s estimates o). A confidence interval (CI) is a range - the estimate plus a margin that encodes how close it is likely to be: estimate ± (distribution multiplier) x (variability). The width comes from sampling variability (and so shrinks as n grows); the multiplier comes from the confidence level you choose. - 定义。一个 point estimate 是从样本算出的单个数字,用来替代一个未知的总体 parameter(参数)(样本均值 x估计 μ; 样本比例 p^估计 p; s 估计 o)。一个 confidence interval (CI,置信区间)是一个范围 -- 估计值加上一个编码了它有多大 可能接近的余量:估计值+(分布乘数)× (变异性)。宽度来自抽样变异性(所以随n增大而收缩);乘数来自你选择的置 信水平。 Quantity Symbol Lives in
-
第四步:考前只看“错句警报”
- “association proves cause”
- “large P proves $H_0$”
- “significant = important”
- “huge n = representative”
- “Hill 打勾就能证因果”
- “qualitative 不科学”[11]Source: asksia-bible-mast20034-bilingual.pdf(b) Big-data 批判(because):巨大的 n修不了 bias -- 它只是那些已经在用这个 app 的人(selection bias); 在海量 n 下一切看起来都“显著”,所以 effect size 与来源比 P 更要紧。再加上数据伦理标记:被记录用户的隐私/ 同意。 What earns the marks. a justified qual choice (the "why" logic + convergence) + a big-data critique naming that size # representativeness, with an ethics flag. 什么能得分。一个有论证的 qual 选择(“why”逻辑+ convergence)+一个 big-data 批判,点明规模 ≠代表性,并附 上一个 ethics 标记。 Trap. dismissing qualitative as "unscientific"; equating large n with representative; forgetting consent/provenance for found data. 陷阱。把 qualitative 斥为“不科学”;把大n 等同于有代表性;对找来的数据忘了 consent/provenance(同意/来源)。 AskSia Library · MAST20034 · 双语 Bilingual ★ Recall checklist - the decision rules for the bank 回忆清单 -- 题库的决策规则 1 每个答案:点名→定义→在情境中给理由→ because(后果)。按分值数理由。 · Design/sampling: justify the choice against its alternative; non-probability methods are biased - size won't cure it. 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 · Confounder: must link to both exposure and outcome; observational - associated with, never causes. Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 · Graph critique: two good features to a principle; one issue + a fix that matches it. 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 · Type I/II: false +/false -; power = 1-ß; small n -+ low power; rare condition - base-rate false positives. Type l/ll: 假阳/假阴;power = 1-β;小 n→低 power;罕见情况→基础率导致的假阳性。 · Qual vs quant: why vs what; convergence is the qual stopping rule. Big data: size # unbiased; effect size & ethics over P. 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data: 大≠无偏;effect size 与伦理胜过 P。 ● 每个答案:点名 →定义→在情境中给理由→ because(后果)。按分值数理由。 ● 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 ● Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 ● 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 ● CI:随机的是区间,固定的是参数。P-value: Pr(datalHo),而非 Pr(Holdata);显著 ≠重要。 ● Type I/ll: 假阳/假阴;power = 1-β;小n→低 power;罕见情况→ 基础率导致的假阳性。 ● Forest plot: 菱形对零线+异质性+发表偏倚的注意。Hill:证据的权重,时序性优先。 ● 诊断图:漏斗形→方差非恒定;QQ 弯曲→非正态→变换/用假设更少的方法。 ● 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data:大≠无偏;effect size 与伦理胜过 P。 AskSia Library · MAST20034 · 双语 Bilingual EXAM MORNING . THE DECODER EXAM MORNING . THE DECODER[15]Source: asksia-bible-mast20034-bilingual.pdf混合方法=定性+定量;定性解释或为定量播种。永远点名其伦理/实务成本(时间、编码、匿名性)。 · Trap: never call qualitative "unscientific"; never "just take a bigger survey" for a why question; never swap bottom-up + top-down. I 陷阱:绝不把定性称为“不科学”;对一个为何的问题绝不“就去做更大的调查”;绝不把自下而上←自上而下对调。 ● Quant=“什么/有多少”;qual=“为什么” -- 按问题的性质来选,而非按哪个“更好”。 ● 四种来源: interviews (深度) · focus groups (互动)· observation (做≠说)· documents/artefacts (已经存在) -- 各有其风险。 ● 编码:bottom-up =归纳,codes 从数据中涌现;top-down =演绎,codes 来自先验理论。Themes =归组后的 codes (thematic analysis) . · Convergence (收敛)=定性的停止规则(类比于 power/样本量):当新数据不再增添新主题时停止。 · Rigour (严谨): credibility (~ 内部效度)、transferability (~外部效度)、transparency、有目的的采集。 ● Mixed methods (混合方法) = qual + quant; qual 解释或孕育 quant。始终点名伦理/实务代价(时间、编码、 匿名)。 ● 陷阱:绝不把定性叫“不科学”;对一个为什么问题绝不“干脆做更大的调查”;绝不把 bottom-up ←> top-down 互 换。 AskSia Library · MAST20034 · 双语 Bilingual WEEK 7 . FRAMEWORKS FOR INFERENCE - WEEK 7 . FRAMEWORKS FOR INFERENCE CH 7 . ESTIMATION & SAMPLING DISTRIBUTIONS From sample to population: estimation & the CLT 从样本到总体:估计与 CLT Why one sample can speak for a whole population - and how confidently 为何一个样本能为整个总体发声 -- 以及有多大把握 TL;DR. Inference runs the arrow backwards: probability reasons population - sample, inference reasons sample - population. A point estimate is one number; a confidence interval is honest because it carries "how close". The whole machine rests on the sampling distribution - what the estimate would do over many samples - which the Central Limit Theorem makes Normal. Everything on these three pages is about reading and explaining this, never computing it. TL;DR. 推断把箭头倒过来跑:概率从总体→样本推理,推断从样本→ 总体推理。一个 point estimate(点估计)是一个数 字;一个 confidence interval(置信区间)之所以诚实,是因为它带着“有多接近”。整套机器都立在 sampling distribution (抽样分布)之上 -- 即这个估计在许多样本上会有的表现 -- 而 Central Limit Theorem (CLT,中心极限定理)让它呈正 态。这三页里的一切都关于读懂并解释它,从不计算它。 ★ What the exam asks here 考试在这里问什么 The 60% final is short-answer reasoning only - no calculator, no calculations, no multiple choice. You bring in four sides of your own notes. For inference you will be handed a CI or a P-value to interpret and asked to say what it does (and does not) mean, or to name an error / explain power in a scenario. The marking is explicit: "explaining your reasoning and choices is typically more important than any answer. " Every mark is a because - so carry the definitions, the CI/P-value interpretation rules, and the Type I/II decoder, not arithmetic. 60% 期末只考简答推理 -- 无计算器、无计算、无多选。你带入四面自己的笔记。对于推断,你会被递给一个要解读的 CI 或 P-value,要你说它意味着什么(以及不意味着什么),或在某情景中点名一种 error/解释 power。评分明确: “阐释你的推理与选择,通常比任何答案本身更重要。”每一分都是一个because -- 所以带上定义、CI/P-value 解读规 则,以及 Type l/II 解码器,而非算术。 7. 1 Estimation: point estimate vs confidence interval 7. 1fait: point estimate vs confidence interval Definitions. A point estimate is a single number computed from the sample that stands in for an unknown population parameter (the sample mean x estimates u; the sample proportion p estimates p; s estimates o). A confidence interval (CI) is a range - the estimate plus a margin that encodes how close it is likely to be: estimate ± (distribution multiplier) x (variability). The width comes from sampling variability (and so shrinks as n grows); the multiplier comes from the confidence level you choose. - 定义。一个 point estimate 是从样本算出的单个数字,用来替代一个未知的总体 parameter(参数)(样本均值 x估计 μ; 样本比例 p^估计 p; s 估计 o)。一个 confidence interval (CI,置信区间)是一个范围 -- 估计值加上一个编码了它有多大 可能接近的余量:估计值+(分布乘数)× (变异性)。宽度来自抽样变异性(所以随n增大而收缩);乘数来自你选择的置 信水平。 Quantity Symbol Lives in[16]Source: asksia-cheatsheet-mast20034.pdfqualitative . convergence 17 . The "Because" Rule HOW TO BANK MARKS EVERY ANSWER = A CHAIN NAME the concept/framework - APPLY it to the context + BECAUSE . . . (the reason wins the mark) · Marks are per correct, sufficiently-detailed reason - one detailed reason often = full marks; restating the definition earns nothing · Dot points are fine . no grammar/spelling marks . 3 hours SIA > No question needs a calculator and none recall case-study details. If you're computing, you've misread it - they want the reasoning. Spend the words on the because. 18 . Top Traps to THE MARKS LOST MOST Avoid · "P = Pr(Ho true)" . "large P proves Ho" . choosing one- sided after the data · Calling a natural experiment an experiment . "association proves cause" · Cohort>case-control mix-up · stratified > quota . cluster<> stratified · Hill as a tick-box · ignoring temporality · "Significant" = "important" . trusting a result because n is huge the core of your 4-side bring-in notes . confirm on the MAST20034 exam-info page name it. apply it. because . . . asksia. ai/cheatsheet/ unimelb-mast20034 . side 2/2 AskSia CHEAT SHEET SERIES RESTRICTED BRING-IN . NO CALCULATOR
-
九、如果你现在时间很少,只抓这 12 个最核心考点
-
1. PPDAC
-
2. experiment vs observational
-
3. cohort / case-control / cross-sectional
-
4. validity = randomise / compare / control
-
5. precision = replicate / stratify / balance
-
6. correlation $\ne$ causation
-
7. confounder 必须同时连 exposure 和 outcome
-
8. Hill:重点 temporality + dose-response
-
9. bigger sample won’t fix bias
-
10. graph critique:2 good + 1 fix
-
11. P-value 与 CI 的正确解读
-
12. significant $\ne$ important;big data 也会有 bias[3]Source: asksia-bible-mast20034-bilingual.pdf"Interpret this CI / P- value / output / forest plot" Inference reading rules . Ch7, Ch8, Ch10 (1) State what it shows in context (CI excludes O - evidence of an effect; small P - strong evidence against H. ). (2) Add the correct caveat (the interval is random, u is fixed; large P does not prove H. ). (3) Comment on strength / meaning - significant # important. SAMPLING & TRUST "Is this sample OK?"; "what's wrong with how they recruited?" Sampling bias + WEIRD · Ch1, Ch9 (1) Ask who is missing - frame / selection / non-response / volunteer gap. (2) Name the method and its bias (convenience - people similar to each other). (3) State that a bigger sample will NOT fix bias - it repeats the mistake at scale; consider WEIRD over-sampling. "Too good to be true"; "a surprising significant result"; "just barely p‹0. 05" Reproducibility + p-hacking · Ch9-10 (1) Publication bias - novel/significant results over-published, inflating effects. (2) Watch for p-hacking / HARKing (one-sided chosen after the data, multiple looks). (3) Ask for replication, a CI / effect size, and pre- registration before trusting it. "Big data / an Al claim"; "with millions of records . . . " Ethics + validity at scale · Ch1, Ch11 (1) Huge n - everything is significant - judge effect size & practical importance, not P. (2) Apply the context questions (who/why/what/how) + data ethics (consent, fairness, stewardship). (3) Be sceptical of Al - check provenance and the missing data. ✓ How to use the table under pressure 在压力下如何使用这张表 Underline the verb and the noun in the stem first ("choose a design", "critique this graph", "interpret the output"). That two-word cue picks the row; the right column is your paragraph. Then convert each of the three things into a sentence that ends in a because. You are never asked to compute - resist the urge. 先在题干里给动词和名词划线(“选择一个设计”、“批判这张图”、“解读这段输出”)。那个两字提示挑出对应的行;右侧 那一栏就是你要写的段落。然后把这三样东西各转成一个以 because 收尾的句子。题目从不要求你计算 -- 忍住冲动。 AskSia Library . MAST20034 . XXia Bilingual EXAM MORNING . THE DECODER - EXAM MORNING . THE DECODER BUILDING THE NOTES YOU CARRY IN Your 4 sides, the 'because' rule, and the 3 hours 你的4 页笔记、‘because’规则,以及那3个小时 TL;DR. You may bring four sides of your own notes and there is no calculator - so do not waste space on formulas. Fill the four sides with decision trees, checklists, and crisp definitions: the machinery that turns a cue into a justified answer. This page lays out what to put on each side, the one rule that wins short- answer marks, a timing plan for the three hours, and the closing concepts-to-recall list. TL;DR. 你可以带四面自己的笔记,而且没有计算器 -- 所以不要把空间浪费在公式上。把这四面填满decision trees (决策 树)、checklists (清单)和精炼的定义:那些把线索变成有论证答案的机器。本页摆出每一面该放什么、赢得简答分的那一 条规则、三小时的时间规划,以及收官的待回忆概念清单。 - 12. 2 The 4-side notes plan 12. 2四页笔记计划 The exam is reasoning, not recall of numbers, so your sheet is a reasoning toolkit. A good layout maps one side to each job of the decoder above. Trees and checklists earn marks; a wall of formulae does not (there is nothing to calculate). 考的是推理,而非对数字的回忆,所以你的小抄是一个推理工具箱。好的布局把每一面对应到上面解码器的一项工作。树与清 单能得分;一墙公式不能(没有任何东西要算)。 Side What goes on it[4]Source: asksia-bible-mast20034-bilingual.pdf4 Analysis explore + model + test 3 Data collect / clean / store iterative: conclusions raise new problems -> cycle repeats The course's engine in one picture: Problem - Plan + Data - Analysis - Conclusion - the investigation cycle that frames almost every critique prompt. Most exam answers are really a question about one node: was the Plan a sound design? Were the Data well sampled? Is the Conclusion licensed by the design? Learn to walk it from memory. 一幅图道尽本课程的引擎:Problem → Plan → Data → Analysis → Conclusion -- 这个研究循环 框定了几乎每一个批判题。多数考试答案其实是关于 某一个节点的问题:Plan 是不是一个可靠的设计? Data 抽样得当吗?Conclusion 是设计所许可的吗? 要学会凭记忆把它走一遍。 Examinable scope = the 12-week reasoning spine: objectivity & data · good graphics · study design . observational studies & confounding · reporting & critiquing claims . qualitative methods . frameworks for inference . analysis & modelling . sampling & AskSia Library . MAST20034 . XXia Bilingual WEIRD bias . accumulating research (meta-analysis & Hill) . big data · context. Research prompts touch only the whole-class case studies and never demand recalled details. 可考范围 = 12 周的推理主线:objectivity 与数据 · 好图表 · 研究设计 · observational study 与 confounding · 报告 与批判主张 · qualitative methods · 推断框架 · 分析与建 模 · 抽样与 WEIRD bias · 积累研究 (meta-analysis 与 Hill) · big data · 情境。研究类题目只触及全班共学的案 例研究,从不要求背诵细节。 What the exam is really testing 这场考试真正在考什么 The cue you get The move it rewards A graph / figure Critique it: name two good features + one specific fix (the graphics principles) A described study Name the design - say what conclusion is legal (causation vs association) An association Find the confounder / lurking variable and explain how it could fake the link Statistical output / a CI / P Interpret it in context - without the classic misreads A sampling scenario Name the method & the bias (incl. WEIRD) and why size won't cure it ✓ The one habit that wins this exam 赢下这场考试的那一个习惯 For every prompt, name the concept first, then write the because. "Two variables move together" - confounding / correlation#causation; "who got picked" - a sampling / selection bias; "is this graph any good" - the five graphics principles; "does X cause Y across studies" - Bradford Hill; "what does this P-value mean" - the interpretation rules. The decoder in Ch 14 lists every cue. 对每道题,先点名概念,再写because。“两个变量一 起变动”→ confounding / correlation≠causation;“谁被选中”→ 某种 sampling/ selection bias;“这张图好不好”→五条 图表原则;“跨多项研究X是否导致 Y”→ Bradford Hill;“这个 P-value 是什么意思”→解读规则。第 14 章的解码器列出了每一个线索。 ★ The single highest-value habit 价值最高的单一习惯 You may write in dot-points or sentences, and there are no marks for grammar or spelling - so spend every word on the reasoning. Practise answering in the shape the markers reward: (1) name the concept, (2) define it in a line, (3) apply it to the scenario, (4) state the consequence or fix. Four sentences, full marks. "Explaining your reasoning and choices is typically more important than any answer. " 你可以用要点或句子书写,且语法或拼写不计分 所以把每一个词都花在推理上。按评分者奖励的形态 练习作答:(1)点名概念,(2)一行内定义它,(3)把它 应用到情景,(4)陈述后果或修正。四句话,满分。 “阐释你的推理与选择,通常比任何答案本身更重要。”[5]Source: asksia-bible-mast20034-bilingual.pdfFINAL . 60% . SHORT-ANSWER REASONING The exam-morning decoder 考试当天解码器 If the question says X, reach for Y, and say these three things 若题目说 X,就伸手去取 Y,并说出这三件事 TL;DR. The final hands you a scenario, a graph, or a piece of statistical output and asks you to name the concept and justify it. There is no calculator and no arithmetic - every mark is a because. This page is the lookup table: read the cue words in the stem, reach for the matching concept, then deliver the three reasons that bank the marks. Memorise the column on the right; that is the answer. TL;DR. 期末递给你一个scenario (情景)、一张图,或一段统计输出,要你点名概念并加以论证。没有计算器,也没有算术 -- 每一分都是一个because。本页就是查找表:读题干里的cue words (线索词),伸手抓对应的概念,再交出能存下分数 的三条理由。把右侧那一列背下来;那就是答案。 ★ What the exam asks here 考试在这里问什么 The 60% final is 3 hours, short-answer only (no MCQ, no essay), no calculator, and you bring in 4 sides of your own notes. The marking criteria are explicit: "explaining your reasoning and choices is typically more important than any answer. " Dot-points are fine; no marks for grammar/spelling. So this whole chapter trains the one move the exam pays for - name the concept - give the because. 60% 的期末为时 3 小时,仅简答(无MCQ、无论文),不可用计算器,且你带入自备4面笔记。评分标准写得很明 确:“解释你的推理与选择,通常比任何答案本身更重要。”用要点列举即可;语法/拼写不计分。所以整章都在训练考试 买单的那一招 -- 点名概念 →给出 because。 12. 1 The cue - concept - because table 12. 1cue -> concept -> because xJAR Each row is a question species you have already met in this book. The left column is what the stem sounds like; the middle is the framework to invoke (with the chapter); the right is the 3-part skeleton - say all three and you have earned the reasoning marks. 每一行都是你在本书中已经见过的一类问题species(题种)。左列是题干听起来像什么;中列是要调用的框架(附章节);右 列是三段式骨架 -- 三段都说出来,你就挣到了推理分。 If the question says . . . Reach for this concept Say these 3 things (the because) DESIGN & CAUSATION "Choose / justify a study design"; "how would you investigate . . . " Study-design tree + validity · Ch3-4 (1) Can you intervene? - experiment (RCT) vs observational. (2) Pick the type with a because - rare outcome-case-control, many outcomes-cohort, snapshot-cross-sectional, populations-ecological. (3) Name the design tools that protect validity (randomise/compare/control). "Is it causal?"; "does X cause Y?"; "can we conclude . . . " Confounding + Bradford Hill . Ch4, Ch10 (1) Correlation # causation - observational data give association only. (2) Name a plausible confounder (linked to both exposure and outcome). (3) Argue Hill - esp. temporality (cause first) + dose-response gradient; an RCT would strengthen it by removing confounders. AskSia Library . MAST20034 . XXia Bilingual If the question says . . . Reach for this concept Say these 3 things (the because) GRAPHS & OUTPUT "Critique this graph"; "two good features & one improvement" 5 graphics principles . Ch2 (1) Two good features, each tied to a principle (standard form / common scale / clear encoding / shows data / simple). (2) One real issue (no title, abbreviated labels, panels on different scales). (3) A specific fix that addresses that issue - vague fixes score zero.[9]Source: asksia-bible-mast20034-bilingual.pdfThe PPDAC cycle - the spine of the whole unit, and a one-glance map of how an exam scenario hangs together: every question lives somewhere on Problem - Plan - Data - Analysis - Conclusion. Locating the stage tells you which concept the marker wants. PPDAC 循环 -- 整个单元的脊柱,也是一张让你一眼看清考试情景如何拼接的地图:每道题都栖身于 Problem → Plan → Data → Analysis → Conclusion 中的某处。定位到阶段,就知道评分者想要哪个概念。 AskSia Library . MAST20034 . XXia Bilingual ★ Concepts to recall - the whole-book checklist 要回忆的概念 -- 全书清单 · Context first (Ch1): data are value-laden; ask who/why/what/how/when; critique # criticism (always offer a constructive fix). 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · Graphics (Ch2): the 5 principles; match graph to variable types; two good features + one specific improvement. I 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 · Design (Ch3): validity = randomise/compare/control (kills bias); precision = replicate/stratify/balance (kills variability); they are independent axes. 设计(第3章): validity = 随机化/比较/控制(杀 bias); precision = 重复/分层/平衡(杀 variability);二者是独 立的轴。 · Observational (Ch4): cohort=group-by-exposure, case-control=group-by-outcome; confounder links to both; correlation # causation. 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠ 因果。 I 报告(第5章):中心/离散/趋势/离群点;报告 Cl+水平、以及统计量 +P,而非只报P。 · Qualitative (Ch6): "why" not "what"; bottom-up vs top-down coding; convergence as the stopping rule. 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence 作为停止规则。 推断(第7章):随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大P 不证明 Ho; Type l/ll 与 power. · Modelling (Ch8): signal+noise; "all models wrong, some useful"; parsimony; read residual/QQ plots - interpret, never fit. 建模(第8章):信号+噪声;“所有模型都是错的,有些有用”;简约性;读残差/QQ图 -- 解读,绝不拟合。 · Sampling (Ch9): frame vs sample; a big sample won't fix bias; 4 random + 4 non-random methods; WEIRD; reproducibility crisis. I 抽样(第9章):抽样框 vs样本;大样本修不了偏倚;4种随机+4种非随机方法;WEIRD;可重复性危机。 I 累积(第10章):森林图(零线+菱形);Hill 准则(时序性+梯度);发表偏倚。 · Big data (Ch11): significance # importance at scale; provenance, ethics, scepticism toward Al findings. Big data (第11章):在大规模下显著 ≠重要;来源、伦理、对 AI发现的怀疑。 而且永远 -- 铁律:点名概念,然后给 because。祝你好运。 AskSia Library · MAST20034 · 双语 Bilingual ● 情境优先(第1章):数据带有价值色彩;问 谁/为何/什么/如何/何时;critique ≠ criticism (永远附上一个建设性 修复)。 · 图表(第2章):5条原则;图与变量类型匹配;两个好特征+一个具体改进。 ● 设计(第3章): validity =随机化/比较/控制(杀 bias); precision= 重复/分层/平衡(杀 variability);二者是独 立的轴。 ● 观察性(第4章):cohort=按暴露分组,case-control=按结局分组;confounder 同时关联两者;相关 ≠因果。 ● 报告(第5章):中心/离散/趋势/离群点;报告 CI+水平、以及统计量+P,而非只报P。 ● 定性(第6章):“为何”而非“是什么”;自下而上 vs 自上而下编码;convergence(收敛)作为停止规则。 推断(第7章): 随机的是区间,μ 是固定的;P= Pr(data or more extreme | Ho); 大 P 不证明 Ho; Type l/ll 与 power.[11]Source: asksia-bible-mast20034-bilingual.pdf(b) Big-data 批判(because):巨大的 n修不了 bias -- 它只是那些已经在用这个 app 的人(selection bias); 在海量 n 下一切看起来都“显著”,所以 effect size 与来源比 P 更要紧。再加上数据伦理标记:被记录用户的隐私/ 同意。 What earns the marks. a justified qual choice (the "why" logic + convergence) + a big-data critique naming that size # representativeness, with an ethics flag. 什么能得分。一个有论证的 qual 选择(“why”逻辑+ convergence)+一个 big-data 批判,点明规模 ≠代表性,并附 上一个 ethics 标记。 Trap. dismissing qualitative as "unscientific"; equating large n with representative; forgetting consent/provenance for found data. 陷阱。把 qualitative 斥为“不科学”;把大n 等同于有代表性;对找来的数据忘了 consent/provenance(同意/来源)。 AskSia Library · MAST20034 · 双语 Bilingual ★ Recall checklist - the decision rules for the bank 回忆清单 -- 题库的决策规则 1 每个答案:点名→定义→在情境中给理由→ because(后果)。按分值数理由。 · Design/sampling: justify the choice against its alternative; non-probability methods are biased - size won't cure it. 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 · Confounder: must link to both exposure and outcome; observational - associated with, never causes. Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 · Graph critique: two good features to a principle; one issue + a fix that matches it. 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 · Type I/II: false +/false -; power = 1-ß; small n -+ low power; rare condition - base-rate false positives. Type l/ll: 假阳/假阴;power = 1-β;小 n→低 power;罕见情况→基础率导致的假阳性。 · Qual vs quant: why vs what; convergence is the qual stopping rule. Big data: size # unbiased; effect size & ethics over P. 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data: 大≠无偏;effect size 与伦理胜过 P。 ● 每个答案:点名 →定义→在情境中给理由→ because(后果)。按分值数理由。 ● 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 ● Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 ● 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 ● CI:随机的是区间,固定的是参数。P-value: Pr(datalHo),而非 Pr(Holdata);显著 ≠重要。 ● Type I/ll: 假阳/假阴;power = 1-β;小n→低 power;罕见情况→ 基础率导致的假阳性。 ● Forest plot: 菱形对零线+异质性+发表偏倚的注意。Hill:证据的权重,时序性优先。 ● 诊断图:漏斗形→方差非恒定;QQ 弯曲→非正态→变换/用假设更少的方法。 ● 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data:大≠无偏;effect size 与伦理胜过 P。 AskSia Library · MAST20034 · 双语 Bilingual EXAM MORNING . THE DECODER EXAM MORNING . THE DECODER[16]Source: asksia-cheatsheet-mast20034.pdfqualitative . convergence 17 . The "Because" Rule HOW TO BANK MARKS EVERY ANSWER = A CHAIN NAME the concept/framework - APPLY it to the context + BECAUSE . . . (the reason wins the mark) · Marks are per correct, sufficiently-detailed reason - one detailed reason often = full marks; restating the definition earns nothing · Dot points are fine . no grammar/spelling marks . 3 hours SIA > No question needs a calculator and none recall case-study details. If you're computing, you've misread it - they want the reasoning. Spend the words on the because. 18 . Top Traps to THE MARKS LOST MOST Avoid · "P = Pr(Ho true)" . "large P proves Ho" . choosing one- sided after the data · Calling a natural experiment an experiment . "association proves cause" · Cohort>case-control mix-up · stratified > quota . cluster<> stratified · Hill as a tick-box · ignoring temporality · "Significant" = "important" . trusting a result because n is huge the core of your 4-side bring-in notes . confirm on the MAST20034 exam-info page name it. apply it. because . . . asksia. ai/cheatsheet/ unimelb-mast20034 . side 2/2 AskSia CHEAT SHEET SERIES RESTRICTED BRING-IN . NO CALCULATOR
-
十、我给你一个“临场万能答题模板”
-
你可以直接背这个英文骨架,考试很好用:
-
设计题
- “This should be a [design], which means [definition]. It is appropriate here because [reason 1], and because [reason 2]. A stronger causal conclusion would require [RCT / better control / randomisation].”
-
抽样题
- “This is [sampling method], which means [definition]. It is biased here because [who is missing], so [direction of bias]. A larger sample would not fix this, because the recruitment method itself is biased.”
-
confounding题
- “A plausible confounder is [X], because it is related to both the exposure and the outcome. That means the observed association could be partly or wholly due to [X], so the data do not justify a causal claim.”
-
图表题
- “One good feature is [feature], because it follows the principle of [principle]. Another good feature is [feature], because [principle]. One issue is [fault], and it should be fixed by [specific fix], because [why that fix works].”
-
CI / P-value题
- “This result suggests [effect/no strong evidence] in context. However, the P-value is not the probability that $H_0$ is true, and statistical significance is not the same as practical importance.”
-
十一、最后给你一句最实用的考试策略
-
这门课不是“背很多知识点然后胡乱堆”,而是:
- 看到题干线索词
- 立刻匹配概念
- 按骨架写 because
- 按分值数理由[3]Source: asksia-bible-mast20034-bilingual.pdf"Interpret this CI / P- value / output / forest plot" Inference reading rules . Ch7, Ch8, Ch10 (1) State what it shows in context (CI excludes O - evidence of an effect; small P - strong evidence against H. ). (2) Add the correct caveat (the interval is random, u is fixed; large P does not prove H. ). (3) Comment on strength / meaning - significant # important. SAMPLING & TRUST "Is this sample OK?"; "what's wrong with how they recruited?" Sampling bias + WEIRD · Ch1, Ch9 (1) Ask who is missing - frame / selection / non-response / volunteer gap. (2) Name the method and its bias (convenience - people similar to each other). (3) State that a bigger sample will NOT fix bias - it repeats the mistake at scale; consider WEIRD over-sampling. "Too good to be true"; "a surprising significant result"; "just barely p‹0. 05" Reproducibility + p-hacking · Ch9-10 (1) Publication bias - novel/significant results over-published, inflating effects. (2) Watch for p-hacking / HARKing (one-sided chosen after the data, multiple looks). (3) Ask for replication, a CI / effect size, and pre- registration before trusting it. "Big data / an Al claim"; "with millions of records . . . " Ethics + validity at scale · Ch1, Ch11 (1) Huge n - everything is significant - judge effect size & practical importance, not P. (2) Apply the context questions (who/why/what/how) + data ethics (consent, fairness, stewardship). (3) Be sceptical of Al - check provenance and the missing data. ✓ How to use the table under pressure 在压力下如何使用这张表 Underline the verb and the noun in the stem first ("choose a design", "critique this graph", "interpret the output"). That two-word cue picks the row; the right column is your paragraph. Then convert each of the three things into a sentence that ends in a because. You are never asked to compute - resist the urge. 先在题干里给动词和名词划线(“选择一个设计”、“批判这张图”、“解读这段输出”)。那个两字提示挑出对应的行;右侧 那一栏就是你要写的段落。然后把这三样东西各转成一个以 because 收尾的句子。题目从不要求你计算 -- 忍住冲动。 AskSia Library . MAST20034 . XXia Bilingual EXAM MORNING . THE DECODER - EXAM MORNING . THE DECODER BUILDING THE NOTES YOU CARRY IN Your 4 sides, the 'because' rule, and the 3 hours 你的4 页笔记、‘because’规则,以及那3个小时 TL;DR. You may bring four sides of your own notes and there is no calculator - so do not waste space on formulas. Fill the four sides with decision trees, checklists, and crisp definitions: the machinery that turns a cue into a justified answer. This page lays out what to put on each side, the one rule that wins short- answer marks, a timing plan for the three hours, and the closing concepts-to-recall list. TL;DR. 你可以带四面自己的笔记,而且没有计算器 -- 所以不要把空间浪费在公式上。把这四面填满decision trees (决策 树)、checklists (清单)和精炼的定义:那些把线索变成有论证答案的机器。本页摆出每一面该放什么、赢得简答分的那一 条规则、三小时的时间规划,以及收官的待回忆概念清单。 - 12. 2 The 4-side notes plan 12. 2四页笔记计划 The exam is reasoning, not recall of numbers, so your sheet is a reasoning toolkit. A good layout maps one side to each job of the decoder above. Trees and checklists earn marks; a wall of formulae does not (there is nothing to calculate). 考的是推理,而非对数字的回忆,所以你的小抄是一个推理工具箱。好的布局把每一面对应到上面解码器的一项工作。树与清 单能得分;一墙公式不能(没有任何东西要算)。 Side What goes on it[5]Source: asksia-bible-mast20034-bilingual.pdfFINAL . 60% . SHORT-ANSWER REASONING The exam-morning decoder 考试当天解码器 If the question says X, reach for Y, and say these three things 若题目说 X,就伸手去取 Y,并说出这三件事 TL;DR. The final hands you a scenario, a graph, or a piece of statistical output and asks you to name the concept and justify it. There is no calculator and no arithmetic - every mark is a because. This page is the lookup table: read the cue words in the stem, reach for the matching concept, then deliver the three reasons that bank the marks. Memorise the column on the right; that is the answer. TL;DR. 期末递给你一个scenario (情景)、一张图,或一段统计输出,要你点名概念并加以论证。没有计算器,也没有算术 -- 每一分都是一个because。本页就是查找表:读题干里的cue words (线索词),伸手抓对应的概念,再交出能存下分数 的三条理由。把右侧那一列背下来;那就是答案。 ★ What the exam asks here 考试在这里问什么 The 60% final is 3 hours, short-answer only (no MCQ, no essay), no calculator, and you bring in 4 sides of your own notes. The marking criteria are explicit: "explaining your reasoning and choices is typically more important than any answer. " Dot-points are fine; no marks for grammar/spelling. So this whole chapter trains the one move the exam pays for - name the concept - give the because. 60% 的期末为时 3 小时,仅简答(无MCQ、无论文),不可用计算器,且你带入自备4面笔记。评分标准写得很明 确:“解释你的推理与选择,通常比任何答案本身更重要。”用要点列举即可;语法/拼写不计分。所以整章都在训练考试 买单的那一招 -- 点名概念 →给出 because。 12. 1 The cue - concept - because table 12. 1cue -> concept -> because xJAR Each row is a question species you have already met in this book. The left column is what the stem sounds like; the middle is the framework to invoke (with the chapter); the right is the 3-part skeleton - say all three and you have earned the reasoning marks. 每一行都是你在本书中已经见过的一类问题species(题种)。左列是题干听起来像什么;中列是要调用的框架(附章节);右 列是三段式骨架 -- 三段都说出来,你就挣到了推理分。 If the question says . . . Reach for this concept Say these 3 things (the because) DESIGN & CAUSATION "Choose / justify a study design"; "how would you investigate . . . " Study-design tree + validity · Ch3-4 (1) Can you intervene? - experiment (RCT) vs observational. (2) Pick the type with a because - rare outcome-case-control, many outcomes-cohort, snapshot-cross-sectional, populations-ecological. (3) Name the design tools that protect validity (randomise/compare/control). "Is it causal?"; "does X cause Y?"; "can we conclude . . . " Confounding + Bradford Hill . Ch4, Ch10 (1) Correlation # causation - observational data give association only. (2) Name a plausible confounder (linked to both exposure and outcome). (3) Argue Hill - esp. temporality (cause first) + dose-response gradient; an RCT would strengthen it by removing confounders. AskSia Library . MAST20034 . XXia Bilingual If the question says . . . Reach for this concept Say these 3 things (the because) GRAPHS & OUTPUT "Critique this graph"; "two good features & one improvement" 5 graphics principles . Ch2 (1) Two good features, each tied to a principle (standard form / common scale / clear encoding / shows data / simple). (2) One real issue (no title, abbreviated labels, panels on different scales). (3) A specific fix that addresses that issue - vague fixes score zero.[8]Source: asksia-bible-mast20034-bilingual.pdfAskSia Library · MAST20034 · 双语 Bilingual 3 Reason in the context given. Tie the concept to this scenario, not a generic textbook one. 在给定的情境中推理。把概念系到这个情景,而非一个泛泛的教科书情景。 4 Close each point with a because. State the consequence (the bias it induces, the assumption it breaks, the conclusion it licenses) - this is the mark-bearing clause. 每个要点都以一个 because 收尾。说出后果(它引发的偏倚、它破坏的假设、它许可的结论) -- 这是承载分数的从句。 5 Count your reasons against the marks. If it says [4: 2+2], deliver a definition and two separate consequence-level reasons. 按分值数你的理由。若标着[4:2+2],就要给出一个定义以及两个各自独立、到后果层面的理由。 ✓ The universal sentence frame 通用句式框架 "This is [named concept], which means [definition]. Here it matters because [consequence #1], and also because [consequence #2]. "Drop any scenario into that frame and you have structured an answer that the rubric can find marks in. “这是[点名的概念],意思是[定义]。在这里它之所以重要,是因为[后果#1],也因为[后果 #2]。”把任意情景套进这 个框架,你就把一个评分标准能找到分的答案给搭好了。 - ! The four ways students throw away marks 学生白白丢分的四种方式 复述,而非推理 -- “它有偏倚,因为它不随机”只是给定义换了个名字;要说出这个偏倚做了什么。 |一个理由套两件外套 -- 一道“2+2”的题目需要两个不同的理由,而非把一个换说法说两遍。 · Trying to compute - there is no calculator and no calc question; if you start arithmetic you have misread the task. 试图计算 -- 这里既无计算器也无计算题;若你开始做算术,你就读错题了。 过度宣称因果 -- 对观察性数据唯一合法的动词是 is associated with,绝不是 causes。 ● 复述,而非推理 -- “它有偏倚,因为它不随机”只是给定义换了个名字;要说出这个偏倚做了什么。 ● 一个理由套两件外套 -- 一道“2+2”的题目需要两个不同的理由,而非把一个换说法说两遍。 ● 试图计算 -- 这里既无计算器也无计算题;若你开始做算术,你就读错题了。 ● 过度宣称因果 -- 对观察性数据唯一合法的动词是 is associated with (与 . . . . . 相关),绝不是 causes(导致)。 AskSia Library · MAST20034 · 双语 Bilingual REVISION . SHORT - ANSWER BANK REVISION . SHORT - ANSWER BANK STUDY - PRODUCTION SPECIES Drills 1-6: design, sampling, confounding & graphs 演练 1-6: 设计、抽样、confounding 与图表 TL;DR. These six rehearse the "how the data were produced" family: choose & justify a design, pick a sampling method, name the confounder, identify exposure/outcome, and critique a graph (two good features + one specific fix). Reason the choice against its alternative - that contrast is where the marks live. TL;DR. 这六张演练“数据是如何产生的”这一族题:选择并论证一种设计、挑一种 sampling 方法、点名 confounder、识别 exposure/outcome,以及批判一张图(两个优点+一个具体的修正)。要把你的选择对照其备选方案来论证 -- 那个对比正 是分数所在。 Q1[11]Source: asksia-bible-mast20034-bilingual.pdf(b) Big-data 批判(because):巨大的 n修不了 bias -- 它只是那些已经在用这个 app 的人(selection bias); 在海量 n 下一切看起来都“显著”,所以 effect size 与来源比 P 更要紧。再加上数据伦理标记:被记录用户的隐私/ 同意。 What earns the marks. a justified qual choice (the "why" logic + convergence) + a big-data critique naming that size # representativeness, with an ethics flag. 什么能得分。一个有论证的 qual 选择(“why”逻辑+ convergence)+一个 big-data 批判,点明规模 ≠代表性,并附 上一个 ethics 标记。 Trap. dismissing qualitative as "unscientific"; equating large n with representative; forgetting consent/provenance for found data. 陷阱。把 qualitative 斥为“不科学”;把大n 等同于有代表性;对找来的数据忘了 consent/provenance(同意/来源)。 AskSia Library · MAST20034 · 双语 Bilingual ★ Recall checklist - the decision rules for the bank 回忆清单 -- 题库的决策规则 1 每个答案:点名→定义→在情境中给理由→ because(后果)。按分值数理由。 · Design/sampling: justify the choice against its alternative; non-probability methods are biased - size won't cure it. 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 · Confounder: must link to both exposure and outcome; observational - associated with, never causes. Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 · Graph critique: two good features to a principle; one issue + a fix that matches it. 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 · Type I/II: false +/false -; power = 1-ß; small n -+ low power; rare condition - base-rate false positives. Type l/ll: 假阳/假阴;power = 1-β;小 n→低 power;罕见情况→基础率导致的假阳性。 · Qual vs quant: why vs what; convergence is the qual stopping rule. Big data: size # unbiased; effect size & ethics over P. 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data: 大≠无偏;effect size 与伦理胜过 P。 ● 每个答案:点名 →定义→在情境中给理由→ because(后果)。按分值数理由。 ● 设计/抽样:对照其替代方案来论证选择;非概率方法是有偏的 -- 规模治不了它。 ● Confounder: 必须同时关联暴露与结局;观察性→与 . . . . . . 相关,绝不是 导致。 ● 图表批判:两个好特征对应一条原则;一个问题+一个与之匹配的修复。 ● CI:随机的是区间,固定的是参数。P-value: Pr(datalHo),而非 Pr(Holdata);显著 ≠重要。 ● Type I/ll: 假阳/假阴;power = 1-β;小n→低 power;罕见情况→ 基础率导致的假阳性。 ● Forest plot: 菱形对零线+异质性+发表偏倚的注意。Hill:证据的权重,时序性优先。 ● 诊断图:漏斗形→方差非恒定;QQ 弯曲→非正态→变换/用假设更少的方法。 ● 定性 vs 定量:为何 vs 是什么;收敛是定性的停止规则。Big data:大≠无偏;effect size 与伦理胜过 P。 AskSia Library · MAST20034 · 双语 Bilingual EXAM MORNING . THE DECODER EXAM MORNING . THE DECODER[13]Source: asksia-bible-mast20034-bilingual.pdfEX 12. 1 Turning a fact into a because (worked short-answer) name > consequence + because Stem (AskSia-invented): "A wellbeing app is evaluated by surveying users who clicked an in-app pop-up. Comment on the sample. " 题干(AskSia 自拟):“某福祉 app通过调查那些点击了应用内弹窗的用户来评估。评论这个样本。” Weak (no marks): "It is a convenience sample. " - a label, no reasoning. 弱(无分):“这是一个 convenience sample。” -- 只是标签,没有推理。 Strong (banks the marks): "This is a convenience / volunteer sample, because only users already engaged enough to click respond - so it suffers self-selection bias and likely over-states satisfaction (because dissatisfied users have churned and are missing). A larger pop-up sample would not fix this, because it repeats the same biased method at scale. " 强(存下分数):“这是一个 convenience/ volunteer(便利/自愿)样本,因为只有已经足够投入到会去点击的用户才 会作答 -- 所以它存在 self-selection bias(自我选择偏倚),很可能高估满意度(因为不满意的用户已经流失,处于 缺失(missing)状态)。更大的弹窗样本并不能解决这个问题,因为它只是把同一种有偏的方法放大重复。” - Read-out: three clauses, three becauses: name the method - name the consequence in context - pre- empt the 'bigger sample' trap. (Scenario AskSia-invented; no figures to compute. ) 读出结构:三个分句,三个because:点名方法→在情境中点名后果→预先化解“样本更大”的陷阱。(情景由 AskSia 自拟;没有要计算的数字。) - 12. 4 The 3-hour timing plan 12. 43 小时计时计划 Three hours for short-answer reasoning is generous - the risk is over-writing early questions, not running out of ideas. Budget by marks, leave a critique-polish pass at the end. 三小时做简答推理是宽裕的 -- 风险在于早段题目写过头,而非想不出点子。按分值分配时间,末尾留一遍批判-润色的检 查。 AskSia Library · MAST20034 · 双语 Bilingual 1 First 10 min - survey & map. Read every question; pencil the decoder row next to each (design? critique? interpret? sample?). Spot the high-mark items. 头10分钟 -- 通览与定位。把每道题读一遍;在每题旁用铅笔标出解码器的行(设计?批判?解读?抽样?)。挑出高分 题。 2 Bulk - answer by mark weight. Roughly a minute per mark; a 4-mark design question wants two detailed becauses, a 2-mark "two good features" wants exactly two. Do not pad. 主体 -- 按分值作答。大约每分一分钟;一道 4分的设计题想要两个详尽的 because,一道 2分的“两个好特征”就恰好两 个。别注水。 3 Discipline - never compute. If you feel an arithmetic urge, you have misread - the answer is an interpretation, not a number. 纪律 -- 绝不计算。若你感到一股算术冲动,那你读错题了 -- 答案是一个解读,而非一个数字。 4 Last 20 min - the because audit. Re-read each answer and check every claim ends in a reason tied to the scenario; add the missing because, the missing caveat (significance # importance), the missing specific fix. 最后 20 分钟 -- because 审计。重读每个答案,检查每一条主张是否都以一个系到情景的理由收尾;补上缺失的 because、缺失的注意点(显著≠重要)、缺失的具体修复。 FIG 12. 1 1 Problem define the question 5 Conclusion answer in context 2 Plan design how to get data PPDAC cycle 4 Analysis explore + model + test 3 Data collect / clean / store iterative: conclusions raise new problems -> cycle repeats
-
如果你愿意,我下一条可以直接继续帮你做这 3 样东西里的任意一个:
- A. 给你整理一份“4页笔记纸”最终版内容
- B. 给你做一份“MAST20034 高频定义+万能答题句”速背版
- C. 直接按 final 风格,出 10 道模拟题并带你写满分答案
-
你只要回我一个字母:A / B / C。
-
另外,如果你愿意,我也可以把这门课的这些复习重点帮你记住,下次你来我就直接按这个框架继续带你复习。
Ask Sia for quick explanations, examples, and study support.