👆 Tap any section with ↗ TAP for interactive charts, worked examples & practice
Midterm & Final Reference · Ultra-Dense A4
Generated by AskSia.ai — graphs, formulas, traps
PROBABILITY FUNDAMENTALS ↗ TAP
Foundations

Probability: P(A) ∈ [0,1]. Sum of all mutually exclusive outcomes = 1.

P(A or B) = P(A) + P(B) − P(A and B)
P(A and B) = P(A) · P(B|A)
P(A|B) = P(A and B) / P(B)
ConceptFormulaCue
IndependentP(A∩B) = P(A)·P(B)knowing one tells nothing about other
Mutually excl.P(A∩B) = 0can't both happen
ConditionalP(A|B) = P(A∩B)/P(B)'given that B'
Bayes' theorem
P(A|B) = P(B|A)·P(A) / P(B). Reverses conditional. Used heavily in disease testing, spam filters.
Counting
Permutations nPr = n!/(n−r)!. Combinations nCr = n!/[r!(n−r)!]. Order matters? P. Doesn't? C.
⚡ EXAM TRAP — INDEPENDENT vs MUTUALLY EXCLUSIVE

These are opposite, not the same. Mutually exclusive events are maximally dependent — if one happens, the other definitely doesn't. Two distinct outcomes can't both be 'independent' AND 'mutually exclusive' (unless one has P=0).

CHI-SQUARE & ANOVA ↗ TAP
Chi-square (χ²)

Tests if categorical data matches expected distribution.

χ² = Σ (Observed − Expected)² / Expected
TestUse whendf
Goodness of fit1 categorical var matches a distributionk−1
Independence2 categorical vars unrelated(r−1)(c−1)
HomogeneitySame distribution across groups(r−1)(c−1)

Big χ² → reject H₀ → data doesn't match expected.

ANOVA (F-test)

Tests if 3+ group means are equal (vs at least one different).

F = MS_between / MS_within
F large → between-group variance dominates → reject H₀
One-way ANOVA
One factor, multiple levels. e.g. 'do 4 teaching methods give different test scores?'. df: between = k−1, within = N−k.
Post-hoc
ANOVA tells you SOMETHING differs but not WHAT. Use Tukey's HSD or Bonferroni after rejection to pinpoint.
⚡ EXAM TRAP — RUNNING MULTIPLE t-TESTS

Don't run 6 pairwise t-tests on 4 groups — that's family-wise error rate explosion. With α=0.05 each, your overall false-positive rate balloons. Use ANOVA first, then post-hoc tests.

RANDOM VARIABLES & DISTRIBUTIONS ↗ TAP
Discrete vs continuous
Discrete (PMF)Continuous (PDF)
P(X=k) ≥ 0f(x) ≥ 0
Σ P(X=k) = 1∫ f(x)dx = 1
E[X] = Σ k·P(X=k)E[X] = ∫ x·f(x)dx
Var(X) = E[X²] − (E[X])² · SD(X) = √Var(X)
Common distributions
DistributionMeanVarianceUse
Bernoulli(p)pp(1−p)1 trial, success/fail
Binomial(n,p)npnp(1−p)n indep. trials
Poisson(λ)λλrare events / time
Normal(μ,σ²)μσ²continuous, bell shape
Uniform(a,b)(a+b)/2(b−a)²/12equal density on [a,b]
Exponential(λ)1/λ1/λ²waiting time
Z-score
z = (x − μ)/σ. Number of SDs from mean. Used to standardize and look up tail probabilities.
68-95-99.7 rule
For Normal: ~68% within 1σ, 95% within 2σ, 99.7% within 3σ. Memorize — exam shortcuts depend on it.
⚡ EXAM TRAP — P(X = x) FOR CONTINUOUS

For continuous random variables, P(X = x) = 0 always. Probability lives in intervals, not points. P(X < 5) and P(X ≤ 5) are equal — the boundary contributes nothing.

HYPOTHESIS TESTING ↗ TAP
The 5-step recipe
▼ HYPOTHESIS TEST PROCEDURE

1. State H₀ (null) and H_a (alt)

2. Choose α (usually 0.05)

3. Compute test statistic (z, t, χ², F)

4. Find p-value or compare to critical value

5. p < α → reject H₀; p ≥ α → fail to reject

TestUse whenStatistic
z-testσ known, large n(x̄−μ₀)/(σ/√n)
t-test (1-sample)σ unknown, use s(x̄−μ₀)/(s/√n)
2-sample tcompare 2 means(x̄₁−x̄₂)/SE
Paired tbefore/after on same unitsd̄/(s_d/√n)
Proportion zcomparing p̂ to p₀(p̂−p₀)/√(p₀(1−p₀)/n)
Type I error (α)
Reject H₀ when it's true. False alarm. We control this with α.
Type II error (β)
Fail to reject when H₀ false. Miss. Power = 1−β. Increase by larger n or bigger effect.

p-value: probability of seeing data this extreme or more, IF H₀ were true. Small p = data unlikely under H₀ = evidence against H₀.

⚡ EXAM TRAP — p-VALUE MISINTERPRETATION

p = 0.03 does NOT mean 'H₀ has 3% probability of being true.' It means: IF H₀ were true, we'd see data this extreme 3% of the time. p-value is about data, not hypotheses.

REGRESSION ↗ TAP
Simple linear regression
ŷ = b₀ + b₁·x
b₁ = r·(s_y/s_x) · b₀ = ȳ − b₁·x̄

Slope b₁: predicted change in y per 1-unit change in x. r is correlation coefficient ∈ [−1, 1].

QuantityMeaningRange
rlinear correlation[−1, 1]
R² = r²% variance explained[0, 1]
residualy − ŷ (vertical gap)any real
SE(b₁)uncertainty in slope≥ 0
Inference on slope
t = b₁ / SE(b₁) · df = n − 2

H₀: β₁ = 0 (no linear relation). Reject if |t| > critical.

R² interpretation
R²=0.7 means 70% of variation in y is explained by x. Higher = better fit. Doesn't imply causation.
Residual plots
Random scatter around 0 = good. Pattern (curve, fan) = model misspecified, transform variables.
⚡ EXAM TRAP — CORRELATION ≠ CAUSATION

r = 0.9 between ice cream sales and drownings doesn't mean ice cream causes drowning. Hot weather causes both. Always consider lurking variables and direction.

SAMPLING & CENTRAL LIMIT THEOREM ↗ TAP
Sampling distribution

The distribution of a statistic (like x̄) computed from many samples of size n.

For sample mean x̄:
E[x̄] = μ · SD(x̄) = σ/√n (Standard Error)
Central Limit Theorem (CLT)

For any distribution with finite μ, σ:

x̄ → Normal(μ, σ²/n) as n → ∞

Rule of thumb: n ≥ 30 and CLT kicks in even for skewed parents. For symmetric parents, much smaller n works.

Why √n matters
To halve the SE, you need 4× the sample size. Precision scales with √n, not n. Expensive.
Population vs sample
μ, σ = population (parameters). x̄, s = sample (statistics). Use lowercase for what you actually have.

Proportions: p̂ ~ Normal(p, p(1−p)/n) when np ≥ 10 AND n(1−p) ≥ 10.

⚡ EXAM TRAP — CLT SAYS x̄ NORMAL, NOT X

CLT is about the sample mean distribution, not the original variable. If X is heavily skewed, X stays skewed forever. Only x̄ becomes Normal as n grows.

CONFIDENCE INTERVALS ↗ TAP
The CI formula
CI = point estimate ± (critical value) × (standard error)
ParameterCI FormWhen
μ (σ known)x̄ ± z*·(σ/√n)rare, only if σ given
μ (σ unknown)x̄ ± t*·(s/√n)standard, df = n−1
p (proportion)p̂ ± z*·√(p̂(1−p̂)/n)np̂ ≥ 10, n(1−p̂) ≥ 10
μ₁−μ₂ (2-sample)(x̄₁−x̄₂) ± t*·SEcompare 2 means
Common critical values
90% CI → z* = 1.645 · 95% → 1.96 · 99% → 2.576
What 95% CI means
If we repeated this sampling process many times, ~95% of the resulting intervals would contain μ. Not a probability about THIS interval — it either contains μ or doesn't.
Sample size for given margin
n = (z*·σ / E)² for given margin of error E. Want half the margin? Need 4× the sample.
⚡ EXAM TRAP — 'PROBABILITY MU IS HERE' INTERPRETATION

'There's a 95% chance μ is in (3.2, 4.8)' is wrong. μ is fixed; the interval is random. Better: 'we used a procedure that captures μ 95% of the time.'

DECISION BOX — pick the test ↗ TAP
Read the question. Find the trigger.
If you see…Use §
'P(A and B)', 'and / or'§1 probability rules
'P(A | B)', 'given that'§1 conditional / Bayes
'mean of', 'expected value'§2 distributions
'68/95/99.7', z-score§2 Normal
'sample mean', large n§3 CLT
'estimate μ with margin'§4 confidence interval
'is the mean equal to'§5 t-test or z-test
2 group comparison§5 2-sample t-test
before/after same subjects§5 paired t-test
'is proportion equal to'§5 proportion z-test
3+ group means§6 ANOVA
categorical data, 'fits a distribution'§6 chi-square goodness
2 categorical vars, 'related'§6 chi-square independence
'predict y from x'§7 regression
▼ LAST-MINUTE PROCEDURE

Match the data type: categorical → χ² or proportion. Numerical → t/z/ANOVA.

Match the question: 'estimate' → CI. 'is it equal' → hypothesis test. 'predict' → regression.

Match the # groups: 1 group → 1-sample. 2 groups → 2-sample. 3+ groups → ANOVA.

Always check assumptions: Normal? Independent? Sample size adequate?

⚡ EXAM TRAP — IGNORING ASSUMPTIONS

Every test has assumptions (normality, independence, equal variance, etc.). Apply a t-test to grossly skewed data with n=8 and you're producing nonsense, not statistics. Always state and check.

⚡ FINAL EXAM TRAP — STATISTICAL vs PRACTICAL SIGNIFICANCE

With huge n, even tiny effects become 'statistically significant' (p < 0.05). Always also report effect size — does the difference matter in real life?

STATS 101 · Comprehensive Cram Sheet · Ultra-Dense A4
✦ AskSia.ai
For exam prep only · Check your professor's formula sheet rules · asksia.ai/library

Want one for YOUR exact syllabus?

Sia is your free desktop study agent. Drop your professor's slides — Sia builds you a sheet tailored to YOUR test. Better than this library because it knows YOUR materials.

↓ Download Sia · Free