AskSia - STAT7055-Introductory Statistics for Business and Finance Study Notes & Practice | The Australian National University

I've received your latest file upload, but there was no message or question included.

How can I assist you with your uploaded file or is there something specific you would like to know or analyze?
Feel free to provide instructions or ask any questions related to this document or any other topic!

Tip:
If you want me to remember specific formulas, summaries, or exam points for the future, just let me know—I can save them for your upcoming study sessions!

summarize_document

Summary of Statistical Inference Concepts

This document outlines key concepts in statistical inference, focusing on hypothesis testing and estimation for population parameters.

1. Hypothesis Testing

Hypothesis testing involves making a claim about a population parameter and using sample data to determine if the claim is supported.

Null Hypothesis (H₀): A default claim about a population parameter, typically including an equality sign (=). It is assumed to be true initially.
Alternative Hypothesis (H₁): A claim about the population parameter that the researcher is trying to prove. It contradicts the null hypothesis.
Null Distribution: The sampling distribution of the test statistic under the assumption that H₀ is true.
Rejection Region: The set of values for the test statistic that are considered extreme enough to provide evidence against H₀. If the observed test statistic falls within this region, H₀ is rejected.
Critical Values: Boundaries that define the rejection region.
Decision Rule:
- If the observed test statistic falls in the rejection region, reject H₀ (concluding H₁ is true).
- If the observed test statistic does not fall in the rejection region, fail to reject H₀ (concluding H₀ is likely true).

Errors in Hypothesis Testing

Type I Error: Rejecting H₀ when it is actually true. The probability of this error is denoted by α (alpha), the significance level. A smaller α leads to a more stringent test.
Type II Error: Failing to reject H₀ when it is actually false. The probability of this error is denoted by β (beta).
Power of a Test: The probability of correctly rejecting a false H₀, calculated as 1 - β.

Hypothesis Tests for Population Mean (μ)

When σ² is Known:
- Test Statistic: Z-statistic, which follows a standard normal distribution (N(0, 1)) under H₀.
- Rejection Regions:
 - Two-tailed test (H₁: μ ≠ μ₀): Z > Zα/2 or Z < -Zα/2
 - One-tailed test (H₁: μ > μ₀): Z > Zα
 - One-tailed test (H₁: μ < μ₀): Z < -Zα
When σ² is Unknown:
- Test Statistic: t-statistic, which follows a t-distribution with n-1 degrees of freedom under H₀.
- Rejection Regions: Similar structure to Z-tests but using critical t-values (tα/2, n-1 or tα, n-1).
p-value: The probability of observing a test statistic as extreme as, or more extreme than, the one calculated from the sample, assuming H₀ is true. If p-value ≤ α, reject H₀.

Hypothesis Test for Population Proportion (p)

Test Statistic: Z-statistic, calculated using the hypothesized proportion p₀.
- Z = (p̂ - p₀) / √[p₀(1 - p₀) / n]
Rejection Regions: Similar to Z-tests for the mean.

Testing 2 or More Population Means (ANOVA)

One-Way ANOVA: Compares means of a continuous response variable across levels of one categorical factor.
- H₀: μ₁ = μ₂ = ... = μk
- H₁: Not all population means are equal.
- Sources of Variation:
 - SST (Sum of Squares for Treatment): Variation between sample means.
 - SSE (Sum of Squares for Error): Variation within samples.
 - SS(total): Total variation.
- Test Statistic: F-statistic (F = MST / MSE), which follows an F-distribution.
- Rejection Region: F > Fα, k-1, n-k.
Two-Way ANOVA: Analyzes the effect of two categorical factors and their interaction on a continuous response variable.
- Interaction: The effect of one factor depends on the level of the other factor.
- Hypotheses: Tests for interaction effect, main effect of Factor A, and main effect of Factor B.
- Test Statistics: F-statistics for interaction and main effects.

Inference for Categorical Data

Chi-Square Goodness-of-Fit Test: Tests if population proportions for a single categorical variable match hypothesized values.
- H₀: P₁ = P₂ = ... = Pk = 1/k (or specified proportions)
- H₁: Not all population proportions are equal.
- Test Statistic: χ² statistic, calculated using observed (fᵢ) and expected (eᵢ) counts.
- χ² = Σ [(fᵢ - eᵢ)² / eᵢ]
- Degrees of freedom: k - 1.
- Rejection Region: χ² > χ²α, k-1.
Chi-Square Test for Independence: Tests if two categorical variables are independent using a contingency table.
- H₀: The two variables are independent.
- H₁: The two variables are dependent.
- Test Statistic: χ² statistic, calculated using observed (fᵢⱼ) and expected (eᵢⱼ) counts in the contingency table.
- χ² = ΣΣ [(fᵢⱼ - eᵢⱼ)² / eᵢⱼ]
- Degrees of freedom: (r - 1)(c - 1), where r and c are the number of categories for each variable.
- Rejection Region: χ² > χ²α, (r-1)(c-1).

2. Simple Linear Regression

Models the linear relationship between one continuous independent variable (X) and one continuous dependent variable (Y).

Model: Yᵢ = β₀ + β₁Xᵢ + εᵢ
- β₀: True y-intercept.
- β₁: True slope.
- εᵢ: Random error component, assumed to be normally distributed with mean 0 and constant variance σ².
Estimated Regression Line: Ŷᵢ = b₀ + b₁Xᵢ
- b₀ and b₁ are estimates of β₀ and β₁.
- Method of Least Squares: Minimizes the sum of squared residuals (eᵢ = Yᵢ - Ŷᵢ).
Assumptions of Errors (εᵢ):
- Normality: εᵢ ~ N(0, σ²)
- Constant Variance (Homoscedasticity): σ² is constant for all X.
- Independence: Errors are independent of each other.
Model Significance:
- Overall Significance Test (F-test): Tests if β₁ = 0.
- Coefficient Significance Test (t-test): Tests if β₁ ≠ 0.
Coefficient of Determination (R²): The proportion of the total variation in Y that is explained by the model. R² = SSR / SS(total) = 1 - (SSE / SS(total)).
Confidence and Prediction Intervals: Used to estimate the expected value of Y or predict a specific value of Y for a given X. Prediction intervals are wider than confidence intervals due to greater uncertainty.

3. Multiple Linear Regression

Extends simple linear regression to model the relationship between a dependent variable (Y) and two or more independent variables (X₁, X₂, ..., Xk).

Model: Yᵢ = β₀ + β₁X₁ᵢ + β₂X₂ᵢ + ... + βkXkᵢ + εᵢ
- βⱼ represents the expected change in Y for a one-unit increase in Xⱼ, holding all other independent variables constant.
Assumptions: Similar to simple linear regression, including normality, constant variance, and independence of errors.
Model Selection: Crucial to avoid overfitting and multicollinearity.
Overall Significance Test (F-test): Tests H₀: β₁ = β₂ = ... = βk = 0.
Coefficient Significance Test (t-test): Tests H₀: βⱼ = 0 for individual coefficients.
R² and Adjusted R²: Adjusted R² penalizes the addition of unnecessary variables.
Multicollinearity: High correlation between independent variables, leading to unstable coefficient estimates and large standard errors.
Categorical Independent Variables: Coded using indicator (dummy) variables. This allows for varying intercepts and/or slopes based on the categories.

4. Descriptive Statistics

Summarizes data using measures and visualizations.

Categorical Data: Nominal (no order) and Ordinal (ordered). Visualized with Bar Charts and Pie Charts.
Numerical Data: Discrete (countable) and Continuous (uncountable). Visualized with Histograms and Boxplots.
Measures:
- Central Tendency: Mean, Median (middle observation), Mode (most frequent).
- Variability: Variance (s²), Standard Deviation (s), Coefficient of Variation, Interquartile Range (IQR).
- Relationship: Covariance, Correlation Coefficient (r).

5. Probability Concepts

Sample Space (S): Set of all possible outcomes.
Event: A subset of the sample space.
Probability: A measure of the likelihood of an event occurring, ranging from 0 to 1.
- Approaches: Classical, Relative Frequency, Subjective.
Rules: Addition Rule, Multiplication Rule, Conditional Probability, Law of Total Probability, Complement Rule.
Independence: Two events A and B are independent if P(A ∩ B) = P(A) * P(B).

6. Random Variables and Distributions

Random Variable: A function assigning a numerical value to each outcome in a sample space.
- Discrete: Countable values (e.g., number of successes).
- Continuous: Uncountable values (e.g., height, temperature).
Probability Distribution: Lists possible values and their probabilities.
- Discrete: Probability Mass Function (PMF).
- Continuous: Probability Density Function (PDF). Probabilities are calculated as areas under the PDF curve. P(X=x) = 0 for continuous variables.
Expected Value (E[X] or μ): The mean of the distribution.
Variance (V[X] or σ²): Measures the spread of the distribution.
Special Distributions:
- Binomial: Number of successes in a fixed number of independent Bernoulli trials.
- Uniform: Equal probability over a given range.
- Normal: Bell-shaped, symmetric distribution (μ, σ²). Key for the Central Limit Theorem.
- Chi-Square: Used in hypothesis testing for categorical data.
- Exponential: Describes waiting times.
Bivariate Distribution: Describes the joint probability of two random variables.

7. Sampling Distributions

The probability distribution of a sample statistic (e.g., sample mean, sample proportion).
Central Limit Theorem (CLT): States that the sampling distribution of the sample mean (X̄) approaches a normal distribution with mean μ and variance σ²/n, regardless of the population distribution, as the sample size (n) becomes large.
Sampling Distribution of Sample Proportion (p̂): Also approaches a normal distribution (mean p, variance p(1-p)/n) for large sample sizes (np > 5 and n(1-p) > 5).

8. Statistical Estimation

Point Estimator: A single value calculated from sample data to estimate an unknown population parameter (e.g., sample mean X̄ estimates population mean μ).
- Properties: Unbiasedness (E[θ̂] = θ), Variance (V[θ̂]), Mean Squared Error (MSE = V[θ̂] + Bias²), Consistency (MSE → 0 as n → ∞).
- Relative Efficiency: Compares the variances of unbiased estimators.
Interval Estimator (Confidence Interval): A range of values likely to contain the unknown population parameter.
- Confidence Level (1-α): The probability that the interval contains the true parameter.
- Construction: Uses the sampling distribution (often via CLT) and critical values (e.g., Zα/2).
- Factors Affecting Width: Population variance, sample size, confidence level. Wider intervals indicate less precision.

This document provides a comprehensive overview of statistical concepts and methods, covering descriptive statistics, probability, random variables, distributions, sampling, estimation, hypothesis testing, and regression analysis.

Week 1: Introduction and Descriptive Statistics

Types of Data:
- Discrete: Measured in fixed increments.
- Continuous: Measured in infinitely small increments.
Population vs. Sample:
- Population: Characterized by parameters (unknown values).
- Sample: Characterized by statistics.
Measures of Relative Standing:
- Quantiles: Values that divide a dataset into ordered parts (e.g., quartiles).
Measures of Variability:
- Range: Largest Value - Smallest Value.
- Interquartile Range (IQR): Q3 - Q1.
- Coefficient of Variation: Measures relative variability.
Covariance: Measures the joint variability of two random variables.
- Population Covariance: $\frac{\sum_{i=1}^{N} (X_i - \mu_X)(Y_i - \mu_Y)}{N}$
Outlier Detection (using IQR):
- Lines extend from Q1 to $\min{\text{max}, Q3 + 1.5 \times \text{IQR}}$ and from Q3 to $\max{\text{min}, Q1 - 1.5 \times \text{IQR}}$.
Probability Fundamentals:
- Random Experiment: A process with uncertain outcomes.
- Outcomes (O): Possible results of an experiment.
- Sample Space (S): The set of all possible outcomes ${O_1, O_2, O_3, ...}$.
- Probabilities of Outcomes (P(Oᵢ)):
  - $0 \le P(O_i) \le 1$ for all $i$.
  - Sum of probabilities of all outcomes equals 1.
- Mutually Exclusive & Exhaustive: Events that cannot occur simultaneously and cover all possibilities.
- Complement (Aᶜ): The event that A does not occur.
- Marginal Probability (P(A)): The probability of event A.
- Law of Total Probability: Used to calculate the probability of an event using a partition of the sample space.
- Conditional Probability (P(A|B)): The probability of A given B has occurred.
  - $P(A \cap B) = P(A|B) \times P(B) = P(B|A) \times P(A)$
- Addition Rule: $P(A \cup B) = P(A) + P(B) - P(A \cap B)$
- Correlation Coefficient: Measures the linear relationship between two variables.
- Joint Probability (P(A ∩ B)): The probability of both A and B occurring.
Events: A collection of one or more simple events.
- $P(A) = \frac{\text{Number of simple events in A}}{\text{Number of simple events in S}}$
Types of Data Scales:
- Nominal: Categories with no inherent order or relationship.
- Ordinal: Categories with a distinct ordering.

Weeks 3 & 4: Random Variables and Discrete/Continuous Probability Distributions

Probability Distribution Function (PDF):
- For Discrete Variables: $P(X=x) = p(x)$
  - $0 \le p(x) \le 1$ for all $x$.
  - $\sum_{all\ x} p(x) = 1$.
- For Continuous Variables: $f(x)$ (Probability Density Function)
  - $f(x) \ge 0$ for all $x$.
  - $\int_{-\infty}^{\infty} f(x) dx = 1$.
Expected Value (Mean):
- Discrete: $\mu = E(X) = \sum_{all\ x} x \cdot p(x)$
- Continuous: $\mu = E(X) = \int_{-\infty}^{\infty} x \cdot f(x) dx$
- Expected Value of a function of X: $E(g(X)) = \sum_{all\ x} g(x) \cdot p(x)$ (discrete) or $\int_{-\infty}^{\infty} g(x) \cdot f(x) dx$ (continuous).
Variance:
- Discrete: $\sigma^2 = V(X) = E((X-\mu)^2) = \sum_{all\ x} (x-\mu)^2 \cdot p(x)$
- Continuous: $\sigma^2 = V(X) = E((X-\mu)^2) = \int_{-\infty}^{\infty} (x-\mu)^2 \cdot f(x) dx$
Bivariate Distributions: Involve two random variables (X, Y).
- Joint Probability Mass Function (Discrete): $P(X=x, Y=y) = p(x, y)$
- Joint Probability Density Function (Continuous): $f(x, y)$
- Marginal Probability (Discrete): $p(x) = P(X=x) = \sum_{all\ y} p(x, y)$
- Marginal Probability Density Function (Continuous): $f_X(x) = \int_{-\infty}^{\infty} f(x, y) dy$
Covariance (Bivariate):
- $\sigma_{XY} = \text{Cov}(X,Y) = E((X-\mu_X)(Y-\mu_Y)) = \sum_{all\ x} \sum_{all\ y} (x-\mu_X)(y-\mu_Y) \cdot p(x, y)$ (discrete)
- Also, $\text{Cov}(X,Y) = E(XY) - E(X)E(Y)$.
Independence: Two random variables X and Y are independent if $p(x, y) = p_X(x) \cdot p_Y(y)$ (discrete) or $f(x, y) = f_X(x) \cdot f_Y(y)$ (continuous).
Laws of Expectation and Variance:
- $E(XY) = E(X)E(Y)$ if X and Y are independent.
- $V(c) = 0$, $V(cX) = c^2V(X)$, $V(X+c) = V(X)$.
- $V(X+Y) = V(X) + V(Y)$ if X and Y are independent.
- $V(aX + bY) = a^2V(X) + b^2V(Y) + 2ab\text{Cov}(X,Y)$.
Common Distributions:
- Binomial Distribution ($X \sim \text{Bin}(n, p)$): For a fixed number of independent trials with two outcomes.
  - $P(X=k) = \binom{n}{k} p^k (1-p)^{n-k}$
  - $E(X) = np$, $V(X) = np(1-p)$.
- Normal Distribution ($X \sim N(\mu, \sigma^2)$): Bell-shaped, symmetric distribution.

Week 5: Sampling Distributions & Week 6: Estimation

Sampling Distribution: The probability distribution of a statistic (e.g., sample mean).
Central Limit Theorem (CLT): For a sufficiently large sample size ($n$), the sampling distribution of the sample mean ($\bar{X}$) is approximately normal, regardless of the population distribution.
- Conditions for CLT approximation:
  1. If population is normal, $\bar{X} \sim N(\mu, \sigma^2/n)$ for any $n$.
  2. If population is not normal but $n \ge 20$, $\bar{X}$ is close to normal.
  3. If population is far from normal, $n > 50$ is recommended.
De Moivre-Laplace Theorem: Approximates the Binomial distribution with a Normal distribution when $np \ge 5$ and $n(1-p) \ge 5$.
Properties of Estimators:
- Unbiasedness: $B(\hat{\theta}) = E(\hat{\theta}) - \theta = 0 \implies E(\hat{\theta}) = \theta$.
- Consistency: Mean Squared Error (MSE) approaches 0 as sample size $n$ increases. $MSE(\hat{\theta}) = E((\hat{\theta} - \theta)^2) = V(\hat{\theta}) + (B(\hat{\theta}))^2$.
- Relative Efficiency: Compares the variances of two unbiased estimators: $\text{eff}(\hat{\theta}_1, \hat{\theta}_2) = V(\hat{\theta}_2) / V(\hat{\theta}_1)$.
Interval Estimators (when $\sigma$ is known):
- Confidence Interval for $\mu$: $\bar{X} \pm z_{\alpha/2} \frac{\sigma}{\sqrt{n}}$
- Confidence Levels: 90%, 95%, 99% correspond to specific $z_{\alpha/2}$ values.
Uniform Distribution ($X \sim U(a, b)$):
- $f(x) = \frac{1}{b-a}$ for $a \le x \le b$, and 0 otherwise.
- $P(a \le X \le b) = \int_a^b f(x) dx = 1$.

Week 7: Hypothesis Testing

Testing Hypotheses about $\mu$:
- When $\sigma$ is known: Use Z-test.
 - $H_0: \mu = \mu_0$
 - Test Statistic: $Z = \frac{\bar{X} - \mu_0}{\sigma/\sqrt{n}}$
 - Rejection Region: $Z > z_{\alpha/2}$ or $Z < -z_{\alpha/2}$ (two-tailed).
- When $\sigma$ is unknown: Use t-test.
 - $H_0: \mu = \mu_0$
 - Test Statistic: $t = \frac{\bar{X} - \mu_0}{s/\sqrt{n}}$ with $n-1$ degrees of freedom.
 - Rejection Region: $t > t_{\alpha/2, n-1}$ or $t < -t_{\alpha/2, n-1}$ (two-tailed).
Errors in Hypothesis Testing:
- Type I Error: Rejecting $H_0$ when it is true ($\alpha$).
- Type II Error: Failing to reject $H_0$ when it is false ($\beta$).
Significance Level ($\alpha$): Probability of Type I error.
Power of the Test: $1 - \beta$.
Calculating Probability of a Type II Error ($\beta$):
1. Determine the rejection region based on $\alpha$ and $H_0$.
2. Work backward to find the rejection region in terms of the unstandardized test statistic.
3. Calculate the probability of not rejecting $H_0$ under the alternative hypothesis ($H_1$) by re-standardizing using the true parameter value from $H_1$.

Week 8: Comparing Two Populations

Inferences about:
- Difference in population means ($\mu_1 - \mu_2$).
- Difference in population proportions ($p_1 - p_2$).
Comparing Two Means:
- Independent Samples:
  - $\sigma_1, \sigma_2$ known: Z-statistic.
  - $\sigma_1, \sigma_2$ unknown, $\sigma_1^2 = \sigma_2^2$: Pooled t-statistic.
  - $\sigma_1, \sigma_2$ unknown, $\sigma_1^2 \ne \sigma_2^2$: Welch's t-statistic (often referred to as "penta kill" in the notes, likely a colloquial term).
- Paired Samples: Treat differences as one sample (t-statistic).
Testing $\mu_1 - \mu_2$ (known $\sigma_1, \sigma_2$):
- $H_0: \mu_1 - \mu_2 = D_0$
- Test Statistic: $Z = \frac{(\bar{X}_1 - \bar{X}_2) - D_0}{\sqrt{\frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2}}}$
- Rejection Region depends on $H_1$ (two-tailed, upper-tailed, lower-tailed).
Testing $\mu_1 - \mu_2$ with Paired Samples:
- Let $D_i = X_{1i} - X_{2i}$. Test $H_0: \mu_D = D_0$.
- Test Statistic: $t = \frac{\bar{D} - D_0}{s_D/\sqrt{n}}$ with $n-1$ degrees of freedom.
Comparing Two Proportions ($p_1 - p_2$):
- Independent Samples:
  - Test Statistic (for $H_0: p_1 - p_2 = D_0$): $Z = \frac{(\hat{p}_1 - \hat{p}_2) - D_0}{\sqrt{\frac{\hat{p}_1(1-\hat{p}_1)}{n_1} + \frac{\hat{p}_2(1-\hat{p}_2)}{n_2}}}$
  - If $D_0 = 0$, use pooled proportion $\hat{p} = \frac{X_1+X_2}{n_1+n_2}$.
Testing $\mu_1 - \mu_2$ (unknown $\sigma_1, \sigma_2$):
- $H_0: \mu_1 - \mu_2 = D_0$
- Test Statistic (pooled variance): $t = \frac{(\bar{X}_1 - \bar{X}_2) - D_0}{s_p \sqrt{\frac{1}{n_1} + \frac{1}{n_2}}}$ where $s_p^2$ is the pooled sample variance.
- Rejection Region: $t > t_{\alpha/2, n_1+n_2-2}$ or $t < -t_{\alpha/2, n_1+n_2-2}$ (two-tailed).

Week 9: Analysis of Variance (ANOVA)

Purpose: To compare means of three or more groups.
Hypotheses:
- $H_0$: All population means are equal ($\mu_1 = \mu_2 = ... = \mu_k$).
- $H_1$: At least two population means differ.
Test Statistic: F-statistic (from ANOVA table).
ANOVA Table Components:
- SS (Total): Sum of squares for the total variation.
- SST: Sum of squares between groups (treatment).
- SSE: Sum of squares within groups (error).
- $SS(\text{Total}) = SST + SSE$.
- For two-way ANOVA: SSA, SSB, SSAB (interaction), SSE.
Rejection Region: $F > F_{\alpha, k-1, n-k}$ (for one-way ANOVA).
Interaction Effects: Tested first in multi-way ANOVA.
- $H_0$: No interaction between factors.
- $H_1$: Interaction exists.
- Rejection Region: $F_{AB} > F_{\alpha, (a-1)(b-1), n-ab}$.

Week 10: Chi-Squared Tests

Chi-Squared Goodness-of-Fit Test:
- Used for one categorical variable with $k$ categories.
- Hypotheses:
  - $H_0: p_1 = c_1, p_2 = c_2, ..., p_k = c_k$ (population proportions match specified values).
  - $H_1$: The population proportions do not match.
- Test Statistic: $\chi^2 = \sum_{i=1}^k \frac{(f_i - e_i)^2}{e_i}$, where $f_i$ is observed count and $e_i$ is expected count.
Chi-Squared Test of a Contingency Table:
- Used to determine if two categorical variables (with $r$ rows and $c$ columns) are independent.
- Hypotheses:
  - $H_0$: The variables are independent.
  - $H_1$: The variables are not independent.
- Test Statistic: $\chi^2 = \sum_{i=1}^r \sum_{j=1}^c \frac{(O_{ij} - E_{ij})^2}{E_{ij}}$, where $O_{ij}$ is observed count and $E_{ij} = \frac{(\text{ith row total}) \times (\text{jth column total})}{(\text{grand total})}$.
Hypothesis Testing for Proportion (p):
- $H_0: p = p_0$
- $H_1: p \ne p_0$ (or $p > p_0$, $p < p_0$)
- Test Statistic: $Z = \frac{\hat{p} - p_0}{\sqrt{\frac{p_0(1-p_0)}{n}}}$
- Rejection Region: Based on Z-values ($z_{\alpha/2}$, $z_{\alpha}$).

Weeks 11 & 12: Simple and Multiple Linear Regression

Simple Linear Regression Model: $Y = \beta_0 + \beta_1 X + \epsilon$, where $\epsilon \sim N(0, \sigma^2)$.
- $\beta_0$: Intercept.
- $\beta_1$: Slope.
- $\epsilon$: Error term.
Estimating the Model (Least Squares):
- $\hat{\beta}1 = \frac{S{XY}}{S_{XX}}$, where $S_{XY} = \sum (X_i - \bar{X})(Y_i - \bar{Y})$ and $S_{XX} = \sum (X_i - \bar{X})^2$.
- $\hat{\beta}_0 = \bar{Y} - \hat{\beta}_1 \bar{X}$.
- $E(\hat{\beta}_1) = \beta_1$ (unbiased).
Measures of Fit:
- Coefficient of Determination ($R^2$): Proportion of variance in Y explained by X. $R^2 = \frac{SSR}{SS(\text{Total})} = 1 - \frac{SSE}{SS(\text{Total})}$.
- $SS(\text{Total}) = \sum (Y_i - \bar{Y})^2 = (n-1)s_Y^2$.
- $SSE = \sum (Y_i - \hat{Y}_i)^2$.
Inference for Regression Coefficients:
- Test Statistic for $\beta_1$: $t = \frac{\hat{\beta}_1 - \beta_1}{SE(\hat{\beta}_1)}$ with $n-2$ degrees of freedom.
- Rejection Region: $t > t_{\alpha/2, n-2}$ or $t < -t_{\alpha/2, n-2}$ (two-tailed).
Prediction:
- Point Estimate for a particular Y: $\hat{Y}_g = \hat{\beta}_0 + \hat{\beta}_1 X_g$.
- Confidence Interval for $E(Y | X=X_g)$: $\hat{Y}g \pm t{\alpha/2, n-2} \times SE(\hat{Y}_g)$.
Multiple Linear Regression Model: $Y = \beta_0 + \beta_1 X_1 + ... + \beta_k X_k + \epsilon$.
- Assumptions: $\epsilon_i \sim N(0, \sigma^2)$ i.i.d.
- $E(Y) = \beta_0 + \beta_1 X_1 + ... + \beta_k X_k$.
- $V(Y) = \sigma^2$.
Assessing the Model:
- Overall Test (ANOVA F-test):
 - $H_0: \beta_1 = \beta_2 = ... = \beta_k = 0$.
 - $H_1$: Not all coefficients are zero.
 - Test Statistic: F-statistic from ANOVA table.
 - Rejection Region: $F > F_{\alpha, k, n-k-1}$.
- Individual Coefficient Tests (t-tests):
 - $H_0: \beta_j = 0$.
 - $H_1: \beta_j \ne 0$.
 - Test Statistic: $t = \frac{\hat{\beta}_j}{SE(\hat{\beta}_j)}$ with $n-k-1$ degrees of freedom.
 - Rejection Region: $t > t_{\alpha/2, n-k-1}$ or $t < -t_{\alpha/2, n-k-1}$.
Adjusted $R^2$: $1 - \frac{SSE/(n-k-1)}{SS(\text{Total})/(n-1)}$.
Categorical Independent Variables (Dummy Variables): Used to include qualitative predictors in the model.
Multicollinearity: Correlation between independent variables can make parameter estimates unstable with large variance, affecting individual coefficient tests but not the overall F-test. Conclusions from individual coefficient tests are conditional on other variables being in the model.

中文讲解

好的！以下是对Notes & Review - STAT7055 Introductory Statistics for Business and Finance.pdf 以及非常重要！公式总结.pdf的中文总结讲解，涵盖主要统计推断知识点，并配合常考公式与方法：

一、描述性统计与概率基础

数据类型：

数值型（离散型、连续型）
分类型（类别型、顺序型）

中心趋势/离散程度

均值（样本均值 $ \bar{x} $，总体均值 $ \mu $）
方差/标准差（$ s^2, \sigma^2 $），变异系数
协方差与相关系数：衡量两个变量的线性关系

概率基础

事件 Event、样本空间 Sample Space
概率性质：$ 0 \leq P(A) \leq 1 $，且所有互斥事件概率和为1
条件概率：$ P(A|B) = \dfrac{P(A \cap B)}{P(B)} $
全概率公式与贝叶斯公式
$$ P(A) = \sum_i P(A|B_i)P(B_i) $$ $$ P(A|B) = \frac{P(B|A),P(A)}{P(B)} $$
两事件独立性：$ P(A \cap B) = P(A)P(B) $

二、随机变量与分布

离散型分布

二项分布 $ X \sim Bin(n, p) $: $ P(X=k) = \binom{n}{k}p^k(1-p)^{n-k} $

连续型分布

正态分布 $ X \sim N(\mu, \sigma^2) $
$ Z $ 标准化公式：$ Z = \frac{X-\mu}{\sigma} $
其它分布：t分布、F分布、$\chi^2$分布、指数分布等

某些性质

$ E(X+Y) = E(X)+E(Y) $
$ Var(aX+bY) = a^2Var(X) + b^2Var(Y) + 2ab,Cov(X, Y) $

三、抽样分布与估计

抽样分布

样本均值 $ \bar{X} $ 的分布：
$$ \bar{X} \sim N(\mu, \frac{\sigma^2}{n}) \text{（大样本时由中心极限定理保证）} $$

点估计与区间估计

点估计：用样本统计量（如 $ \bar{X} $）估计总体参数 $ \mu $
区间估计（置信区间）：

四、假设检验

设定假设
- 原假设 $ H_0 $：带等号
- 备择假设 $ H_1 $：带“≠”、“>”、“<”
计算检验统计量
- 均值检验 $ \sigma $已知用Z，未知用t $$ Z = \frac{\bar{X}-\mu_0}{\sigma/\sqrt{n}} $$ $$ t = \frac{\bar{X}-\mu_0}{s/\sqrt{n}} $$
- 比例检验：$ Z = \frac{\hat{p} - p_0}{\sqrt{p_0(1-p_0)/n}} $[40]Source: Notes & Review - STAT7055 Introductory Statistics for Business and Finance.pdf· T-distribution The null distribution of T-statistic with n-1 degrees of freedom - Symmetric = 0, mound shaped - 1 Degree of freedom (n-1) > approaches to normal distribution - Greater variance than standard normal distribution · Rejection Region T-table: (n-1) degree of freedom & significant level a T < - ta ___ or T > tan En - 1 T <- ta,n-1 Η1: μ ζ μο One-Tailed Test Η1: μ = μο Two-Tailed Test Η1: μ > μο One-Tailed Test T > ta,n-1 Hypothesis Test for p Hypothesis Ho: p = po H1: p (> # <) po Test Statistic (assume Ho is true) Z = PO (1 - PO ) p-Po n (p == ) X n p(1-p) n ,p + Za p(1-p) Confidence interval (p - Za n Decision rule Under H0, Z has N(0, 1) distribution Z > Za or Z < - Za 2 2 HI: p # po Two-Tailed Test Z < - Za Hi: p < po Z > Za HI: p > po
- 方差、相关等其它检验
拒绝域/显著性水平
- 判断统计量是否落入拒绝域（由 $\alpha$ 决定），比如双尾检验：$ |Z| > z_{\alpha/2} $[@ref_20 @ref_21]
- I型错误（$\alpha$）：拒绝了真的$H_0$
- II型错误（$\beta$）：没有拒绝假的$H_0$
结论
- 计算并对比 p-value，决定是否拒绝 $H_0$

五、方差分析（ANOVA）

单因素方差分析（One-way ANOVA）
- $H_0$：全部总体均值相等
- $H_1$：至少有两个均值不同
- 检验统计量F $$ F = \frac{MSB}{MSW} $$
- 拒绝域：$ F > F_{\alpha, k-1, n-k} $[@ref_16 @ref_28]
双因素ANOVA（包含交互作用）

六、回归分析

简单回归

模型：$ Y = \beta_0 + \beta_1 X + \epsilon $
最小二乘法估计参数
$ R^2 $：决定系数，衡量模型对$Y$的解释能力 $$ R^2 = \frac{SSR}{SST} $$

多元回归

多个自变量，需注意多重共线性（自变量间高度相关）
增加变量会令$R^2$上升，但有时这是虚假的，需要用调整$R^2$

虚拟变量（分类变量）

加入哑变量以处理类别型自变量

七、卡方与定类资料分析

卡方拟合优度检验：检验观测分布是否与理论分布一致 $$ \chi^2 = \sum_{i=1}^k \frac{(f_i - e_i)^2}{e_i} $$
卡方独立性检验：判断两个分类变量是否独立[@ref_10 @ref_11]

八、两总体比较

两均值的区间估计与假设检验（独立样本/配对样本，各自用Z或t）[@ref_35 @ref_36]
两比例的区间估计与假设检验

总结提示

各章节均有典型题型和公式，进行期末复习时建议优先掌握核心概念、基本判别方法与公式推导
注意分类变量/连续变量、检验条件（样本量、方差已知未知）、以及统计推断流程
专业名词最好中英对照，便于考场中快速反应

如需要更具体某章/某类题目的讲解和例题，可以随时补充说明！

温馨提示：如果你需要我记忆某些公式、易错点或题型，可随时告知，我可以长期保留便于复习。

按照week的形式

当然可以！以下是结合STAT7055 Introductory Statistics for Business and Finance与非常重要！公式总结内容，按照每周（week）知识点分主题进行中文总结讲解，并配合重要公式，便于系统复习：

Week 1：介绍与描述性统计 Descriptive Statistics

数据类型
- 定类（名义型nominal、顺序型ordinal）
- 数值型（离散型discrete、连续型continuous）
总体与样本
- 总体特征参数（参数，未知）
- 样本特征统计量（可计算）
常用统计量
- 均值（$\mu$，$\bar{x}$）、方差/标准差（$\sigma^2$，$s^2$）
- 变异系数 $CV = \frac{\sigma}{\mu}$
- 四分位数、IQR（$Q_3 - Q_1$）
协方差与相关系数

Week 2：概率基础 Probability Fundamentals

Week 3 & Week 4: 随机变量与概率分布 Random Variables & Distributions

Week 5 & Week 6：抽样分布与估计 Sampling Distributions & Estimation

Week 7：假设检验 Hypothesis Testing

流程：

Week 8：两总体比较 Two-population inference

两均值比较

Week 9：方差分析 ANOVA

Week 10：卡方检验与分类数据分析 Chi-Squared Test

Week 11 & 12：回归分析 Regression

温馨提示：课程各章节的重点在于理解不同问题下应选用何种统计推断方法、掌握常见分布及其参数、熟练运用公式解决实际问题。
如需某周详细例题解析或难点讲解，欢迎随时提出！

如需记忆公式、重点难点，或下次快速查阅，请告知，我可以为你长期保存！

STAT7055-Introductory Statistics for Business and Finance Study Notes & Practice | The Australian National University | AskSia