STAT7055-Introductory Statistics for Business and Finance Study Notes & Practice | The Australian National University | AskSia
Mar 13, 2026
All files
I've received your latest file upload, but there was no message or question included.
How can I assist you with your uploaded file or is there something specific you would like to know or analyze?
Feel free to provide instructions or ask any questions related to this document or any other topic!
Tip:
If you want me to remember specific formulas, summaries, or exam points for the future, just let me know—I can save them for your upcoming study sessions!
Summary of Statistical Inference Concepts
This document outlines key concepts in statistical inference, focusing on hypothesis testing and estimation for population parameters.
1. Hypothesis Testing
Hypothesis testing involves making a claim about a population parameter and using sample data to determine if the claim is supported.
- Null Hypothesis (H₀): A default claim about a population parameter, typically including an equality sign (
=). It is assumed to be true initially. - Alternative Hypothesis (H₁): A claim about the population parameter that the researcher is trying to prove. It contradicts the null hypothesis.
- Null Distribution: The sampling distribution of the test statistic under the assumption that H₀ is true.
- Rejection Region: The set of values for the test statistic that are considered extreme enough to provide evidence against H₀. If the observed test statistic falls within this region, H₀ is rejected.
- Critical Values: Boundaries that define the rejection region.
- Decision Rule:
- If the observed test statistic falls in the rejection region, reject H₀ (concluding H₁ is true).
- If the observed test statistic does not fall in the rejection region, fail to reject H₀ (concluding H₀ is likely true).
Errors in Hypothesis Testing
- Type I Error: Rejecting H₀ when it is actually true. The probability of this error is denoted by α (alpha), the significance level. A smaller α leads to a more stringent test.
- Type II Error: Failing to reject H₀ when it is actually false. The probability of this error is denoted by β (beta).
- Power of a Test: The probability of correctly rejecting a false H₀, calculated as 1 - β.
Hypothesis Tests for Population Mean (μ)
- When σ² is Known:
- Test Statistic: Z-statistic, which follows a standard normal distribution (N(0, 1)) under H₀.
- Rejection Regions:
- Two-tailed test (H₁: μ ≠ μ₀): Z > Z<sub>α/2</sub> or Z < -Z<sub>α/2</sub>
- One-tailed test (H₁: μ > μ₀): Z > Z<sub>α</sub>
- One-tailed test (H₁: μ < μ₀): Z < -Z<sub>α</sub>
- When σ² is Unknown:
- Test Statistic: t-statistic, which follows a t-distribution with
n-1degrees of freedom under H₀. - Rejection Regions: Similar structure to Z-tests but using critical t-values (t<sub>α/2, n-1</sub> or t<sub>α, n-1</sub>).
- Test Statistic: t-statistic, which follows a t-distribution with
- p-value: The probability of observing a test statistic as extreme as, or more extreme than, the one calculated from the sample, assuming H₀ is true. If p-value ≤ α, reject H₀.
Hypothesis Test for Population Proportion (p)
- Test Statistic: Z-statistic, calculated using the hypothesized proportion p₀.
- Z = (p̂ - p₀) / √[p₀(1 - p₀) / n]
- Rejection Regions: Similar to Z-tests for the mean.
Testing 2 or More Population Means (ANOVA)
- One-Way ANOVA: Compares means of a continuous response variable across levels of one categorical factor.
- H₀: μ₁ = μ₂ = ... = μ<sub>k</sub>
- H₁: Not all population means are equal.
- Sources of Variation:
- SST (Sum of Squares for Treatment): Variation between sample means.
- SSE (Sum of Squares for Error): Variation within samples.
- SS(total): Total variation.
- Test Statistic: F-statistic (F = MST / MSE), which follows an F-distribution.
- Rejection Region: F > F<sub>α, k-1, n-k</sub>.
- Two-Way ANOVA: Analyzes the effect of two categorical factors and their interaction on a continuous response variable.
- Interaction: The effect of one factor depends on the level of the other factor.
- Hypotheses: Tests for interaction effect, main effect of Factor A, and main effect of Factor B.
- Test Statistics: F-statistics for interaction and main effects.
Inference for Categorical Data
- Chi-Square Goodness-of-Fit Test: Tests if population proportions for a single categorical variable match hypothesized values.
- H₀: P₁ = P₂ = ... = P<sub>k</sub> = 1/k (or specified proportions)
- H₁: Not all population proportions are equal.
- Test Statistic: χ² statistic, calculated using observed (fᵢ) and expected (eᵢ) counts.
- χ² = Σ [(fᵢ - eᵢ)² / eᵢ]
- Degrees of freedom: k - 1.
- Rejection Region: χ² > χ²<sub>α, k-1</sub>.
- Chi-Square Test for Independence: Tests if two categorical variables are independent using a contingency table.
- H₀: The two variables are independent.
- H₁: The two variables are dependent.
- Test Statistic: χ² statistic, calculated using observed (fᵢⱼ) and expected (eᵢⱼ) counts in the contingency table.
- χ² = ΣΣ [(fᵢⱼ - eᵢⱼ)² / eᵢⱼ]
- Degrees of freedom: (r - 1)(c - 1), where r and c are the number of categories for each variable.
- Rejection Region: χ² > χ²<sub>α, (r-1)(c-1)</sub>.
2. Simple Linear Regression
Models the linear relationship between one continuous independent variable (X) and one continuous dependent variable (Y).
- Model: Yᵢ = β₀ + β₁Xᵢ + εᵢ
- β₀: True y-intercept.
- β₁: True slope.
- εᵢ: Random error component, assumed to be normally distributed with mean 0 and constant variance σ².
- Estimated Regression Line: Ŷᵢ = b₀ + b₁Xᵢ
- b₀ and b₁ are estimates of β₀ and β₁.
- Method of Least Squares: Minimizes the sum of squared residuals (eᵢ = Yᵢ - Ŷᵢ).
- Assumptions of Errors (εᵢ):
- Normality: εᵢ ~ N(0, σ²)
- Constant Variance (Homoscedasticity): σ² is constant for all X.
- Independence: Errors are independent of each other.
- Model Significance:
- Overall Significance Test (F-test): Tests if β₁ = 0.
- Coefficient Significance Test (t-test): Tests if β₁ ≠ 0.
- Coefficient of Determination (R²): The proportion of the total variation in Y that is explained by the model. R² = SSR / SS(total) = 1 - (SSE / SS(total)).
- Confidence and Prediction Intervals: Used to estimate the expected value of Y or predict a specific value of Y for a given X. Prediction intervals are wider than confidence intervals due to greater uncertainty.
3. Multiple Linear Regression
Extends simple linear regression to model the relationship between a dependent variable (Y) and two or more independent variables (X₁, X₂, ..., X<sub>k</sub>).
- Model: Yᵢ = β₀ + β₁X₁ᵢ + β₂X₂ᵢ + ... + β<sub>k</sub>X<sub>kᵢ</sub> + εᵢ
- βⱼ represents the expected change in Y for a one-unit increase in Xⱼ, holding all other independent variables constant.
- Assumptions: Similar to simple linear regression, including normality, constant variance, and independence of errors.
- Model Selection: Crucial to avoid overfitting and multicollinearity.
- Overall Significance Test (F-test): Tests H₀: β₁ = β₂ = ... = β<sub>k</sub> = 0.
- Coefficient Significance Test (t-test): Tests H₀: βⱼ = 0 for individual coefficients.
- R² and Adjusted R²: Adjusted R² penalizes the addition of unnecessary variables.
- Multicollinearity: High correlation between independent variables, leading to unstable coefficient estimates and large standard errors.
- Categorical Independent Variables: Coded using indicator (dummy) variables. This allows for varying intercepts and/or slopes based on the categories.
4. Descriptive Statistics
Summarizes data using measures and visualizations.
- Categorical Data: Nominal (no order) and Ordinal (ordered). Visualized with Bar Charts and Pie Charts.
- Numerical Data: Discrete (countable) and Continuous (uncountable). Visualized with Histograms and Boxplots.
- Measures:
- Central Tendency: Mean, Median (middle observation), Mode (most frequent).
- Variability: Variance (s²), Standard Deviation (s), Coefficient of Variation, Interquartile Range (IQR).
- Relationship: Covariance, Correlation Coefficient (r).
5. Probability Concepts
- Sample Space (S): Set of all possible outcomes.
- Event: A subset of the sample space.
- Probability: A measure of the likelihood of an event occurring, ranging from 0 to 1.
- Approaches: Classical, Relative Frequency, Subjective.
- Rules: Addition Rule, Multiplication Rule, Conditional Probability, Law of Total Probability, Complement Rule.
- Independence: Two events A and B are independent if P(A ∩ B) = P(A) * P(B).
6. Random Variables and Distributions
- Random Variable: A function assigning a numerical value to each outcome in a sample space.
- Discrete: Countable values (e.g., number of successes).
- Continuous: Uncountable values (e.g., height, temperature).
- Probability Distribution: Lists possible values and their probabilities.
- Discrete: Probability Mass Function (PMF).
- Continuous: Probability Density Function (PDF). Probabilities are calculated as areas under the PDF curve. P(X=x) = 0 for continuous variables.
- Expected Value (E[X] or μ): The mean of the distribution.
- Variance (V[X] or σ²): Measures the spread of the distribution.
- Special Distributions:
- Binomial: Number of successes in a fixed number of independent Bernoulli trials.
- Uniform: Equal probability over a given range.
- Normal: Bell-shaped, symmetric distribution (μ, σ²). Key for the Central Limit Theorem.
- Chi-Square: Used in hypothesis testing for categorical data.
- Exponential: Describes waiting times.
- Bivariate Distribution: Describes the joint probability of two random variables.
7. Sampling Distributions
- The probability distribution of a sample statistic (e.g., sample mean, sample proportion).
- Central Limit Theorem (CLT): States that the sampling distribution of the sample mean (X̄) approaches a normal distribution with mean μ and variance σ²/n, regardless of the population distribution, as the sample size (n) becomes large.
- Sampling Distribution of Sample Proportion (p̂): Also approaches a normal distribution (mean p, variance p(1-p)/n) for large sample sizes (np > 5 and n(1-p) > 5).
8. Statistical Estimation
- Point Estimator: A single value calculated from sample data to estimate an unknown population parameter (e.g., sample mean X̄ estimates population mean μ).
- Properties: Unbiasedness (E[θ̂] = θ), Variance (V[θ̂]), Mean Squared Error (MSE = V[θ̂] + Bias²), Consistency (MSE → 0 as n → ∞).
- Relative Efficiency: Compares the variances of unbiased estimators.
- Interval Estimator (Confidence Interval): A range of values likely to contain the unknown population parameter.
- Confidence Level (1-α): The probability that the interval contains the true parameter.
- Construction: Uses the sampling distribution (often via CLT) and critical values (e.g., Z<sub>α/2</sub>).
- Factors Affecting Width: Population variance, sample size, confidence level. Wider intervals indicate less precision.
This document provides a comprehensive overview of statistical concepts and methods, covering descriptive statistics, probability, random variables, distributions, sampling, estimation, hypothesis testing, and regression analysis.
Week 1: Introduction and Descriptive Statistics
- Types of Data:
- Discrete: Measured in fixed increments.
- Continuous: Measured in infinitely small increments.
- Population vs. Sample:
- Population: Characterized by parameters (unknown values).
- Sample: Characterized by statistics.
- Measures of Relative Standing:
- Quantiles: Values that divide a dataset into ordered parts (e.g., quartiles).
- Measures of Variability:
- Range: Largest Value - Smallest Value.
- Interquartile Range (IQR): Q3 - Q1.
- Coefficient of Variation: Measures relative variability.
- Covariance: Measures the joint variability of two random variables.
- Population Covariance: $\frac{\sum_{i=1}^{N} (X_i - \mu_X)(Y_i - \mu_Y)}{N}$
- Outlier Detection (using IQR):
- Lines extend from Q1 to $\min{\text{max}, Q3 + 1.5 \times \text{IQR}}$ and from Q3 to $\max{\text{min}, Q1 - 1.5 \times \text{IQR}}$.
- Probability Fundamentals:
- Random Experiment: A process with uncertain outcomes.
- Outcomes (O): Possible results of an experiment.
- Sample Space (S): The set of all possible outcomes ${O_1, O_2, O_3, ...}$.
- Probabilities of Outcomes (P(Oᵢ)):
- $0 \le P(O_i) \le 1$ for all $i$.
- Sum of probabilities of all outcomes equals 1.
- Mutually Exclusive & Exhaustive: Events that cannot occur simultaneously and cover all possibilities.
- Complement (Aᶜ): The event that A does not occur.
- Marginal Probability (P(A)): The probability of event A.
- Law of Total Probability: Used to calculate the probability of an event using a partition of the sample space.
- Conditional Probability (P(A|B)): The probability of A given B has occurred.
- $P(A \cap B) = P(A|B) \times P(B) = P(B|A) \times P(A)$
- Addition Rule: $P(A \cup B) = P(A) + P(B) - P(A \cap B)$
- Correlation Coefficient: Measures the linear relationship between two variables.
- Joint Probability (P(A ∩ B)): The probability of both A and B occurring.
- Events: A collection of one or more simple events.
- $P(A) = \frac{\text{Number of simple events in A}}{\text{Number of simple events in S}}$
- Types of Data Scales:
- Nominal: Categories with no inherent order or relationship.
- Ordinal: Categories with a distinct ordering.
Weeks 3 & 4: Random Variables and Discrete/Continuous Probability Distributions
- Probability Distribution Function (PDF):
- For Discrete Variables: $P(X=x) = p(x)$
- $0 \le p(x) \le 1$ for all $x$.
- $\sum_{all\ x} p(x) = 1$.
- For Continuous Variables: $f(x)$ (Probability Density Function)
- $f(x) \ge 0$ for all $x$.
- $\int_{-\infty}^{\infty} f(x) dx = 1$.
- For Discrete Variables: $P(X=x) = p(x)$
- Expected Value (Mean):
- Discrete: $\mu = E(X) = \sum_{all\ x} x \cdot p(x)$
- Continuous: $\mu = E(X) = \int_{-\infty}^{\infty} x \cdot f(x) dx$
- Expected Value of a function of X: $E(g(X)) = \sum_{all\ x} g(x) \cdot p(x)$ (discrete) or $\int_{-\infty}^{\infty} g(x) \cdot f(x) dx$ (continuous).
- Variance:
- Discrete: $\sigma^2 = V(X) = E((X-\mu)^2) = \sum_{all\ x} (x-\mu)^2 \cdot p(x)$
- Continuous: $\sigma^2 = V(X) = E((X-\mu)^2) = \int_{-\infty}^{\infty} (x-\mu)^2 \cdot f(x) dx$
- Bivariate Distributions: Involve two random variables (X, Y).
- Joint Probability Mass Function (Discrete): $P(X=x, Y=y) = p(x, y)$
- Joint Probability Density Function (Continuous): $f(x, y)$
- Marginal Probability (Discrete): $p(x) = P(X=x) = \sum_{all\ y} p(x, y)$
- Marginal Probability Density Function (Continuous): $f_X(x) = \int_{-\infty}^{\infty} f(x, y) dy$
- Covariance (Bivariate):
- $\sigma_{XY} = \text{Cov}(X,Y) = E((X-\mu_X)(Y-\mu_Y)) = \sum_{all\ x} \sum_{all\ y} (x-\mu_X)(y-\mu_Y) \cdot p(x, y)$ (discrete)
- Also, $\text{Cov}(X,Y) = E(XY) - E(X)E(Y)$.
- Independence: Two random variables X and Y are independent if $p(x, y) = p_X(x) \cdot p_Y(y)$ (discrete) or $f(x, y) = f_X(x) \cdot f_Y(y)$ (continuous).
- Laws of Expectation and Variance:
- $E(XY) = E(X)E(Y)$ if X and Y are independent.
- $V(c) = 0$, $V(cX) = c^2V(X)$, $V(X+c) = V(X)$.
- $V(X+Y) = V(X) + V(Y)$ if X and Y are independent.
- $V(aX + bY) = a^2V(X) + b^2V(Y) + 2ab\text{Cov}(X,Y)$.
- Common Distributions:
- Binomial Distribution ($X \sim \text{Bin}(n, p)$): For a fixed number of independent trials with two outcomes.
- $P(X=k) = \binom{n}{k} p^k (1-p)^{n-k}$
- $E(X) = np$, $V(X) = np(1-p)$.
- Normal Distribution ($X \sim N(\mu, \sigma^2)$): Bell-shaped, symmetric distribution.
- Binomial Distribution ($X \sim \text{Bin}(n, p)$): For a fixed number of independent trials with two outcomes.
Week 5: Sampling Distributions & Week 6: Estimation
- Sampling Distribution: The probability distribution of a statistic (e.g., sample mean).
- Central Limit Theorem (CLT): For a sufficiently large sample size ($n$), the sampling distribution of the sample mean ($\bar{X}$) is approximately normal, regardless of the population distribution.
- Conditions for CLT approximation:
- If population is normal, $\bar{X} \sim N(\mu, \sigma^2/n)$ for any $n$.
- If population is not normal but $n \ge 20$, $\bar{X}$ is close to normal.
- If population is far from normal, $n > 50$ is recommended.
- Conditions for CLT approximation:
- De Moivre-Laplace Theorem: Approximates the Binomial distribution with a Normal distribution when $np \ge 5$ and $n(1-p) \ge 5$.
- Properties of Estimators:
- Unbiasedness: $B(\hat{\theta}) = E(\hat{\theta}) - \theta = 0 \implies E(\hat{\theta}) = \theta$.
- Consistency: Mean Squared Error (MSE) approaches 0 as sample size $n$ increases. $MSE(\hat{\theta}) = E((\hat{\theta} - \theta)^2) = V(\hat{\theta}) + (B(\hat{\theta}))^2$.
- Relative Efficiency: Compares the variances of two unbiased estimators: $\text{eff}(\hat{\theta}_1, \hat{\theta}_2) = V(\hat{\theta}_2) / V(\hat{\theta}_1)$.
- Interval Estimators (when $\sigma$ is known):
- Confidence Interval for $\mu$: $\bar{X} \pm z_{\alpha/2} \frac{\sigma}{\sqrt{n}}$
- Confidence Levels: 90%, 95%, 99% correspond to specific $z_{\alpha/2}$ values.
- Uniform Distribution ($X \sim U(a, b)$):
- $f(x) = \frac{1}{b-a}$ for $a \le x \le b$, and 0 otherwise.
- $P(a \le X \le b) = \int_a^b f(x) dx = 1$.
Week 7: Hypothesis Testing
- Testing Hypotheses about $\mu$:
- When $\sigma$ is known: Use Z-test.
- $H_0: \mu = \mu_0$
- Test Statistic: $Z = \frac{\bar{X} - \mu_0}{\sigma/\sqrt{n}}$
- Rejection Region: $Z > z_{\alpha/2}$ or $Z < -z_{\alpha/2}$ (two-tailed).
- When $\sigma$ is unknown: Use t-test.
- $H_0: \mu = \mu_0$
- Test Statistic: $t = \frac{\bar{X} - \mu_0}{s/\sqrt{n}}$ with $n-1$ degrees of freedom.
- Rejection Region: $t > t_{\alpha/2, n-1}$ or $t < -t_{\alpha/2, n-1}$ (two-tailed).
- When $\sigma$ is known: Use Z-test.
- Errors in Hypothesis Testing:
- Type I Error: Rejecting $H_0$ when it is true ($\alpha$).
- Type II Error: Failing to reject $H_0$ when it is false ($\beta$).
- Significance Level ($\alpha$): Probability of Type I error.
- Power of the Test: $1 - \beta$.
- Calculating Probability of a Type II Error ($\beta$):
- Determine the rejection region based on $\alpha$ and $H_0$.
- Work backward to find the rejection region in terms of the unstandardized test statistic.
- Calculate the probability of not rejecting $H_0$ under the alternative hypothesis ($H_1$) by re-standardizing using the true parameter value from $H_1$.
Week 8: Comparing Two Populations
- Inferences about:
- Difference in population means ($\mu_1 - \mu_2$).
- Difference in population proportions ($p_1 - p_2$).
- Comparing Two Means:
- Independent Samples:
- $\sigma_1, \sigma_2$ known: Z-statistic.
- $\sigma_1, \sigma_2$ unknown, $\sigma_1^2 = \sigma_2^2$: Pooled t-statistic.
- $\sigma_1, \sigma_2$ unknown, $\sigma_1^2 \ne \sigma_2^2$: Welch's t-statistic (often referred to as "penta kill" in the notes, likely a colloquial term).
- Paired Samples: Treat differences as one sample (t-statistic).
- Independent Samples:
- Testing $\mu_1 - \mu_2$ (known $\sigma_1, \sigma_2$):
- $H_0: \mu_1 - \mu_2 = D_0$
- Test Statistic: $Z = \frac{(\bar{X}_1 - \bar{X}_2) - D_0}{\sqrt{\frac{\sigma_1^2}{n_1} + \frac{\sigma_2^2}{n_2}}}$
- Rejection Region depends on $H_1$ (two-tailed, upper-tailed, lower-tailed).
- Testing $\mu_1 - \mu_2$ with Paired Samples:
- Let $D_i = X_{1i} - X_{2i}$. Test $H_0: \mu_D = D_0$.
- Test Statistic: $t = \frac{\bar{D} - D_0}{s_D/\sqrt{n}}$ with $n-1$ degrees of freedom.
- Comparing Two Proportions ($p_1 - p_2$):
- Independent Samples:
- Test Statistic (for $H_0: p_1 - p_2 = D_0$): $Z = \frac{(\hat{p}_1 - \hat{p}_2) - D_0}{\sqrt{\frac{\hat{p}_1(1-\hat{p}_1)}{n_1} + \frac{\hat{p}_2(1-\hat{p}_2)}{n_2}}}$
- If $D_0 = 0$, use pooled proportion $\hat{p} = \frac{X_1+X_2}{n_1+n_2}$.
- Independent Samples:
- Testing $\mu_1 - \mu_2$ (unknown $\sigma_1, \sigma_2$):
- $H_0: \mu_1 - \mu_2 = D_0$
- Test Statistic (pooled variance): $t = \frac{(\bar{X}_1 - \bar{X}_2) - D_0}{s_p \sqrt{\frac{1}{n_1} + \frac{1}{n_2}}}$ where $s_p^2$ is the pooled sample variance.
- Rejection Region: $t > t_{\alpha/2, n_1+n_2-2}$ or $t < -t_{\alpha/2, n_1+n_2-2}$ (two-tailed).
Week 9: Analysis of Variance (ANOVA)
- Purpose: To compare means of three or more groups.
- Hypotheses:
- $H_0$: All population means are equal ($\mu_1 = \mu_2 = ... = \mu_k$).
- $H_1$: At least two population means differ.
- Test Statistic: F-statistic (from ANOVA table).
- ANOVA Table Components:
- SS (Total): Sum of squares for the total variation.
- SST: Sum of squares between groups (treatment).
- SSE: Sum of squares within groups (error).
- $SS(\text{Total}) = SST + SSE$.
- For two-way ANOVA: SSA, SSB, SSAB (interaction), SSE.
- Rejection Region: $F > F_{\alpha, k-1, n-k}$ (for one-way ANOVA).
- Interaction Effects: Tested first in multi-way ANOVA.
- $H_0$: No interaction between factors.
- $H_1$: Interaction exists.
- Rejection Region: $F_{AB} > F_{\alpha, (a-1)(b-1), n-ab}$.
Week 10: Chi-Squared Tests
- Chi-Squared Goodness-of-Fit Test:
- Used for one categorical variable with $k$ categories.
- Hypotheses:
- $H_0: p_1 = c_1, p_2 = c_2, ..., p_k = c_k$ (population proportions match specified values).
- $H_1$: The population proportions do not match.
- Test Statistic: $\chi^2 = \sum_{i=1}^k \frac{(f_i - e_i)^2}{e_i}$, where $f_i$ is observed count and $e_i$ is expected count.
- Chi-Squared Test of a Contingency Table:
- Used to determine if two categorical variables (with $r$ rows and $c$ columns) are independent.
- Hypotheses:
- $H_0$: The variables are independent.
- $H_1$: The variables are not independent.
- Test Statistic: $\chi^2 = \sum_{i=1}^r \sum_{j=1}^c \frac{(O_{ij} - E_{ij})^2}{E_{ij}}$, where $O_{ij}$ is observed count and $E_{ij} = \frac{(\text{ith row total}) \times (\text{jth column total})}{(\text{grand total})}$.
- Hypothesis Testing for Proportion (p):
- $H_0: p = p_0$
- $H_1: p \ne p_0$ (or $p > p_0$, $p < p_0$)
- Test Statistic: $Z = \frac{\hat{p} - p_0}{\sqrt{\frac{p_0(1-p_0)}{n}}}$
- Rejection Region: Based on Z-values ($z_{\alpha/2}$, $z_{\alpha}$).
Weeks 11 & 12: Simple and Multiple Linear Regression
- Simple Linear Regression Model: $Y = \beta_0 + \beta_1 X + \epsilon$, where $\epsilon \sim N(0, \sigma^2)$.
- $\beta_0$: Intercept.
- $\beta_1$: Slope.
- $\epsilon$: Error term.
- Estimating the Model (Least Squares):
- $\hat{\beta}1 = \frac{S{XY}}{S_{XX}}$, where $S_{XY} = \sum (X_i - \bar{X})(Y_i - \bar{Y})$ and $S_{XX} = \sum (X_i - \bar{X})^2$.
- $\hat{\beta}_0 = \bar{Y} - \hat{\beta}_1 \bar{X}$.
- $E(\hat{\beta}_1) = \beta_1$ (unbiased).
- Measures of Fit:
- Coefficient of Determination ($R^2$): Proportion of variance in Y explained by X. $R^2 = \frac{SSR}{SS(\text{Total})} = 1 - \frac{SSE}{SS(\text{Total})}$.
- $SS(\text{Total}) = \sum (Y_i - \bar{Y})^2 = (n-1)s_Y^2$.
- $SSE = \sum (Y_i - \hat{Y}_i)^2$.
- Inference for Regression Coefficients:
- Test Statistic for $\beta_1$: $t = \frac{\hat{\beta}_1 - \beta_1}{SE(\hat{\beta}_1)}$ with $n-2$ degrees of freedom.
- Rejection Region: $t > t_{\alpha/2, n-2}$ or $t < -t_{\alpha/2, n-2}$ (two-tailed).
- Prediction:
- Point Estimate for a particular Y: $\hat{Y}_g = \hat{\beta}_0 + \hat{\beta}_1 X_g$.
- Confidence Interval for $E(Y | X=X_g)$: $\hat{Y}g \pm t{\alpha/2, n-2} \times SE(\hat{Y}_g)$.
- Multiple Linear Regression Model: $Y = \beta_0 + \beta_1 X_1 + ... + \beta_k X_k + \epsilon$.
- Assumptions: $\epsilon_i \sim N(0, \sigma^2)$ i.i.d.
- $E(Y) = \beta_0 + \beta_1 X_1 + ... + \beta_k X_k$.
- $V(Y) = \sigma^2$.
- Assessing the Model:
- Overall Test (ANOVA F-test):
- $H_0: \beta_1 = \beta_2 = ... = \beta_k = 0$.
- $H_1$: Not all coefficients are zero.
- Test Statistic: F-statistic from ANOVA table.
- Rejection Region: $F > F_{\alpha, k, n-k-1}$.
- Individual Coefficient Tests (t-tests):
- $H_0: \beta_j = 0$.
- $H_1: \beta_j \ne 0$.
- Test Statistic: $t = \frac{\hat{\beta}_j}{SE(\hat{\beta}_j)}$ with $n-k-1$ degrees of freedom.
- Rejection Region: $t > t_{\alpha/2, n-k-1}$ or $t < -t_{\alpha/2, n-k-1}$.
- Overall Test (ANOVA F-test):
- Adjusted $R^2$: $1 - \frac{SSE/(n-k-1)}{SS(\text{Total})/(n-1)}$.
- Categorical Independent Variables (Dummy Variables): Used to include qualitative predictors in the model.
- Multicollinearity: Correlation between independent variables can make parameter estimates unstable with large variance, affecting individual coefficient tests but not the overall F-test. Conclusions from individual coefficient tests are conditional on other variables being in the model.
好的!以下是对Notes & Review - STAT7055 Introductory Statistics for Business and Finance.pdf 以及非常重要!公式总结.pdf的中文总结讲解,涵盖主要统计推断知识点,并配合常考公式与方法:
一、描述性统计与概率基础
数据类型:
- 数值型(离散型、连续型)
- 分类型(类别型、顺序型)
中心趋势/离散程度
- 均值(样本均值 $ \bar{x} $,总体均值 $ \mu $)
- 方差/标准差($ s^2, \sigma^2 $),变异系数
- 协方差与相关系数:衡量两个变量的线性关系
概率基础
- 事件 Event、样本空间 Sample Space
- 概率性质:$ 0 \leq P(A) \leq 1 $,且所有互斥事件概率和为1
- 条件概率:$ P(A|B) = \dfrac{P(A \cap B)}{P(B)} $
- 全概率公式与贝叶斯公式
$$ P(A) = \sum_i P(A|B_i)P(B_i) $$ $$ P(A|B) = \frac{P(B|A),P(A)}{P(B)} $$ - 两事件独立性:$ P(A \cap B) = P(A)P(B) $
二、随机变量与分布
离散型分布
- 二项分布 $ X \sim Bin(n, p) $: $ P(X=k) = \binom{n}{k}p^k(1-p)^{n-k} $
连续型分布
- 正态分布 $ X \sim N(\mu, \sigma^2) $
- $ Z $ 标准化公式:$ Z = \frac{X-\mu}{\sigma} $
- 其它分布:t分布、F分布、$\chi^2$分布、指数分布等
某些性质
- $ E(X+Y) = E(X)+E(Y) $
- $ Var(aX+bY) = a^2Var(X) + b^2Var(Y) + 2ab,Cov(X, Y) $
三、抽样分布与估计
抽样分布
- 样本均值 $ \bar{X} $ 的分布:
$$ \bar{X} \sim N(\mu, \frac{\sigma^2}{n}) \text{(大样本时由中心极限定理保证)} $$
点估计与区间估计
- 点估计:用样本统计量(如 $ \bar{X} $)估计总体参数 $ \mu $
- 区间估计(置信区间):
- $ \sigma $已知: $$ \bar{X} \pm z_{\alpha/2} \frac{\sigma}{\sqrt{n}} $$
- $ \sigma $未知,采用t分布: $$ \bar{X} \pm t_{\alpha/2, n-1} \frac{s}{\sqrt{n}} $$
- 置信区间越宽,参数越不精确,但置信水平越高 [9]Source: Notes & Review - STAT7055 Introductory Statistics for Business and Finance.pdf· Construct an interval estimator Using the probabilities derived from the sampling distribution of the estimator based the CLT. · Estimating u (ø2 is known) The CTL - X~N(H,-) for large sample size X-H - Z = 2~N(0, 1) for large sample size n 90% confidence interval for u X ± 1. 645 P(-1. 645 < Z < 1. 645) = 0. 90 95% confidence interval for u X ± 1. 960 P(-1. 960 < Z < 1. 960) = 0. 95 99% confidence interval for u X ± 2. 575- P(-2. 575 < Z < 2. 575) = 0. 99 100(1-a)% confidence interval for u X + Za- o2 žn P X - Za 2 Vn < p < X + Za- 6 2 Vn / = 1-a lower confidence limit higher confidence limit confidence level Interpretation - Interval random > changes from sample to sample - Population mean --- fixed & constant > within the interval or not In repeated sampling, 100(1-a)% of such intervals created would contain the true population mean many samples > many intervals > 100(1 - a)% of them catch u 100a% of them miss u each sample > each interval > ~Bin (n, 100(1-α)%) ( 1 if X 1 - 2 ] < H < X 1 + Za X1 = . if not Factors affecting confidence interval - Population variance V(X) 1 ----- interval wider - accuracy \ - Sample size n 1 ----- interval narrower ------- accuracy 1 - Confidence level (1-a)% 1 - - interval wider 이년 이트 이트 创」
四、假设检验
-
设定假设
- 原假设 $ H_0 $:带等号
- 备择假设 $ H_1 $:带“≠”、“>”、“<”
-
计算检验统计量
- 均值检验 $ \sigma $已知用Z,未知用t $$ Z = \frac{\bar{X}-\mu_0}{\sigma/\sqrt{n}} $$ $$ t = \frac{\bar{X}-\mu_0}{s/\sqrt{n}} $$
- 比例检验:$ Z = \frac{\hat{p} - p_0}{\sqrt{p_0(1-p_0)/n}} $[40]Source: Notes & Review - STAT7055 Introductory Statistics for Business and Finance.pdf· T-distribution The null distribution of T-statistic with n-1 degrees of freedom - Symmetric = 0, mound shaped - 1 Degree of freedom (n-1) > approaches to normal distribution - Greater variance than standard normal distribution · Rejection Region T-table: (n-1) degree of freedom & significant level a T < - ta ___ or T > tan En - 1 T <- ta,n-1 Η1: μ ζ μο One-Tailed Test Η1: μ = μο Two-Tailed Test Η1: μ > μο One-Tailed Test T > ta,n-1 Hypothesis Test for p Hypothesis Ho: p = po H1: p (> # <) po Test Statistic (assume Ho is true) Z = PO (1 - PO ) p-Po n (p == ) X n p(1-p) n ,p + Za p(1-p) Confidence interval (p - Za n Decision rule Under H0, Z has N(0, 1) distribution Z > Za or Z < - Za 2 2 HI: p # po Two-Tailed Test Z < - Za Hi: p < po Z > Za HI: p > po
- 方差、相关等其它检验
-
拒绝域/显著性水平
- 判断统计量是否落入拒绝域(由 $\alpha$ 决定),比如双尾检验:$ |Z| > z_{\alpha/2} $[@ref_20 @ref_21]
- I型错误($\alpha$):拒绝了真的$H_0$
- II型错误($\beta$):没有拒绝假的$H_0$
-
结论
- 计算并对比 p-value,决定是否拒绝 $H_0$
五、方差分析(ANOVA)
-
单因素方差分析(One-way ANOVA)
- $H_0$:全部总体均值相等
- $H_1$:至少有两个均值不同
- 检验统计量F $$ F = \frac{MSB}{MSW} $$
- 拒绝域:$ F > F_{\alpha, k-1, n-k} $[@ref_16 @ref_28]
-
双因素ANOVA(包含交互作用)
六、回归分析
简单回归
- 模型:$ Y = \beta_0 + \beta_1 X + \epsilon $
- 最小二乘法估计参数
- $ R^2 $:决定系数,衡量模型对$Y$的解释能力 $$ R^2 = \frac{SSR}{SST} $$
多元回归
- 多个自变量,需注意多重共线性(自变量间高度相关)
- 增加变量会令$R^2$上升,但有时这是虚假的,需要用调整$R^2$
虚拟变量(分类变量)
- 加入哑变量以处理类别型自变量
七、卡方与定类资料分析
- 卡方拟合优度检验:检验观测分布是否与理论分布一致 $$ \chi^2 = \sum_{i=1}^k \frac{(f_i - e_i)^2}{e_i} $$
- 卡方独立性检验:判断两个分类变量是否独立[@ref_10 @ref_11]
八、两总体比较
- 两均值的区间估计与假设检验(独立样本/配对样本,各自用Z或t)[@ref_35 @ref_36]
- 两比例的区间估计与假设检验
总结提示
- 各章节均有典型题型和公式,进行期末复习时建议优先掌握核心概念、基本判别方法与公式推导
- 注意分类变量/连续变量、检验条件(样本量、方差已知未知)、以及统计推断流程
- 专业名词最好中英对照,便于考场中快速反应
如需要更具体某章/某类题目的讲解和例题,可以随时补充说明!
温馨提示:如果你需要我记忆某些公式、易错点或题型,可随时告知,我可以长期保留便于复习。
当然可以!以下是结合STAT7055 Introductory Statistics for Business and Finance与非常重要!公式总结内容,按照每周(week)知识点分主题进行中文总结讲解,并配合重要公式,便于系统复习:
Week 1:介绍与描述性统计 Descriptive Statistics
- 数据类型
- 定类(名义型nominal、顺序型ordinal)
- 数值型(离散型discrete、连续型continuous)
- 总体与样本
- 总体特征参数(参数,未知)
- 样本特征统计量(可计算)
- 常用统计量
- 均值($\mu$,$\bar{x}$)、方差/标准差($\sigma^2$,$s^2$)
- 变异系数 $CV = \frac{\sigma}{\mu}$
- 四分位数、IQR($Q_3 - Q_1$)
- 协方差与相关系数
- 协方差:$\text{Cov}(X, Y) = E[(X - \mu_X)(Y - \mu_Y)]$
- 相关系数:$P_{XY} = \frac{\text{Cov}(X, Y)}{\sigma_X \sigma_Y}$,[-1,1]区间
- 判断变量间线性关系强弱[1]Source: Notes & Review - STAT7055 Introductory Statistics for Business and Finance.pdfSTAT 7055 Kisotono Semester 2 2020 Notes & Review Xiaotong 小红书号:278284926 ANU Master of Finance 22届毕业生 小红书 扫描二维码 在小红书找到我 Descriptive Statistics Types of Data · Numerical: Discrete & Continuous Histograms & Boxplots Term & Variability Population (parameters) Sample (statistics) Notes Number of observations N n Arithmetic Mean μ =- ΣΚΙ ΧΙ x = Li-1xi Variance (spread/variability) σ2 =- ΣΕ 1 (xΙ -μ)2 s2 = - 1-Li-1(xi -X)2 Standard deviation S Coefficient of variance (variability) cv = º σ 1 cv = 3 account scale Covariance (liner relationship)[6]Source: 非常重要!公式总结.pdfWeek 1 Introduction and Descriptive Statistics Categorical Data Numerical discrete can be measured in fixed increments continuous can be measured in infinitely small increments Population - parameters (unknown) mean variance mean Sample - statistics (calculated) variance Measures of Relative Standing: Quantile Quartiles Q1 Q2 Q3 Percentiles Lp =(n+1)- p 100 Measures of Variability Range Range = Largest Value - Smallest Value Interquartile Range IQR = Q3 -Q1 Variance Standard Deviation Coefficient of Variation Population 9 ( X i ) 2 N i=1 n[19]Source: 非常重要!公式总结.pdfQ3 Percentiles Lp =(n+1)- p 100 Measures of Variability Range Range = Largest Value - Smallest Value Interquartile Range IQR = Q3 -Q1 Variance Standard Deviation Coefficient of Variation Population 9 ( X i ) 2 N i=1 n Sample 52 1 n-11 E(X-X)2 i=1 Covariance N Population OXY Σ(Χ; - μχ)(Υ ;- μγ) N i=1 Sample Sxy = Σ(Χ,-Χ)(Y,-Y) -5 -10 Bimodal Symmetric Pos skew Neg skew Draw a line that extends from the first quartile to either the smallest observation or 1. 5 times the IQR, whichever distance is shorter: min {max, Q3 +1. 5IQR} And from the third quartile to either the largest observation or 1. 5 times the IQR, whichever distance is shorter. max {min, Q1-1. 5IQR} Week 2 Probability Random experiment Outcomes(O)), Sample Space(S ={01,02,03, . . . } ) Mutually Exclusive & Exhaustive Probabilities of outcomes ( P(O;)) 1. 0 ≤ P(O¡) ≤1 for all i[32]Source: Notes & Review - STAT7055 Introductory Statistics for Business and Finance.pdfOXY =>(x4-Mx)(V1 -1) i=0 n >(x1-X)(y1-V) n -1. i=0 Correlation coefficient (liner relationship) PXY = TxY = SYSY -1 -- 0 -- 1 neg-no-pos Mode The most frequency occurring observation Median The middle observation Quartiles Q1 <25%, Q2<50%, Q3<75%, Q4<100% Interquartile range = Q3 - Q1 Percentiles Lp = (n+1) P 100 Graphically & Distribution * O -5 * * * -10 Bimodal Symmetric Pos skew Neg skew Frequency Bimodal Frequency -2 O
Week 2:概率基础 Probability Fundamentals
- 基本概念
- 随机试验、样本空间$S$、事件$A$等
- 概率性质:$0 \le P(A) \le 1$,$\sum_i P(O_i) = 1$
- 关系与运算
- 并、交、补事件操作
- 条件概率$P(A|B) = \frac{P(A \cap B)}{P(B)}$
- 加法、乘法、全概率、贝叶斯公式
- 独立性:$P(A \cap B) = P(A)P(B)$[8]Source: 非常重要!公式总结.pdfi=1 E(X,-1)2 S CV == X =- ZX, n i=1 ** Joint Probability P(ANB) 2. _ P(O;)=1 all i Events: is a collection of one or more simple events number of simple events in A P(A) = number of simple events in S N 1 n n-1 i=1 nominal categories that have no ordering or relationship ordinal categories that have a distinct ordering n B Box Week 3 & Week 4 Random Variables and Discrete / Continuous Probability Distribution Discrete Continuous Probability Distribution Function Properties P(X = x) = p(x) [0 ≤p(x)≤1 _p(x) =1 all x x x1 . . . x n p(x) p(x) . . . p(x2) Expected Value u= E(X)=>(xxp(x)) all x μ = Ε (g (Χ)) = Σ(g(x)x p(x)) all x[17]Source: 非常重要!公式总结.pdfSample 52 1 n-11 E(X-X)2 i=1 Covariance N Population OXY Σ(Χ; - μχ)(Υ ;- μγ) N i=1 Sample Sxy = Σ(Χ,-Χ)(Y,-Y) -5 -10 Bimodal Symmetric Pos skew Neg skew Draw a line that extends from the first quartile to either the smallest observation or 1. 5 times the IQR, whichever distance is shorter: min {max, Q3 +1. 5IQR} And from the third quartile to either the largest observation or 1. 5 times the IQR, whichever distance is shorter. max {min, Q1-1. 5IQR} Week 2 Probability Random experiment Outcomes(O)), Sample Space(S ={01,02,03, . . . } ) Mutually Exclusive & Exhaustive Probabilities of outcomes ( P(O;)) 1. 0 ≤ P(O¡) ≤1 for all i Intersection AOB Union AUB Complement AC Marginal Probability P(A) B BC Totals A P(ANB) ( C C ) P(A) AC P(Ac NB) P(ABC) P(AC) Totals P(B)[18]Source: 非常重要!公式总结.pdfSample 52 1 n-11 E(X-X)2 i=1 Covariance N Population OXY Σ(Χ; - μχ)(Υ ;- μγ) N i=1 Sample Sxy = Σ(Χ,-Χ)(Y,-Y) -5 -10 Bimodal Symmetric Pos skew Neg skew Draw a line that extends from the first quartile to either the smallest observation or 1. 5 times the IQR, whichever distance is shorter: min {max, Q3 +1. 5IQR} And from the third quartile to either the largest observation or 1. 5 times the IQR, whichever distance is shorter. max {min, Q1-1. 5IQR} Week 2 Probability Random experiment Outcomes(O)), Sample Space(S ={01,02,03, . . . } ) Mutually Exclusive & Exhaustive Probabilities of outcomes ( P(O;)) 1. 0 ≤ P(O¡) ≤1 for all i Intersection AOB Union AUB Complement AC Marginal Probability P(A) B BC Totals A P(ANB) ( C C ) P(A) AC P(Ac NB) P(ABC) P(AC) Totals P(B)[39]Source: Notes & Review - STAT7055 Introductory Statistics for Business and Finance.pdf· Sample space: S = {01,02,03, . . . , On} ------ Mutually Exclusive & Exhausted o Classical approach: happen equally Relative frequency approach: long-run proportion of times (past history) 0 Subject approach: Judgement, knowledge, experience · Event: A = {1, 2, 3, 4, 5, 6} , B = {4, 5, 6, 7, 8}, S = {1, 2, 3, 4, 5, 6, 7, 8, 9, 10} Venn Diagram S A B C Intersection: And -> AnB = {4, 5,6} Union: Or -> AUB = {1, 2, 3, 4, 5, 6, 7, 8} Complement: Opposite -> AC = {7,8,9,10} AnAC = ¢ AUAC = S P(S) = 1 0 < P(A) <1 P(ANC) = 0 ------ ANC = ¢ Probability Calculation · Joint probability · Marginal probability P(A1 0B) = P(A1 0 B1) + P(A1 0 B2) + P(A1 0 B3) · Law of total probability P(A) = >/=1 P(An Bi) 1. B1, B2, . . . , Bn are mutually exclusive 2. B1 U B2 U . . . U Bn = S · Conditional probability P(A|B) = P(AnB) P(B) · Multiplication rule P(AnB) = P(A|B) × P(B) = P(B|A) ×P(A) · Additional rule P(AUB) = P(A) + P(B) + P(A n B) = P(A) + P(B) (A and B are mutually exclusive) · Complement rule P(AC) = 1 - P(A) · Independence (A and B) P(AnB) = P(A) x P(B) P(A|B) = P(ANB) P(B) P(A)xP(B) P(B) = P(A) Discrete Random Variables Random Variable A function that assigns a numeric value to each simple event in a sample space - Random variable - uppercase ---- X, Y, Z - Value of random variable - lowercase ---- x, y, z A continuous random variable is one that can take an uncountable number of possible values (infinite)
Week 3 & Week 4: 随机变量与概率分布 Random Variables & Distributions
- 离散型变量
- 概率质量函数PMF:$P(X = x) = p(x)$
- $E(X)=\sum_x x p(x)$,$\text{Var}(X)=\sum_x (x-\mu)^2 p(x)$
- 连续型变量
- 概率密度函数PDF:$f(x), \int_{-\infty}^{\infty} f(x)dx=1$
- $P(a<X<b)=\int_a^b f(x)dx$
- $E(X)=\int x f(x)dx$,$\text{Var}(X)=\int (x-\mu)^2 f(x)dx$
- 常见分布
- 二项分布$X\sim Bin(n,p): P(X=k)=C_n^k p^k(1-p)^{n-k}, E(X)=np, Var(X)=np(1-p)$
- 均匀分布$X\sim U(a, b): f(x)=\frac{1}{b-a}, E(X)=\frac{a+b}{2}, Var(X)=\frac{(b-a)^2}{12}$
- 正态分布$X\sim N(\mu, \sigma^2)$
- 联合分布与协方差
- $Cov(X,Y) = E(XY) - E(X)E(Y)$
- 两随机变量独立$ \Longleftrightarrow p(x, y) = p_X(x)p_Y(y) $[2]Source: 非常重要!公式总结.pdfB Box Week 3 & Week 4 Random Variables and Discrete / Continuous Probability Distribution Discrete Continuous Probability Distribution Function Properties P(X = x) = p(x) [0 ≤p(x)≤1 _p(x) =1 all x x x1 . . . x n p(x) p(x) . . . p(x2) Expected Value u= E(X)=>(xxp(x)) all x μ = Ε (g (Χ)) = Σ(g(x)x p(x)) all x μ=(X)= xf (x)dx u=E(g(X)=g(x)f(x)dx Variance o2=V(X)=E((X-µ)2)=>((x-µ)2xp(x)) all x 02=V(X)=E((X-µ)2)=(x-u)2f(x)dx Bivariate Distribution P(X = x, Y = y) = p(x, y) p(x)= P(X =x) = >p(x, y) Marginal Probability all y P{(X,Y) € A} = [{ f(x, y)dxdy A fx(x)=f(x,y)dy Marginal PDF Covariance OXY = Cov(X,Y) = E((X-Hx)(Y-Hy)) = ΣΣ ((x-μχ) (y-μγ)· p(x, y)) all x all y V(X)=E(X2)-(E(X))2 Cov(X,Y)= E(XY)-E(X)E(Y) Independence p(x, y) = Px(x)xpy(y) Law of Expected Value E(c)=c E(cX)=cE(X) E(X+Y)=E(X)±E(Y) E(XY)=E(X)xE(Y) if X and Y are independent[3]Source: 非常重要!公式总结.pdfB Box Week 3 & Week 4 Random Variables and Discrete / Continuous Probability Distribution Discrete Continuous Probability Distribution Function Properties P(X = x) = p(x) [0 ≤p(x)≤1 _p(x) =1 all x x x1 . . . x n p(x) p(x) . . . p(x2) Expected Value u= E(X)=>(xxp(x)) all x μ = Ε (g (Χ)) = Σ(g(x)x p(x)) all x μ=(X)= xf (x)dx u=E(g(X)=g(x)f(x)dx Variance o2=V(X)=E((X-µ)2)=>((x-µ)2xp(x)) all x 02=V(X)=E((X-µ)2)=(x-u)2f(x)dx Bivariate Distribution P(X = x, Y = y) = p(x, y) p(x)= P(X =x) = >p(x, y) Marginal Probability all y P{(X,Y) € A} = [{ f(x, y)dxdy A fx(x)=f(x,y)dy Marginal PDF Covariance OXY = Cov(X,Y) = E((X-Hx)(Y-Hy)) = ΣΣ ((x-μχ) (y-μγ)· p(x, y)) all x all y V(X)=E(X2)-(E(X))2 Cov(X,Y)= E(XY)-E(X)E(Y) Independence p(x, y) = Px(x)xpy(y) Law of Expected Value E(c)=c E(cX)=cE(X) E(X+Y)=E(X)±E(Y) E(XY)=E(X)xE(Y) if X and Y are independent[4]Source: 非常重要!公式总结.pdfBinomial Distribution X ~ Bin(n, p) P(X=k)=C} pk (1-p)"-k n! k!(n-k)! p“(1- p)"-k E(X)=np,V(X)=np(1-p) Uniform Distribution Law of Variance V(c)=0 V(cX)=c2V(X) V(X+c)=V(X) V(X+Y)=V(X)+V(Y) if X and Y are independent V(aX +bY)=a2V(X)+b2V(Y)+2abCov(X,Y) = a2'" +b20} +2abp xYO XOy Normal Distribution X ~ N(µ,02), Y = 9 X-H_N(0,1) (x-11 )2 f(x)= -e 202 , -00< x< +00 12πσ E(X)=u, V(X)=62 Week 5 Sampling Distributions & Week 6 Estimation Sampling Distribution Central Limit Theorem(CLT) - Sample Mean as n-> 00 ~ N (0,1) X-µ o / In 1) X ~ N(1,02), for all sample size 2) X, close to normal distribution, n ~ 20 3) X far from normal distribution, n>50 De Movire - Laplace Theorem X ~ Bin (n, p) lim P a≤ n->00 X -np ·b a Vnp (1-p) ≤b - V2x 1 l'e 2 dt 12 Sample Proportion * X ~ Bin (n, p) asn-> p =~~ N p, P(1-p) n n When both np and n(1 - p) are ≥5 Note * P(X2c)=P(+2℃)=P(p>c) =P Vp(1-p) / n > Vp(1-p) / n Estimation Point Estimators 1) Unbiased B(Ô) = E(Ô)-֏ B(Ô)=0 -> E(Ô)= 0 2) Consistency MSE(O) ->0 as n-> MSE(?)=E((Ô-0)2)=V(Ô)+(B(Ô))2 3) Relative Efficiency eff(Ô,Ô2)=V(Ô)/V(2) Interval Estimators ( o is known) ** _2 X - u o Vn P - Za/2 0 Jn < 2 a/2 ~ N(0,1)[5]Source: 非常重要!公式总结.pdfBinomial Distribution X ~ Bin(n, p) P(X=k)=C} pk (1-p)"-k n! k!(n-k)! p“(1- p)"-k E(X)=np,V(X)=np(1-p) Uniform Distribution Law of Variance V(c)=0 V(cX)=c2V(X) V(X+c)=V(X) V(X+Y)=V(X)+V(Y) if X and Y are independent V(aX +bY)=a2V(X)+b2V(Y)+2abCov(X,Y) = a2'" +b20} +2abp xYO XOy Normal Distribution X ~ N(µ,02), Y = 9 X-H_N(0,1) (x-11 )2 f(x)= -e 202 , -00< x< +00 12πσ E(X)=u, V(X)=62 Week 5 Sampling Distributions & Week 6 Estimation Sampling Distribution Central Limit Theorem(CLT) - Sample Mean as n-> 00 ~ N (0,1) X-µ o / In 1) X ~ N(1,02), for all sample size 2) X, close to normal distribution, n ~ 20 3) X far from normal distribution, n>50 De Movire - Laplace Theorem X ~ Bin (n, p) lim P a≤ n->00 X -np ·b a Vnp (1-p) ≤b - V2x 1 l'e 2 dt 12 Sample Proportion * X ~ Bin (n, p) asn-> p =~~ N p, P(1-p) n n When both np and n(1 - p) are ≥5 Note * P(X2c)=P(+2℃)=P(p>c) =P Vp(1-p) / n > Vp(1-p) / n Estimation Point Estimators 1) Unbiased B(Ô) = E(Ô)-֏ B(Ô)=0 -> E(Ô)= 0 2) Consistency MSE(O) ->0 as n-> MSE(?)=E((Ô-0)2)=V(Ô)+(B(Ô))2 3) Relative Efficiency eff(Ô,Ô2)=V(Ô)/V(2) Interval Estimators ( o is known) ** _2 X - u o Vn P - Za/2 0 Jn < 2 a/2 ~ N(0,1)[14]Source: 非常重要!公式总结.pdfμ=(X)= xf (x)dx u=E(g(X)=g(x)f(x)dx Variance o2=V(X)=E((X-µ)2)=>((x-µ)2xp(x)) all x 02=V(X)=E((X-µ)2)=(x-u)2f(x)dx Bivariate Distribution P(X = x, Y = y) = p(x, y) p(x)= P(X =x) = >p(x, y) Marginal Probability all y P{(X,Y) € A} = [{ f(x, y)dxdy A fx(x)=f(x,y)dy Marginal PDF Covariance OXY = Cov(X,Y) = E((X-Hx)(Y-Hy)) = ΣΣ ((x-μχ) (y-μγ)· p(x, y)) all x all y V(X)=E(X2)-(E(X))2 Cov(X,Y)= E(XY)-E(X)E(Y) Independence p(x, y) = Px(x)xpy(y) Law of Expected Value E(c)=c E(cX)=cE(X) E(X+Y)=E(X)±E(Y) E(XY)=E(X)xE(Y) if X and Y are independent Binomial Distribution X ~ Bin(n, p) P(X=k)=C} pk (1-p)"-k n! k!(n-k)! p“(1- p)"-k E(X)=np,V(X)=np(1-p) Uniform Distribution Law of Variance V(c)=0 V(cX)=c2V(X) V(X+c)=V(X) V(X+Y)=V(X)+V(Y) if X and Y are independent V(aX +bY)=a2V(X)+b2V(Y)+2abCov(X,Y) = a2'" +b20} +2abp xYO XOy Normal Distribution X ~ N(µ,02), Y = 9 X-H_N(0,1) (x-11 )2 f(x)= -e 202 , -00< x< +00 12πσ E(X)=u, V(X)=62 Week 5 Sampling Distributions & Week 6 Estimation Sampling Distribution Central Limit Theorem(CLT) - Sample Mean as n-> 00 ~ N (0,1) X-µ o / In 1) X ~ N(1,02), for all sample size 2) X, close to normal distribution, n ~ 20 3) X far from normal distribution, n>50 De Movire - Laplace Theorem
Week 5 & Week 6:抽样分布与估计 Sampling Distributions & Estimation
- 抽样分布
- 样本均值分布策略
- 中心极限定理CLT:$n$大时,$\bar{X} \sim N(\mu, \frac{\sigma^2}{n})$[4]Source: 非常重要!公式总结.pdfBinomial Distribution X ~ Bin(n, p) P(X=k)=C} pk (1-p)"-k n! k!(n-k)! p“(1- p)"-k E(X)=np,V(X)=np(1-p) Uniform Distribution Law of Variance V(c)=0 V(cX)=c2V(X) V(X+c)=V(X) V(X+Y)=V(X)+V(Y) if X and Y are independent V(aX +bY)=a2V(X)+b2V(Y)+2abCov(X,Y) = a2'" +b20} +2abp xYO XOy Normal Distribution X ~ N(µ,02), Y = 9 X-H_N(0,1) (x-11 )2 f(x)= -e 202 , -00< x< +00 12πσ E(X)=u, V(X)=62 Week 5 Sampling Distributions & Week 6 Estimation Sampling Distribution Central Limit Theorem(CLT) - Sample Mean as n-> 00 ~ N (0,1) X-µ o / In 1) X ~ N(1,02), for all sample size 2) X, close to normal distribution, n ~ 20 3) X far from normal distribution, n>50 De Movire - Laplace Theorem X ~ Bin (n, p) lim P a≤ n->00 X -np ·b a Vnp (1-p) ≤b - V2x 1 l'e 2 dt 12 Sample Proportion * X ~ Bin (n, p) asn-> p =~~ N p, P(1-p) n n When both np and n(1 - p) are ≥5 Note * P(X2c)=P(+2℃)=P(p>c) =P Vp(1-p) / n > Vp(1-p) / n Estimation Point Estimators 1) Unbiased B(Ô) = E(Ô)-֏ B(Ô)=0 -> E(Ô)= 0 2) Consistency MSE(O) ->0 as n-> MSE(?)=E((Ô-0)2)=V(Ô)+(B(Ô))2 3) Relative Efficiency eff(Ô,Ô2)=V(Ô)/V(2) Interval Estimators ( o is known) ** _2 X - u o Vn P - Za/2 0 Jn < 2 a/2 ~ N(0,1)[5]Source: 非常重要!公式总结.pdfBinomial Distribution X ~ Bin(n, p) P(X=k)=C} pk (1-p)"-k n! k!(n-k)! p“(1- p)"-k E(X)=np,V(X)=np(1-p) Uniform Distribution Law of Variance V(c)=0 V(cX)=c2V(X) V(X+c)=V(X) V(X+Y)=V(X)+V(Y) if X and Y are independent V(aX +bY)=a2V(X)+b2V(Y)+2abCov(X,Y) = a2'" +b20} +2abp xYO XOy Normal Distribution X ~ N(µ,02), Y = 9 X-H_N(0,1) (x-11 )2 f(x)= -e 202 , -00< x< +00 12πσ E(X)=u, V(X)=62 Week 5 Sampling Distributions & Week 6 Estimation Sampling Distribution Central Limit Theorem(CLT) - Sample Mean as n-> 00 ~ N (0,1) X-µ o / In 1) X ~ N(1,02), for all sample size 2) X, close to normal distribution, n ~ 20 3) X far from normal distribution, n>50 De Movire - Laplace Theorem X ~ Bin (n, p) lim P a≤ n->00 X -np ·b a Vnp (1-p) ≤b - V2x 1 l'e 2 dt 12 Sample Proportion * X ~ Bin (n, p) asn-> p =~~ N p, P(1-p) n n When both np and n(1 - p) are ≥5 Note * P(X2c)=P(+2℃)=P(p>c) =P Vp(1-p) / n > Vp(1-p) / n Estimation Point Estimators 1) Unbiased B(Ô) = E(Ô)-֏ B(Ô)=0 -> E(Ô)= 0 2) Consistency MSE(O) ->0 as n-> MSE(?)=E((Ô-0)2)=V(Ô)+(B(Ô))2 3) Relative Efficiency eff(Ô,Ô2)=V(Ô)/V(2) Interval Estimators ( o is known) ** _2 X - u o Vn P - Za/2 0 Jn < 2 a/2 ~ N(0,1)[38]Source: Notes & Review - STAT7055 Introductory Statistics for Business and Finance.pdf1 st 1. 5 2. 5 roll 2. 5 3. 5 2. 5 3. 5 Sampling distribution of sample mean 1. 5 2 2. 5 3 3. 5 P(x) E(X) = Eauxxxp(x)=5 V(X) =E(X2) -(E(X) === 2 8 V(X) variance of X Uniform Distribution X ~ U (0, 100) n î ------- X and s2 converge towards u and o2 a single sample with large size (n) > n <- 2000 > unif. data <- runif (n, min =0, max =100) ----- 20 random values from uniform distribution X~U(0,100) > hist (unif. data, breaks = 10) histogram > mean (unif. data) [1] 49. 93714 E(X) = b-ª = 50 b-a > var (unif. data) [1] 833. 4439 V(X) = (b-a)2 12 - = 833. 333 Multiple samples at the same time > n <- 100 > m <- 1000
- 参数估计
- 点估计:无偏、有效、一致
- 区间估计:
- $\sigma$已知 $CI: \bar{X} \pm z_{\alpha/2} \frac{\sigma}{\sqrt{n}}$
- $\sigma$未知 $CI: \bar{X} \pm t_{\alpha/2, n-1}\frac{s}{\sqrt{n}}$[20]Source: Notes & Review - STAT7055 Introductory Statistics for Business and Finance.pdf· Construct an interval estimator Using the probabilities derived from the sampling distribution of the estimator based the CLT. · Estimating u (ø2 is known) The CTL - X~N(H,-) for large sample size X-H - Z = 2~N(0, 1) for large sample size n 90% confidence interval for u X ± 1. 645 P(-1. 645 < Z < 1. 645) = 0. 90 95% confidence interval for u X ± 1. 960 P(-1. 960 < Z < 1. 960) = 0. 95 99% confidence interval for u X ± 2. 575- P(-2. 575 < Z < 2. 575) = 0. 99 100(1-a)% confidence interval for u X + Za- o2 žn P X - Za 2 Vn < p < X + Za- 6 2 Vn / = 1-a lower confidence limit higher confidence limit confidence level Interpretation - Interval random > changes from sample to sample - Population mean --- fixed & constant > within the interval or not In repeated sampling, 100(1-a)% of such intervals created would contain the true population mean many samples > many intervals > 100(1 - a)% of them catch u 100a% of them miss u each sample > each interval > ~Bin (n, 100(1-α)%) ( 1 if X 1 - 2 ] < H < X 1 + Za X1 = . if not Factors affecting confidence interval - Population variance V(X) 1 ----- interval wider - accuracy \ - Sample size n 1 ----- interval narrower ------- accuracy 1 - Confidence level (1-a)% 1 - - interval wider 이년 이트 이트 创」[13]Source: 非常重要!公式总结.pdfE(X)=ª+b, V(X)= 12 1 a<x<b X~U(a,b), f(x) =< b-a 0 others (a-b)2 P(a≤X≤b)=[ºf(x)dx f(x)≥0, for all x f (x)dx =1 -00 Probability Distribution Table X ~ N(u,-) n X ~ N(µ,-) σ' n x- u Week 7 Hypothesis Testing Testing u when o is known Testing u when o is unknown Test for p Hypotheses: Ηo: μ= μο Hypotheses: H : H : 1 0 Test statistic: Z = X o In 0 Rejection region: Z > Zap or Z < - Zaf (H):) # 10) Z ) Z 0 ) X±za - a σ n x -t ,n-1 Jn S X+ta s Vn %,n-1 Vn Z <- Za p-value Z <- zaj or Z > Zah p=P(Z <- Zag)+P(Z >Zah)[26]Source: 非常重要!公式总结.pdfConfidence level 90% 95% 99% Zap 1. 645 1. 960 2. 575 0 X - 2 c/2 o In 11 , X + 2 /2 6 n n c/n- p Vp(1-p)/ n ** o is unknown s Jn x-“ -t(n-1) p-p c/n- p = P| z> =1-a a+b E(X)=ª+b, V(X)= 12 1 a<x<b X~U(a,b), f(x) =< b-a 0 others (a-b)2 P(a≤X≤b)=[ºf(x)dx f(x)≥0, for all x f (x)dx =1 -00 Probability Distribution Table X ~ N(u,-) n X ~ N(µ,-) σ' n x- u Week 7 Hypothesis Testing Testing u when o is known Testing u when o is unknown Test for p Hypotheses: Ηo: μ= μο Hypotheses: H : H :
Week 7:假设检验 Hypothesis Testing
- 流程:
- 提出$H_0$和$H_1$
- 计算检验统计量:
- 均值检验,$\sigma$已知用$Z$,未知用$t$
- 比例检验
- 拒绝域根据$\alpha$(显著性水平),判定是否拒绝$H_0$
- 错误类型:I型($\alpha$),II型($\beta$),检验功效$=1-\beta$[21]Source: 非常重要!公式总结.pdfTest statistic: Z =- o In 0 Rejection Region: Z <- Z . =- 20. 05 =- 1. 645 or Z > Za/ =20. 05 =- 1. 645 X-70 10/15 X-70 10/V5 >1. 645->X>77. 3567 <- 1. 645-> X <62. 6433 P(Type II error) = P(62. 6433 < X <77. 3567| H1 is true) = P 10 / 5 8 / Jn < 10/15 - power of the test = 1- B = 0. 1334 = P(-2. 0922<Z<1. 1978)=P(Z<1. 20)-P(Z <- 2. 09)=0. 8849-0. 0183=0. 8666 Week 10 Chi-squared Tests Chi-squared Goodness-of-fit Test - one categorical variable with k categories Hypotheses: Ho: p1 =1, P2 =C2, . . . , PK =Ck H1: The population proportions do not match that given above Test statistic: 2 ǁ k (fi-e;)2 e; i=1 Rejection region: x2 > Xa,k-1 k : number of categories fi: the observed counts ei: the expected counts ei = pixn Chi-squared Test of a Contingency Table - used to determine two categorical variables (with r and c categories) are independent Hypotheses: H 0 : The variables are independent H : The variables are not independent Test statistic: r : number of rows c : number of columns 2 x2 = >> r c f e[13]Source: 非常重要!公式总结.pdfE(X)=ª+b, V(X)= 12 1 a<x<b X~U(a,b), f(x) =< b-a 0 others (a-b)2 P(a≤X≤b)=[ºf(x)dx f(x)≥0, for all x f (x)dx =1 -00 Probability Distribution Table X ~ N(u,-) n X ~ N(µ,-) σ' n x- u Week 7 Hypothesis Testing Testing u when o is known Testing u when o is unknown Test for p Hypotheses: Ηo: μ= μο Hypotheses: H : H : 1 0 Test statistic: Z = X o In 0 Rejection region: Z > Zap or Z < - Zaf (H):) # 10) Z ) Z 0 ) X±za - a σ n x -t ,n-1 Jn S X+ta s Vn %,n-1 Vn Z <- Za p-value Z <- zaj or Z > Zah p=P(Z <- Zag)+P(Z >Zah)[24]Source: 非常重要!公式总结.pdfRejection region: Z > zuf or Z <- Zaj (two-tailed ±) Z> Za (upper-tailed >) Z <- Za(lower-tailed <) Confidence Interval (X1-X2)+z, 02+02 Vm n2 (x1-X2)±ta. n+12-2| 1 s(-+-) 1 m n2 (p1-p2)±q ₱(1-p1)_p2(1-P2) n 1 n2 Testing 14- 142 with paired samples (Hi -Hl2i = Ilpi , as one sample) Hypotheses: Ho : Hp = Do H1 : HD + Do Test statistic: T: Xp - Do Sp Vn Rejection region: T>tahn-for T <- tan-1 Confidence interval: Xp +tan-1- SD Week 9 Analysis of Variance: ANOVA One-way ANOVA Hypotheses: H. : The population means at dierent levels of the factor are all equal. H1 : At least two of the population means differ. Test statistic (ANOVA Table): Source Sum of squares Deg. of freedom Mean squares F-statistic Factor SST =En, (Y,-Y)" k[25]Source: 非常重要!公式总结.pdfRejection region: Z > zuf or Z <- Zaj (two-tailed ±) Z> Za (upper-tailed >) Z <- Za(lower-tailed <) Confidence Interval (X1-X2)+z, 02+02 Vm n2 (x1-X2)±ta. n+12-2| 1 s(-+-) 1 m n2 (p1-p2)±q ₱(1-p1)_p2(1-P2) n 1 n2 Testing 14- 142 with paired samples (Hi -Hl2i = Ilpi , as one sample) Hypotheses: Ho : Hp = Do H1 : HD + Do Test statistic: T: Xp - Do Sp Vn Rejection region: T>tahn-for T <- tan-1 Confidence interval: Xp +tan-1- SD Week 9 Analysis of Variance: ANOVA One-way ANOVA Hypotheses: H. : The population means at dierent levels of the factor are all equal. H1 : At least two of the population means differ. Test statistic (ANOVA Table): Source Sum of squares Deg. of freedom Mean squares F-statistic Factor SST =En, (Y,-Y)" k
Week 8:两总体比较 Two-population inference
- 两均值比较
- 独立样本:
- $\sigma_1, \sigma_2$已知,$Z$检验
- 未知且相等,合并方差$t$检验
- 未知且不等,Welch's t 检验
- 配对样本:差值$t$检验
- 置信区间与假设检验公式要会套用[24]Source: 非常重要!公式总结.pdfRejection region: Z > zuf or Z <- Zaj (two-tailed ±) Z> Za (upper-tailed >) Z <- Za(lower-tailed <) Confidence Interval (X1-X2)+z, 02+02 Vm n2 (x1-X2)±ta. n+12-2| 1 s(-+-) 1 m n2 (p1-p2)±q ₱(1-p1)_p2(1-P2) n 1 n2 Testing 14- 142 with paired samples (Hi -Hl2i = Ilpi , as one sample) Hypotheses: Ho : Hp = Do H1 : HD + Do Test statistic: T: Xp - Do Sp Vn Rejection region: T>tahn-for T <- tan-1 Confidence interval: Xp +tan-1- SD Week 9 Analysis of Variance: ANOVA One-way ANOVA Hypotheses: H. : The population means at dierent levels of the factor are all equal. H1 : At least two of the population means differ. Test statistic (ANOVA Table): Source Sum of squares Deg. of freedom Mean squares F-statistic Factor SST =En, (Y,-Y)" k[25]Source: 非常重要!公式总结.pdfRejection region: Z > zuf or Z <- Zaj (two-tailed ±) Z> Za (upper-tailed >) Z <- Za(lower-tailed <) Confidence Interval (X1-X2)+z, 02+02 Vm n2 (x1-X2)±ta. n+12-2| 1 s(-+-) 1 m n2 (p1-p2)±q ₱(1-p1)_p2(1-P2) n 1 n2 Testing 14- 142 with paired samples (Hi -Hl2i = Ilpi , as one sample) Hypotheses: Ho : Hp = Do H1 : HD + Do Test statistic: T: Xp - Do Sp Vn Rejection region: T>tahn-for T <- tan-1 Confidence interval: Xp +tan-1- SD Week 9 Analysis of Variance: ANOVA One-way ANOVA Hypotheses: H. : The population means at dierent levels of the factor are all equal. H1 : At least two of the population means differ. Test statistic (ANOVA Table): Source Sum of squares Deg. of freedom Mean squares F-statistic Factor SST =En, (Y,-Y)" k
- 独立样本:
Week 9:方差分析 ANOVA
- 单因素方差分析
- $H_0$:$k$组总体均值相等,$H_1$:至少两组不同
- 检验统计量F,拒绝域$F>F_{\alpha, k-1, n-k}$[22]Source: 非常重要!公式总结.pdfT>tahn-for T <- tan-1 Confidence interval: Xp +tan-1- SD Week 9 Analysis of Variance: ANOVA One-way ANOVA Hypotheses: H. : The population means at dierent levels of the factor are all equal. H1 : At least two of the population means differ. Test statistic (ANOVA Table): Source Sum of squares Deg. of freedom Mean squares F-statistic Factor SST =En, (Y,-Y)" k (Treatments) j=1 k nj Error SSE =>>(Y, -Y,)2 n-k MSE = n-k j=1 i=1 k nj Total SS (Total) = ΣΣ(Y ;- ) n-1 k n = En; j=1 Rejection region: F > Fa,k-1,n-k[33]Source: Notes & Review - STAT7055 Introductory Statistics for Business and Finance.pdfXiaotong 5 grad) ANOVA One-way ANOVA Testing 2 or more population means Analyse: Amount of Variation & Source of Variation · 1 response variable Y (continuous) 1 categorical variable factor with k levels (treatment) k populations Explain the variation of the response variable Factor Level or Treatment j k Y11 Yıj Y1k : : : Sample Values Yil Yij Yik : Yng1 Ynj Sample Size n1 nj . . . nk Sample Mean Ÿ1 . . .[29]Source: Notes & Review - STAT7055 Introductory Statistics for Business and Finance.pdfSum of squares Deg. of freedom Mean squares F-statistic Airline 3446. 8 1723. 4 220. 4289 7. 8184 One-way Error 12564. 45> ANOVA Total 16011. 25 Airline Class 3446. 80 1723. 40 14. 60 Two-way Interaction 130. 00 65. 00 0. 55 ANOVA Error 6374. 30 118. 04 Total 16011. 25 京 THE[36]Source: Notes & Review - STAT7055 Introductory Statistics for Business and Finance.pdf- SS(total) is same - SST in one-way ANOVA = corresponding factor in two-way ANOVA - The two-way ANOVA accounts for more variation > less unexplained variation > Į SSE 6060. 15 51. 34 6060. 15 MSAB = (a-1) (b-1) MSA MSB Chi-squared Test Categorical Data Inference about populations of categorical data with 2 or more categories Goodness-of-Fit test 1 categorical variable k categories Population proportion of observation of each category Ho: P1 = P2 = . . . = Pn = . 1 n H1: Not all the population proportions are equal HO: P1 -1, 2 2 H1: The population proportion do not match above fi ------ observed counts (number of observations in each category ei ------ expected counts = pi x total x2 = > k i=1 (fi - ei)2 ei (upper tailed test - reject Ho when x2 is too large) Degrees of freedom = k - 1 Rejection region x2 > Xa,k-1 vadkal wes tong Contingency Table 2 categorical variables ----- r and c categories Whether the 2 categorical variables are independent? · A cross-classification table of counts that summarises the joint distribution of 2 categorical variables, each with a finite number of categories column 1 column 2 column 3
- 双因素方差分析与交互作用
Week 10:卡方检验与分类数据分析 Chi-Squared Test
- 拟合优度检验(单分类变量)
- $H_0$总体比例与期望相符,$H_1$至少一个比例不同
- $\chi^2 = \sum \frac{(f_i - e_i)^2}{e_i}$
- 列联表独立性检验(两个分类变量)
- $H_0$:变量独立
- $H_1$:变量有关
- $\chi^2 = \sum \frac{(f_{ij} - e_{ij})^2}{e_{ij}}$
- 自由度$(r-1)(c-1)$[10]Source: Notes & Review - STAT7055 Introductory Statistics for Business and Finance.pdfTotal row 1 f11 f12 f13 r1 row 3 f21 f22 f23 r2 row 3 f31 f32 f33 r3 Total C1 C2 C3 Total ith row total x jth column total eij sample size x2 = EI r C (fij - ej) 2 degree of freedom = (r -1)(c -1) i=1 j=1 eij Rejection region: x2 > Xã,(r-1)(c-1) 原创 Simple Liner Regression Variables[21]Source: 非常重要!公式总结.pdfTest statistic: Z =- o In 0 Rejection Region: Z <- Z . =- 20. 05 =- 1. 645 or Z > Za/ =20. 05 =- 1. 645 X-70 10/15 X-70 10/V5 >1. 645->X>77. 3567 <- 1. 645-> X <62. 6433 P(Type II error) = P(62. 6433 < X <77. 3567| H1 is true) = P 10 / 5 8 / Jn < 10/15 - power of the test = 1- B = 0. 1334 = P(-2. 0922<Z<1. 1978)=P(Z<1. 20)-P(Z <- 2. 09)=0. 8849-0. 0183=0. 8666 Week 10 Chi-squared Tests Chi-squared Goodness-of-fit Test - one categorical variable with k categories Hypotheses: Ho: p1 =1, P2 =C2, . . . , PK =Ck H1: The population proportions do not match that given above Test statistic: 2 ǁ k (fi-e;)2 e; i=1 Rejection region: x2 > Xa,k-1 k : number of categories fi: the observed counts ei: the expected counts ei = pixn Chi-squared Test of a Contingency Table - used to determine two categorical variables (with r and c categories) are independent Hypotheses: H 0 : The variables are independent H : The variables are not independent Test statistic: r : number of rows c : number of columns 2 x2 = >> r c f e[36]Source: Notes & Review - STAT7055 Introductory Statistics for Business and Finance.pdf- SS(total) is same - SST in one-way ANOVA = corresponding factor in two-way ANOVA - The two-way ANOVA accounts for more variation > less unexplained variation > Į SSE 6060. 15 51. 34 6060. 15 MSAB = (a-1) (b-1) MSA MSB Chi-squared Test Categorical Data Inference about populations of categorical data with 2 or more categories Goodness-of-Fit test 1 categorical variable k categories Population proportion of observation of each category Ho: P1 = P2 = . . . = Pn = . 1 n H1: Not all the population proportions are equal HO: P1 -1, 2 2 H1: The population proportion do not match above fi ------ observed counts (number of observations in each category ei ------ expected counts = pi x total x2 = > k i=1 (fi - ei)2 ei (upper tailed test - reject Ho when x2 is too large) Degrees of freedom = k - 1 Rejection region x2 > Xa,k-1 vadkal wes tong Contingency Table 2 categorical variables ----- r and c categories Whether the 2 categorical variables are independent? · A cross-classification table of counts that summarises the joint distribution of 2 categorical variables, each with a finite number of categories column 1 column 2 column 3
Week 11 & 12:回归分析 Regression
- 简单线性回归
- $Y = \beta_0 + \beta_1 X + \epsilon$
- 最小二乘法估计$\beta_1, \beta_0$
- $R^2$判定系数衡量解释力
- 回归系数$t$检验和区间估计[11]Source: 非常重要!公式总结.pdfF 2 S2 1 s SST MST MST = F = k-1 MSE SSE SSA MS MSB SS AB MS AB SSE Jn Main effect j=1 i=1 14-H2 > Testing 14-142(o & o2, unknown ) Hypotheses: H : k-1 Week 11 Simple Linear Regression & Week 12 Multiple Linear Regression Simple Linear Regression Simple Linear Regression Model Y = Bo + BX+& - Y = Bo + BX, +8; 8; ~ N (0, 02) Estimating the model: Least Squared Y=0+BX, +e;+ Î; = Bo + BX, ₿1 = 5xx Sx E(B)=B SB =1 1[37]Source: Notes & Review - STAT7055 Introductory Statistics for Business and Finance.pdf· Assumptions of Error - Independent - E(E) = 0 - mean - 02 - constant variance (regardless the value of X) - Ei N(0,02) - normal distribution (iid ---- independently and identically distributed) Parameter Estimation · Estimated / Fitted Regression Line Ŷ = ₿0 + Ê1X 1 Îi = 80 + B1Xi ---- -- fitted value 1 Yi = Bo + B1Xi + ei ----- actual value O O (Xi, Yi) O e. Y = Bo + B1X O o O Ŷ = ₿0 + 1X · ei = Yi - Yi --- residual of the ith observation Make Yi as close as Ý¡ possible > ei as small as possible Minimize - - - --- ΣΕ12 = ΣΙ_1(Υ-Υ;) = ΣΕ1 [Υ; - (βο+ βιχ;)]] Unbiased estimator B1 = 3 E(B1) = E (X) = 0X = B1 Bo = Y - B1X E(82) = E(Y - B1X) = HY - B1Hx = B2 dur <- c (2, 4, 6, 8, 9, 10, 12, 12, 12, 17, 18) att <- c (2. 5, 3, 3. 4, 5, 9, 6, 6, 9, 10, 10, 11) city. dat <- data. frame ( "duration" = dur, "attitude" = att) city. 1m <- 1m (attitude ~duration, data = city. dat) summary (city. 1m) Residuals : Min 1Q Median 3Q -1. 9262 -0. 7640 -0. 4579 0. 6165 2. 7494[34]Source: Notes & Review - STAT7055 Introductory Statistics for Business and Finance.pdfCoefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 1. 2237 1. 0531 1. 162 0. 275114 two-tailed test duration 0. 5585 0. 0952 5. 867 0. 000239 Residual standard error: 1. 493 on 9 degrees of freedom Multiple R-squared: 0. 7927, Adjusted R-squared: 0. 7697 F-statistic: 34. 42 on 1 and 9 DF, p-value: 0. 0002387 Accessing the Model True Ei is unknown ---- -- represented by ei - Normally distribution bell-shaped and symmetric histogram liner normal probability plot - e = 0 & o2 A random scatter of point about 0 No obvious pattern or trend - Independent O 原创 Whether or not a model exists Whether a liner relationship between variable X and Y Whether B1 = 0 Ho: B1=0 H1: B1 ± 0 SB1 Ži=1 et 1 n-2' (n-1)sx T = ₿1-0 SB1 IT B1 SB1 rejection region: T <- tan-2 or T > tan-2 Whether p is equal to 0 Ho: p = 0[27]Source: Notes & Review - STAT7055 Introductory Statistics for Business and Finance.pdfH1: p = 0 T = rxvn- 2 11-r2 rejection region: T < - tan_2 or T > ta, En - 2 Both test statistics are compared to the same sampling distribution B 1 _rx Vn - 2 SB 1 V1 - r2 General test for B1 Ho: B1 = c H1:B1 (>, ¢, <) c SB 1 n-2 (n-1)sx NON T = B1-c SB1 O 原创 General test for Bo Ho: Bo = c H1: B0 (>, ¢, <) c SB (n-1)sk 1 n T = B-c ₿1-C SB 1 freedom = n - 2 · Estimate oz Small o Ei close to 0 - the regression model fits the data ei > unbiased estimator of og s2 - Line2 __ (x1-9)2 n-2 n-2 ranto standard error of estimate Comparing 2 different models, the model with the smaller standard error of estimate is considered better · Calculate R2 Strength of liner relationship R2 = Sxv- - coefficient of determination how much variation is explained by the model Total sum of squares SS(total) = >[-1(Yi - Y)2 = SSR + SSE[15]Source: 非常重要!公式总结.pdfMS AB SSE Jn Main effect j=1 i=1 14-H2 > Testing 14-142(o & o2, unknown ) Hypotheses: H : k-1 Week 11 Simple Linear Regression & Week 12 Multiple Linear Regression Simple Linear Regression Simple Linear Regression Model Y = Bo + BX+& - Y = Bo + BX, +8; 8; ~ N (0, 02) Estimating the model: Least Squared Y=0+BX, +e;+ Î; = Bo + BX, ₿1 = 5xx Sx E(B)=B SB =1 1 n-2 i=1 (n-1)sx V(n-1)sx X i i=1 Assessing the model Hypotheses: HO: B=0 HI: B±0 Ho: p=0 H: p=0 Test statistic: T = s ˆ
- 多元回归
- 模型$Y = \beta_0+\beta_1X_1+...+\beta_kX_k+\epsilon$
- 判别多重共线性、虚拟变量以及整体显著性F检验[12]Source: Notes & Review - STAT7055 Introductory Statistics for Business and Finance.pdfMultiple R2 R2 = 5 (total ) = 1 - SS(total) SSE - the proportion of total variation of Y explained by the model SS(total) R2 will increase as we add more independent variables to the model, even if they are not related to Y Adjusted R2 adjusted R2 = 1 -- n-k-1 SS(total)) n-1 SSE Adjusted R2 will decrease as we add more independent variables, even if they are not related to Y §g =ß0 +§1×1g + ₿2X2g + ·+ ßkXkg ng 原味 Multicollinearity - Independent variables are correlated with each other - Occur to extent in all multiple linear regression model - Cause the parameter estimates of the correlated independent variables to become unstable and have large standard errors Y = Bo + B1X1 + B2X2 + € X1 and X2 are perfectly correlated with each other Assume X1 = X2 (a very extreme situation) Y=2+ 100X1+ 2X2 same Ý = 1 + 102X1 resulted from 2 very different pairs of parameter estimates Y=2+2X1+ 100X2 Huge variability of both B1 and B2 -> SB, and Sp, are very large -> affect Tj = Bj SBj > affect the decision of whether there is linear relationship between Y and a particular Xi > does NOT affect the F-statistic for testing the overall significance of the model Categorical Independent Variables w={ existence of a condition absence of the condition Y= Bo + B1X1+ B2W + € W = 1 ---- Y = (Bo +B2) + B1X1 + € Intercept vary depend on the value of categorical variables W = 0 ---- Y = Bo + B1X1+€ Slop remain the same Fit the following model Y= Bo+B1X1+ B2W + B3(Xx W) + € W = 1 ---- Y = (B0 +B2) + (B1 + B3)X1 + € ------ both intercept and slop change W = 0 ---- Y = B0 +B1X1+€ Ho: B2= 0 H1: B2 # 0 Ho: B3 = 0 H1: B3 # 0 wintong 原色 Introductory Statistics for Business and Finance[31]Source: Notes & Review - STAT7055 Introductory Statistics for Business and Finance.pdf原色 · Method of Least Squares ΣΠ1e2=ΣΠΙ(Υ-Ρ)=ΣΠΥ-(Βο+β+β2Χ2i+ . . . + βκΧκί)]2 · R code x1 <- c() ×2 <- c() y <- c () sample. data <- data. frame ( "duration" =x1, "weather" =x2, "attitude" =y) sample. 1m <- 1m (formula = attitude ~ duration + weather, data = sample. data) summary (sample. 1m) Residuals: Min 1Q Median 3Q Max -1. 56859 -0. 79732 0. 03449 0. 47779 1. 82480 Coefficients : Estimate Std. Error t value Pr(>|t|) (Intercept) 0. 45755 0. 94094 0. 486 0. 639817 duration 0. 46751 0. 08907 5. 249 0. 000775 weather 0. 26344 0. 11784 2. 236 0. 055810 Residual standard error: 1. 243 on 8 degrees of freedom Multiple R-squared: 0. 8724, Adjusted R-squared: 0. 8405 F-statistic: 27. 35 on 2 and 8 DF, p-value: 0. 0002649 Accessing the Model - Normally distribution bell-shaped and symmetric histograms & liner normal probability plots - HE = 0 & o2 is constant --- scatterplots (random around 0 without patterns) - Independence plots of residuals against collection order (no trends/patterns) · Test the overall significance of the model Ho: B1 = B2 = . . . = Bk = 0 H1: Not all the coefficient parameters are equal to 0 SSR =2-1(fi - 7)2 SS(total) = >[=1(Yi - Y)2 =SSR + SSE[30]Source: Notes & Review - STAT7055 Introductory Statistics for Business and Finance.pdfPr (>F) Regression 2 84. 4586 42. 2293 Residuals 8 12. 3505 1. 5438 Total 10 96. 8091 Which coefficients are not equal to 0 Ho: Bj=0 Ho: Bj±0 SBj T : ₿j rejection region: T < - ta 2, n- k- 1 or T > ta En-k-1 Reject Ho ---- once all other variables have been considered, Xj has a significant linear relationship with Y Fail to eject Ho ---- once all other variables have been considered, Xj does not have a significant linear relationship with Y If we fit a simple linear regression with only Xj, we might not necessarily make the same conclusion General test for Bj Ho: Bj= c Ho: Bj(>,¢,<)c T = ₿¡ - c SBj degree of freedom = n - k - 1 凉创 SSE (Residual) F value 27. 3538 0. 0002649 · Estimate oz Ef-1(Yi - Yi)" 2 SE = Vn - k-1 n - k-1 From the regression output Residual standard error : ( ) SSE From the ANOVA table SE = n-k-1 · Calculate R2
温馨提示:课程各章节的重点在于理解不同问题下应选用何种统计推断方法、掌握常见分布及其参数、熟练运用公式解决实际问题。
如需某周详细例题解析或难点讲解,欢迎随时提出!
如需记忆公式、重点难点,或下次快速查阅,请告知,我可以为你长期保存!
Ask Sia for quick explanations, examples, and study support.