Learn & Review: Learn STATISTICS in Half an Hour

Jan 23, 2026

Teach me STATISTICS in half an hour! Seriously.

audio

Media preview

Transcript

Transcript will appear once available.

summarize_document

Introduction to Statistics: A No-Maths, Under-30-Minute Overview

This summary provides an intuitive introduction to statistics, focusing on core concepts without complex mathematical calculations. The examples are themed around the NBA to make the material engaging.

1. Types of Data

Statistics deals with different types of data, which can be broadly categorized into two main classes:

  • Categorical Data: Represents categories or labels.

    • Nominal Data: Categories with no inherent order.
      • Example: The team a player plays for (e.g., Golden State Warriors, Atlanta Hawks). The order of listing these teams doesn't matter.
    • Ordinal Data: Categories with a meaningful order or ranking.
      • Example: Player positions (e.g., Guard, Forward, Center). There's a general positional order and often a correlation with player height.
  • Numerical Data: Represents quantities or measurements.

    • Discrete Data: Can only take specific, distinct values, often whole numbers.
      • Example: The number of free throws missed by a player. A player can miss 0, 1, 2, but not 1.5 or 2.3 free throws.
    • Continuous Data: Can take any value within a given range, allowing for infinite subdivisions.
      • Example: A player's height. While often rounded to whole centimeters or inches, height can theoretically be measured to any level of precision (e.g., 191.3217 cm).
  • Proportions/Percentages: A special type of data that aggregates nominal data (like made or missed shots) into a numerical summary.

    • Example: A player's three-point percentage. While built from individual made/missed shots (nominal), the percentage itself is a numerical value. The question of whether a proportion is discrete or continuous is left for discussion.

2. Distributions

Distributions describe how data is spread across its possible values.

  • Probability Density Function (PDF): A function that describes the distribution of a population or the probability of selecting an individual with a specific characteristic (like height) at random.

    • Example: The distribution of heights of NBA players, ranging from the shortest to the tallest.
  • Common Distribution Shapes:

    • Normal Distribution (Bell Curve): The most common distribution, where the bulk of the data clusters around the middle, and it becomes rarer towards the extremes. It's symmetrical.
    • Uniform Distribution: Every value has an equal probability of occurring. This is unlikely for NBA player heights.
    • Bimodal Distribution: Has two distinct peaks, indicating two common values or groups.
    • Skewed Distribution: Asymmetrical, with a tail extending in one direction.
      • Left Skew: The tail points to the left, indicating a predominance of higher values.
      • Right Skew: The tail points to the right, indicating a predominance of lower values.
  • Sampling Distributions: Describe the distribution of a statistic (like the average) calculated from multiple samples.

    • If you take samples of a certain size (e.g., 10 players) and calculate their average height, the distribution of these average heights will have the same mean as the original distribution but will be skinnier (less variance).
    • This is because larger sample sizes make extreme sample means less likely. This concept is crucial because most studies rely on samples to make inferences about a population.

3. Sampling and Estimation

Statistics uses samples to estimate unknown population values.

  • Population Parameter (θ): The true, unknown value for a characteristic of an entire population.
    • Example: Steph Curry's true long-term three-point shooting percentage (θ). This is a "godlike" value that we can never know precisely.
  • Sample Statistic: A value calculated from a sample, used to estimate the population parameter.
    • Example: Steph Curry's three-point percentage this season (0.4766 based on 128 attempts). This is a sample statistic (often denoted by 'p' for proportion).
  • Estimation: Using sample statistics to make an educated guess about population parameters. The sample statistic is often the best point estimate.
  • Uncertainty and Confidence Intervals: Statistics quantifies the uncertainty around estimates.
    • Confidence Intervals: A range of values within which we are confident (e.g., 95% confident) the true population parameter lies.
    • Example: A 95% confidence interval for Steph Curry's true three-point percentage might be [0.40, 0.55].
    • Sample Size Matters: A larger sample size leads to a narrower confidence interval (more certainty), while a smaller sample size leads to a wider interval (less certainty). This is why Myers Leonard's percentage (0.60 from 15 attempts) has a much wider confidence interval than Steph Curry's.

4. Common Statistical Parameters and Statistics

Greek letters often represent population parameters, while Roman letters represent sample statistics.

  • Parameters (Population):
    • μ (mu): Mean
    • σ (sigma): Standard Deviation (measure of variation/uncertainty)
    • π (pi): Proportion (sometimes θ is used more generally)
    • ρ (rho): Correlation
    • β (beta): Gradient (in regression)
  • Statistics (Sample):
    • x̄ (x-bar): Sample Mean
    • s: Sample Standard Deviation
    • p: Sample Proportion
    • r: Sample Correlation
    • b: Sample Gradient

5. Hypothesis Testing

Hypothesis testing is a formal method to test a claim or assumption about a population parameter using sample data.

  • Null Hypothesis (H₀): A statement of no effect or no difference, which we assume to be true initially. It's a conservative starting point.
    • Example: Myers Leonard's long-term three-point percentage (θ) is less than or equal to 50% (θ ≤ 0.5).
  • Alternative Hypothesis (H₁ or Hₐ): The claim we are seeking evidence for.
    • Example: Myers Leonard's long-term three-point percentage (θ) is greater than 50% (θ > 0.5).
  • The Process:
    1. Set up the null and alternative hypotheses.
    2. Collect sample data and calculate a test statistic.
    3. Determine if the sample data is extreme enough to reject the null hypothesis.
  • Rejection Region: A range of values for the test statistic that are considered too extreme to be likely if the null hypothesis were true.
  • Level of Significance (α): The probability of rejecting the null hypothesis when it is actually true (Type I error). Commonly set at 5% (0.05). If the sample statistic falls in the top 5% of the distribution under the null hypothesis, we reject H₀.
  • Key Principles:
    • Never Prove: Hypothesis tests do not prove anything; they provide evidence for or against a hypothesis.
    • Do Not Accept: We either "reject the null hypothesis" or "fail to reject the null hypothesis." We never "accept" the null hypothesis, as we might simply lack sufficient evidence. This is analogous to a court finding a defendant "not guilty" (lack of evidence for guilt) rather than "innocent."

6. P-values

The p-value measures how extreme the sample data is relative to the null hypothesis.

  • Definition: The probability of observing a sample statistic as extreme as, or more extreme than, the one obtained, assuming the null hypothesis is true.
  • Interpretation:
    • Small p-value (e.g., < 0.05): Indicates that the observed sample data is very unlikely under the null hypothesis. This provides strong evidence against the null hypothesis and for the alternative hypothesis. We reject H₀.
    • Large p-value (e.g., > 0.05): Indicates that the observed sample data is reasonably likely under the null hypothesis. This provides weak evidence against the null hypothesis. We fail to reject H₀.
  • Relationship to Rejection Region: If the p-value is less than the chosen level of significance (α), then the test statistic falls within the rejection region, and we reject the null hypothesis.

7. P-hacking (Misuse of P-values)

P-hacking is a problematic practice where researchers repeatedly test hypotheses on the same data until a statistically significant result (p < 0.05) is found.

  • The Problem: When you test many hypotheses, the probability of finding a significant result purely by chance increases. If you test 20 independent hypotheses, you'd expect about one to show a significant result (p < 0.05) just due to random variation, even if no true effect exists.
  • Good Research: Formulate a hypothesis before collecting data, then test that specific hypothesis.
  • Bad Research (P-hacking): Collect data first, then explore many potential relationships, looking for a significant p-value. This inflated significance is misleading and can invalidate research findings.

Ask Sia for quick explanations, examples, and study support.

Let's Get in Touch

AskSia on InstagramAskSia on TikTokAskSia on DiscordAskSia on FacebookAskSia on LinkedInAskSia on Reddit