MODULE 0910 QUESTIONS

Normal Distribution

ADAPTIVE FLASHCARDS
Flashcard Study Mode
Study this module with spaced repetition. Wrong answers come back weighted heavier.

Normal Distribution

The Most Important Distribution in Statistics

The normal distribution is a continuous, bell-shaped distribution that appears everywhere in nature and statistics. Heights, test scores, measurement errors — many real-world phenomena follow an approximately normal distribution.

The distribution is completely described by two parameters:

  • mu (mean) — the mean, which determines where the bell is centered
  • sigma (standard deviation) — the standard deviation, which determines how wide the bell is

Notation: X ~ N(mu, sigma^2) — note sigma^2 is the variance, so sigma is the standard deviation.

Normal vs other distributions: Unlike the binomial (which counts discrete successes), the normal distribution is continuous — it applies to measurements that can take any value on a number line.

The 68-95-99.7 Rule (Empirical Rule)

This rule lets you do quick probability calculations in your head:

R
1# For any normal distribution N(mu, sigma^2):
2# P(mu - sigma < X < mu + sigma) approximately 0.68 (68%)
3# P(mu - 2*sigma < X < mu + 2*sigma) approximately 0.95 (95%)
4# P(mu - 3*sigma < X < mu + 3*sigma) approximately 0.997 (99.7%)

Example: SAT scores ~ N(1060, 195^2). About 68% of students score between 865 and 1255. About 95% score between 670 and 1450. A score above 1645 (3 SDs above mean) is in the top 0.15%.

Z-Scores: Standardization

A z-score measures how many standard deviations a value is from the mean:

z = (x - mu) / sigma

R
1# Score of 85, mean = 75, sd = 10
2z <- (85 - 75) / 10
3z

A z-score of +1 means the value is 1 standard deviation above the mean. Standardizing converts any normal distribution to the standard normal Z ~ N(0, 1).

Why z-scores matter: They let you compare values from different scales. A z-score of 2.0 on a math test and a z-score of 2.0 on a physics test are equally impressive, even if the raw scores were very different.

R Functions for Normal Probabilities

R
1# P(X <= 85) for X ~ N(75, 100)
2pnorm(85, mean = 75, sd = 10)
3
4# P(X > 80)
5pnorm(80, mean = 75, sd = 10, lower.tail = FALSE)
6# or: 1 - pnorm(80, mean = 75, sd = 10)
7
8# P(65 <= X <= 85)
9pnorm(85, mean = 75, sd = 10) - pnorm(65, mean = 75, sd = 10)
10
11# Find the 90th percentile
12qnorm(0.90, mean = 75, sd = 10)

pnorm vs qnorm: pnorm goes from a value to a probability (area to the left). qnorm goes from a probability to a value — it's the inverse. Use qnorm when questions ask "what score corresponds to the top 10%?"

The Central Limit Theorem (CLT)

The CLT is arguably the most important theorem in statistics. It says:

If you take random samples of size n from any population with mean mu and standard deviation sigma, the distribution of sample means will be approximately normal with mean mu and standard error sigma/sqrt(n), as long as n is large enough (usually n >= 30).

R
1# Even if the population is skewed, sample means are normal
2# SE = population_sd / sqrt(n)
3SE <- 15 / sqrt(100) # pop sd = 15, n = 100
4SE

Why the CLT is so powerful: It lets us use normal distribution math even when we don't know the shape of the population distribution. This is the foundation for confidence intervals and hypothesis tests in the next two modules.

dnorm(), pnorm(), qnorm()

dnorm(x, mean, sd) — height of the bell curve at x (density, not probability):

R
1dnorm(0, mean = 0, sd = 1) # height at z = 0
2dnorm(2, mean = 0, sd = 1) # height at z = 2

Important: dnorm gives height, not area. Probability is always zero for any single point in a continuous distribution.

pnorm(q, lower.tail = FALSE) — find P(X > q):

R
1pnorm(2, mean = 0, sd = 1, lower.tail = FALSE) # P(Z > 2)

Normal Approximation to the Binomial

When the sample is large enough, Binomial(n, p) approximately N(np, np(1-p)).

Conditions: np(1-p) >= 10

R
1# Check if approximation is valid
2n <- 100
3p <- 0.5
4np_1_minus_p <- n * p * (1 - p)
5np_1_minus_p >= 10

With n = 100, p = 0.5: np(1-p) = 25 (valid).

With n = 100, p = 0.01: np(1-p) = 0.99 (invalid) Invalid — binomial too skewed.

Standardization for Comparison

Z-scores let you compare values across different distributions:

R
1# Alice scores 85 on a test with mean 75, sd 10
2# Bob scores 38 on a test with mean 30, sd 8
3alice_z <- (85 - 75) / 10
4bob_z <- (38 - 30) / 8
5alice_z
6bob_z

Both have z = 1, so they performed equally well relative to their peers.

Four Normal Probability Cases

Every normal probability question is one of four forms:

1. P(X < a): pnorm(a, mean, sd) — left tail

2. P(X > a): pnorm(a, mean, sd, lower.tail=FALSE) — right tail

3. P(a < X < b): pnorm(b, mean, sd) - pnorm(a, mean, sd) — between

4. Find value: qnorm(p, mean, sd) — given probability, find x

Example: X ~ N(72, 8^2) — heights in inches

R
1pnorm(80, 72, 8) # P(X < 80)
2pnorm(60, 72, 8, lower.tail = FALSE) # P(X > 60)
3pnorm(80, 72, 8) - pnorm(64, 72, 8) # P(64 < X < 80)
4qnorm(0.95, 72, 8) # 95th percentile

Note: P(64 < X < 80) = 0.6827 because 64 = mu-sigma and 80 = mu+sigma, so this is the ±1 sigma range (68-95-99.7 rule).

CLT and Sampling Distribution of x-bar

When you take repeated samples of size n from a population with mean mu and sd sigma:

x-bar ~ N(mu, sigma/sqrt(n)) where SE = sigma/sqrt(n) (standard error)

The sample mean is normally distributed regardless of the population distribution (if n is large enough).

R
1# Heights ~ N(68, 3^2), taking samples of n=36
2SE <- 3 / sqrt(36)
3SE
4pnorm(69, mean = 68, sd = SE, lower.tail = FALSE)

Key insight: Doubling n reduces SE by factor sqrt(2) approximately 1.41. Quadrupling n cuts SE in half. Larger samples — more precise estimates of mu.