The normal distribution is a continuous, bell-shaped distribution that appears everywhere in nature and statistics. Heights, test scores, measurement errors — many real-world phenomena follow an approximately normal distribution.
The distribution is completely described by two parameters:
Notation: X ~ N(mu, sigma^2) — note sigma^2 is the variance, so sigma is the standard deviation.
Normal vs other distributions: Unlike the binomial (which counts discrete successes), the normal distribution is continuous — it applies to measurements that can take any value on a number line.
This rule lets you do quick probability calculations in your head:
Example: SAT scores ~ N(1060, 195^2). About 68% of students score between 865 and 1255. About 95% score between 670 and 1450. A score above 1645 (3 SDs above mean) is in the top 0.15%.
A z-score measures how many standard deviations a value is from the mean:
z = (x - mu) / sigma
A z-score of +1 means the value is 1 standard deviation above the mean. Standardizing converts any normal distribution to the standard normal Z ~ N(0, 1).
Why z-scores matter: They let you compare values from different scales. A z-score of 2.0 on a math test and a z-score of 2.0 on a physics test are equally impressive, even if the raw scores were very different.
pnorm vs qnorm: pnorm goes from a value to a probability (area to the left). qnorm goes from a probability to a value — it's the inverse. Use qnorm when questions ask "what score corresponds to the top 10%?"
The CLT is arguably the most important theorem in statistics. It says:
If you take random samples of size n from any population with mean mu and standard deviation sigma, the distribution of sample means will be approximately normal with mean mu and standard error sigma/sqrt(n), as long as n is large enough (usually n >= 30).
Why the CLT is so powerful: It lets us use normal distribution math even when we don't know the shape of the population distribution. This is the foundation for confidence intervals and hypothesis tests in the next two modules.
dnorm(x, mean, sd) — height of the bell curve at x (density, not probability):
Important: dnorm gives height, not area. Probability is always zero for any single point in a continuous distribution.
pnorm(q, lower.tail = FALSE) — find P(X > q):
When the sample is large enough, Binomial(n, p) approximately N(np, np(1-p)).
Conditions: np(1-p) >= 10
With n = 100, p = 0.5: np(1-p) = 25 (valid).
With n = 100, p = 0.01: np(1-p) = 0.99 (invalid) Invalid — binomial too skewed.
Z-scores let you compare values across different distributions:
Both have z = 1, so they performed equally well relative to their peers.
Every normal probability question is one of four forms:
1. P(X < a): pnorm(a, mean, sd) — left tail
2. P(X > a): pnorm(a, mean, sd, lower.tail=FALSE) — right tail
3. P(a < X < b): pnorm(b, mean, sd) - pnorm(a, mean, sd) — between
4. Find value: qnorm(p, mean, sd) — given probability, find x
Example: X ~ N(72, 8^2) — heights in inches
Note: P(64 < X < 80) = 0.6827 because 64 = mu-sigma and 80 = mu+sigma, so this is the ±1 sigma range (68-95-99.7 rule).
When you take repeated samples of size n from a population with mean mu and sd sigma:
x-bar ~ N(mu, sigma/sqrt(n)) where SE = sigma/sqrt(n) (standard error)
The sample mean is normally distributed regardless of the population distribution (if n is large enough).
Key insight: Doubling n reduces SE by factor sqrt(2) approximately 1.41. Quadrupling n cuts SE in half. Larger samples — more precise estimates of mu.