Hover any formula for a plain-English explanation and guidance on when to use it.
Descriptive Statistics
4 formulas
Ch.3Sample Mean
Yˉ=n∑Yi
hover for explanation
The arithmetic average — add all observations and divide by the count. Best estimate of the population mean μ.
When
Always the starting point. Pair with the standard error to quantify precision.
Ch.3Standard Deviation (conceptual)
s=n−1∑(Yi−Yˉ)2
hover for explanation
How spread out data are around the mean. Dividing by n−1 corrects for bias (Bessel's correction).
When
Quantifying variability. Used in t-tests, CIs, and ANOVA.
Ch.3Standard Deviation (computational)
s=n−1∑Yi2−nYˉ2
hover for explanation
Algebraically equivalent but avoids rounding errors when computing by hand. ΣYᵢ² = square each value then sum.
When
Preferred for hand calculations from raw data.
Ch.4Standard Error of the Mean
SEYˉ=ns
hover for explanation
How much Ȳ varies from sample to sample. Larger n → smaller SE. Doubling precision requires quadrupling n.
When
Building confidence intervals for μ, or as the t-test denominator.
Probability Distributions
4 formulas
Ch.7Binomial Distribution
Pr[X=x]=(xn)px(1−p)n−x
hover for explanation
Probability of exactly x successes in n independent trials, each with probability p. Mean = np, Variance = np(1−p).
When
Counting successes in a fixed number of binary trials. E.g. # of mutant offspring.
Ch.8Poisson Distribution
P[X=x]=x!μx⋅e−μ
hover for explanation
Probability of x events when the average rate is μ. Key property: mean = variance = μ.
When
Counting rare, random events per unit time or space. Test fit with χ² GOF.
Ch.10Normal Distribution
f(x)=2πσ21⋅e−2σ2(x−μ)2
hover for explanation
The bell curve, described by mean μ and variance σ². 68% within 1σ, 95% within 2σ, 99.7% within 3σ.
When
Describing symmetric continuous measurements. Many parametric tests assume normally distributed errors/residuals, or normality within groups, especially for small samples.
Ch.5Bayes' Theorem
Pr[A∣B]=Pr[B]Pr[B∣A]⋅Pr[A]
hover for explanation
Updates probability of A given evidence B. Pr[A] = prior, Pr[B|A] = likelihood, Pr[A|B] = posterior.
When
Reversing conditional probabilities. E.g. given a positive test, actual probability of disease?
Confidence Intervals
4 formulas
Ch.4CI for the Mean
Yˉ±SEYˉ⋅tα(2),df
hover for explanation
95% CI (α = 0.05): if repeated, 95% of intervals would contain true μ. df = n−1.
When
After estimating a mean. Report as: Ȳ = X (95% CI: lower, upper).
Ch.12CI for Difference in Means
(Yˉ1−Yˉ2)±SEYˉ1−Yˉ2⋅tα(2),df
hover for explanation
Interval for the true difference μ₁ − μ₂. If it excludes 0, the difference is significant at level α.
When
After a two-sample t-test to report the plausible range of the difference.
Ch.7Agresti-Coull (Proportion CI)
p~=n+4X+2,p~±1.96n+4p~(1−p~)
hover for explanation
Better than the Wald interval. Adding 2 phantom successes and 2 failures stabilizes the interval near 0 or 1.
When
Estimating a population proportion p. Always preferred over Wald CI in BIOL 300.
Ch.11CI for the Variance
χα/2,df2df⋅s2≤σ2≤χ1−α/2,df2df⋅s2
hover for explanation
Uses two χ² critical values (asymmetric interval because χ² is skewed). df = n−1.
When
Estimating population variance σ² directly from a single sample.
Chi-Square & Proportions
3 formulas
Ch.8Chi-Square Statistic
χ2=∑Ei(Oi−Ei)2
hover for explanation
Larger χ² = more departure from H₀. df = (categories − 1) minus parameters estimated. Assumptions: expected count > 1 in all cells; no more than 20% of cells have expected count < 5.
When
Testing fit to a theoretical distribution (GOF) or independence in a contingency table.
Ch.9Odds Ratio
OR=b⋅ca⋅d
hover for explanation
From a 2×2 table (a=top-left, b=top-right, c=bottom-left, d=bottom-right). OR > 1: event more likely in group 1.
When
Measuring association between two binary variables in a 2×2 contingency table.
Ch.9CI for Odds Ratio
ln(OR)±Zα⋅SE[ln(OR)]
hover for explanation
The CI is built on the log scale (where ln(OR) is approximately normal), then exponentiated. If CI excludes 1, association is significant.
When
Reporting uncertainty around an estimated odds ratio. If CI includes 1, no significant association.
t-Tests
7 formulas
Ch.11One-Sample t
t=s/nYˉ−μ0
hover for explanation
Tests H₀: μ = μ₀. Large |t| means Ȳ is far from μ₀ in SE units. df = n−1.
When
Comparing a sample mean to a specific hypothesized value. One group, one mean.
Ch.12Pooled Sample Variance
sp2=df1+df2df1s12+df2s22
hover for explanation
Weighted average of both groups' variances. Valid only when assuming σ₁² = σ₂².
When
First step in the pooled two-sample t-test.
Ch.12SE for Pooled Two-Sample t
SEYˉ1−Yˉ2=sp2(n11+n21)
hover for explanation
Standard error of the difference in means, assuming equal variances.
When
Part of pooled two-sample t-test, after computing sp².