Statistical Power Calculator
Determine the statistical power of a hypothesis test given effect size, sample size, and alpha level. Use it when designing experiments to confirm you have enough participants to detect a real effect.
About this calculator
Statistical power (1 − β) is the probability that a test correctly rejects a false null hypothesis. It depends on three inputs: the significance level α, the effect size (Cohen's d), and the sample size per group. For a two-sample z-test, the critical z-value is found from α (e.g., z = 1.96 for α = 0.05 two-tailed), then a beta z-score is computed as: β_z = z_critical − d × √(n/2). Power is then 1 minus the cumulative normal probability at β_z. Larger effect sizes and larger samples both increase power. The conventional target is 0.80, meaning an 80% chance of detecting a true effect. Running a power analysis before data collection prevents underpowered studies that waste resources and produce inconclusive results.
How to use
Suppose you expect a medium effect size of Cohen's d = 0.5, plan n = 64 participants per group, use α = 0.05 (two-tailed). Step 1: critical z = 1.96. Step 2: β_z = 1.96 − 0.5 × √(64/2) = 1.96 − 0.5 × 5.657 = 1.96 − 2.828 = −0.868. Step 3: Power = 1 − Φ(−0.868) ≈ 1 − 0.193 = 0.807, or about 80.7%. This meets the conventional 0.80 threshold, confirming the sample size is adequate.
Frequently asked questions
What is a good statistical power level for a research study?
The widely accepted minimum is 0.80 (80%), meaning the study has an 80% chance of detecting a true effect if one exists. Many journals and funding agencies require this threshold before approving a study design. High-stakes medical or safety research often targets 0.90 or higher to further reduce the risk of a false negative. Falling below 0.80 means your study is likely underpowered and may fail to detect real differences even when they exist.
How does effect size affect statistical power in hypothesis testing?
Effect size (Cohen's d) measures how large the true difference between groups is relative to variability. Larger effect sizes make real effects easier to detect, directly increasing power. A d of 0.2 is considered small, 0.5 medium, and 0.8 large. If you expect a small effect, you need a substantially larger sample to maintain the same level of power compared to a study with a large expected effect.
Why does increasing sample size increase statistical power?
A larger sample size reduces the standard error of the mean, making the sampling distribution narrower and allowing smaller true effects to stand out from random noise. In the power formula, sample size appears under a square root: d × √(n/2), so power grows with the square root of n. Doubling the sample size does not double power, but it reliably pushes power closer to 1. This is why power analysis is conducted before data collection — to determine the minimum n needed.