Hardy-Weinberg Equilibrium: How to Calculate Genotype Frequencies in a Population
If a gene comes in two versions, what proportion of a population should carry one copy, two copies, or none at all? In 1908, mathematician G. H. Hardy and physician Wilhelm Weinberg independently answered this with one of the most elegant equations in biology. The Hardy-Weinberg principle states that, absent any evolutionary pressure, allele and genotype frequencies in a population stay constant from one generation to the next — and it gives you a formula to predict exactly what those frequencies should be. This guide shows you how to calculate them, why the equation always sums to 1, and how biologists use deviations from it to detect evolution in action.
What Hardy-Weinberg Equilibrium Is and Why It Matters
Consider a single gene with two alleles, conventionally written p (the dominant allele frequency) and q (the recessive allele frequency). Because every individual carries two copies of the gene, and those two copies are drawn from the population's allele pool, the population's genotypes distribute according to p² + 2pq + q² = 1. Here p² is the fraction of homozygous dominant individuals, 2pq is the heterozygotes, and q² is the homozygous recessive.
It matters because it is the null hypothesis of population genetics — the baseline of "nothing is happening." The principle holds only when five conditions are met: no mutation, no migration, no natural selection, random mating, and an infinitely large population (so no genetic drift). Real populations never perfectly satisfy all five, which is exactly the point. When observed frequencies deviate from what Hardy-Weinberg predicts, that deviation is a signal that one of those evolutionary forces is at work. The equation gives biologists a precise expectation to measure reality against.
How to Calculate Hardy-Weinberg Frequencies
The two governing equations are:
Allele frequencies: p + q = 1
Genotype frequencies: p² + 2pq + q² = 1
The first says the two allele frequencies must account for the whole gene pool. The second distributes those alleles across the three possible genotypes. The most common task is working backward from the visible recessive phenotype: since only the homozygous recessive genotype (q²) shows the recessive trait, you can find q by taking its square root, then derive everything else.
Worked example. Suppose a flower species has red flowers (dominant) and white flowers (recessive), and you count 16% of the plants in a field as white.
1. The white plants are homozygous recessive, so q² = 0.16
2. Solve for q: q = √0.16 = 0.4
3. Find p: p = 1 − q = 1 − 0.4 = 0.6
4. Homozygous dominant: p² = 0.6² = 0.36 (36%)
5. Heterozygous carriers: 2pq = 2 × 0.6 × 0.4 = 0.48 (48%)
6. Verify the identity: 0.36 + 0.48 + 0.16 = 1.00 ✓
So 36% of the plants are homozygous red, 48% are heterozygous red carriers, and 16% are white. The fact that all three genotype frequencies sum to exactly 1 is the built-in check — you can confirm any pair of allele frequencies with the Hardy-Weinberg Equilibrium calculator by entering p and q and confirming the total genotype frequency equals 1.
The striking insight here: even though only 16% of plants show the white trait, nearly half the population secretly carries the white allele as hidden heterozygotes. This is why recessive traits persist even when they are visibly rare.
Using Hardy-Weinberg in Real Research
The principle's everyday value is in estimating carrier frequencies for recessive conditions. In human genetics, if a recessive disorder affects 1 in 10,000 births (q² = 0.0001), then q = 0.01 and the carrier frequency 2pq is roughly 0.0198 — about 1 in 50 people. Predicting carrier rates from disease incidence is foundational to genetic counseling.
Population geneticists also use the equation as a quality-control test. When genotyping a large sample, a strong departure from Hardy-Weinberg proportions at a given marker often flags a technical error — mistyped genotypes or a flawed assay — rather than real biology. Many genome-wide studies routinely discard markers that violate equilibrium.
Finally, deviations reveal evolution. An excess of homozygotes can indicate inbreeding or population structure; a shortage of one genotype can point to selection against it. The genotype frequency calculator lets you generate expected proportions quickly so you can compare them against your observed counts.
Common Mistakes and How to Avoid Them
Confusing alleles with genotypes. p and q are allele frequencies; p², 2pq, and q² are genotype frequencies. Mixing the two is the most frequent beginner error.
Forgetting the heterozygote factor of 2. The middle term is 2pq, not pq. There are two ways to inherit one dominant and one recessive allele, which is why it is doubled.
Taking the square root of the wrong value. Only the homozygous recessive proportion (q²) maps directly to the visible recessive phenotype. Do not square-root the dominant phenotype frequency, which lumps together both p² and 2pq.
Assuming equilibrium proves nothing is evolving. A population can sit close to Hardy-Weinberg proportions even while forces partly offset one another. Matching the prediction is consistent with equilibrium, not definitive proof of it.
Applying it to tiny populations. The model assumes a large population so that random drift is negligible. In small populations, frequencies wander by chance and the prediction breaks down.
Conclusion
The Hardy-Weinberg equilibrium packs a deep biological idea into a one-line equation: in the absence of evolutionary forces, the genetic makeup of a population is perfectly predictable and stable. By splitting allele frequencies p and q into the genotype proportions p² + 2pq + q², you can estimate hidden carriers, validate genotyping data, and — most powerfully — detect evolution by spotting where reality diverges from the prediction. It is the still backdrop against which all the motion of population genetics becomes visible.
Key Takeaways
• Know both equations: Allele frequencies satisfy p + q = 1, and genotype frequencies satisfy p² + 2pq + q² = 1, which always sums to exactly one
• Work from the recessive phenotype: Only homozygous recessives (q²) show the recessive trait, so take its square root to find q, then derive p and the rest
• Don't forget the 2: Heterozygotes are 2pq, not pq, because there are two ways to inherit one of each allele
• Use deviations as a signal: Compare observed counts against the Hardy-Weinberg Equilibrium calculator output — departures hint at selection, drift, or genotyping errors