Genetic Distance Calculator
Compute the genetic map distance between two linked loci in centimorgans (cM) from the recombination frequency observed in offspring. Returns the percentage of recombinant offspring, valid as a distance estimate for small frequencies.
Last updated: May 2026
Compare with similar
About this calculator
The formula is Genetic Distance (cM) = (recombinantOffspring / totalOffspring) × 100. By Morgan's definition, one centimorgan (cM) corresponds to a 1% recombination frequency between two loci — so recombination frequency in percent equals distance in cM when distance is small. Inputs: recombinantOffspring is the count of offspring with parental-rearranged combinations of marker alleles; totalOffspring is the total brood size scored. Edge cases: the relationship breaks down for distances above roughly 20–25 cM because double crossovers (two recombinations on the same chromosome between the two markers) restore the parental arrangement and are scored as non-recombinant, causing the raw recombination frequency to under-estimate the true map distance. For larger distances, mapping functions like Haldane (d = −0.5 × ln(1 − 2r)) or Kosambi (d = 0.25 × ln((1+2r)/(1−2r))) correct for double crossovers. Above 50% recombination, two loci behave as if they're on different chromosomes and the formula is meaningless — recombination frequency caps at 50% regardless of true distance. Small offspring numbers (<100) give noisy estimates: Poisson sampling on a small brood produces a standard error of √(r(1−r)/n), so for r = 0.10 and n = 50 the SE is about 4 cM, meaning your single-brood estimate of 10 cM has a 95% CI of roughly 2–18 cM. Always combine multiple crosses for better estimates.
How to use
Example 1 — closely linked loci. 12 recombinant offspring out of 100 total. Step 1: ratio = 12/100 = 0.12. Step 2: × 100 = 12 cM. Verify: 12% recombination corresponds directly to 12 cM for small distances — the loci are close enough together on the chromosome that double crossovers are rare and the formula is accurate ✓. Two markers at 12 cM are useful for fine-mapping a trait between them. Example 2 — moderately distant loci needing correction. 22 recombinant offspring out of 100 total. Step 1: 22/100 = 0.22. Step 2: × 100 = 22 cM (raw). Verify: this raw distance under-estimates the true map distance because double crossovers reduce observed recombinants. Applying Haldane's correction: d = −0.5 × ln(1 − 2(0.22)) = −0.5 × ln(0.56) = −0.5 × (−0.5798) ≈ 29 cM. Kosambi gives a similar 26 cM. Use the raw 22 cM if you want a quick estimate; use Haldane/Kosambi for accurate map-building when r > 0.15 ✓.
Frequently asked questions
What is a centimorgan, and how does it relate to physical distance?
A centimorgan (cM) is a unit of genetic map distance defined as the distance over which the expected number of recombination events per generation is 0.01. Equivalently, two loci 1 cM apart show a 1% recombination frequency in a single generation. Centimorgans measure genetic recombination, not physical distance in base pairs, and the relationship between cM and Mb (megabases) varies widely across the genome and species: in humans the average is ~1 cM/Mb, but it ranges from <0.1 cM/Mb in centromeric regions (recombination suppressed) to >5 cM/Mb in hotspots near telomeres. Sex differences are large too — human female meiosis has about 1.7× more recombination than male meiosis overall, with cM/Mb roughly 60% higher in females. The unit was named after Thomas Hunt Morgan, who first showed that genes are arranged linearly on chromosomes and that recombination frequency correlates with linear distance, foundational work that earned him the 1933 Nobel Prize.
When do I need to use Haldane or Kosambi mapping functions instead of raw recombination frequency?
Use a mapping function when recombination frequency exceeds about 15% — that's where double crossovers (two recombinations between the same two markers) become numerically significant. Double crossovers restore the parental allele combination and appear as non-recombinant, so the raw frequency under-estimates true distance. Haldane's function assumes no interference (crossover events are independent): d = −0.5 × ln(1 − 2r). Kosambi's function accounts for interference (one crossover suppresses nearby ones), typically observed in most organisms: d = 0.25 × ln((1+2r)/(1−2r)). For r < 0.10, all three (raw, Haldane, Kosambi) agree within 1 cM. For r = 0.30, raw gives 30 cM, Kosambi about 38 cM, Haldane about 46 cM — substantial divergence. Beyond r = 0.40 the corrections are sensitive to assumptions and unreliable; the best approach is multi-marker mapping where intermediate markers give independent short-distance estimates that you sum.
What if my recombination frequency is near 50%?
A recombination frequency near 50% means the two markers are unlinked — either physically located on different chromosomes, or so far apart on the same chromosome that they assort independently in meiosis. The 50% maximum is a hard ceiling: even loci on opposite ends of a long chromosome (>50 cM apart) show ~50% recombination because the probability of an odd number of crossovers between them approaches 50% for distant pairs. Linkage analysis cannot distinguish 'on different chromosomes' from 'far apart on the same chromosome' purely from recombination data; you need physical markers or LOD-score analysis with intermediate markers to confirm linkage. For practical mapping, statistical tests like LOD ≥ 3 are used to declare linkage; recombination frequency alone isn't enough. If you observe ~50% recombination in your data, conclude only that the markers are unlinked, not that you've measured a true distance.
What are the common mistakes when measuring genetic distance?
The biggest mistake is using raw recombination frequency for moderately distant loci (r > 0.15) without applying Haldane or Kosambi correction, which under-estimates true map distance. The second is small sample sizes — broods of <50 individuals give noisy r estimates with confidence intervals ±5–10 cM, so single-brood mapping is unreliable for precision. The third is misclassifying offspring phenotypes: ambiguous markers, incomplete penetrance, or epistasis can make some recombinants look parental or vice versa, biasing the estimate in either direction. People also forget that recombination frequency varies by sex (often 1.5–2× higher in human females than males) and by chromosome region (hotspots vs cold spots), so a single distance estimate is an average over the markers and crosses observed. Pooling crosses from different parental backgrounds when the underlying physical distance varies (different alleles can carry different recombination modifiers) inflates apparent distance. Finally, ignoring genotyping errors: even 1% error rate can add 1–2 cM to apparent distances, which matters for fine-mapping.
When should I not use this calculator?
Do not use it for recombination frequencies above ~15% without applying Haldane or Kosambi correction — the raw cM number under-estimates true map distance because of double crossovers. It is not appropriate for sex-linked or organelle-inherited loci, which have different inheritance patterns and require their own analysis (X-linked markers in male progeny show no recombination because males inherit only one X). Do not use it for human pedigree linkage analysis without LOD-score methods, family structure consideration, and software like MERLIN, PLINK, or LINKAGE — single-cross human pedigrees rarely have enough informative meioses for a meaningful frequency-based estimate. It is not suitable for QTL mapping (quantitative trait loci) where the trait is continuous and a different statistical framework is needed. For unlinked markers (r ≈ 50%), the formula returns 50 cM but does not actually estimate distance — only confirms unlinkage. Finally, for high-throughput genome mapping with thousands of markers, use specialised software (R/qtl, JoinMap, MapMaker) that handles multiple markers simultaneously, applies mapping functions consistently, and computes confidence intervals.