Question 1

What is a good lexical diversity score for written text?

Accepted Answer

There is no universal benchmark because TTR is sensitive to text length — shorter texts almost always score higher. For texts of similar length, a TTR above 70% is considered high diversity, while scores below 30% suggest heavy repetition. Researchers often use corrected measures like MATTR or MTLD for longer texts to control for length effects. For practical comparison, use TTR only when comparing texts of roughly equal word count.

Question 2

How does lexical diversity differ from readability?

Accepted Answer

Lexical diversity measures vocabulary variety, while readability measures how easy a text is to understand. A highly diverse text (high TTR) can actually be harder to read because it uses many rare or unfamiliar words. Readability formulas like Flesch-Kincaid focus on sentence length and syllable count, not word uniqueness. Both metrics together give a fuller picture of text quality and complexity.

Question 3

Why does type-token ratio decrease as text gets longer?

Accepted Answer

As a text grows, function words like 'the', 'a', and 'is' appear repeatedly, pulling the ratio of unique words to total words downward. Even content words start repeating as a topic is discussed in depth. This mathematical artifact means TTR scores are not directly comparable across texts of different lengths. Linguists compensate by using standardized windows of fixed word counts or alternative diversity indices.

Lexical Diversity Index Calculator

About this calculator

How to use

Frequently asked questions

What is a good lexical diversity score for written text?

How does lexical diversity differ from readability?

Why does type-token ratio decrease as text gets longer?