Spellchecker status overview
This page provides an overview of spellcheckers for different languages. These tools are built from the language models in the lang-* repositories.
The spellers are grouped according to maturity. Private repositories are not listed.
The maturity levels are production, beta, alpha and experimental. Some beta spellers are used in practical applications.
Being in the Production group means the speller has been tested and is considered stable enough for production use.
Automatic classification: Spellers are automatically classified based on version number and lexicon size (lemma count):
- Production: version ≥ 1.0.0
- Beta: version < 1.0.0 and lemma count ≥ 10,000
- Alpha: version < 1.0.0 and lemma count 1,000–10,000
- Experimental: version < 1.0.0 and lemma count < 1,000
- Undefined: missing version or lemma count data
This objective classification ensures transparency and gives language teams clear upgrade criteria.
Suggestion Quality (S): The tables below include a “Suggestion Quality” column showing how well each spellchecker provides correct spelling suggestions. The test data is taken from tools/spellcheckers/test/typos.tsv in each repository. The badge displays three values: First% | Top5% | Tests
- First%: Percentage of typos where the correct word is the first suggestion
- Top5%: Percentage of typos where the correct word is in the top 5 suggestions
- Tests: Number of typo test cases evaluated (formatted as “k” for thousands) (only true positives in the file mentioned above, other entries are ignored in the calculation)
Badge colors indicate overall quality based on these thresholds:
- 🟢 Green (Good/prod. ready): First ≥ 80% AND Top5 ≥ 90% AND Tests ≥ 1000
- 🟡 Yellow (Beta): First ≥ 60% AND Top5 ≥ 70% AND Tests ≥ 500
- 🔴 Red (Alpha): First ≥ 40% AND Top5 ≥ 50% AND Tests ≥ 100
- ⚫ Black (Experimental): Below red thresholds