GiellaLT provides an infrastructure for rule-based language technology aimed at minority and indigenous languages, and streamlines building anything from keyboards to speech technology. Read more about Why. See also How to get started, and our Privacy document.

Ranking suggestions in divvunspell

Background

GiellaLT facilitates the use of two epeller engines:

We have recently (2024) moved to using divvunspell for the GiellaLT spellers.

The Speller Error Model page documents how to rank correction suggestions based on letter substitutions.

Speller testing with divvunspell

There’s a prototype-level testing tool in the divvunspell directory. In order to use it, clone divvunspell (see the README file for details. Note that you will need rust to use divvunspell.

Use divvunspell like this (here with sma as an example).:

accuracy -o support/accuracy-viewer/public/report.json typos.txt sma.zhfst

cd support/accuracy-viewer

npm i && npm run dev

View in http://localhost:5000 (where the 5-digit number is given in the feedback.

More info by accuracy --help.

Using the results

The penalty points are explained on the Speller Error Model page. The goal is to get values for corrections as high as possible, this may be done by tweaking the penalty points.

Sitemap