Testing in GiellaLT infra

Most of the repositories in GiellaLT come with a lot of tests. Tests are used to verify that the language models work correctly within the specified parametres: every update to language data triggers automatic build followed by automatic checks. If checks are successful, new versions of stuff may be distributed to users, e.g. via nightly channel. The main command to run automated tests is make check -j. This runs tests (in parallel) until at least one fails, if one or more fails it means the language model is not good enough to be used in tools or research. Similarly, when we update something for all language repos at once, it is make check that shows whether or not the update was compatible with all of the languages.

make check is good for automated testing. It runs fast, in parallel, in background, but it is not ideal for developers who are working on fixing things. For development use make devtest can be more useful, it runs all the tests one by one, ignoring the results and shows more output in order. For some of the tests, there are helpful scripts in the devtools/ folder, these can provide more output and open relevant files in a graphical editor for example.

Langs tests

Following is a list of all tests enabled for all languages and what they do. These tests are run by the make check command and automated testing in the build infra.

src/cg3/test

No common tests yet.

src/fst/morphology/test

`generate-{adjective,noun,propernoun,verb}-lemmas.sh`

gets lemmas from src/fst/morphology/stems/{adjectives,nouns,propernouns,verbs}.lexc files respectively and ensures that all of them can be generated by the normative generator using specific analysis tags. Use devtools/generate-{adj,noun,prop,verb}-lemmas.sh to get more information about lemmas that are not generated.

`generate-{adjective,noun,verb}-paradigm.sh`

gets lemmas from src/fst/morphology/stems/{adjectives,nouns,verbs}.lexc files respectively and generates paradigms defined by test{adj,noun,verb}paradigm.txt respectively, with the descriptive generator. Use devtools/generate-{adj,noun,prop,verb}-wordforms.sh to get more information about paradigms that are not generated.

`src/fst/morphology/phonology/pair-test-hfst.sh`

runs pair tests against twol rules.

`src/fst/orthography/test/run-initcaps-genyaml-testcases.sh`

runs YAML tests from the same directory for generating word-forms with initial caps or titlecasing if any.

`src/fst/phonetics/tests/`

no common tests.

`src/fst/test/run-*-yaml-testscases.sh`

runs YAML tests from subdirectories like: dict-gt-yamls, gt-desc-yamls, gt-norm-yamls etc. if any are found.

`src/fst/test/run-lexc-testcases.sh`

runs YAML testcases from lexc doccomments if any are found.

`tools/analysers/test/`

no common tests

`tools/grammarcheckers/tests/`

runs yaml checks in the directory against grammar checker, if any. Grammar checker testing is documented here.

`tools/hyphenators/test/`

no common tests

`tools/mt/apertium/test/run-mt-gt-desc-anayaml-tests.sh`

runs YAML checks in the same folder if any for machine translation analyser.

`tools/spellcheckers/test/accept-all-lemmas.sh`

gets lemmas from src/fst/morphology/stems/*.lexc and ensures that they are accepted by spell-checker.

`tools/spellcheckers/test/run-*-yaml-testcases.sh`

runs YAML checks in the desktopspeller-gt-norm-yamls subdirectory if any.

`tools/spellcheckers/test/suggestion-quality.sh`

runs spelling corrector against typos.tsv and ensures that number of correct suggestions is within specified parametres.

`tools/spellcheckers/test/test-zhfst-basic-sugg-speed.sh`

verifies spell-checker produces suggestions within reasonable time

`tools/spellcheckers/test/test-zhfst-file.sh`

checks that the spell-checker archive is usable at all

`tools/tokenisers/tests/`

no common tests

`tools/tts/test`

no common tests

`devtools/`

contains scripts for manually running tests and seeing more results and opening the result files in an editor.

Sitemap