GiellaLT Documentation

GiellaLT provides rule-based language technology aimed at minority and indigenous languages

View GiellaLT on GitHub

Page Content

Ideas for testing

Goal: We want to test our transducers better

Existing tests

  1. Paradgim testing against predefined answers: yaml tests
  2. Tests written in the lexc and twolc code
  3. Testing whether we generate the lemma or not
  4. Tests using the lemma list as gold standard (do we generate the lemma)

Ideas for new tests

Elaborating the test ideas

Test for Multichar Symbols on the lower side

Now and then Multichar Symbols slip through twolc and give “words” like
Suome^Vn pro correct Suomeen.

How to test for this:

This test one should be able to set up language-independently. In case we get

Test for phonotactically illegal strings

Example, from fkv (this must be adjusted to a script):