Finite state and Constraint Grammar based analysers, proofing tools and other resources
View the project on GitHub giellalt/lang-nob
This file documents the phonology.twolc file
We declare both the a-å letters and all other possible letters.
Morpheme boundaries and escaped quotes - do not delete in twolc, they will be converted to zero/the real thing at a later stage.
These symbols cause the twolc rules to work.
This section shows the twolc rules and the tests used to check whether they work
Umlaut Rule for bok : bøker etc. It shifts the vowels u, o, a, å to y, ø, e, e, respectively when Z1 is found after the stem.
Epenthetic Deletion Rule is actually 3 rules in one: 1) it deletes -e- in moden : modne etc, 2) it deletes the stem -e in hare + -er and 3) it delets suffix -e in ærlig + est > ærligst
Tests: (star denotes negativ test, test that is supposed to fail)
Delete foreign vowel Rule for deleting final a or o in words like kollega : kolleger. Trigger symbol to the right is X2.
Tests:
Consonant shortening before deletion Rule
Tests:
Geminate deletion in front of -t and -d Rule deletes: 1) before Q3 and d or t (kaller:kalte) 2) before passive Q1 t (lykkes:lyktes) and 3) before epenthetic -e- and l, n or r (sikker:sikre)
Tests:
Delete r Rule deletes r in plural -er to get -er + -ne = plural -ene
Delete m Rule for kam:kammen, here we delete the second m when word-final.
um Deletion 1 Rule (um Deletion 2 is now part of the Delete m Rule)
Tests:
t weakening Rule
Tests:
Double t deletion Rule
Tests:
Insert t in passives Rule
Tests:
Clitic after s-final Rule for changing the so-called genitive -s to ‘ for s-final stems: huss -> hus’
Change -er stem to -ar in Nynorsk
This rule is for dictionary use only. The idea is to be able to click on words in a Nynorsk text and get translation to North Sámi. Therefore, the Bokmål analyser is able to give an analysis to Nynorsk words as well. The Nynorsk-only forms are removed from all other transducers than the -dict
transducer.
Test to have an error
This (part of) documentation was generated from src/fst/morphology/phonology.twolc