Norwegian Bokmål NLP Grammar

Finite state and Constraint Grammar based analysers, proofing tools and other resources

View the project on GitHub giellalt/lang-nob

Morphophonological rules for Bokmål

This file documents the phonology.twolc file

Sets and definitions

Alphabet

We declare both the a-å letters and all other possible letters.

Boundary symbols

Morpheme boundaries and escaped quotes - do not delete in twolc, they will be converted to zero/the real thing at a later stage.

Morphophonological triggers

These symbols cause the twolc rules to work.

Triggers for nominal rules

Trigers for verbal rules

Triggers for common rules (both for N and V)

Nynorsk trigger

Sets

Rule section

This section shows the twolc rules and the tests used to check whether they work

Umlaut

Umlaut Rule for bok : bøker etc. It shifts the vowels u, o, a, å to y, ø, e, e, respectively when Z1 is found after the stem.

Vowel deletions rules

Epenthetic Deletion Rule is actually 3 rules in one: 1) it deletes -e- in moden : modne etc, 2) it deletes the stem -e in hare + -er and 3) it delets suffix -e in ærlig + est > ærligst

Tests: (star denotes negativ test, test that is supposed to fail)

Delete foreign vowel Rule for deleting final a or o in words like kollega : kolleger. Trigger symbol to the right is X2.

Tests:

Consonant deletion

Consonant shortening before deletion Rule

Tests:

Geminate deletion in front of -t and -d Rule deletes: 1) before Q3 and d or t (kaller:kalte) 2) before passive Q1 t (lykkes:lyktes) and 3) before epenthetic -e- and l, n or r (sikker:sikre)

Tests:

Delete r Rule deletes r in plural -er to get -er + -ne = plural -ene

Delete m Rule for kam:kammen, here we delete the second m when word-final.

um Deletion 1 Rule (um Deletion 2 is now part of the Delete m Rule)

Tests:

t weakening Rule

Tests:

Double t deletion Rule

Tests:

Insertion rules

Insert t in passives Rule

Compound rule

Tests:

Clitics

Clitic after s-final Rule for changing the so-called genitive -s to for s-final stems: huss -> hus’

Nynorsk dictionary rules

Change -er stem to -ar in Nynorsk

This rule is for dictionary use only. The idea is to be able to click on words in a Nynorsk text and get translation to North Sámi. Therefore, the Bokmål analyser is able to give an analysis to Nynorsk words as well. The Nynorsk-only forms are removed from all other transducers than the -dict transducer.

Test to have an error


This (part of) documentation was generated from src/fst/morphology/phonology.twolc