Norwegian Bokmål NLP Grammar

Finite state and Constraint Grammar based analysers, proofing tools and other resources

On this page

Morphophonological rules for Bokmål

This file documents the phonology.twolc file

Sets and definitions

Alphabet

We declare both the a-å letters and all other possible letters.

Boundary symbols

Morpheme boundaries and escaped quotes - do not delete in twolc, they will be converted to zero/the real thing at a later stage.

Morphophonological triggers

These symbols cause the twolc rules to work.

Triggers for nominal rules

Trigers for verbal rules

Triggers for common rules (both for N and V)

Nynorsk trigger

Sets

Rule section

This section shows the twolc rules and the tests used to check whether they work

Umlaut

Umlaut Rule for bok : bøker etc. It shifts the vowels u, o, a, å to y, ø, e, e, respectively when Z1 is found after the stem.

Vowel deletions rules

Epenthetic Deletion Rule is actually 3 rules in one: 1) it deletes -e- in moden : modne etc, 2) it deletes the stem -e in hare + -er and 3) it delets suffix -e in ærlig + est > ærligst

Tests: (star denotes negativ test, test that is supposed to fail)

Delete foreign vowel Rule for deleting final a or o in words like kollega : kolleger. Trigger symbol to the right is X2.

Tests:

Consonant deletion

Consonant shortening before deletion Rule

Tests:

Geminate deletion in front of -t and -d Rule deletes: 1) before Q3 and d or t (kaller:kalte) 2) before passive Q1 t (lykkes:lyktes) and 3) before epenthetic -e- and l, n or r (sikker:sikre)

Tests:

Delete r Rule deletes r in plural -er to get -er + -ne = plural -ene

Delete m Rule for kam:kammen, here we delete the second m when word-final.

um Deletion 1 Rule (um Deletion 2 is now part of the Delete m Rule)

Tests:

t weakening Rule

Tests:

Double t deletion Rule

Tests:

Insertion rules

Compound rule

Tests:

Clitics

Clitic after s-final Rule for changing the so-called genitive -s to for s-final stems: huss -> hus’

Nynorsk dictionary rules

Change -er stem to -ar in Nynorsk

This rule is for dictionary use only. The idea is to be able to click on words in a Nynorsk text and get translation to North Sámi. Therefore, the Bokmål analyser is able to give an analysis to Nynorsk words as well. The Nynorsk-only forms are removed from all other transducers than the -dict transducer.

Test to have an error


This (part of) documentation was generated from src/fst/morphology/phonology.twolc

Sitemap