Norwegian Bokmål NLP Grammar

Finite state and Constraint Grammar based analysers, proofing tools and other resources

View the project on GitHub giellalt/lang-nob

The grammarchecker disambiguator for Norwegian Bokmål

This disambiguator is based upon the disambiguator from OBT (Oslo-Bergen-taggeren), hereafter OBT-cg. It is adjusted to the GiellaLT FST and extended with several rules. It contains the morphological rules only.

The original OBT disambiguator was written in CG-1 by Kristin Hagen and Anders Nøklestad at UiO. It was translated to CG-2 by Lars Nygård. The conversion to CG-3 and the Tromsø format was done by Trond Trosterud.

This particular file (grc-disambiguator.cg3) is a version of the above adjusted to grammar checker needs. Mainly, disambiguation rules are relaxed or even commented out.

NOTE! For reference, removed rules should be marked with the searchable tag grcremoval

Delimiters and sets

The tagsets are a superset of the OBT and GiellaLT tags, so that the labels are kept from OBT-cg, but GiellaLT content is added when needed.

These sets model noun phrases (NPs). The idea is to first define whatever can occur in front of the head of the NP, and thereafter negate that with the expression WORD - premodifiers.

The strict version of items that can only be premodifiers, not parts of the predicate

to be used together with PRE-NP-HEAD before @>N is disambiguated

The set NOT-NPMOD is used to find barriers between NPs. Typical usage: … (*1 N BARRIER NOT-NPMOD) … meaning: Scan to the first noun, ignoring anything that can be part of the noun phrase of that noun (i.e., “scan to the next NP head”)

GRADE-ADV

Rule section

Giellatekno early rules

NotAbbr removes abbreviations whenever alternatives

AbbrBeforePara removes CLB before CLB

Nynorsk removes all +Nynorsk forms (they are in use only for the dictionary interface, and that does not use disambiguation).

aa

aaIM selects +IM for å

Numerals

Compounds

Mostly OBT Rules

The bulk of the file contains rules from the original OBT file.

Giellatekno late rules

Neuter sg pl

Pronouns

Det rules

V and not N

Prepositions

Late rules, Gt

Rules with weights

minweight selects reading with lowest weight.


This (part of) documentation was generated from tools/grammarcheckers/grc-disambiguator.cg3