Norwegian Bokmål NLP Grammar

Finite state and Constraint Grammar based analysers, proofing tools and other resources

View the project on GitHub giellalt/lang-nob

Page Content

Bokmål Norwegian Grammar Checker

This file contains two parts: Definitions and rules

Definition section

Delimiters

DELIMITERS = “<.>” “<!>” “<?>” “<…>” “<¶>”;

Grammatical tags

Here we declare all grammatical tags

Parts of speech tags

Sets for POS sub-categories

Boundary tags

Sets for Semantic tags

Sets for Morphosyntactic properties

Syntactic tags

Initials

INITIAL = small letters, *CAP-INITIAL** = capital letters

Sets

Sets of tags

Word or not

Noun sets

Verb sets

Pronoun sets

Numeral sets

Adjectival sets and their complements

Adverbial sets and their complements

Introduce finite clauses.

Coordinators

Sets of elements with common syntactic behaviour

Sets for verbs

All active verbs with a TV tag, including V:

NP sets defined according to their morphosyntactic features

These sets model noun phrases (NPs). The idea is to first define whatever can occur in front of the head of the NP, and thereafter negate that with the expression WORD - premodifiers.

The strict version of items that can only be premodifiers, not parts of the predicate

to be used together with PRE-NP-HEAD before @>N is disambiguated

The set NOT-NPMOD is used to find barriers between NPs. Typical usage: … (*1 N BARRIER NOT-NPMOD) … meaning: Scan to the first noun, ignoring anything that can be part of the noun phrase of that noun (i.e., “scan to the next NP head”)

Miscellaneous sets

Border sets and their complements

Syntactic sets

These were the set types.

Grammarchecker sets

There are 20 or so different rule tags, see the rule section below.

For ADDRELATION rules (perhaps not in use)

Rule section

Speller rules

Speller suggestions rule – add &SUGGESTWF to any spelling suggestion that we actually want to suggest to the user.

Speller rule: Add typo to misspelled words The simplest is to just add it to all spelled words:

Speller rule: Do not mark misspelled words in quotes But perhaps you want to only suggest spellings of words that are not inside “quotes”:

NP internal agreement rules

Ensure preceding adjective agrees with noun

Agreement rule: masculine adjectives should be neuter (msyn-agr-adjmsc-adjneu). Context: Et fin/fint hus.

Agreement rule: Singular adjectives should be plural (msyn-agr-adjsg-adjpl). Context: mange organisert/organiserte fritidsaktiviteter.

Agreement rule: Neuter adjectives shoul be masculine (msyn-agr-adjneu-adjmsc). Context: En fint/fin båt.

Agreement rule: Masculine definite determiners should be neuter (msyn-agr-detmsc-detneu). Context: den/det huset.

Agreement rule: Masculine indefinite determiners should be neuter (msyn-agr-detmsc-detneu). Context: en/et land.

Agreement rule: Neuter definite determiners should be feminine (msyn-agr-detneu-detfem). Context: det/den boka.

Agreement rule: Neuter indefinite determiners should be feminine (msyn-agr-detneu-detfem). Context: et/ei bok.

Agreement rule: Neuter indefinite determiners should be feminine (msyn-agr-detneu-detfem). Context: et/ei realitetens kvinne.

Agreement rule: Neuter indefinite determiners should be feminine (msyn-agr-detneu-detfem). Context: et/ei realitetens kvinne.

Agreement rule: Neuter indefinite determiners should be masculine (msyn-agr-detneu-detmsc). Context: et/en studie.

Agreement rule: Neuter indefinite determiners should be masculine (msyn-agr-detneu-detmsc). Context: et/en studie.

Agreement rule: Neuter adjectives should be masculine (msyn-agr-detneu-detmsc). Context: et/en … båt.

Agreement rule: same rule but for Pron

Definiteness rule: Double definiteness. Context: disse grunner/grunnene

Definiteness rule: Double definiteness. Context: de sosiale aspekter/aspektene The rule gave too many false alarms, we skip it.

Definite adjectives

Quantifier phrases

Agreement rule: Indef after quantifier. (msyn-qucompl-def-indef). Context: Vi har mange bøkene/bøker.

Agreement rule: Pl instead of Sg after quantifier. (msyn-qucompl-sg-pl). Context: Vi har mange ulike utfordring

Comparative rule: Quantor in superlative: de flere/fleste ulike kulturene

Predicative gender agreement

Predicative: neuter adjective should be masculine (msyn-pred-adjneu-adjmsc). Context: Båten var fint/fin.

Predicative: msculine adjective should be neuter (msyn-pred-adjmsc-adjneu). Context: Eplet var god/godt.

Agreement rule:. Context: Eplet var god/godt.

Agreement rule: Context: Eplet var god/godt.

Agreement rule: Context: Eplene var god/gode.

Agreement rule: Context: Jeg spiste et eple som var god/godt.

Agreement rule: Context: Jeg har en bil som er rødt/rød.

Agreement rule: Context: Jeg har ei hytte som er rødt/rød.

Agreement rule: Context: Jeg har biler som er fin

Agreement rule: Context: Eplet som jeg spiste var grønn/grønt

Agreement rule: Context: Bilen som jeg kjørte var grønt.

Agreement rule: Context: Hytta som jeg eier er fint.

Agreement rule: with relative clause Context: Bilene som jeg kjørte var grønt/grønn

Case errors

Case rules so far: Nominative pronouns should be accusative

Agreement rule: The context is P-complement. (msyn-pron-nom-acc). Context: Vi snakker om du.

Finite verb errors

Verb rule: Infinitive and no finite form in the sentence (msyn-v-inf-pres). Context: Jeg like/liker peanøtter.

Infinitive

Verb rule: Verb error: Present tense should be infinitive (msyn-v-pres-inf). Context: Jeg vil skriver et brev.

Adverb errors

Word order errors

V3 -> V2 in main clause

V2 to V3 in embedded clauses

og/å errors

The og -> å rules

Realword rule: og should be å real-og-aa. Context: Det er ikke til og holde ut.

Realword rule: og should be aa between Ind and Inf (real-og-aa). Context: Vi prøver og gå.

The å -> og rules

Realword rule: å should be og between nouns (real-aa-og). Context: Det var Trond å Kari.

Realword rule: å should be og between similar verbforms except 2nd V = obj (real-aa-og). Context: Vi må lese å skrive lyrikk.

Realword rule: å should be og between similar verbforms except 2nd V = obj (real-aa-og). Not: Det er ikke så lett som man skulle tro å skrive lyrikk.

Realword rule: å should be og between similar verbforms except 2nd V = obj (real-aa-og). Context: Vi vil hoppe å/og sprette.

Realword rule: å should be og between similar verbforms except 2nd V = obj (real-aa-og). Context: Vi hopper å/og spretter.

Punctuation rules

Simple punctuation rules showing how to change the lemma in the suggestions:

Quotation mark rule: Use correct quotation mark.

Ellipsis rule: Ellipsis … for … (use-ellipsis)


This (part of) documentation was generated from tools/grammarcheckers/grammarchecker.cg3