Faroese NLP Grammar

Finite state and Constraint Grammar based analysers, proofing tools and other resources

On this page

Faroese disambiguator

Usage, in lang-fao: cat text.txt|hfst-tokenize -cg tools/tokenisers/tokeniser-disamb-gt-desc.pmhfst |vislcg3 -g src/cg3/disambiguator.cg3

This file documents the Faroese disambiguator file .

Delimiters, tags and sets

Delimiters

Tags and sets

Tags

Declearing all tags from the fst.

Sets

Combining tags into useful sets

Noun sets

Adjective sets

Nominal sets

Verb sets

Noun-Verb sets

Number sets

Preposition sets, taking different cases

Boundary sets

Case sets

These are sets of cases, not sets of prepositions choosing them.

NOTNOM, NOTDAT, etc.: Some case, but not…

Word sets

Test: Go for minimal weight. This rules gives priority to lexicalised forms.

Infinitive

A or N

Disambiguate A, N due to context.

TAD, Pron or Det

Adjective disambiguation in NP

Case disambiguation

Noun disambiguation

Conjunctions

No rules so far

Subjunctions

Verbs

Passive

-st

vera

verða

kunna

koma

Plural

PP disambiguation

Preposition or not?

á

av

millum

móti

til

tíður

um

undir

við

Case within PP phrases

POS disambiguation

Adjectives

kalur

Pronouns

Pron Pers or Det

Det

Pron not N

Proper nouns

Specific lexemes, words

Adverbs

General adverb

Specific adverbs

akkurát

Lexicalised adverbs.

Adverb verbs

Idioms

Numerals and number symbols

NP internal constraints

Determiner disambiguation

Specific determiners

Postnominal determiner disambiguation

Definiteness disambiguation

Define definiteness based upon case concordance.

Case disambiguation

Noun disambiguation

Poss disambiguation

Ensuring case concordande within poss phrases

Number disambiguation

Coordination

Embedded clause V topicalisation

Elliptic AP as NP

P chains or not

Pronoun disambiguation

NP Coordination

VP disambiguation

V or A

V or Adv

V or N

Infinitive

Imperative

The best would be to make a corpus of imperative sentences, identify all the imperatives, and then just remove the rest.

Here come all rules selecting Imp. (so far only one)

Then we remove the remaining ones.

Present participle

Supine

Present singular

Present plural

V + Refl

Past indicative

Perfect participle

Case disambiguation

Nominative

Predicative

Subject

Miscellanious

Accusative

Mær dámar

Genitive

Pronoun disambiguation

seg

Verb disambiguation

A or V

Person

Number disambiguation

Postverbal subject

Gender disamb of adjectives

Gender disamb of numerals

Case disamb of numerals

Ordinals

Coordination

Adjective disambiguation outside NP

Substituting tags

Titles

CC Coordinate NPs

AFTER-section

For Apertium

MAPPING OF CC AND CS

Mostly we map both @CNP and @CVP, then we select @CNP, after that we remove them so @CVP remains


This (part of) documentation was generated from src/cg3/disambiguator.cg3

Sitemap