Kalaallisut NLP Grammar

Finite state and Constraint Grammar based analysers, proofing tools and other resources

View the project on GitHub giellalt/lang-kal

Greenlandic morphological analyser

Fil for at generere de centrale morfologiske processer i vor grønlandske analysator

Multicharacter symbols

Tags for POS (primary tags)

Main Word Classes

Secondary tags

Tags for Verbs

Tags for Pronouns

Tags for Other Word Classes

Semantics

værdi i spillekort

Grammar

Derivation

Dialect

Tags to mark loan word entries with a diverting orthography

That is, they need special treatment in e.g. speech syntesis.

Orthograhy

Usage/error

Tags for Inflection

Numerus

Kasus

Særlige 3./4. persons kasus med DivPron (Gram/Cong)

Modus

Verb person-numerus

Possesive tags - Possessormarkering i possessum

Flag diacritics for Greenlandic

Flag diacritics til pluralis tantum subjekter

Flag diacritics til verber med kun pluralis i objekterne

Test af boolsk variabel til ad hoc blokeringer

Test af boolsk variabel til ad hoc blokering af Gram/Exclm. stems sættes Off og derivation On

Off-flag sættes i verbs på transitive verber med usandsynlig Refl. On-flag på taggen Gram/Refl i gennemgangslexica

Off-flag på verber som akuaa, der ikke må lave metatese på NIQ

Nyt flag 20211214 for at forebygge *taakkuunngitsoq og *taannaanngitsut

Off-flag på nominer, der SKAL opføre sig replacive som pilersaarusiorpoq og aqqusinniorpoq

Off-flag i nouns og Off-flag i der-inf når TUR og TUGAQ ikke må assibileres og On-flag, når de skal assibileres. Også for at forebygge assibilering efter HTR på nnip

Flag specielt for at sikre additiv p-bøjning af ulloq i Trm@

Ad hoc til test af alternativ flag diacritics ved præfikser. Husk også den udkommenterede linie ‘Kingumoorutit ;’ i LEXICON Root

Test af P- og D-flag til forebyggelse af rekusivitet ved TIP

og blokeres af

Test 20210504 af P- og R-flag for at generere både takornariat og takornarissat+Err/Sub

Flags for loan words, which must not go to N+Abs+Sg without derivation.

30.10.23: Trond tok taggane som var deklarert fleire gonger (sannsynlegvis tidlegare taggstrengar A=B=C) ut desse og laga i staden ei liste der kvar tag sto ein gong (nedanfor): docs/tagstrings.md

List of the so-called Greenlandic tilhæng, i.e., derivational affixes

Grænsesymbol

Symbols that need to be escaped on the lower side (towards twolc)

Vore morfofonemer

Vore magiske symboler

Language-independent flag diacritics

We have manually optimised the structure of our lexicon using following flag diacritics to restrict morhpological combinatorics - only allow compounds with verbs if the verb is further derived into a noun again:

| Flag | Explanation | — | —

For languages that allow compounding, the following flag diacritics are needed to control position-based compounding restrictions for nominals. Their use is handled automatically if combined with +CmpN/xxx tags. If not used, they will do no harm.

Flag Explanation
!@P.CmpFrst.FALSE@ Require that words tagged as such only appear first
!@D.CmpPref.TRUE@ Block such words from entering ENDLEX
!@P.CmpPref.FALSE@ Block these words from making further compounds
!@D.CmpLast.TRUE@ Block such words from entering R
!@D.CmpNone.TRUE@ Combines with the next tag to prohibit compounding
!@U.CmpNone.FALSE@ Combines with the prev tag to prohibit compounding
!@P.CmpOnly.TRUE@ Sets a flag to indicate that the word has passed R
!@D.CmpOnly.FALSE@ Disallow words coming directly from root.

Use the following flag diacritics to control downcasing of derived proper nouns (e.g. Finnish Pariisi -> pariisilainen). See e.g. North Sámi for how to use these flags. There exists a ready-made regex that will do the actual down-casing given the proper use of these flags.

| Flag | Explanation | — | —

LEXICON Root pointing to main parts of speech


This (part of) documentation was generated from src/fst/morphology/root.lexc

Sitemap

Debugging site.pages:

URL: /assets/css/style.css - Title:

URL: /IssuesInGreenlandic.html - Title:

URL: /Links.html - Title:

URL: /index-header.html - Title: Kalaallisut documentation

URL: / - Title: Kalaallisut documentation

URL: /kal.html - Title: Kalaallisut language model documentation

URL: /morph.html - Title: Morphophonology

URL: /src-cg3-dependency.cg3.html - Title: West Greenlandic Dependency Parser

URL: /src-cg3-disambiguator.cg3.html - Title: W E S T G R E E N L A N D I C D I S A M B I G U A T O R #

URL: /src-cg3-functions.cg3.html - Title:

URL: /src-fst-morphology-affixes-derivations-inflections.lexc.html - Title: Fil for at generere de centrale morfologiske processer i vor grønlandske analysator

URL: /src-fst-morphology-affixes-noun_to_noun.lexc.html - Title:

URL: /src-fst-morphology-affixes-numerals.lexc.html - Title: Arabiske numeralier

URL: /src-fst-morphology-affixes-propernouns.lexc.html - Title:

URL: /src-fst-morphology-affixes-symbols.lexc.html - Title: Symbol affixes

URL: /src-fst-morphology-root.lexc.html - Title: Greenlandic morphological analyser

URL: /src-fst-morphology-stems-nouns.lexc.html - Title: Grønlandske nomener

URL: /src-fst-morphology-stems-propernouns.lexc.html - Title:

URL: /src-fst-morphology-stems-verbs.lexc.html - Title:

URL: /src-fst-transcriptions-transcriptor-abbrevs2text.lexc.html - Title:

URL: /src-fst-transcriptions-transcriptor-numbers-digit2text.lexc.html - Title:

URL: /tagstrings.html - Title: Tag strings from root.lexc

URL: /tools-grammarcheckers-grammarchecker.cg3.html - Title: G R E E N L A N D I C G R A M M A R C H E C K E R

URL: /tools-grammarcheckers-liststemplates.cg3.html - Title: Grammarchecker tags

URL: /tools-tokenisers-tokeniser-disamb-gt-desc.pmscript.html - Title: Tokeniser for kal

URL: /tools-tokenisers-tokeniser-gramcheck-gt-desc.pmscript.html - Title: Grammar checker tokenisation for kal

URL: /tools-tokenisers-tokeniser-tts-cggt-desc.pmscript.html - Title: TTS tokenisation for smj

Root items:

URL: /IssuesInGreenlandic.html - Title: Issuesingreenlandic

URL: /Links.html - Title: Links

URL: /index-header.html - Title: Kalaallisut documentation

URL: / - Title: Kalaallisut documentation

URL: /kal.html - Title: Kalaallisut language model documentation

URL: /morph.html - Title: Morphophonology

URL: /src-cg3-dependency.cg3.html - Title: West Greenlandic Dependency Parser

URL: /src-cg3-disambiguator.cg3.html - Title: W E S T G R E E N L A N D I C D I S A M B I G U A T O R #

URL: /src-cg3-functions.cg3.html - Title: Src-cg3-functions.cg3

URL: /src-fst-morphology-affixes-derivations-inflections.lexc.html - Title: Fil for at generere de centrale morfologiske processer i vor grønlandske analysator

URL: /src-fst-morphology-affixes-noun_to_noun.lexc.html - Title: Src-fst-morphology-affixes-noun_to_noun.lexc

URL: /src-fst-morphology-affixes-numerals.lexc.html - Title: Arabiske numeralier

URL: /src-fst-morphology-affixes-propernouns.lexc.html - Title: Src-fst-morphology-affixes-propernouns.lexc

URL: /src-fst-morphology-affixes-symbols.lexc.html - Title: Symbol affixes

URL: /src-fst-morphology-root.lexc.html - Title: Greenlandic morphological analyser

URL: /src-fst-morphology-stems-nouns.lexc.html - Title: Grønlandske nomener

URL: /src-fst-morphology-stems-propernouns.lexc.html - Title: Src-fst-morphology-stems-propernouns.lexc

URL: /src-fst-morphology-stems-verbs.lexc.html - Title: Src-fst-morphology-stems-verbs.lexc

URL: /src-fst-transcriptions-transcriptor-abbrevs2text.lexc.html - Title: Src-fst-transcriptions-transcriptor-abbrevs2text.lexc

URL: /src-fst-transcriptions-transcriptor-numbers-digit2text.lexc.html - Title: Src-fst-transcriptions-transcriptor-numbers-digit2text.lexc

URL: /tagstrings.html - Title: Tag strings from root.lexc

URL: /tools-grammarcheckers-grammarchecker.cg3.html - Title: G R E E N L A N D I C G R A M M A R C H E C K E R

URL: /tools-grammarcheckers-liststemplates.cg3.html - Title: Grammarchecker tags

URL: /tools-tokenisers-tokeniser-disamb-gt-desc.pmscript.html - Title: Tokeniser for kal

URL: /tools-tokenisers-tokeniser-gramcheck-gt-desc.pmscript.html - Title: Grammar checker tokenisation for kal

URL: /tools-tokenisers-tokeniser-tts-cggt-desc.pmscript.html - Title: TTS tokenisation for smj

Directory items: