South Sámi NLP Grammar

Finite state and Constraint Grammar based analysers, proofing tools and other resources

View the project on GitHub giellalt/lang-sma

Documenting the South Saami lexicon file

The nouns

The noun stems are stored in gt/sma/noun-sma-lex.txt, whereas the morphology is found in gt/sma/sma-lex.txt. It was made by converting the original Moshagen / Trosterud sma lexicon to the Xerox format. The lexical rules are taken from Karttunen’s alternative formulation of our file. His original is found in the archive of original files (Contact Trond for reference, if needed).

The nouns are divided into three stem classes, the N_IE nouns (gåetie, etc.), the N_OE nouns (bearkoe, etc) and the N_OTHER nouns (all other ones, bi- and trisyllabic alike).

The case forms fall in three different groups: Forms unique to each of the three stem classes (listed under each sublexicon), forms with -j- in the N_OE class and -i- in the other ones (with separate i- and j- sublexica and a common suffix lexicon), and forms common to all stem classes (covered in a common continuation lexicon).

Here is a list of the lexica (to be documented)

 N_ODD
 N_ODD_NODISIMP
 ÅABPETJH        !default N_ODD plural lexicon
 N_ODD_C         !these words have consonant-ending in nominative
 AAJEGE          !Sg+Nom=Aajege/Aajeh  Sg+Cmp=Aajeh-
 AAREGE          !Sg+Nom=Aarege/Aareh  Sg+Cmp=Aarege-/Aareh-
 BAARTEGE        !Sg+Nom=baartege/baarth  Sg+Cmp=baartege-/baarth-
 GAAJSEGE        !Sg+Nom=gaajsege/gaajsh  Sg+Cmp=gaajsh-
 LAADTEGE        !Sg+Nom=laadtege  Sg+Cmp=laadth-
 SAADTEGE        !Sg+Nom=saadtege  Sg+Cmp=saadtege-/saadth-
 LEEJJEGE        !Sg+Nom=leejjege  Sg+Cmp=leejjeh-
 DEAKEHKE        !Sg+Nom=deakehke/deakah  Sg+Cmp=deakehke-/deakah-
 ÅERUVE
 BÅERUVE
 VUANOVE
 BÅERUJE
 IJE_ODD
 DAKTERE
 N_IE
 VUELIE
 TJIDTJIE
 TJÅENIEH       !ie plural lexicon
 N_OE_UML
 N_OE
 LAAHKOE
 GAAROEH        !-oe plural
 MAANA
 AAHKA
 NIEJTE
 MAAKE
 JOVKEMES 

The adjectives

The continuation lexica are built on the following convention:

attrsuffix_PREDSUFFIX_STEMTYPECOMPTYPE

A letter C in the beginning of the suffix marks consonant. There may be more than one suffix both for attr and PRED, thereby the difference small/capital letters. Example:

faelskies+CmpN/SgN+CmpN/SgG+CmpN/PlG:faelsk ies_IES_IE_EVEN

This is an even-syllabic adjective, with -ies attributive and -ies or -ie in predicative. Since nothing is said about comparative forms, it has normal comparative and superlative inflection.

The verbs

The auxiliary lea and the negative verbs have been added. These verbs are irregular, and have thus been added without the use of any morphophonological processes.

To be written: Documentation for verb lexica.

The adpositions and prepositions

The adpositions in Bergsland’s grammar have been listed, in two groups, pure postpositions and combined pre/postpositions (named “adpositions”). Other adpositions ahve been added.