Lule Sami language model documentation

All doc-comment documentation in one large file.

src-cg3-disambiguator.cg3.md

Semantic tags

Rules for removing some Props which are identical to common nouns
IfonlyVerb selects the FMAINV reading in the cohort

Go for minimal weight (requires –with-backend-format=openfst-tropical)

@NO CODE@

Removing Err/Orth

This (part of) documentation was generated from src/cg3/disambiguator.cg3

src-fst-morphology-affixes-abbreviations.lexc.md

Continuation lexicons for abbreviations

Lexica for adding tags and periods

The sublexica

Continuation lexicons for abbrs both with and witout final period

**LEXICON ab-noun **
**LEXICON ab-adj **
**LEXICON ab-adv **
**LEXICON ab-num **

Lexicons without final period

**LEXICON ab-nodot-noun ** The bulk
**LEXICON ab-nodot-adj **
**LEXICON ab-nodot-adv **
**LEXICON ab-nodot-num **

Lexicons with final period

**LEXICON ab-dot-noun ** This is the lexicon for abbrs that must have a period.
**LEXICON ab-dot-adj ** This is the lexicon for abbrs that must have a period.
**LEXICON ab-dot-adv ** This is the lexicon for abbrs that must have a period.
**LEXICON ab-dot-num ** This is the lexicon for abbrs that must have a period.
**LEXICON ab-dot-cc **
**LEXICON ab-dot-verb **
**LEXICON ab-nodot-verb **
**LEXICON ab-dot-IVprfprc **
**LEXICON nodot-attrnomaccgen-infl **
**LEXICON nodot-attr-infl **
**LEXICON nodot-nomaccgen-infl **
**LEXICON dot-attrnomaccgen-infl **
**LEXICON dot-attr **
**LEXICON dot-nomaccgen-infl **
**LEXICON DOT ** - Adds the dot to dotted abbreviations.

This (part of) documentation was generated from src/fst/morphology/affixes/abbreviations.lexc

src-fst-morphology-affixes-adjectives.lexc.md

Sublexica for Adjective

Even-syllable stems

LEXICON GIEVRRA Adjectives with attribute in WeG and -s. As 1a in Spiik. Sg Acc: gievrav, Attr: gievras.

gárttje # Even-syllable test examples:*
gárttje: gárttje+A+Sg+Nom
gártjev: gárttje+A+Sg+Acc
gártjes: gárttje+A+Attr
gártjep: gárttje+A+Der/Comp+A+Sg+Nom

LEXICON NUORRA Adjectives with attribute same as pred. As 1b in Spiik. Sg Acc: nuorav, Attr: nuorra.

visská # Even-syllable test examples:*
visská: visská+A+Sg+Nom
viskáv: visská+A+Sg+Acc
visská: visská+A+Attr
viskáp: visská+A+Der/Comp+A+Sg+Nom

LEXICON GALLJE Adjectives on -e, the attribute is in WeG and e > a. As 1d in Spiik. Sg Acc: galjev, Attr: galja.

uhttse # Even-syllable test examples:*
uhttse: uhttse+A+Sg+Nom
uhtsev: uhttse+A+Sg+Acc
uhtsa: uhttse+A+Attr
uhtses: uhttse+A+Attr (Eng. # gets this attr from)
uhtsep: uhttse+A+Der/Comp+A+Sg+Nom

LEXICON TJÁBBE Adjectives on -e, the attribute is in WeG and e > a. Same as GALLJE only different adv derivation. Sg Acc: tjáppev, Attr: tjáppa.

njálgge # Even-syllable test examples:*
njálgge: njálgge+A+Sg+Nom
njálgev: njálgge+A+Sg+Acc
njálga: njálgge+A+Attr
njálgep: njálgge+A+Der/Comp+A+Sg+Nom

LEXICON VILLDA Adjectives with attribute same as pred, without CG. As 1b in Spiik. Sg Acc: nuorav, Attr: nuorra.

frisska # Even-syllable test examples:*
frisska: frisska+A+Sg+Nom
frisskav: frisska+A+Sg+Acc
frisska: frisska+A+Attr
frisskap: frisska+A+Der/Comp+A+Sg+Nom

LEXICON HÁVSSKE Adjectives with attribute -s, without WeG. As 1c in Spiik. Sg Acc: hávsskev, Attr: hávsskes.

hoallá # Even-syllable test examples:*
hoallá: hoallá+A+Sg+Nom
hoalláv: hoallá+A+Sg+Acc
hoallás: hoallá+A+Attr
hoalláp: hoallá+A+Der/Comp+A+Sg+Nom

LEXICON TJUODDJE Adjectives with attribute -is, without WeG. presently only “Tjuoddje” Sg Acc: tjuoddjev, Attr: tjuoddjis.

tjuoddje # Even-syllable test examples:*
tjuoddje: tjuoddje+A+Sg+Nom
tjuoddjev: tjuoddje+A+Sg+Acc
tjuoddjis: tjuoddje+A+Attr
tjuoddjep: tjuoddje+A+Der/Comp+A+Sg+Nom

Pres.participles

LEXICON SÁVADAHTTE Causative-participles. No attribute. No comparision. As 1e in Spiik. Sg Acc: sávadahttev. PrsPrc of causative verbs “uttrykker at handlingen lar seg gjøre eller er verdt å gjøre” (Kintel 1991).

vuojedahtte # Even-syllable test examples:*
vuojedahtte: vuojedahtte+A+Sg+Nom
vuojedahttev: vuojedahtte+A+Sg+Acc

LEXICON JUHKKE participles with -s attributive. No comparision As 1e in Spiik. Sg Acc: juhkkev, Attr: juhkkes. Spiik: presens particip har med den attributive formen på -s betydelsen “någon som är duktig i, snabb til att, begiven att utföra handlingen”.

vuohttje # Even-syllable test examples:*
vuohttje: vuohttje+A+Sg+Nom
vuohttjev: vuohttje+A+Sg+Acc
vuohttjes: vuohttje+A+Attr

LEXICON BÅRRE participles without the -s attributive. As 1e in Spiik. Sg Acc: bårrev, Attr: bårre. Spiik: presens particip har med den attributiva formen utan -s betydelsen ºdem som utför handlingenº.

ednabårre # Even-syllable test examples:*
ednabårre: ednabårre+A+Sg+Nom
ednabårrev: ednabårre+A+Sg+Acc
ednabårre: ednabårre+A+Attr

Test data:

Loan words lexicas

Correctly assimilated loanwords, derived from real noun.

LEXICON METÅVDÅLASJ LOAN! Foreign -isk adjectives adapted in updated normative way. To smj ending -alasj, adjective is truly derived from a noun. Mekanisk-mekanihkka-mekanihkalasj, instead of mekánalasj that goes to MEKÁNALASJ_BADASS. Pred and attr are both -alasj. Attr same as pred. With comparatives.

LEXICON METÅVDÅLASJ_CMP_INFL

kapitalismalasj # Even-syllable test examples:*
kapitalismalasj: kapitalismalasj+A+Sg+Nom
kapitalismalattjav: kapitalismalasj+A+Sg+Acc
kapitalismalasj: kapitalismalasj+A+Attr
kapitalismalabbo: kapitalismalasj+A+Der/Comp+A+Sg+Nom

LEXICON MEKANIHKA_MEKANIJKA_LASJ LOAN! Same type of adjectives as METÅVDÅLASJ, only for adjektives that become mekanihkalasj in norway and mekanijkalasj in sweden, because of differences mekanik vs mekanikk>mekanijkka vs mekanihkka. Attr same as pred. With comparatives.

LEXICON IJJALASJ Just lik METÅVDÅLASJ only for words ending on ijjalasj/iddjalasj, so that we don’t need a lot of Area and Err tags in stems file.

LEXICON IJJALASJ_CMP_INFL

LEXICON OGIJJALASJ Just like IJJALASJ only for words ending on ogijjalasj/ogiddjalasj, so that we don’t need a lot of err tags in stems files. For words like “pedagogijjalasj” which also have “pedagåvgålasj” (not really a wrong derivation, but doesn’t mean pedagogisk) and “pedagogalasj” err taged.

LEXICON OGIJJALASJ_CMP_INFL

LEXICON SJÅNÅLASJ_SJONAL -sjonal/sjonell and -tional/tionel loanwords. Only for words that work as nouns, so that they are REAL dervations, as nasjonal-nasjåvnnå-nasjåvnålasj. NOT for words like “rasjonell”, with no real noun. Words as “rasjonell>rasjonálla-rasjonálalasj” go to lexicon ÁLLA. The fake derivation “nasjonálalasj” is err taged, so is the strange “nasjonálla/nasjunálla”.

LEXICON SJÅNÅLASJ_SJONAL_CMP_INFL

konstitusjåvnålasj # Even-syllable test examples:*
nasjåvnålasj: nasjåvnålasj+A+Sg+Nom
nasjåvnålattjav: nasjåvnålasj+A+Sg+Acc
nasjåvnålasj: nasjåvnålasj+A+Attr

LEXICON SJÅNÅLASJ_SJONELL -sjonal/sjonell and -tional/tionel loanwords. Only for words that work as nouns, so that they are REAL dervations, as nasjonal-nasjåvnnå-nasjåvnålasj. NOT for words like “rasjonell”, with no real noun. Words as “rasjonell>rasjonálla-rasjonálalasj” go to lexicon ÁLLA. The fake derivation “nasjonálalasj” is err taged, so is the strange “nasjonálla/nasjunálla”.

LEXICON SJÅNÅLASJ_SJONELL_CMP_INFL

konstitusjåvnålasj # Even-syllable test examples:*
konstitusjåvnålasj: konstitusjåvnålasj+A+Sg+Nom
konstitusjåvnålattjav: konstitusjåvnålasj+A+Sg+Acc
konstitusjåvnålasj: konstitusjåvnålasj+A+Attr

Badly assimilated loanwords, some against norm others with no norm

LEXICON MEKÁNALASJ_BADASS LOAN! Wronly assimilated -lasj adjectives from SE/NO -isk. Looks derived but isn’t since there is no real noun to be derived from. Like mekanisk-mekánalasj, but “mekádna” is no real noun! Like METÅVDÅLASJ, but gives the Err/Der tag, so it’s only for these wronly/non-derived loan adjectives.

LEXICON ARKTALASJ_CMP_INFL Foreign -isk, that are not real derivations. Same as MEKÁNALASJ_BADASS, but no +Use/-Spell tag since ther is no “right” way to assimilate these. This is a question for GG. Adapted to smj by simply adding -alasj in place of -isk. These are not real derivations, but sitation borrowed loan adjectives. Only words without a noun base, like arktisk and syntetisk. Pred and attr are both -lasj. No comparatives.

syntiehtalasj # Even-syllable test examples:*
syntiehtalasj: syntiehtalasj+A+Sg+Nom
syntehtalasj: syntiehtalasj+A+Sg+Nom
syntiehtalattjav: syntiehtalasj+A+Sg+Acc
syntehtalattjav: syntiehtalasj+A+Sg+Acc
syntiehtalasj: syntiehtalasj+A+Attr
syntehtalasj: syntiehtalasj+A+Attr

LEXICON ORÁNSSJA Loan adjectives, not -isk. Used without the -lasj. Adjectives with attribute same as pred. So far only for oránssja.

LEXICON DEMONSTRATIJVA_LASJ_NO_NORM Loan adjectives from norwegian/swedish (Not adjectives ending on -isk). Words like demonstrativ, transitiv, dupleks, informativ, analog, privat. Gives both “demonstratijvva” and “demonstratijvalasj”. Two ways of adapting these adjectives are used, the adding of -lasj isn’t okey, because that’s a false derivation. But GG hasn’t decided how these should be handled. Looks like noun instead of adjective when adapted without the -lasj ending. Attr is in weak grad, used in strong grad ass pred even thou this seems a little bit odd “Værbba l transitijvva”.

LEXICON DEMONSTRATIJVA_LASJ_CMP_INFL

aktijvva # Even-syllable test examples:*
aktijva: aktijvva+A+Attr
aktijvva: aktijvva+A+Sg+Nom
aktijvav: aktijvva+A+Sg+Acc

LEXICON ÁLA_LASJ_NO_NORM Same as DEMONSTRATIJVA_LASJ_NO_NORM. Only for adjectives ending on -al. Words like digital,liberal, lokal. Gives both “eksponentiálla” and “eksponentiálalasj”. Different lexicon for these -al adjectives because of Err/Orth tags. OBS, “dialektal”, is assimilated “dialevtalasj”, and goes to lexicon METÅVDÅLASJ.

LEXICON ÁLA_LASJ_INFL_CMP

LEXICON ELLA_LASJ_NO_NORM Loanwords, same as ÁLA_LASJ_NO_NORM and DEMONSTRANTIJVA_LASJ_NO_NORM. For NO and SE adjectives ending on -ell, eksperimentell, ideell, parallell. The short form is nom parallælla, attr, parallella The long form: paralellalasj, attr parallellalasj. Different lexicon for these -ell adjectives because of err/orth tags. OBS, “individuell”, is assimilated “indivijdalasj”, and goes to lexicon METÅVDÅLASJ.

LEXICON ELLA_LASJ_INFL_CMP

LEXICON ÁLLA-ÆLLA

LEXICON MEKÁNALASJ_CMP_INFL Same as METÅVDÅLASJ only without vuohta.

Inherent comparatives and superlatives lexica

LEXICON OANEP Inherent comparatives, gives comp and superl. Adjectives that are lexicalized in their comparative (and superlative) forms, like sisŋep, bárep. Some entries are likely incorrect compared forms of other adjectives, like ådåp and ruvvap (more research needed).

lagáp # Even-syllable test examples:*
lagáp: lagáp+A+Sg+Nom
lagámus: lagáp+A+Der/Superl+A+Sg+Nom

LEXICON TJAVGGÁMUS Inherent superlatives, only gives superl. Some words are lexicalized in their superlative forms, like dájvvámus. Some are likely incorrect superlative forms, like tjábbámus (more research is needed)

dájvvámus # Even-syllable test examples:*
dájvvámus: dájvvámus+A+Sg+Nom

4-syllable miscellanious stems

LEXICON ÁRMMOGIS Adjectives on -is, attribute same as pred. Odd-syllable comparison. As 2 in Spiik. Sg Acc: ármmogisáv, Attr: ármmogis.

bahágis # Even-syllable test examples:*
bahágis: bahágis+A+Sg+Nom
bahágisáv: bahágis+A+Sg+Acc
bahágis: bahágis+A+Attr
bahágabbo: bahágis+A+Der/Comp+A+Sg+Nom

LEXICON SÆHKÁLAK Adjectives on -álak, attribute same as pred. Odd-syllable comparison. So far only for “sæhkálak”.

sæhkálak # Even-syllable test examples:*
sæhkálak: sæhkálak+A+Sg+Nom
sähkálak: sæhkálak+A+Sg+Nom
sæhkálagáv: sæhkálak+A+Sg+Acc
sähkálagáv: sæhkálak+A+Sg+Acc
sæhkálak: sæhkálak+A+Attr
sähkálak: sæhkálak+A+Attr
sæhkálabbo: sæhkálak+A+Der/Comp+A+Sg+Nom
sähkálabbo: sæhkálak+A+Der/Comp+A+Sg+Nom

LEXICON ÅLLAGSJ_CMP_INFL Adjectives on -asj, attribute same as pred. No comparatives. 2 in Spiik. Sg Acc: ållagattjav, Attr: ållagasj.

belulasj # Even-syllable test examples:*
belulasj: belulasj+A+Sg+Nom
belulattjav: belulasj+A+Sg+Acc
belulasj: belulasj+A+Attr

LEXICON DÁRBULASJ_CMP_INFL Adjectives on -asj, attribute same as pred. Odd-syllable comparison. Sg Acc: dárbulattjav, Attr: dárbulasj. Essive -attjan, -adtjan is subtaged. Err/Orth also -ahttja.

dábálasj # Even-syllable test examples:*
dábálasj: dábálasj+A+Sg+Nom
dábálattjav: dábálasj+A+Sg+Acc
dábálasj: dábálasj+A+Attr
dábális: dábálasj+A+Attr
dábálabbo: dábálasj+A+Der/Comp+A+Sg+Nom

LEXICON ASIDASJ_CMP_INFL Adjectives on -asj, -is attr. Odd-syllable comparison. Sg Acc: asidattjav, Attr: asidis.

gågulasj # Even-syllable test examples:*
gågulasj: gågulasj+A+Sg+Nom
gågulattjav: gågulasj+A+Sg+Acc
gågulis: gågulasj+A+Attr
gågulabbo: gågulasj+A+Der/Comp+A+Sg+Nom

LEXICON UDNODIBME Adjectives on -dibme, attribute on -is. Odd-syllable comparison. Sg Acc: udnodimev, Attr: udnodis.

gælvodibme # Even-syllable test examples:*
gælvodibme: gælvodibme+A+Sg+Nom
gälvodibme: gælvodibme+A+Sg+Nom
gælvodimev: gælvodibme+A+Sg+Acc
gälvodimev: gælvodibme+A+Sg+Acc
gælvodis: gælvodibme+A+Attr
gälvodis: gælvodibme+A+Attr
gælvodabbo: gælvodibme+A+Der/Comp+A+Sg+Nom
gälvodabbo: gælvodibme+A+Der/Comp+A+Sg+Nom

LEXICON TJALMEDIBME Like UDNODIBME but no comparatives. Sg Acc: tjalmedimev, Attr: tjalmedis.

huvsodibme # Even-syllable test examples:*
huvsodibme: huvsodibme+A+Sg+Nom
huvsodimev: huvsodibme+A+Sg+Acc
huvsodis: huvsodibme+A+Attr

LEXICON SUOLASIEHKE -siehke. Sg Acc: suolasiegev, attr: suolasiek

hánessiehke # Even-syllable test examples:*
hánessiehke: hánessiehke+A+Sg+Nom
hánessiegev: hánessiehke+A+Sg+Acc
hánessiek: hánessiehke+A+Attr

Odd-syllable stems

With CG Sorted by attr

LEXICON TJIEGOS Attr same as pred. For adjectives with -e in second syllable e>á: divtes>diktásav in StrG. As a. in Spiik. Sg Acc: tjiehkusav, Attr: tjiegos. Consonant gradation.

hánes # Odd-syllable test examples:*
måskas: måskas+A+Sg+Nom
moasskasav: måskas+A+Sg+Acc
måskas: måskas+A+Attr
moasskasabbo: måskas+A+Der/Comp+A+Sg+Nom
bihtja: bihtja+A+Sg+Nom
bihttjagav: bihtja+A+Sg+Acc
bihtja: bihtja+A+Attr
bihttjagabbo: bihtja+A+Der/Comp+A+Sg+Nom

LEXICON LINES Attr ending on -a. Adjektives ending on -es. Does same as TJIEGOS, but with attr -a. As g. in Spiik. lines, Sg Acc: lidnásav, attr: lidna. Consonant gradation.

lines # Odd-syllable test examples:*
lines: lines+A+Sg+Nom
lidnásav: lines+A+Sg+Acc
lidna: lines+A+Attr
lidnásabbo: lines+A+Der/Comp+A+Sg+Nom

LEXICON GALMAS Attr ending on -a or -å. Adjectives on -as, ås- and ás. As e. in Spiik. Sg Acc: galmmasav, attr: galmma, Consonant gradation.

njuoskas # Odd-syllable test examples:*
njuoskas: njuoskas+A+Sg+Nom
njuosskasav: njuoskas+A+Sg+Acc
njuosska: njuoskas+A+Attr
njuosskasabbo: njuoskas+A+Der/Comp+A+Sg+Nom

LEXICON OAMES Attr ending on -e. Adjectives on -es with attribute -e. As g2. in Spiik. Sg Acc: oabmásav, Attr: oabme. Consonant gradation.

goastes # Odd-syllable test examples:*
goastes: goastes+A+Sg+Nom
goasstásav: goastes+A+Sg+Acc
goasste: goastes+A+Attr
goasstásabbo: goastes+A+Der/Comp+A+Sg+Nom

LEXICON SUOHKAT Attr III -is, not suohkkadis but SUOHKKIS. With CG to attr, not from nom to Acc. Same as JALGGAT only with this CG. SUOHKKIS. Without CG between nom and acc. Adjectives on -at and -åt, with attribute III -is. As f. in Spiik. Sg Acc: suohkadav, attr: suohkkis,

rávvat # Odd-syllable test examples:*
rávvat: rávvat+A+Sg+Nom
rávvadav: rávvat+A+Sg+Acc
rávvis: rávvat+A+Attr
rávvadabbo: rávvat+A+Der/Comp+A+Sg+Nom

LEXICON MÅJDÅS Adjectives with no attr. With CG. Sg Acc: måjddåsav. If there is an attribute that dosn’t fit to any lexicon it mus be hardcoded.

rávdes # Odd-syllable test examples:*
rávdes: rávdes+A+Sg+Nom
rávddásav: rávdes+A+Sg+Acc
rávddásabbo: rávdes+A+Der/Comp+A+Sg+Nom

Without CG

LEXICON VIEKSES Attr same as pred. Without CG, but With vowel changes. Sg Acc: væksásav, Attr: viekses. Like TJIEGOS only without the CG but with vowel changes. Mayby change this to a lexicon withput attr and then hardcode attr?

LEXICON ALEK Attr same as pred. Without CG, without any vowel changes. Like TJIEGOS only without the CG an vowelchanges.

purpur # Odd-syllable test examples:*
purpur: purpur+A+Sg+Nom
purpurav: purpur+A+Sg+Acc
purpur: purpur+A+Attr
purpurabbo: purpur+A+Der/Comp+A+Sg+Nom

LEXICON BASSTEL Attr ening on -is. Without CG. Adjs on -et, -l, -r, sm om -k, -sj with attr -is and no consonant gradation. As b. in Spiik. Sg Acc: basstelav, Attr: basstelis. Many of these entries might be instances of derivations, like belak, deblak, and maybe also basstel, bargán.

goavrret # Odd-syllable test examples:*
goavrret: goavrret+A+Sg+Nom
goavrredav: goavrret+A+Sg+Acc
goavrredis: goavrret+A+Attr
goavrredabbo: goavrret+A+Der/Comp+A+Sg+Nom

LEXICON MUTTÁK Two attr enings -is and same as pred. Without CG. Adjs on -ák/-ak/-ek, two attr: -is and same as pred. As c. in Spiik. Sg Acc: muttágav, Attr: muttágis and mutták. These seem to be instances of the adjectival -k derivation. Unclear whether such derivation have different attr forms or not, and thats maybe why some of these derivations are found in BASSTEL lexicon.

bárvak # Odd-syllable test examples:*
bárvak: bárvak+A+Sg+Nom
bárvagav: bárvak+A+Sg+Acc
bárvak: bárvak+A+Attr
bárvagis: bárvak+A+Attr
bárvagabbo: bárvak+A+Der/Comp+A+Sg+Nom

LEXICON JALGGAT Attr III -is, not jalggadis but JALGGIS. Without CG. Adjectives on -at, with attribute III -is. As f. in Spiik. Sg Acc: jalggadav, attr: jalggis,

russjkat # Odd-syllable test examples:*
russjkat: russjkat+A+Sg+Nom
russjkadav: russjkat+A+Sg+Acc
russjkis: russjkat+A+Attr
russjkadabbo: russjkat+A+Der/Comp+A+Sg+Nom

LEXICON TJÅRGGÅT Attr III -is, not tjårggådis but tjårggis. Without CG. Same as JALGGAT only for adjectives ending ot -åt. Adjectives on -åt, with attribute III -is. As f. in Spiik. Sg Acc: jalggadav, attr: jalggis,

russjkat # Odd-syllable test examples:*
russjkat: russjkat+A+Sg+Nom
russjkadav: russjkat+A+Sg+Acc
russjkis: russjkat+A+Attr
russjkadabbo: russjkat+A+Der/Comp+A+Sg+Nom

LEXICON RIHTSOK No attr, without CG and also without any vowelchanges. The lexicon gives no attribute, either because the adjective dosnºt have attr, because there is stemvowel change in attr that the lexicon canºt handle or because there are strange atrributes that donºt fit to any other lexicon (these attributes are hardcoded). Sg Acc: rihtsogav.

rihtsok # Odd-syllable test examples:*
rihtsok: rihtsok+A+Sg+Nom
rihtsogav: rihtsok+A+Sg+Acc
rihtsogabbo: rihtsok+A+Der/Comp+A+Sg+Nom

exception lexicons for odd-syll

LEXICON IENNILS no comparatives, attr same as pred.

ieŋŋils # Odd-syllable test examples:*
ieŋŋils: ieŋŋils+A+Sg+Nom
ieŋŋilsav: ieŋŋils+A+Sg+Acc

LEXICON RÁDAS Presently only used for “rádas”. This word has special consonant gradation d>dd. Attr same as pred. Sg Acc: ráddasav, Attr: rádas. Consonant gradation.

rádas # Odd-syllable test examples:*
rádas: rádas+A+Sg+Nom
ráddasav: rádas+A+Sg+Acc
rádas: rádas+A+Attr
ráda: rádas+A+Attr (Eng. # from LEXATTR)
ráddasabbo: rádas+A+Der/Comp+A+Sg+Nom

LEXICON LUOBES Err/Orth lexicon! Does the same as TJIEGOS only e>a instead of usuall e>á, must be some err/orth. Sg Acc: luohpasav, Attr: luobes. Consonant gradation. NO Attr, must be hardcoded

LEXICON LÅSSÅT Two attr, two comp. As f3. in Spiik. So far the only word i this lexicon i “låssåt”, because both låssis and låsså are attr and comparative is both låsep(hybrid?) and låssådabbo.

låssåt # Odd-syllable test examples:*
låssåt: låssåt+A+Sg+Nom
låssådav: låssåt+A+Sg+Acc
låssis: låssåt+A+Attr
låsså: låssåt+A+Attr
låssådabbo: låssåt+A+Der/Comp+A+Sg+Nom
låsep: låssåt+A+Der/Comp+A+Sg+Nom

LEXICON STUORAK Only for stuorak. It hase two attributes. Has even-syllable comparison: stuoráp and stuorámus.Sg Acc: stuoragav, attr: stuor and stuorra. This might be a -k derivation of adjective stuorre attr stuor(ra). The comparison is thus based on the original adjective and thus it naturally is an even syll comparison.

stuorak # Odd-syllable test examples:*
stuorak: stuorak+A+Sg+Nom
stuoragav: stuorak+A+Sg+Acc
stuorra: stuorak+A+Attr
stuor: stuorak+A+Attr
stuoráp: stuorak+A+Der/Comp+A+Sg+Nom

LEXICON ALLAK Adjs on -ak, attr.on -a. Have both gasep/gaggagabbo and alep/allagabbo as comparatives. As d. in Spiik. So far only the adjectives “allak” and “gassak” go to this lexicon.

gassak # Odd-syllable test examples:*
gassak: gassak+A+Sg+Nom
gassagav: gassak+A+Sg+Acc
gassa: gassak+A+Attr
gassagabbo: gassak+A+Der/Comp+A+Sg+Nom
gasep: gassak+A+Der/Comp+A+Sg+Nom

LEXICON GÅBDDÅK Adjs on -åk, attr. on -å. Has even-syllable comparison: gåbdep and gåbdemus. So far “gåbddåk” is the only word in this lexicon. As d2. in Spiik. Sg Acc: gåbddågav, Attr: gåbddå.

gåbddåk # Odd-syllable test examples:*
gåbddåk: gåbddåk+A+Sg+Nom
gåbddågav: gåbddåk+A+Sg+Acc
gåbddå: gåbddåk+A+Attr
gåbdep: gåbddåk+A+Der/Comp+A+Sg+Nom

Inherent comparatives and superlatives

LEXICON NUORTTALABBO Inherent comparatives, gives both comp and superl. Most of the words are the compared forms of -el(a) words, like nuorttal, lullel.

guddnelabbo # Even-syllable test examples:*
guddnelabbo: guddnelabbo+A+Sg+Nom
guddnelamos: guddnelabbo+A+Der/Superl+A+Sg+Nom
guddnelap: guddnelabbo+A+Attr
guddnelup: guddnelabbo+A+Attr

LEXICON GASSKALAMOS Inherent superlatives, gives onlys superl. Words that are lexicalized in their superlative forms.

ájtodamos # Even-syllable test examples:*
ájtodamos: ájtodamos+A+Sg+Nom

Contracted stems

LEXICON SÁDNES Attr same as pred. Sg Acc: sáddnáv, Attr: sádnes.

hávres # Contracted test examples:*
hávres: hávres+A+Sg+Nom
hávrráv: hávres+A+Sg+Acc
hávres: hávres+A+Attr
hávrráp: hávres+A+Der/Comp+A+Sg+Nom

LEXICON GOAVSOS Attr same as pred. Sg Acc: goaksuv, Attr: goavsos.(goavsos is so far the only word in this lexicon)

goavsos # Contracted test examples:*
goavsos: goavsos+A+Sg+Nom
goaksuv: goavsos+A+Sg+Acc
goaksusav: goavsos+A+Sg+Acc (Eng. # From lexicon TJIEGOS)
goavsos: goavsos+A+Attr
goaksup: goavsos+A+Der/Comp+A+Sg+Nom
goaksusabbo: goavsos+A+Der/Comp+A+Sg+Nom (Eng. # from lexicon TJIEGOS)

LEXICON SUVRES Sg Acc: suvrráv, Attr: suvra.

suvres # Contracted test examples:*
suvres: suvres+A+Sg+Nom
suvrráv: suvres+A+Sg+Acc
suvrrásav: suvres+A+Sg+Acc (Eng. # From lexicon SJÆVNNJAT)
suvra: suvres+A+Attr
suvrráp: suvres+A+Der/Comp+A+Sg+Nom
suvrrásabbo: suvres+A+Der/Comp+A+Sg+Nom (Eng. # from LINES)

LEXICON GÅLMAKTES Attr same as pred. without cg but with vowel changes. Sg Acc: gålmaktáv, Attr: gålmaktes. VIEKSES makes odd-syll same thing.

Comparation

LEXICON BU/MUS comparison for even-syll adjectives. Also derivates diminutive and adverbs from the comparisions.

LEXICON ABBO/AMOS comparison for odd-syll adjectives. Also derivates diminutive and adverbs from the comparisions.

LEXICON BUStem Comparative even-syll, case and attr.

LEXICON ABBO Comparative odd-syll, get case and attr. With the dialect differences “-ubbo” and “-æbbo”.

Superlative

LEXICON BUOREMUS Superlative even-syll, get attr and nom case.

LEXICON AMOS Superlative odd-syll, get case and attr. With the dialect differences “-umos” and “-æmos”.

Comparative and Superlative sub-lexica

LEXICON CompSup-EVEN

LEXICON CompSup-EVENWEAKSTEM

LEXICON ATTR Sends attributes to

LEXICON ATTR_PrsPrc Attr without -vuohta derivation.

Derivation of adjectives

LEXICON DenominalAdjsV1 ! even noun stems are sent here

LEXICON DenominalAdjsV1_1 ! even noun stems without grade alternation are sent here

LEXICON DenominalAdjsV2 ! even noun stems are sent here. -asj derivation

LEXICON DenominalAdjsKINO ! unassimilated nouns are sent here

LEXICON DenominalAdjsODD ! gives derivation -ahtes

LEXICON DenominalAdjsContr

Derivations to adjectives, hardcoded in adjectives stems file

LEXICON DIEHTEMAHTES ! odd syllable For hardcoded -ahtes words. Derived from odd-syll NomAct (Bårråt>bårråm-bårråmahtes), or from odd-syll verbs as buorránit>buorránahtes. Migth want to split lexicon in two.

LEXICON LÁGÁSJ

LEXICON BÁJNUK ! hardcoded denominal derivations, latus has changed from o>u, a>a, e>á (Bájnno>bájnuk, juolgge>juolgák, giella>gielak. Attr same as pred, no comp in this lexicon.

LEXICON TSÅHPÅK ! hardcoded denominal derivations latus has changed from o>u, a>a, e>á AND -GIS attr. Attr same as pred is err/orth taged. no comp in this lexicon.

LEXICON GIEVLEK ! hardcoded derivations, not same as BÁJNUK since latus has unexpected vowel. Latus hasn’t changed o>u, a>a, e>á. Goes directly to BÁJNUK, only made to sort these different kinds of derivations. Many of these may be derivated from verbs or other adjectives.

LEXICON SJERVAK ! hardcoded derivations, not same as TSÅHPÅK since latus has unexpected vowel. Latus hasn’t changed o>u, a>a, e>á. Goes directly to TSÅHPÅK, only made to sort these different kinds of derivations. Many of these may be derivated from verbs or other adjectives.

LEXICON DIBME ! even and contracted

LEXICON LIS ! Handlernomen på -is?

LEXICON Ahkásasj ! lexicalized and denominal -asj derivations

LEXICON STÁVVALIS ! Must be “stávvalis” in bot pred and attr, as “guovddelis”. OK& Kintel 2012: stávval attr stávvalis this is err/orth taged, also as second compound, this is err/orth taged. No comparison.

Derivations to adjectives, continuation lexicon not for hardcoded adjectives

LEXICON AHTES ! odd syllable, only a continuation lexicon for words that are not in adjectives stems. Just as DIEHTEMAHTES, only with the +A tag that adjectives already get i stems file.

LEXICON AHKES

LEXICON AGAdj ! denominal derivations go here, attr same as pred, no comp in this lexicon

This (part of) documentation was generated from src/fst/morphology/affixes/adjectives.lexc

src-fst-morphology-affixes-nouns.lexc.md

Sublexica for Noun

Even-syllable stems

2syll stems

LEXICON MUORRA Standard even stems with cg (note Q1). OBS: Nouns with invisible 3>2 cg (as busºsa) do not go to this lexicon. :%> MUORRACmp ; Compounds, Sg Nom strong grade, sg gen and sg pl weak grade

Even-syllable test examples:*
kártta: kártta+N+Sg+Nom
kártajn: kártta+N+Sg+Com

LEXICON TÁLLA Same as MUORRA, but for words with º (extra length). Not in MUORRA because of other err/orths

LEXICON ALMME Same as MUORRA, but with special -LASJ derivation. For noun that have strong grade -lasj. “Almmelasj” instead of “almálasj” which is Err/Orth-taged.

LEXICON NOADE Even stem without cg. OBS: No nouns with invisible 3>2 cg (as busºsa) in this lexicon. OBS: Because of denominal nouns taking a weak grade stem, entries in grade 3 are given the gradation mark º in order to prevent alternation to weak grade. We should consider creating a separate denominal nouns lexicon for NOADE instead.

låda # Even-syllable test examples:*
låda: låda+N+Sg+Nom
lådas: låda+N+Sg+Ela

LEXICON KÁFFA For even-syll words with cg cg III-I: káf’fa-káfav, jáf’fo-jáfo. No vowelchanges jet, need new twolc code.

káffa # Even-syllable test examples:*
káffa: káffa+N+Sg+Nom
gáffa: káffa+N+Sg+Nom
gáfas: káffa+N+Sg+Ela
káfas: káffa+N+Sg+Ela

LEXICON LINNJA Only for the loan word “linnja”. Because it’s a loan word, the “nnj” is pronounced “nn-j”, and therefore does not behave as the regular lule sami “nj” sound and therefore it doesn’t follow the rule that makes a:á in 1. grade with short vowel in first syllable (It isn’t as linnja-linjáv or birás-birrasav). This word is therefore sub taged. Norwegian/Swedish words with a short “i” followed by two different consonants are assimilated to lule sami in different manners accoring to the consonants in question, but the word is always on grade III (Morén-Duolljá 2014). Both err/orth and correct form is part of this lexicon.

linja # Even-syllable test examples:*

LEXICON BOAKSA Only for word “boaksa”. Both boaksa-båvsa and Err/Orth boaksa-båksa are part of lexicon.

LEXICON SÁMEGIEL Compounds on -giella, with short -giel as middle compound (sámegielåhpadiddje)

rievsakgiella # Even-syllable test examples:*
rievsakgiella: rievsakgiella+N+Sg+Nom

LEXICON AHKA Words like tjerastahka, with short compound form

báládahka # Even-syllable test examples:*
báládahka: báládahka+N+Sg+Nom
báládak: báládahka+N+Sg+Nom
báládagáv: báládahka+N+Sg+Acc

LEXICON DARRHA Only for “darrha” or compounds that end on “darrha”.

báktedarrha # Even-syllable test examples:*
báktedarrha: báktedarrha+N+Sg+Nom
báktedarháv: báktedarrha+N+Sg+Acc

Nouns with comparatives

LEXICON GÁDDE 2 syllable stems with cg (note Q1) with comparatives, like MUORRA

boassjo # Even-syllable test examples:*
boassjo: boassjo+N+Sg+Nom
båssjujn: boassjo+N+Sg+Com
boassjojn: boassjo+N+Sg+Com
boassjop: boassjo+N+Der/Comp+A+Sg+Nom

LEXICON VUODO 2 syllable stems without cg (note Q1) with comparatives, like NOADE

låda # Even-syllable test examples:*
vuodo: vuodo+N+Sg+Nom
vuodos: vuodo+N+Sg+Ela

LEXICON SJIEVNNJET Like GAHPER but with comparatives. Odd-syllable C-final noun without cg, no vowchange, no short Ess.

stiebil # Odd-syllable test examples:*
sjievnnjet: sjievnnjet+N+Sg+Nom
sjievnnjedis: sjievnnjet+N+Sg+Ela
sjievnnjedabbo sjievnnje+N+Der/Comp+A+Sg+Nom
sjievnnjedamos sjievnnje+N+Der/Superl+A+Sg+Nom

LEXICON ÅLGGO Like MUORRA, but with comparatives. This lexicon was previously without sg ill/ine/elat, but these nouns can be conjugated for regular location cases. However, “adverbs” like ålggot (from outside), nuorttan (at north), oarjas (to south), etc., are more commonly used to denote location/direction (should therefore maybe consider subing the regular location case forms).

lulle # Even-syllable test examples:*
lulle: lulle+N+Sg+Nom
lulev: lulle+N+Sg+Acc
lulep: lulle+N+Der/Comp+A+Sg+Nom

LEXICON MIEHTE Like MUORRA but no locative/elative/illative sg. Presently no words in this lexica except for err subed nuortto

nuortto # Even-syllable test examples:*
nuortto: nuortto+N+Sg+Nom
nuortov: nuortto+N+Sg+Acc

Plural stems

LEXICON BÅVSÅ Like MUORRA, only in plural. All, except ganta, juvdá and ávta, have regular, singular stem counterparts.

båvså # Even-syllable test examples:*
båvså: båvså+N+Pl+Nom
båvsåjt: båvså+N+Pl+Acc

LEXICON LÅHTSASA Like GAHPER, only in plural. Without derivations, these should maybe be added.

Partially assimilated loanwords. The first part of the word is “citation borrowed” and keeps its norwegian/swedish orthography, only the last two syllables are adapted to sami.

LEXICON MUORRA_LOAN For loan words that do not fit in a loan word lexicon because of wrong short cmp, or partially assimilated loanwords without separate lexicas (medállja), or for Err/Orths assimilated with cg but with other errors. This lexicon gives no short compound forms. Potential short cmps must therefore be hard coded into the FirstComponent lexicon. This also for compounded words with partially assimilated loan words. Examples of problem words: sirup>siráhppa og stetoskop>stetoskoahppa.

LEXICON MUORRA_LOAN_NO_LASJ Like MUORRA_LOAN without -lasj derivation. This lexicon is made for Sem/Hum words like økonåvmmå, biolåvggå, agronåvmmå and so on. We don’t want agronåvmålasj since it means something else than “agronomisk”, the meaning of agronåvmålasj is barely used but messed up with “agronomijjalasj”

LEXICON MUORRA_LOAN_EXTRA_LENGTH Same as MUORRA_LOAN just for words with º (extra length).

LEXICON KAFIEDJA_CMP_INFL Recent loanwords on -edja. Ends on -é in norwegian. Short and long cmp. “Kafea” and “kaféa” are subtaged. See comments about the -ie/-e dialtags in ALFABIEHTTA.

LEXICON ALLEGORIJJA_CMP_INFL Recent loanwords ending on -i in NOR/SWE, with long and short compound form. Standardized as-iddja (SWE) and -ijºja (NOR). Previously often assimilated as -ija (or just -ia), but both forms are ungrammatical: Short vowels cannot preceed and follow a single intervocalic consonant. -ija is thus ungrammatical as the short a would be lenghtened to á, like “idja-ijá”.

partijja # Even-syllable test examples:*
akademiddja: akademijja+N+Sg+Nom
akademijja: akademijja+N+Sg+Nom
akademidjas: akademijja+N+Sg+Ela
akademijjas: akademijja+N+Sg+Ela

LEXICON TEKSTIJLLA_CMP_INFL Recent loanwords on -ijlla with long and short compound-form. . Frequently typos that does not follow lulesami rules are sub taged; These forms goes against the standarization rule, but are found because of earlier standarizations rules and dictionaries.

tekstijlla # Even-syllable test examples:*
tekstijlla: tekstijlla+N+Sg+Nom
tekstijlas: tekstijlla+N+Sg+Ela

LEXICON ASIJLLA_CMP_INFL Recent loanwords on -ijlla, from nor and swe words ending on -yl. With long and short compound-form. Frequently typos that does not follow lulesami rules are sub taged; These forms goes against the standarization rule, but are found because of earlier standarizations rules and dictionaries.

asijlla # Even-syllable test examples:*
asijlla: asijlla+N+Sg+Nom
asijlas: asijlla+N+Sg+Ela

LEXICON BENSIJNNA Recent loanwords on -ijnna with long and short compound-form

LEXICON BENSIJNNA_CMP_INFL Recent loanwords on -ijnna with long and short compound-form. Frequently typos that does not follow lulesami rules are sub taged; These forms goes against the standarization rule, but are found because of earlier standarizations rules and dictionaries.

tamburijnna # Even-syllable test examples:*
tamburijnna: tamburijnna+N+Sg+Nom
tamburijnas: tamburijnna+N+Sg+Ela

LEXICON MASJIJNNA_CMP_INFL Recent loanwords on -sjijnna with long and short compound-form: -SKIN

bivtasmasjijnna # Even-syllable test examples:*
bivtasmasjijnna: bivtasmasjijnna+N+Sg+Nom
bivtasmasjijnas: bivtasmasjijnna+N+Sg+Ela

LEXICON ADJEKTIJVVA_CMP_INFL Recent loanwords on -ijvva with long and short compound-form

datijvva # Even-syllable test examples:*
datijvva: datijvva+N+Sg+Nom
datijvas: datijvva+N+Sg+Ela

LEXICON PARADIJSSA_CMP_INFL Recent loanwords on -ijssa with long and short compound-form. Frequently typos that does not follow lulesami rules are sub taged; These forms goes against the standarization rule, but are found because of earlier standarizations rules and dictionaries.

servijssa # Even-syllable test examples:*
servijssa: servijssa+N+Sg+Nom
servijsas: servijssa+N+Sg+Ela

LEXICON TELEFÅVNNÅ_CMP_INFL Recent loanwords on -åvnnå with long and short compound-form. Frequently typos that does not follow lulesami rules are sub taged; These forms goes against the standarization rule, but are found because of earlier standarizations rules and dictionaries.

persåvnnå # Even-syllable test examples:*
persåvnnå: persåvnnå+N+Sg+Nom
persåvnås: persåvnnå+N+Sg+Ela

LEXICON INSTITUSJÅVNNÅ_CMP_INFL Recent loanwords on -sjåvnnå with long and short compound-form: -TION IN SWEDISH. Frequently typos that does not follow lulesami rules are sub taged; These forms goes against the standarization rule, but are found because of earlier standarizations rules and dictionaries.

populasjåvnnå # Even-syllable test examples:*
populasjåvnnå: populasjåvnnå+N+Sg+Nom
populasjåvnås: populasjåvnnå+N+Sg+Ela

LEXICON MISJÅVNNÅ_CMP_INFL Recent loanwords on -sjåvnnå with long and short compound-form: -SSION IN SWEDISH. Frequently typos that does not follow lulesami rules are sub taged; These forms goes against the standarization rule, but are found because of earlier standarizations rules and dictionaries.

sesjåvnnå # Even-syllable test examples:*
sesjåvnnå: sesjåvnnå+N+Sg+Nom
sesjåvnås: sesjåvnnå+N+Sg+Ela

LEXICON PENSJÅVNNÅ_CMP_INFL Recent loanwords on -sjåvnnå with long and short compound-form: -SION IN SWEDISH. Frequently typos that does not follow lulesami rules are sub taged; These forms goes against the standarization rule, but are found because of earlier standarizations rules and dictionaries.

suspensjåvnnå # Even-syllable test examples:*
suspensjåvnnå: suspensjåvnnå+N+Sg+Nom
suspensjåvnås: suspensjåvnnå+N+Sg+Ela

LEXICON PARTISIHPPA_CMP_INFL Recent loanwords from swe -cip and nor -sipp, becoming -sihppa in Norway, both -sijppa and -sihppa are used in Sweden (Particip vs partisipp). Short and long compound-form.

partisihppa # Even-syllable test examples:*
partisihppa: partisihppa+N+Sg+Nom
partisihpas: partisihppa+N+Sg+Ela
partisijppa: partisihppa+N+Sg+Nom
partisijpas: partisihppa+N+Sg+Ela

LEXICON ALKOHÅVLLÅ_CMP_INFL Recent loanwords on -åvllå with long and short compound-form. The old stadarization form “alkohola” is sub taged. Frequently typos that does not follow lulesami rules are sub taged; These forms goes against the standarization rule, but are found because of earlier standarizations rules and dictionaries.

parabåvllå # Even-syllable test examples:*
parabåvllå: parabåvllå+N+Sg+Nom
parabåvlås: parabåvllå+N+Sg+Ela

LEXICON AGRONÅVMMÅ_CMP_INFL Recent loanwords on -åvmma with long and short compound-form. -lasj derivation is error taged. The old stadarization form -oma that does not follow lulesami rules is sub taged.

agronåvmmå # Even-syllable test examples:*
agronåvmmå: agronåvmmå+N+Sg+Nom
agronåvmås: agronåvmmå+N+Sg+Ela

LEXICON DEMAGÅVGGÅ_CMP_INFL Recent loanwords ending on -og with long and short compound form. Assimilated to smj as -åvggå. -lasj derivation is error taged. The old stadarization -oga that does not follow lulesami rules is sub taged.

pedagoga # Even-syllable test examples:*
pedagåvggå: pedagåvggå+N+Sg+Nom
pedagåvggå: pedagåvggå+N+Sg+Nom
pedagåvgås: pedagåvggå+N+Sg+Ela

LEXICON LAKTÅVSSÅ_CMP_INFL Recent loanwords ending on -ose in nrowegian and -os in swedish, with long and short compound form. Assimilated to smj as -åvsså. The old stadarization -oga that does not follow lulesami rules is sub taged.

pedagoga # Even-syllable test examples:*
laktåvsså: laktåvsså+N+Sg+Nom
laktåvsås: laktåvsså+N+Sg+Ela

LEXICON FAKTÅVRRÅ_CMP_INFL Recent loanwords on -åvrrå with long and short compound-form.

LEXICON MIKROSKÅVPPÅ_CMP_INFL Recent loanwords on -åvppå (-op in NOB/SWE) with long and short compound-form. Long vowel and short consonant is assimilated with njuoban, but somehow a lot of -op words are assimilated -oahppa (biskop is pronounced as -opp, so that’s different, maybe some have used “biskop” as template), so this is Err/Orth taged.

oajvvekontåvrrå # Even-syllable test examples:*
oajvvekontåvrrå: oajvvekontåvrrå+N+Sg+Nom
oajvvekontåvrås: oajvvekontåvrrå+N+Sg+Ela

LEXICON KULTUVRRA_CMP_INFL Recent loanwords on -vrra with long and short compound-form. Frequently typos that does not follow lulesami rules are sub taged; These forms goes against the standarization rule, but are found because of earlier standarizations rules and dictionaries.

muvrra # Even-syllable test examples:*
muvrra: muvrra+N+Sg+Nom
muvrajn: muvrra+N+Sg+Com

LEXICON TERAPÆVTTA_CMP_INFL Recent loanwords on -ævtta/ievtta with long and short compound-form. No -lasj derivation. Frequently typos that does not follow lulesami rules are sub taged; These forms goes against the standarization rule, but are found because of earlier standarizations rules and dictionaries.

muvrra # Even-syllable test examples:*
terapævtta: terapævtta+N+Sg+Nom
terapävtta: terapævtta+N+Sg+Nom
terapievtta: terapævtta+N+Sg+Nom
terapevtajn: terapævtta+N+Sg+Com
terapievtajn: terapævtta+N+Sg+Com

LEXICON ADVÆRBBA_CMP_INFL Recent loanwords on -ærbba with long and short compound-form

detransitijvvaværbba # Even-syllable test examples:*
detransitijvvaværbba: detransitijvvaværbba+N+Sg+Nom
detransitijvvavärbba: detransitijvvaværbba+N+Sg+Nom
detransitijvvaverbas: detransitijvvaværbba+N+Sg+Ela

LEXICON SUBSTÁNSSA_CMP_INFL Recent loanwords on -ánssa with long and short compound-form. Originally -ans in SWE and NOR. Frequently typos that does not follow lulesami rules are sub taged; These forms goes against the standarization rule, but are found because of earlier standarizations rules and dictionaries.

instánssa # Even-syllable test examples:*
instánssa: instánssa+N+Sg+Nom
instánsas: instánssa+N+Sg+Ela

LEXICON VALÆNSSA_CMP_INFL Recent loanwords on -ænssa with long and short compound-form. Frequently typos that does not follow lulesami rules are sub taged; These forms goes against the standarization rule, but are found because of earlier standarizations rules and dictionaries.

intelligænssa # Even-syllable test examples:*
intelligænssa: intelligænssa+N+Sg+Nom
intelligänssa: intelligænssa+N+Sg+Nom
intelligensav: intelligænssa+N+Sg+Acc

LEXICON ADVOKÁHTTA_CMP_INFL Recent loanwords on -áhtta with long and short compound-form. Frequently typos that does not follow lulesami rules are sub taged; These forms goes against the standarization rule, but are found because of earlier standarizations rules and dictionaries.

klimáhtta # Even-syllable test examples:*
klimáhtta: klimáhtta+N+Sg+Nom
klimáhtas: klimáhtta+N+Sg+Ela

LEXICON ALFABIEHTTA_CMP_INFL Recent loanwords originally on -et both in Norway and Sweden. Assimilation differences, however, create two lule sami forms: -iehtta in NOR and -æhtta in SWE. LONG -e is assimilated in different ways in Norway and Sweden: In Norway, it becomes -ie, and in Sweden -e. Tiedja/tedja, systiebma/systebma and so on. This is especially apparent in assimilated words with long e in third grade: E becomes æ in third grade so we get “universitæhtta” in SWE, but this is very strange to people on the norwegian side of the border as they want “universitiehtta”. Both -ie and -e are dialtaged in lexicons HYDROGIEDNA, APOTIEHKKA, SYSTIEBMA, KAFÉ. Previously people often wrote -ehtta in Norway, but this is incorrect as e always becomes æ in grade three.

mobilitiehtta # Even-syllable test examples:*
mobilitähtta: mobilitiehtta+N+Sg+Nom
mobilitiehtta: mobilitiehtta+N+Sg+Nom
mobilitehtav: mobilitiehtta+N+Sg+Acc
mobilitiehtav: mobilitiehtta+N+Sg+Acc

LEXICON INTERNÆHTTA_CMP_INFL Recent loanwords on -æhtta with long and short compound-form: -ET IN SWEDISH, -ETT in norwegian. Differs from ALFABIEHTTA because -ehtta isn’t used in NOR.

intranæhtta # Even-syllable test examples:*
intranæhtta: intranæhtta+N+Sg+Nom
intranähtta: intranæhtta+N+Sg+Nom
intranehtas: intranæhtta+N+Sg+Ela

LEXICON TABLÆHTTA_CMP_INFL Recent loanwords on -æhtta with long and short compound-form. -ETT in both norwegian and in swedish.

kvartæhtta # Even-syllable test examples:*
kvartæhtta: kvartæhtta+N+Sg+Nom
kvartähtta: kvartæhtta+N+Sg+Nom
kvartehtas: kvartæhtta+N+Sg+Ela

LEXICON INSTITUHTTA_CMP_INFL Recent loanwords on -uhtta, with long and short compound-form on -utt(NOR)/-ut(SWE). The swedish -ut also gets uvtta, as ANTIHKKA-antijkka, but instituhtta is also used in sweden, so no Area/NO tag.

minuhtta # Even-syllable test examples:*
minuhtta: minuhtta+N+Sg+Nom
minuvtta: minuhtta+N+Sg+Nom
minuhtas: minuhtta+N+Sg+Ela
minuvtas: minuhtta+N+Sg+Ela

LEXICON SATELIHTTA_CMP_INFL Recent loanwords on -ihtta, with long and short compound-form on -itt(NOR)/-it(SWE). The swedish -it also gets ijtta, as ANTIHKKA-antijkka, but satelihtta is also used in sweden, so no Area/NO tag.

inuihtta # Even-syllable test examples:*
inuihtta: inuihtta+N+Sg+Nom
inuijtta: inuihtta+N+Sg+Nom
inuihtas: inuihtta+N+Sg+Ela
inuijtas: inuihtta+N+Sg+Ela

LEXICON APOTIEHKKA_CMP_INFL Recent loanwords on -iehkka in NOR, -æhkka in SWE. -ehkka as sub. With long and short compound-form on -k. See comments about the -ie/-e dialtags in ALFABIEHTTA.

old “apotehkka” (long e not allowed in grad III, even though it’s in dictionaries it’s wrong)

kartotiehkka # Even-syllable test examples:*
kartotähkka: kartotiehkka+N+Sg+Nom
kartotehkas: kartotiehkka+N+Sg+Ela
kartotiehkka: kartotiehkka+N+Sg+Nom
kartotiehkas: kartotiehkka+N+Sg+Ela

LEXICON ANTIHKKA_CMP_INFL Recent loanwords on -hkka in Norway, both -ijkka and -hkka are used in Sweden (Antik vs antikk). With long and short compound-form on -kk/-k. The swedish forms were earlier added to stems for the Swedish version, but now added here.

dialektihkka # Even-syllable test examples:*
dialektihkka: dialektihkka+N+Sg+Nom
dialektihkas: dialektihkka+N+Sg+Ela
dialektijkka: dialektihkka+N+Sg+Nom
dialektijkas: dialektihkka+N+Sg+Ela

LEXICON SEMINÁRRA_CMP_INFL Recent loanwords on -árra with long and short compound-form. Frequently typos that does not follow lulesami rules are sub taged; These forms goes against the standarization rule, but are found because of earlier standarizations rules and dictionaries.

hektárra # Even-syllable test examples:*
hektárra: hektárra+N+Sg+Nom
hektáras: hektárra+N+Sg+Ela

LEXICON AREÁLLA_CMP_INFL Recent loanwords on -álla with long and short compound-form. Frequently typos that does not follow lulesami rules are sub taged; These forms goes against the standarization rule, but are found because of earlier standarizations rules and dictionaries.

gasskavokálla # Even-syllable test examples:*
gasskavokálla: gasskavokálla+N+Sg+Nom
gasskavokálas: gasskavokálla+N+Sg+Ela

LEXICON AMBASSADERRA_CMP_INFL Recent loanwords on -ør with long and short compound-form. Standarized by Giellagálldo 05.05.14 as -erra. -ørra is subtaged

observaterra # Even-syllable test examples:*
observaterra: observaterra+N+Sg+Nom
observateras: observaterra+N+Sg+Ela

LEXICON VETERINERRA_CMP_INFL Recent loanwords on -erra. Words ending in -ær in both SWE and NOR. Both long and short compound-form. The old standardization form -æra, without cg, is subtaged, -also -ær’ra and -ærra.

LEXICON ATMOSFERRA_CMP_INFL Recent loanwords -on erra. But with different endings in SE and NO, ending on -ære, -ær in NOR and -är, -ära in SWE (Ingefær NO, ingefära in SE). Only long compound-form, short form must be hardcoded in firstcompnent lexicon. The old standardization form -æra, and -era, without cg, are subtaged, -also -ær’ra and -ærra.

observaterra # Even-syllable test examples:*
atmosferra: atmosferra+N+Sg+Nom
atmosferas: atmosferra+N+Sg+Ela

LEXICON KARAKTIERRA_CMP_INFL Recent loanwords -on ierra in NOR, -erra in SWE, because of long e assimilates diffenrent ways. Words ending on -er in NOR, and -er or -är in SWE. Only long compound-form, short form must be hardcoded in firstcompnent lexicon.

observaterra # Even-syllable test examples:*
karaktierra: karaktierra+N+Sg+Nom
karaktieras: karaktierra+N+Sg+Ela
karakterra: karaktierra+N+Sg+Nom
karakteras: karaktierra+N+Sg+Ela

LEXICON TABÆLLA_CMP_INFL Recent loanwords on -älºla with long and short compound-form. Frequently typos that does not follow lulesami rules are sub taged; These forms goes against the standarization rule, but are found because of earlier standarizations rules and dictionaries.

flotælla # Even-syllable test examples:*
flotælla: flotælla+N+Sg+Nom
flotälla: flotælla+N+Sg+Nom
flotellas: flotælla+N+Sg+Ela

LEXICON TELEGRÁMMA_CMP_INFL Recent loanwords on -ámºma with long and short compound-form

grámma # Even-syllable test examples:*
grámma: grámma+N+Sg+Nom
grámmas: grámma+N+Sg+Ela

LEXICON TOPOGRÁFFA_CMP_INFL Recent loanwords on -áfºfa with long and short compound-form, no -lasj derivation since most of these words are humans.

telegráffa # Even-syllable test examples:*
telegráffa: telegráffa+N+Sg+Nom
telegráfas: telegráffa+N+Sg+Ela

LEXICON SYSTIEBMA_CMP_INFL Recent loanwords on -ebma/-iebma with long and short compound-form. -em in NOR and SWE. See comments about the -ie/-e dialtags in ALFABIEHTTA. Frequently typos that does not follow lulesami rules are sub taged; These forms goes against the standarization rule, but are found because of earlier standarizations rules and dictionaries.

vokalsystiebma # Even-syllable test examples:*
vokalsystebma: vokalsystiebma+N+Sg+Nom
vokalsystiebma: vokalsystiebma+N+Sg+Nom
vokalsystiemas: vokalsystiebma+N+Sg+Ela
vokalsystemas: vokalsystiebma+N+Sg+Ela

LEXICON ORGÁDNA_CMP_INFL Recent loanwords on -ádna with long and short compound-form

doarjjaorgádna # Even-syllable test examples:*
doarjjaorgádna: doarjjaorgádna+N+Sg+Nom
doarjjaorgánna: doarjjaorgádna+N+Sg+Nom
doarjjaorgánav: doarjjaorgádna+N+Sg+Acc

LEXICON KOLLÆKTA_CMP_INFL Recent loanwords on -ækta with long and short compound-form

subjækta # Even-syllable test examples:*
subjækta: subjækta+N+Sg+Nom
subjäkta: subjækta+N+Sg+Nom
subjevtas: subjækta+N+Sg+Ela

LEXICON HYDROGIEDNA_CMP_INFL Recent loanwords on -iedna in NOR and -edna in SWE. Both long and short compound-form. Norwegian/swedish -en. The old standardization form -ena, without cg, is subtaged. See comments about the -ie/-e dialtags in ALFABIEHTTA.

LEXICON PATÆNNTA_CMP_INFL Recent loanwords on -ænnta with long and short compound-form. The -ennta form (used in “Ådå testamennta”) is taged as sub (e always becomes æ in grade three).

patænnta # Even-syllable test examples:*
patænnta: patænnta+N+Sg+Nom
patännta: patænnta+N+Sg+Nom
patentas: patænnta+N+Sg+Ela

LEXICON VARIÁNNTA_CMP_INFL Recent loanwords on -ánnta with long and short compound-form. Frequently typos that does not follow lulesami rules are sub taged; These forms goes against the standarization rule, but are found because of earlier standarizations rules and dictionaries.

praktikánnta # Even-syllable test examples:*
praktikánnta: praktikánnta+N+Sg+Nom
praktikántas: praktikánnta+N+Sg+Ela

LEXICON FANATISSMA_CMP_INFL Recent loanwords on -ssma with long and short compound-form.

kabbalissma # Even-syllable test examples:*
kabbalissma: kabbalissma+N+Sg+Nom
kabbalismas: kabbalissma+N+Sg+Ela

LEXICON TURISSTA_CMP_INFL Recent loanwords on -ssta with long and short compound-form. -lasj derivation is error taged. Frequently typos that does not follow lulesami rules are sub taged; These forms goes against the standarization rule, but are found because of earlier standarizations rules and dictionaries.

journalissta # Even-syllable test examples:*
journalissta: journalissta+N+Sg+Nom
journalistas: journalissta+N+Sg+Ela

Loanwords becoming odd-syll

LEXICON PRIEMIJ_CMP_INFL Assimilated loanwords. on -ie/-y, like premie and bandy. Become odd syllable loan words with cg, like “riebij”. Nom: premij, gen prebmiha. Long and short essive.

priemij # Even-syllable test examples:*
priemij: priemij+N+Sg+Nom
premij: priemij+N+Sg+Nom
priebmihis: priemij+N+Sg+Ela
prebmihis: priemij+N+Sg+Ela
priemin: priemij+N+Ess
premin: priemij+N+Ess
priebmihin: priemij+N+Ess
prebmihin: priemij+N+Ess

Loanwords becoming contracted-syll

See further down: ÅLMÅJ_LOAN

Error-lexicons, made to not get too many entries with both Err/Orth and correct

LEXICON A_CMP_INFL Sub-forms. Lexicon for giving sub-variation conjugation by simply adding an -a to the norwegian/swedish word. No cg. Like “alkohola” and “agronoma”. These forms goes against the standarization rule, but are found because of earlier standarizations rules and dictionaries.

LEXICON ERR/ORTH_EVEN_WEAK_CASES Even stem Err/orth lexicon without nominative, illative and essive. Only for entries with ERR/ORTH tag. Made so that we don’t get entries that are both norm and with error tag. Entries like “ålggo” have no grade alternation, a common error is writing it like it has, ålggo>ålgov. tálla>tálav, klimáksa>klimáksav, prefiksa>prefiksav, barggo>barggov

LEXICON ERR/ORTH_EVEN_WEAK_CASES2 Even stem Err/orth lexicon without nominative, illative and essive, AND ALSO without Sg+Gen, Sg+Ine, Pl+Nom, Pl+Com and Pl+Gen (to not get homonemies.

LEXICON ERR/ORTH_EVEN_STRONG_CASES Even stem Err/orth lexicon with only nominative, illative and essive. Only for entries with ERR/ORTH tag. Made so that we don’t get entries that are both norm and with error tag. Hydrogena is used as nom and is err/orth, hydrogena>hydrogenav is not err/orth. marináda-nom, banána-nom

LEXICON ERR/ORTH_ODD Err/Orth lexicon doing the opposit of what odd-syllable nouns do. Strong grade in nom and weak in all other. fiehpar-fiebara

Badly assimilated loanwords

LEXICON NOADE_BADASS 2 syll stems without cg. Badly or wrongly assimilated words, ie. assimilated in a way that isn’t lulesami. (Same as NOADE) Most of the words are Err/Orth tagged with a standardized lemma. Some are Err/Lex tagged, 5.9.2019: EJP/SNM: fjerna +Use/-Spell - sjølv om vi ikkje likar orda, så vil vi sjå til at dei blir skrive rett etter smj-ortografien! Dei fleste orda er uansett merka med +Err/Orth :)

balláda # Even-syllable test examples:*
balláda balláda+N+Sg+Nom
balládas balláda+N+Sg+Ela

LEXICON C_ILL_IJ_BADASS Badly or wrongly assimilated words. Last letter is consonant, no cg, no vowchange, with illative -ij. (Same as GAHPER) Assimilated in a way that isn’t lulesami. Most of the words are Err/Orth tagged with a standardized lemma. Some are Err/Lex tagged, and some only recieve the +Use/-Spell tag from the lexicon.

sentimehter # Odd-syllable test examples:*
sentimehter sentimehter+N+Sg+Nom
sentimehteris sentimehter+N+Sg+Ela
sentimehterij sentimehter+N+Sg+Ill

LEXICON C_ILL_AJ_BADASS Badly or wrongly assimilated words. Last letter in consonant, no cg, no vowchange, with illative -aj. Should have been assimilated to even-syll, but are used as odd-syll, and mostly just assimilated with changing to letter á. So almosed same as CELSIUS_UNASS.

kálsium # Odd-syllable test examples:*
kálsium kálsium+N+Sg+Nom
kálsiumas kálsium+N+Sg+Ela
kálsiumaj kálsium+N+Sg+Ill

Unassimilated loanwords

LEXICON KINO_UNASS_CMP_INFL V-final unassimilated loanwords. Not lulesami. No diacritics whatsoever. Words that aren’t assimilated at all. Really just norwegian words with a kind of sami inflection. Get even syllable case marking. Are part of the spell checker.

netto # Even-syllable test examples:*
netto: netto+N+Sg+Nom
nettos: netto+N+Sg+Ela

LEXICON C_ILL_IJ_UNASS C-final unassimilated loanwords, gives illative- ij. Not lulesami. No diacritics whatsoever. Really just foreign words with a kind of sami inflection. Odd syllable case marking (like GAHPER). Are part of the spell checker.

sirkus # Even-syllable test examples:*
sirkus: sirkus+N+Sg+Nom
sirkusij: sirkus+N+Sg+Ill
sirkusis: sirkus+N+Sg+Ela

LEXICON C_ILL_AJ_UNASS C-final unassimilated loanwords, gives illativ -aj. Also odd-syll words ending on letter i, as selleri. Not lulesami. No diacritics whatsoever. Really just norwegian words with a kind of sami inflection. Case marking like standard even 4 syllable stems (see proper nouns file on the case marking of foreign words with stressed last syllable). Are part of the spell checker.

aids # Even-syllable test examples:*
aids: aids+N+Sg+Nom
aidsaj: aids+N+Sg+Ill
aidsas: aids+N+Sg+Ela
aidsan: aids+N+Ess
aidsadak: aids+N+Abe
aidsada: aids+N+Abe
aidsasj: aids+N+Der/Dimin+N+Sg+Nom

+Der4+Der/ahtes:e»g AHTES ; Only for odd-syllble stems

4syll stems

LEXICON GÅNÅGIS Standard C-final 4-syllabic stems

rahtjamus # Even-syllable test examples:*
rahtjamus: rahtjamus+N+Sg+Nom
rahtjamussaj: rahtjamus+N+Sg+Ill
rahtjamusás: rahtjamus+N+Sg+Ela

LEXICON BERULASJ For words ending on -asj. Same as GÅNÅGIS but with strong essive and illative -adjtan and -adtjaj subtaged, same with PX “-adjtam”. These forms are barely used today. -lahttja is also Err/Orth-taged.

LEXICON BEDNAGASJ Like BERULASJ, but for derived nouns in diminutive. No cg, no vowchange, no short Ess. Has only one dimin derivation since these words already are dimin, ie. no double dim as for GAHPER. No abessive, not totally sure about this, I think we must use postposition dagi when it’s diminutive,

stiebil # Odd-syllable test examples:*
bednagasj: bednagasj+N+Sg+Nom
bednagattjas: bednagasj+N+Sg+Ela

LEXICON HÁVSAGUSJ Like BEDNAGASJ, but not diminutive. No cg, no vowchange, no short Ess. Has only one dimin derivation. No abessive, not totally sure about this, I think we must use postposition dagi when it’s diminutive,

LEXICON JIHPELIJ gen:jihpelahá

gehtsulij # Even-syllable test examples:*
gehtsulij: gehtsulij+N+Sg+Nom
gehtsulaháv: gehtsulij+N+Sg+Acc

LEXICON OARJJILIJ gen:oarjjilihá

allilij # Even-syllable test examples:*
allilij: allilij+N+Sg+Nom
allilihás: allilij+N+Sg+Ela

LEXICON VIESSOMUJ gen:viessumuhá

bårråmuj # Even-syllable test examples:*
bårråmuj: bårråmuj+N+Sg+Nom
bårråmuháj: bårråmuj+N+Sg+Ill

4 syllable plurals

LEXICON OADÁDAGÁ Plural forms of words like tjerastahka with short compound-form

látjádagá # Even-syllable test examples:*
látjádagá: látjádagá+N+Pl+Nom
látjádagájs: látjádagá+N+Pl+Ela

LEXICON BERRAHATTJA Plural stems. Like IEDNITJA, these do not have corresponding singular stems. Most stems here have the same form as the pl nom form of diminutive derivations, but (while it may have originated as a diminuitive derivation) it is not the same derivation (today) and it does not have a singular form.

gahpanisá # Even-syllable test examples:*
gahpanisá: gahpanisá+N+Pl+Nom
gahpanisájda: gahpanisá+N+Pl+Ill
gahpanisájs: gahpanisá+N+Pl+Ela

LEXICON SIJDDALAHÁ Plurals

lullelahá # Even-syllable test examples:*
lullelahá: lullelahá+N+Pl+Nom
lullelahájt: lullelahá+N+Pl+Acc

LEXICON SISSNELUHÁ plurals. presently only for sissŋeluhá

sissŋeluhá # Even-syllable test examples:*
sissŋeluhá: sissŋeluhá+N+Pl+Nom
sissŋeluhájda: sissŋeluhá+N+Pl+Ill

LEXICON DAGI_SINGULAR Earlier we generated “bijladagi” and bijlajdagi as abbessiv. This has been fixed, but to be able to analyse what we earlier generated, we needed this lexicon. Only singular. Gives Err/tag to “bijladagi” and makes correct “bijla dagi”.

LEXICON DAGI_PLURAL Earlier we generated “bijladagi” and bijlajdagi as abbessiv. This has been fixed, but to be able to analyse what we earlier generated, we needed this lexicon. Only plural. Gives Err/tag to “bijlajdagi” and makes correct “bijlaj dagi”.

Adjectival sublexicas. Give 4 syll adjectives inflection

LEXICON SURGULASJ-EVEN

LEXICON N-EVENWEAKSTEM-NO-ABE same as N-EVENWEAKSTEM but without abessive (abessive it Err/Infl-taged). Used for 4-syll nouns

Compound lexicas

Odd-syllable stems

without cg

LEXICON GAHPER Odd-syllable C-final noun without cg, no vowchange, no short Ess. Spiik A3

stiebil # Odd-syllable test examples:*
stiebil: stiebil+N+Sg+Nom
stiebilis: stiebil+N+Sg+Ela

with cg

LEXICON ÅRES Odd-syllable C-final noun with CG, 2ndsyll vowchange. Long and short essive. Spiik A1

sjattos # Odd-syllable test examples:*
sjattos: sjattos+N+Sg+Nom
sjaddusis: sjattos+N+Sg+Ela
sjaddusin: sjattos+N+Ess
sjatton: sjattos+N+Ess

LEXICON SÅHKÅR Odd-syllable C-final noun with CG and 2ndsyll vowelchange. Has only long essive. Spiik 2b

spiger # Odd-syllable test examples:*
spiger: spiger+N+Sg+Nom
spihkáris: spiger+N+Sg+Ela
spihkárin: spiger+N+Ess

LEXICON SJÁŊŊAR Same as SÅHKÅR only for words with storng-extra strong CG (nn>n’n and so on), only difference from SÅHKÅR is missing Err/Orths, because the cg dosn’t show.

LEXICON GÁMAS Odd-syllable C-final noun with CG, no 2ndsyll vowchange (OBS: a does not change). Long and short essive. Spiik A2

sjábtjas # Odd-syllable test examples:*
sjábtjas: sjábtjas+N+Sg+Nom
sjábttjasis: sjábtjas+N+Sg+Ela

LEXICON BENA Odd-syllable V-final noun with cg, no 2nsyll vowchange. Deletes g. Long and short essive. Spiik 2a

galma # Odd-syllable test examples:*
galma: galma+N+Sg+Nom
galmmagis: galma+N+Sg+Ela

LEXICON MIEHTAR Only for word “miehtar”. Same as SÅHKÅR but with Area-differences and a lot of Err/Orths.

miehtar # Odd-syllable test examples:*
miehtar: miehtar+N+Sg+Nom
mehtar: miehtar+N+Sg+Nom
miehttaris: miehtar+N+Sg+Ela
mähttaris: miehtar+N+Sg+Ela
mähttarin: miehtar+N+Ess
miehttarin: miehtar+N+Ess

Irregular stems

LEXICON SUOBDE gen: suobddega. Presently only for “suobde”. For some reason -e dosn’t become á. So not in lexicon BENA. Long and short essive.

ságe # Odd-syllable test examples:*
ságe: ságe+N+Sg+Nom
sáhkahav: ságe+N+Sg+Acc

LEXICON SÁGE gen: sáhkaha. Presently only for “ságe”. Long and short essive.

ságe # Odd-syllable test examples:*
ságe: ságe+N+Sg+Nom
sáhkahav: ságe+N+Sg+Acc

LEXICON BAVSEV Ends on -v and last vowel changes to i: bavsev:baksIma. Not like gierkav gierkkAma and birev birEma.

sievtev # Odd-syllable test examples:*
sievtev: sievtev+N+Sg+Nom
siektimis: sievtev+N+Sg+Ela

LEXICON RÁBEV rábev:ráhpuga. Presently only for “rábev”.

rábev # Odd-syllable test examples:*
rábev: rábev+N+Sg+Nom
ráhpugis: rábev+N+Sg+Ela

LEXICON RITJAS ! Like GÁMAS but without stem a-lengthening for grade I (underlying long -i-). presently only for “ritjas”.

ritjas # Odd-syllable test examples:*
ritjas: ritjas+N+Sg+Nom
rihtjasis: ritjas+N+Sg+Ela

LEXICON SÅGAS gen: sågaska. Presently only for “sågas”.

sågas # Odd-syllable test examples:*
sågas: sågas+N+Sg+Nom
sågaskav: sågas+N+Sg+Acc

LEXICON SJUVÁJ Presently only for “sjuváj”. sjuváj-sjuvvaga. Only this word

sjuváj # Odd-syllable test examples:*
sjuváj: sjuváj+N+Sg+Nom
sjuvvagis: sjuváj+N+Sg+Ela

LEXICON BØSOJ Because of bösoj in O.Korhonen, and bæsoj-bessuga. Only for these two words. J becomes g.

LEXICON GUOVSOJVUOJOJ vuojoj:vuodjom. Presently only for “guovsojvuojoj”.

guovsojvuojoj # Odd-syllable test examples:*
guovsojvuojoj: guovsojvuojoj+N+Sg+Nom
guovsojvuodjomav: guovsojvuojoj+N+Sg+Acc

LEXICON BUTJES butjes-buttjása. Presently only for “butjes”. This is an sub. Korhonen has this form but if you look in Grundstöm it’s buttjes-budtjasa. Must be a typo in Korhonen, because ttj-tj dosn’t exist in smj. This form is err subed in stems file.

LEXICON TJÅLKES tjålkes:tjoalkkas- Presently only for “tjålkes and tsålkes”. This must be wrong, and it dosn’t exist in Grundström. Å in 1. syll isn’t possible with e in 2. syll. Must be tjoalkes-tjoalkkása or tjålkas-tjoalkkasa. This form is err subed in stems file.

tsålkes # Odd-syllable test examples:*
★tsålkes: tsålkes+N+Sg+Nom (is not standard language)
★tsoalkkasav: tsålkes+N+Sg+Acc (is not standard language)

LEXICON VÁJES vájes:vádjas- Presently only for “báhkovájes”. It’s a sub: 2. syll e doesn’t become a. Must be vájes-vádjása or vájas-vádjasa. The second is used in NT, so I belive thats the right one. This form is err subed in stems file.

Derived stems

LEXICON BADJEL Derived nouns with acc -elav, ill -elij, elat -elas, etc. These were previously categorized as adpositions and adverbs, but according to Bruce Morén-Duolljá (2014) they are actually case forms of nouns derived from certain location nouns. Derived from even strong stems (badje -> badjel). Odd syllable inflection, but only singular nominative-elative (not clear if they take comitative and essive case). With comparatives. No Px.

allel # Odd-syllable test examples:*
allel: allel+N+Sg+Nom
allelis: allel+N+Sg+Ela
allelabbo: allel+N+Der/Comp+A+Sg+Nom

LEXICON BÁRNEP bárnep:bárnebu-. Comparisation of nouns. No -ahtá abesive.

iednep # Odd-syllable test examples:*
iednep: iednep+N+Sg+Nom
iednebuv: iednep+N+Sg+Acc

LEXICON OAPPÁSJ Like GAHPER, but for derived nouns in diminutive, have an underived form. Doesn’t get abesive -ahtá or -ahtes derivation. Oddsyll, no cg, no vowchange, no short Ess. Has only one dimin derivation since these words already are dimin, ie. not double dim as in GAHPER.

oappásj # Odd-syllable test examples:*
oappásj: oappásj+N+Sg+Nom
oappátjis: oappásj+N+Sg+Ela

LEXICON FIERUN Like GAHPER, but instruments derived from verbs. Fierrot>fierun. No short essive.

fierun # Odd-syllable test examples:*
fierun: fierun+N+Sg+Nom
fierunis: fierun+N+Sg+Ela

LEXICON GUOLLÁR Like GAHPER, but actor derived from contracted verbs (ACTOR for evensyll verbs). Guollit>guollár. No short essive.

LEXICON DUODJÁR Like GAHPER, but for Sem/Hum that are not ár-derivations, just look a likes. Moved to separate lexicon for sorting purposes only. Odd-syllable C-final noun without cg, no vowchange, no short Ess. Spiik A3

stiebil # Odd-syllable test examples:*
dájddár: dájddár+N+Sg+Nom
dájddáris: dájddár+N+Sg+Ela

LEXICON IELLEM Nomen actionionis derived from even verbs. Earlier these went directly to VSBST-ODD, now they get tag Gram/NomAct before going there. Can’t put it in VSBST-ODD lexicon because paths from verb lexicons.

LEXICON TJIEKTJAMA Pl Nomen actionionis derived from even verbs. Earlier these went directly to VSBST-ODD-PL, now they get tag Gram/NomAct before going there. Can’t put it in VSBST-ODD-PL lexicon because of paths from verb lexicons.

LEXICON AKTIDIBME Nomen actionionis derived from uneven verbs, ending DIBME. Earlier these went directly to VSBST-EVEN, now they get tag Gram/NomAct before going there. Can’t put it in VSBST-ODD lexicon because paths from verb lexicons.

LEXICON BERUSTIBME Nomen actionionis derived from uneven verbs, ending STIBME and DIBME is Err/orth-taged. Earlier these went directly to VSBST-EVEN, now they get tag Gram/NomAct before going there. Can’t put it in VSBST-ODD lexicon because paths from verb lexicons.

Plural odd-syll

LEXICON DÁRBBAGA Like BENA, but plural. Presently only for “dárbbaga”, has singular stem counterpart.

dárbbaga # Odd-syllable test examples:*
dárbbaga: dárbbaga+N+Pl+Nom
dárbbagijt: dárbbaga+N+Pl+Acc

LEXICON BÆLLJASA Like GÁMAS, but plural. These have corresponding singular stems.

jiednabælljasa # Odd-syllable test examples:*
jiednabælljasa: jiednabælljasa+N+Pl+Nom
jiednabälljasa: jiednabælljasa+N+Pl+Nom
jiednabælljasijt: jiednabælljasa+N+Pl+Acc
jiednabälljasijt: jiednabælljasa+N+Pl+Acc

LEXICON IEDNITJA Odd syllable pluralforms only. These do not have a singular form.

jáhkoguojmitja # Odd-syllable test examples:*
jáhkoguojmitja: jáhkoguojmitja+N+Pl+Nom
jáhkoguojmitjijt: jáhkoguojmitja+N+Pl+Acc

LEXICON SNJIERÁGA Odd syllable pluralforms only. These have corresponding singular stems.

guovlloådåsa # Odd-syllable test examples:*
guovlloådåsa: guovlloådåsa+N+Pl+Nom
guovlloådåsijt: guovlloådåsa+N+Pl+Acc

LEXICON MANEBU oddsyllable plural only. presently only for “maŋebu”.

maŋebu # Odd-syllable test examples:*
maŋebu: maŋebu+N+Pl+Nom
maŋebujt: maŋebu+N+Pl+Acc

Contracted stems

LEXICON SUOLOJ C-final with cg II-III: ålmåj:ålmmå

njurgoj # Contracted test examples:*
njurgoj: njurgoj+N+Sg+Nom
njurgguv: njurgoj+N+Sg+Acc

LEXICON ÅLMÅJ_LOAN Same as SUOLOJ, only for loan words. Follows Ráhka/Mikkelsen’s Bårjås 2014. C-final with cg II-III: ålmåj:ålmmå

bistroj # Contracted test examples:*
bistroj: bistroj+N+Sg+Nom
bisstruv: bistroj+N+Sg+Acc
bisstruhav: bistroj+N+Sg+Acc

LEXICON GUOMOJ C-final with cg I-III: guomoj:guobbmu

ænoj # Contracted test examples:*
ænoj: ænoj+N+Sg+Nom
æddnuv: ænoj+N+Sg+Acc
änoj: ænoj+N+Sg+Nom
äddnuv: ænoj+N+Sg+Acc

LEXICON SARVES C-final with cg II-III. sarves:sarvvá

moarmes # Contracted test examples:*
moarmes: moarmes+N+Sg+Nom
moarmmáv: moarmes+N+Sg+Acc

LEXICON SVÁLES C-final with cg I-III. sváles:svállá (lºl)

sváles # Contracted test examples:*
sváles: sváles+N+Sg+Nom
sválláv: sváles+N+Sg+Acc

LEXICON GÅHKES C-final with cg II-III with vowel harmony (a/á=å). gåhkes:gåhkkå. Presently only for “gåhkes”.

gåhkes # Contracted test examples:*
gåhkes: gåhkes+N+Sg+Nom
gåhkkåv: gåhkes+N+Sg+Acc

LEXICON SJUOKKAJ sjuokkaj:sjuoggá. Presently only for “sjuokkaj”.

sjuokkaj # Contracted test examples:*
sjuokkaj: sjuokkaj+N+Sg+Nom
sjuoggáv: sjuokkaj+N+Sg+Acc

LEXICON GISTÁ gistá:gisstá. Presently only for “gistá”.

gistá # Contracted test examples:*
gistá: gistá+N+Sg+Nom
gisstáv: gistá+N+Sg+Acc

Contracted stems sublexica

Px lexica

LEXICON DUOLMUN Fierrot>fierun, instruments derived from verbs, used only for verb derivation, not for lexicalized lemmas. No short essive.

This (part of) documentation was generated from src/fst/morphology/affixes/nouns.lexc

src-fst-morphology-affixes-pronouns.lexc.md

+Use/NG+Gen:n NAMÁK ; ! adjectival -k derivation does not take pronouns +Use/NG+Ela:sstága K ; !Can’t find this anywhere. Maybe this is really dástága/dastagá? in “dáhtakcas”

+Use/NG+Gen: NAMÁK ; ! adjectival -k derivation does not take pronouns

+Use/NG+Gen:aj NAMÁK ; ! adjectival -k derivation does not take pronouns +Ine:a%>jna K-s ; +Abe+Use/NG:a%>jdak K ; ! covered in non-idiosync
+Abe+Use/NG:a%>jdagi K ; ! covered in non-idiosync
+Abe+Use/NG:a%>jdagá K ; ! covered in non-idiosync
+Abe+Use/NG:a%>jtagá K ; ! covered in non-idiosync

This (part of) documentation was generated from src/fst/morphology/affixes/pronouns.lexc

src-fst-morphology-affixes-propernouns.lexc.md

Lule Sáme Proper noun morphology !

Even syllable proper nouns

Unstressed last syllable

Words in ACCRA lexicons end on vowel, have no CG and get “even-syllable” case marking where case suffixes are added directly. Illative e:i, but not o:u. Last syllable is unstressed. Both non-assimilated and assmilated stems (although not all are fully, or correctly, assmilated)

LEXICON ACCRA-ani Vowel-final names where case endings are added directly, no cg. Illative e changes to i. Animales.

Tjuorri # Even-syllable test examples:*
Tjuorri: Tjuorri+N+Prop+Sem/Ani+Sg+Nom
Tjuorrij: Tjuorri+N+Prop+Sem/Ani+Sg+Ill
Tjuorris: Tjuorri+N+Prop+Sem/Ani+Sg+Ela

LEXICON ACCRA-obj Vowel-final names where case endings are added directly, no cg. Object names

Gestapo # Even-syllable test examples:*
Gestapo: Gestapo+N+Prop+Sem/Obj+Sg+Nom
Gestapoj: Gestapo+N+Prop+Sem/Obj+Sg+Ill
Gestapos: Gestapo+N+Prop+Sem/Obj+Sg+Ela

LEXICON ACCRA-org Vowel-final names where caseendings are added directly, no cg. organizations

Giellatekno # Even-syllable test examples:*
Giellatekno: Giellatekno+N+Prop+Sem/Org+Sg+Nom
Giellateknoj: Giellatekno+N+Prop+Sem/Org+Sg+Ill
Giellateknos: Giellatekno+N+Prop+Sem/Org+Sg+Ela

LEXICON ACCRA-mal Vowel-final names where case are added directly, no cg. Male names

Antonio # Even-syllable test examples:*
Antonio: Antonio+N+Prop+Sem/Mal+Sg+Nom
Antonioj: Antonio+N+Prop+Sem/Mal+Sg+Ill
Antonios: Antonio+N+Prop+Sem/Mal+Sg+Ela

LEXICON ACCRA-fem Vowel-final names where case endings are added directly, no cg. Female names

Barbro # Even-syllable test examples:*
Barbro: Barbro+N+Prop+Sem/Fem+Sg+Nom
Barbroj: Barbro+N+Prop+Sem/Fem+Sg+Ill
Barbros: Barbro+N+Prop+Sem/Fem+Sg+Ela

LEXICON ACCRA-femsur Vowel-final names where case endings are added directly, no cg. Female names also used as surnames

Sara # Even-syllable test examples:*
Sara: Sara+N+Prop+Sem/Fem+Sg+Nom
Saraj: Sara+N+Prop+Sem/Fem+Sg+Ill
Saraj: Sara+N+Prop+Sem/Sur+Sg+Ill
Saras: Sara+N+Prop+Sem/Fem+Sg+Ela

LEXICON ACCRA-malfem Vowel-final names where case endings are added directly, no cg. Names that can be both female and male names

Janne # Even-syllable test examples:*
Janne: Janne+N+Prop+Sem/Mal+Sg+Nom
Janne: Janne+N+Prop+Sem/Fem+Sg+Nom
Jannij: Janne+N+Prop+Sem/Mal+Sg+Ill
Jannes: Janne+N+Prop+Sem/Mal+Sg+Ela

LEXICON ACCRA-objplc Vowel-final names where case endings are added directly, no cg. Names that can be both objects and place names

Soria-Moria # Even-syllable test examples:*
Soria-Moria: Soria-Moria+N+Prop+Sem/Obj+Sg+Nom
Soria-Moriaj: Soria-Moria+N+Prop+Sem/Obj+Sg+Ill
Soria-Morias: Soria-Moria+N+Prop+Sem/Obj+Sg+Ela
Soria-Moriaj: Soria-Moria+N+Prop+Sem/Plc+Sg+Ill
soria-moriak: Soria-Moria+N+Prop+Sem/Plc+Der/k+N+Sg+Nom
soria-morialasj: Soria-Moria+N+Prop+Sem/Plc+Der/lasj+N+Sg+Nom

LEXICON ACCRA-femplc Vowel-final names where case endings are added directly, no cg. Names that can be both female and place names

Salla # Even-syllable test examples:*
Salla: Salla+N+Prop+Sem/Fem+Sg+Nom
Salla: Salla+N+Prop+Sem/Plc+Pl+Nom
Sallaj: Salla+N+Prop+Sem/Fem+Sg+Ill
Sallas: Salla+N+Prop+Sem/Fem+Sg+Ela
sallak: Salla+N+Prop+Sem/Plc+Der/k+N+Sg+Nom
sallalasj: Salla+N+Prop+Sem/Plc+Der/lasj+N+Sg+Nom

LEXICON ACCRA-sur Vowel-final names where case endings are added directly, no cg. Surnames

Tønne # Even-syllable test examples:*
Tønne: Tønne+N+Prop+Sem/Sur+Sg+Nom
Tønnij: Tønne+N+Prop+Sem/Sur+Sg+Ill
Tønnes: Tønne+N+Prop+Sem/Sur+Sg+Ela

LEXICON ACCRA-malsur Vowel-final names where case endings are added directly, no cg. Names that can be both male- and surnames

Valio # Even-syllable test examples:*
Valio: Valio+N+Prop+Sem/Sur+Sg+Nom
Valio: Valio+N+Prop+Sem/Mal+Sg+Nom
Valioj: Valio+N+Prop+Sem/Sur+Sg+Ill
Valios: Valio+N+Prop+Sem/Sur+Sg+Ela

LEXICON ACCRA-plc Vowel-final names where caseendings are added directly, no cg. Place names

Burma # Even-syllable test examples:*
Burma: Burma+N+Prop+Sem/Plc+Sg+Nom
Burmaj: Burma+N+Prop+Sem/Plc+Sg+Ill
Burmas: Burma+N+Prop+Sem/Plc+Sg+Ela
burmak: Burma+N+Prop+Sem/Plc+Der/k+N+Sg+Nom
burmalasj: Burma+N+Prop+Sem/Plc+Der/lasj+N+Sg+Nom

LEXICON ACCRA_MWE-plc Vowel-final names where caseendings are added directly, no cg. Place names

LEXICON GIRUNA-plc For proper Kiruna. Same as ACCRA. Different lexicon because of sma.

Veitsiluoto # Even-syllable test examples:*
Veitsiluoto: Veitsiluoto+N+Prop+Sem/Plc+Sg+Nom
Veitsiluoto: Veitsiluoto+N+Prop+Sem/Plc+Sg+Nom
Veitsiluotoj: Veitsiluoto+N+Prop+Sem/Plc+Sg+Ill
Veitsiluotos: Veitsiluoto+N+Prop+Sem/Plc+Sg+Ela
veitsiluotok: Veitsiluoto+N+Prop+Sem/Plc+Der/k+N+Sg+Nom
veitsiluotolasj: Veitsiluoto+N+Prop+Sem/Plc+Der/lasj+N+Sg+Nom

LEXICON ACCRA-LOAN-org Only nominatives. Vowel-final names where case endings are added directly, no cg. organizations

Samenes Idrettsforbund-Norge # Even-syllable test examples:*
Samenes Idrettsforbund-Norge: (Eng. Samenes Idrettsforbund-Norge+N+Prop+Sem/Org+Sg+Nom)
★Samenes Idrettsforbund-Norge: (is not standard language Samenes Idrettsforbund-Norge+N+Prop+Sem/Org+Sg+Ela # negative test)

LEXICON ACCRA-LOAN-obj Only nominatives. Vowel-final names where case endings are added directly, no cg. Object names

The Norwegian Sami Experience # Even-syllable test examples:*
The Norwegian (Eng. Sami Experience: The Norwegian Sami Experience+N+Prop+Sem/Obj+Sg+Nom)
★The Norwegian (is not standard language Sami Experience: The Norwegian Sami Experience+N+Prop+Sem/Obj+Sg+Ill # Negative test)

LEXICON ACCRA-LOAN-plc Only nominatives. Vowel-final names where case endings are added directly, no cg.Place names

Kautokeino # Even-syllable test examples:*
Kautokeino: Kautokeino+N+Prop+Sem/Plc+Sg+Nom
★Kautokeinos: Kautokeino+N+Prop+Sem/Org+Sg+Ela (is not standard language # negative test)

In smj RONDANE is same as ACCRA, in use in smi because of diffrences in sme. No -lasj or -k. Last syllable is unstressed. Non-assimilated-stems.

LEXICON RONDANE-plc E-final names, with no cg. elative -s, ill -ij. Place names

Bakkane # Even-syllable test examples:*
Bakkane: Bakkane+N+Prop+Sem/Plc+Sg+Nom
Bakkane: Bakkane+N+Prop+Sem/Plc+Sg+Gen
Bakkanev: Bakkane+N+Prop+Sem/Plc+Sg+Acc
Bakkanen: Bakkane+N+Prop+Sem/Plc+Sg+Ine
Bakkanij: Bakkane+N+Prop+Sem/Plc+Sg+Ill
Bakkanes: Bakkane+N+Prop+Sem/Plc+Sg+Ela
Bakkanijn: Bakkane+N+Prop+Sem/Plc+Sg+Com
Bakkanijda: Bakkane+N+Prop+Sem/Plc+Pl+Ill
Bakkanijs: Bakkane+N+Prop+Sem/Plc+Pl+Ela
★bakkanek: Bakkane+N+Prop+Sem/Plc+Der/k+N+Sg+Nom (is not standard language)
★bakkanelasj: Bakkane+N+Prop+Sem/Plc+Der/lasj+N+Sg+Nom (is not standard language)

LEXICON RONDANE-SG-plc E-final names, with no cg. elative -s, ill -ij. Place names

Bakkane # Even-syllable test examples:*
Bakkane: Bakkane+N+Prop+Sem/Plc+Sg+Nom
Bakkane: Bakkane+N+Prop+Sem/Plc+Sg+Gen
Bakkanev: Bakkane+N+Prop+Sem/Plc+Sg+Acc

LEXICON RONDANE-LOAN Only nominative.Place names

Azorene # Even-syllable test examples:*
Azorene: Azorene+N+Prop+Sem/Plc+Sg+Nom
★Azorenij: Azorene+N+Prop+Sem/Plc+Sg+Ill (is not standard language # Negative test)

LEXICON RONDANE-SG-LOAN Only nominative. Place names

LEXICON RONDANE-sur Surnames

Benneche # Even-syllable test examples:*
Benneche: Benneche+N+Prop+Sem/Sur+Sg+Nom
Bennechij: Benneche+N+Prop+Sem/Sur+Sg+Ill
Benneches: Benneche+N+Prop+Sem/Sur+Sg+Ela

LEXICON RONDANE-obj Objects

Office # Even-syllable test examples:*
Office: Office+N+Prop+Sem/Obj+Sg+Nom
Office: Office+N+Prop+Sem/Obj+Sg+Gen
Officev: Office+N+Prop+Sem/Obj+Sg+Acc
Officen: Office+N+Prop+Sem/Obj+Sg+Ine
Officij: Office+N+Prop+Sem/Obj+Sg+Ill
Offices: Office+N+Prop+Sem/Obj+Sg+Ela
Officijn: Office+N+Prop+Sem/Obj+Sg+Com

LEXICON RONDANE-org Organizations

Picture # Even-syllable test examples:*
Picture: Picture+N+Prop+Sem/Org+Sg+Nom
Picturij: Picture+N+Prop+Sem/Org+Sg+Ill
Pictures: Picture+N+Prop+Sem/Org+Sg+Ela

LEXICON RONDANE-mal Male names

Lawrence # Even-syllable test examples:*
Lawrence: Lawrence+N+Prop+Sem/Mal+Sg+Nom
Lawrencij: Lawrence+N+Prop+Sem/Mal+Sg+Ill
Lawrences: Lawrence+N+Prop+Sem/Mal+Sg+Ela

LEXICON RONDANE-fem Female names

Jannike # Even-syllable test examples:*
Jannike: Jannike+N+Prop+Sem/Fem+Sg+Nom
Jannikij: Jannike+N+Prop+Sem/Fem+Sg+Ill
Jannikes: Jannike+N+Prop+Sem/Fem+Sg+Ela

These sublexica are irrelevant for ACCRA, but added for the sake of the lexicon MARJA

GATA are Norwegian place names that end on -gata. Gets even-syllable casemarking. Last syllable is unstressed. Non-assimilated stems.

LEXICON GATA-plc Norwegian place names that end on -gata. Gets even-syllable casemarking. Last syllable is unstressed.

Munkegata # Even-syllable test examples:*
Munkegata: Munkegata+N+Prop+Sem/Plc+Sg+Nom
Munkegataj: Munkegata+N+Prop+Sem/Plc+Sg+Ill
Munkegatas: Munkegata+N+Prop+Sem/Plc+Sg+Ela

Words in MARJA end on vowel, with CG, even-syllable case marking. Illative change e to á, illative i stays i. Last syllable is unstressed. Real lule sami stems.

LEXICON MARJA-fem Odd-syllable with cg. Female names

Gáddjá # Even-syllable test examples:*
Gáddjá: Gáddjá+N+Prop+Sem/Fem+Sg+Nom
Gáddjáj: Gáddjá+N+Prop+Sem/Fem+Sg+Ill
Gádjás: Gáddjá+N+Prop+Sem/Fem+Sg+Ela

LEXICON MARJA-ani Animal names

Vilge # Even-syllable test examples:*
Gávrásski: Gávrásski+N+Prop+Sem/Ani+Sg+Nom
Gávrásskij: Gávrásski+N+Prop+Sem/Ani+Sg+Ill
Gávráskis: Gávrásski+N+Prop+Sem/Ani+Sg+Ela

LEXICON MARJA-mal Male names

Biet-Ánnda # Even-syllable test examples:*
Biet-Ánnda: Biet-Ánnda+N+Prop+Sem/Mal+Sg+Nom
Biet-Ánndaj: Biet-Ánnda+N+Prop+Sem/Mal+Sg+Ill
Biet-Ándas: Biet-Ánnda+N+Prop+Sem/Mal+Sg+Ela

LEXICON MARJA-obj Objects

Bern-Konvensjåvnnå # Even-syllable test examples:*
Bern-Konvensjåvnnå: Bern-Konvensjåvnnå+N+Prop+Sem/Obj+Sg+Nom
Bern-Konvensjåvnnåj: Bern-Konvensjåvnnå+N+Prop+Sem/Obj+Sg+Ill
Bern-Konvensjåvnås: Bern-Konvensjåvnnå+N+Prop+Sem/Obj+Sg+Ela

LEXICON MARJA-org Organizations

Domænadoajmma # Even-syllable test examples:*
Domænadoajmma: Domænadoajmma+N+Prop+Sem/Org+Sg+Nom
Domænadoajmmaj: Domænadoajmma+N+Prop+Sem/Org+Sg+Ill
Domænadåjmas: Domænadoajmma+N+Prop+Sem/Org+Sg+Ela

LEXICON MARJA-plc Vowel final names with Gradation and Ill change (place names)

Dundarevuobme # Even-syllable test examples:*
Dundarevuobme: Dundarevuobme+N+Prop+Sem/Plc+Sg+Nom
Dundarevuobmáj: Dundarevuobme+N+Prop+Sem/Plc+Sg+Ill
Dundarevuomes: Dundarevuobme+N+Prop+Sem/Plc+Sg+Ela

LEXICON MARJA-sur Surnames

Skálltje # Even-syllable test examples:*
Skálltje: Skálltje+N+Prop+Sem/Sur+Sg+Nom
Skálltjáj: Skálltje+N+Prop+Sem/Sur+Sg+Ill
Skáltjes: Skálltje+N+Prop+Sem/Sur+Sg+Ela

LEXICON MARJA-plc-der = place name derivations and corresponding flag. Presently not used in SMJ.

LEXICON SUOBMA-plc Placenames. Like MARJA but no derivation

Suobma # Even-syllable test examples:*
Suobma: Suobma+N+Prop+Sem/Plc+Sg+Nom
Suobmaj: Suobma+N+Prop+Sem/Plc+Sg+Ill
Suomas: Suobma+N+Prop+Sem/Plc+Sg+Ela

LEXICON SUOBMA-org Placenames. Like MARJA but no derivation

Stressed last syllable

These proper nouns are in essence partly assimilated loan word as foreign words with stressed last syllable are assimilated to sami by (often adapting the stressed syllable vowel, and) adding an unstressed syllable consisting of adapted (or if necesarry added) consonants and ending on vowel a (Morén-Duollja 2014). Proper nouns are only partly assimilated in that the stressed syllable vowel is not adapted in any way, neither are consonats inserted, only the final “a” remains. These proper nouns therefore work like regular a-stem nouns and get an even syllable case marking.

Words in lexicon NYSTØ end on vowel, no cg. Non-assimilated stems

LEXICON NYSTØ-fem Femal names

Britney # Even-syllable test examples:*
Britney: Britney+N+Prop+Sem/Fem+Sg+Nom
Britneyav: Britney+N+Prop+Sem/Fem+Sg+Acc
Britneyaj: Britney+N+Prop+Sem/Fem+Sg+Ill
Britneyas: Britney+N+Prop+Sem/Fem+Sg+Ela

LEXICON NYSTØ-mal Male name

Taneli # Even-syllable test examples:*
Taneli: Taneli+N+Prop+Sem/Mal+Sg+Nom
Taneliav: Taneli+N+Prop+Sem/Mal+Sg+Acc
Taneliaj: Taneli+N+Prop+Sem/Mal+Sg+Ill
Tanelias: Taneli+N+Prop+Sem/Mal+Sg+Ela

LEXICON NYSTØ-obj Objects

Infiniti # Even-syllable test examples:*
Infiniti: Infiniti+N+Prop+Sem/Obj+Sg+Nom
Infinitiav: Infiniti+N+Prop+Sem/Obj+Sg+Acc
Infinitiaj: Infiniti+N+Prop+Sem/Obj+Sg+Ill
Infinitias: Infiniti+N+Prop+Sem/Obj+Sg+Ela

LEXICON NYSTØ-org Organizations

Kulturby # Even-syllable test examples:*
Kulturby: Kulturby+N+Prop+Sem/Org+Sg+Nom
Kulturbyav: Kulturby+N+Prop+Sem/Org+Sg+Acc
Kulturbyaj: Kulturby+N+Prop+Sem/Org+Sg+Ill
Kulturbyas: Kulturby+N+Prop+Sem/Org+Sg+Ela

LEXICON NYSTØ-LOAN-org Organizations loan

Sameby # Even-syllable test examples:*
Sameby: Sameby+N+Prop+Sem/Org+Sg+Nom
★Samebyaj: Sameby+N+Prop+Sem/Org+Sg+Ill (is not standard language)
★Samebyas: Sameby+N+Prop+Sem/Org+Sg+Ela (is not standard language)

LEXICON NYSTØ-sur Sur names

Sandoz # Even-syllable test examples:*
Sandoz: Sandoz+N+Prop+Sem/Sur+Sg+Nom
Sandozav: Sandoz+N+Prop+Sem/Sur+Sg+Acc
Sandozaj: Sandoz+N+Prop+Sem/Sur+Sg+Ill
Sandozas: Sandoz+N+Prop+Sem/Sur+Sg+Ela
Teigmo # Even-syllable test examples:*
Teigmo: Teigmo+N+Prop+Sem/Plc+Sg+Nom
Teigmoav: Teigmo+N+Prop+Sem/Plc+Sg+Acc
Teigmoaj: Teigmo+N+Prop+Sem/Plc+Sg+Ill
Teigmoas: Teigmo+N+Prop+Sem/Plc+Sg+Ela

LEXICON NYSTØ-LOAN-plc Place names loan

Bodø # Even-syllable test examples:*
Bodø: Bodø+N+Prop+Sem/Plc+Sg+Nom
★Bodøaj: Bodø+N+Prop+Sem/Plc+Sg+Ill (is not standard language)
★Bodøas: Bodø+N+Prop+Sem/Plc+Sg+Ela (is not standard language)

LEXICON NYSTØ-plc Place names

Borgå # Even-syllable test examples:*
Borgå: Borgå+N+Prop+Sem/Plc+Sg+Nom
Borgåav: Borgå+N+Prop+Sem/Plc+Sg+Acc
Borgåaj: Borgå+N+Prop+Sem/Plc+Sg+Ill
Borgåas: Borgå+N+Prop+Sem/Plc+Sg+Ela

LEXICON NYSTØ_MWE-plc Place names

Words in DUBAI lexicon end on vowel+vowel and have no cg. Last syllable is stressed. Get even syllable case marking. Non-assimilated stems. Not sure if this lexicon is necessary, at least for smj’s sake.

LEXICON DUBAI-fem I-final names. No cg. Female names

Mai # Even-syllable test examples:*
Mai: Mai+N+Prop+Sem/Fem+Sg+Nom
Maiaj: Mai+N+Prop+Sem/Fem+Sg+Ill
Maias: Mai+N+Prop+Sem/Fem+Sg+Ela

LEXICON DUBAI-obj I-final names. No cg. Object names

Hyundai # Even-syllable test examples:*
Hyundai: Hyundai+N+Prop+Sem/Obj+Sg+Nom
Hyundaiaj: Hyundai+N+Prop+Sem/Obj+Sg+Ill
Hyundaias: Hyundai+N+Prop+Sem/Obj+Sg+Ela

LEXICON DUBAI-org Organizations

Khoi # Even-syllable test examples:*
Khoi: Khoi+N+Prop+Sem/Org+Sg+Nom
Khoiaj: Khoi+N+Prop+Sem/Org+Sg+Ill
Khoias: Khoi+N+Prop+Sem/Org+Sg+Ela

LEXICON DUBAI-mal Male names

Kublai # Even-syllable test examples:*
Kublai: Kublai+N+Prop+Sem/Mal+Sg+Nom
Kublaiaj: Kublai+N+Prop+Sem/Mal+Sg+Ill
Kublaias: Kublai+N+Prop+Sem/Mal+Sg+Ela

LEXICON DUBAI-sur Surnames

Maarthai # Even-syllable test examples:*
Maarthai: Maarthai+N+Prop+Sem/Sur+Sg+Nom
Maarthaiaj: Maarthai+N+Prop+Sem/Sur+Sg+Ill
Maarthaias: Maarthai+N+Prop+Sem/Sur+Sg+Ela

LEXICON DUBAI-plc Place names

Madurai # Even-syllable test examples:*
Madurai: Madurai+N+Prop+Sem/Plc+Sg+Nom
Maduraiaj: Madurai+N+Prop+Sem/Plc+Sg+Ill
Maduraias: Madurai+N+Prop+Sem/Plc+Sg+Ela
maduraiak: Madurai+N+Prop+Sem/Plc+Der/k+N+Sg+Nom

Words in lexicon BERN end on conconant, no cg, even syllable case marking with -av, -aj, -as, etc. Last syllable is stressed. Both assimilated and non-assmilated stems.

LEXICON BERN-ani Animals

Lillemor # Even-syllable test examples:*
Lillemor: Lillemor+N+Prop+Sem/Ani+Sg+Nom
Lillemoraj: Lillemor+N+Prop+Sem/Ani+Sg+Ill
Lillemoras: Lillemor+N+Prop+Sem/Ani+Sg+Ela

LEXICON BERN-mal Male names

Eystein # Even-syllable test examples:*
Eystein: Eystein+N+Prop+Sem/Mal+Sg+Nom
Eysteinaj: Eystein+N+Prop+Sem/Mal+Sg+Ill
Eysteinas: Eystein+N+Prop+Sem/Mal+Sg+Ela

LEXICON BERN-surmal name that are both sur- and male names

Pipin # Even-syllable test examples:*
Pipin: Pipin+N+Prop+Sem/Sur+Sg+Nom
Pipinaj: Pipin+N+Prop+Sem/Sur+Sg+Ill
Pipinas: Pipin+N+Prop+Sem/Sur+Sg+Ela

LEXICON BERN-fem Female name

Ragnfrid # Even-syllable test examples:*
Ragnfrid: Ragnfrid+N+Prop+Sem/Fem+Sg+Nom
Ragnfridaj: Ragnfrid+N+Prop+Sem/Fem+Sg+Ill
Ragnfridas: Ragnfrid+N+Prop+Sem/Fem+Sg+Ela

Different lexicon for female persons. Audhild.

LEXICON BERN-sur Surnames

Lind # Even-syllable test examples:*
Lind: Lind+N+Prop+Sem/Sur+Sg+Nom
Lindaj: Lind+N+Prop+Sem/Sur+Sg+Ill
Lindas: Lind+N+Prop+Sem/Sur+Sg+Ela

LEXICON BERN-plc Placenames

Beijing # Even-syllable test examples:*
Beijing: Beijing+N+Prop+Sem/Plc+Sg+Nom
Beijingaj: Beijing+N+Prop+Sem/Plc+Sg+Ill
Beijingas: Beijing+N+Prop+Sem/Plc+Sg+Ela

LEXICON BERN_MWE-plc Placenames

LEXICON BERN-objsur Names used as both objects and surnames.

Stenbukk # Even-syllable test examples:*
Stenbukk: Stenbukk+N+Prop+Sem/Obj+Sg+Nom
Stenbukkaj: Stenbukk+N+Prop+Sem/Obj+Sg+Ill
Stenbukkas: Stenbukk+N+Prop+Sem/Obj+Sg+Ela

LEXICON BERN-orgsur Names used for both organizations and surnames.

Nord # Even-syllable test examples:*
Nord: Nord+N+Prop+Sem/Org+Sg+Nom
Nordaj: Nord+N+Prop+Sem/Org+Sg+Ill
Nordas: Nord+N+Prop+Sem/Org+Sg+Ela

LEXICON BERN-obj Objects. Obs: Different lexicon for organisations. Microsoft.

Sult # Even-syllable test examples:*
Sult: Sult+N+Prop+Sem/Obj+Sg+Nom
Sultaj: Sult+N+Prop+Sem/Obj+Sg+Ill
Sultas: Sult+N+Prop+Sem/Obj+Sg+Ela

LEXICON BERN-org Organizations

Laks # Even-syllable test examples:*
Laks: Laks+N+Prop+Sem/Org+Sg+Nom
Laksaj: Laks+N+Prop+Sem/Org+Sg+Ill
Laksas: Laks+N+Prop+Sem/Org+Sg+Ela

LEXICON BERN-LOAN-org Organizations loan.

Reinsamelag # Even-syllable test examples:*
Reinsamelag: Reinsamelag+N+Prop+Sem/Org+Sg+Nom
★Reinsamelagaj: Reinsamelag+N+Prop+Sem/Org+Sg+Ill (is not standard language)
★Reinsamelagas: Reinsamelag+N+Prop+Sem/Org+Sg+Ela (is not standard language)

LEXICON BERN-LOAN-plc Placenames loan.

Mehamn # Even-syllable test examples:*
Mehamn: Mehamn+N+Prop+Sem/Plc+Sg+Nom
★Mehamnaj: Mehamn+N+Prop+Sem/Plc+Sg+Ill (is not standard language)
★Mehamnas: Mehamn+N+Prop+Sem/Plc+Sg+Ela (is not standard language)

LEXICON BERN-LOAN-obj Objects loan.

Verneplan # Even-syllable test examples:*
Verneplan: Verneplan+N+Prop+Sem/Obj+Sg+Nom
★Verneplanaj: Verneplan+N+Prop+Sem/Obj+Sg+Ill (is not standard language)
★Verneplanas: Verneplan+N+Prop+Sem/Obj+Sg+Ela (is not standard language)

Different lexicon for names that are both surnames and places.

Lexicons OY work as BERN lexicons

Words in LONDONBERN are sent to both LONDON and BERN lexicons. Non-assmilated stems.

4-syllable stems

Words in lexicon BASUDIS are trisyllabic in sg nom, and work like standard 4-syllable nouns. End on conconant and have cg. Even syllable case marking with acc -áv, ill -áj, ela -ás, etc. Real lule sami stems.

LEXICON BASUDIS-org Only singular. Placenames

LEXICON BASUDIS-mal Male names

Ájluhasj # 4-syllable test examples:*
Ájluhasj: Ájluhasj+N+Prop+Sem/Mal+Sg+Nom
Ájluhadtjaj: Ájluhasj+N+Prop+Sem/Mal+Sg+Ill (Eng. !should add dummy to prevent unusual dtj-stem?)
Ájluhattjas: Ájluhasj+N+Prop+Sem/Mal+Sg+Ela

LEXICON BASUDIS-plc Place names

Ulldevis # 4-syllable test examples:*
Ulldevis: Ulldevis+N+Prop+Sem/Plc+Sg+Nom
Ulldevissaj: Ulldevis+N+Prop+Sem/Plc+Sg+Ill
Ulldevisás: Ulldevis+N+Prop+Sem/Plc+Sg+Ela

Plurals

Words in lexicon VARGGAT even-syllable sámi plurals .

LEXICON VARGGAT-plc Plural stems, sáme names. Place names

LEXICON VARGGAT-org Plural stems, sáme names.

Bieva # test examples:*
★Bievva: Bieva+N+Prop+Sem/Plc+Sg+Nom (is not standard language)
Bieva: Bieva+N+Prop+Sem/Plc+Pl+Nom
Bievajda: Bieva+N+Prop+Sem/Plc+Pl+Ill
Bievajs: Bieva+N+Prop+Sem/Plc+Pl+Ela

Words in lexicon ALEUHTAT even-syllables assimilated plurals.

LEXICON ALEUHTAT-plc Plural names, not sami names. like -váre, -gårtje

Odd syllable case marking

Words in lexicon LONDON end on conconant, no cg, case marking with -av, -ij, -is, etc. Last syllable is unstressed. Gets a regular odd syllable case marking. Both real lule sami stems, assimilated stems and non-assimilated stems

LEXICON LONDON-sur Odd-syllable. Surnames. Final foot structure (X.) and (X..) => Loc:%>is

Åstot # Odd-syllable test examples:*
Åstot: Åstot+N+Prop+Sem/Sur+Sg+Nom
Åstodij: Åstot+N+Prop+Sem/Sur+Sg+Ill
Åstodis: Åstot+N+Prop+Sem/Sur+Sg+Ela

LEXICON LONDON-ani Animals

Jubmel # Odd-syllable test examples:*
Jubmel: Jubmel+N+Prop+Sem/Ani+Sg+Nom
Jubmelij: Jubmel+N+Prop+Sem/Ani+Sg+Ill
Jubmelis: Jubmel+N+Prop+Sem/Ani+Sg+Ela

LEXICON LONDON-org Only singular Organizations

Klassekampen # Odd-syllable test examples:*
Klassekampen: Klassekampen+N+Prop+Sem/Org+Sg+Nom
Klassekampenij: Klassekampen+N+Prop+Sem/Org+Sg+Ill
Klassekampenis: Klassekampen+N+Prop+Sem/Org+Sg+Ela

LEXICON LONDON-mal Male names

Mátteus # Odd-syllable test examples:*
Matteus: Matteus+N+Prop+Sem/Mal+Sg+Nom
Matteusij: Matteus+N+Prop+Sem/Mal+Sg+Ill
Matteusis: Matteus+N+Prop+Sem/Mal+Sg+Ela

LEXICON LONDON-malsur Names that can be both male- and surnames. Not used in smj-propernouns

Timeus # Odd-syllable test examples:*
Timeus: Timeus+N+Prop+Sem/Mal+Sg+Nom
Timeusij: Timeus+N+Prop+Sem/Mal+Sg+Ill
Timeusis: Timeus+N+Prop+Sem/Mal+Sg+Ela

LEXICON LONDON-fem Female names

Luhták # Odd-syllable test examples:*
Luhták: Luhták+N+Prop+Sem/Fem+Sg+Nom
Luhtágij: Luhták+N+Prop+Sem/Fem+Sg+Ill
Luhtágis: Luhták+N+Prop+Sem/Fem+Sg+Ela

LEXICON LONDON-malfem Names that can be both male and female names.Not used in smj-propernouns

Robin # Odd-syllable test examples:*
Robin: Robin+N+Prop+Sem/Fem+Sg+Nom
Robinij: Robin+N+Prop+Sem/Fem+Sg+Ill
Robinis: Robin+N+Prop+Sem/Fem+Sg+Ela

LEXICON LONDON-malplc Names that can be both male- and placenames.Not used in smj-propernouns

Jergol # Odd-syllable test examples:*
Jergol: Jergol+N+Prop+Sem/Mal+Sg+Nom
Jergolij: Jergol+N+Prop+Sem/Mal+Sg+Ill
Jergolis: Jergol+N+Prop+Sem/Mal+Sg+Ela

LEXICON LONDON-plc Only singular. Placenames

Njierek # Odd-syllable test examples:*
Njierek: Njierek+N+Prop+Sem/Plc+Sg+Nom
Njieregij: Njierek+N+Prop+Sem/Plc+Sg+Ill
Njieregis: Njierek+N+Prop+Sem/Plc+Sg+Ela

LEXICON TJIERREK-plc Only singular. Placenames. Without cg. Same as LONDON, but does not get Sem/Sur tag, not usuall for SMJ place names to become surnames.

Njierek # Odd-syllable test examples:*
Njierek: Njierek+N+Prop+Sem/Plc+Sg+Nom
Njieregij: Njierek+N+Prop+Sem/Plc+Sg+Ill
Njieregis: Njierek+N+Prop+Sem/Plc+Sg+Ela

LEXICON DILLÁRASJ-fem Only singular. Names with odd-syll diminutive.

LEXICON DILLÁRASJ-mal Only singular. Names with odd-syll diminutive.

LEXICON LONDON-orgsur Names that can be both organizations and surnames.Not used in Smj-propernouns

Rieser # Odd-syllable test examples:*
Rieser: Rieser+N+Prop+Sem/Sur+Sg+Nom
Rieserij: Rieser+N+Prop+Sem/Sur+Sg+Ill
Rieseris: Rieser+N+Prop+Sem/Sur+Sg+Ela

LEXICON LONDON-obj Objects.

Rovdjursutredningen # Odd-syllable test examples:*
Rovdjursutredningen: Rovdjursutredningen+N+Prop+Sem/Obj+Sg+Nom
Rovdjursutredningenij: Rovdjursutredningen+N+Prop+Sem/Obj+Sg+Ill
Rovdjursutredningenis: Rovdjursutredningen+N+Prop+Sem/Obj+Sg+Ela

LEXICON LONDON-LOAN-obj Objects loan. Not used in smj-propernouns

Sameloven # Odd-syllable test examples:*
Sameloven: Sameloven+N+Prop+Sem/Obj+Sg+Nom
★Samelovenij: Sameloven+N+Prop+Sem/Obj+Sg+Ill (is not standard language)
★Samelovenis: Sameloven+N+Prop+Sem/Obj+Sg+Ela (is not standard language)

LEXICON LONDON-LOAN-plc Only nominatives. Placenames loan. Not used in Smj-propernouns

Jordandalen # Odd-syllable test examples:*
Jordandalen: Jordandalen+N+Prop+Sem/Plc+Sg+Nom
★Jordandalenij: Jordandalen+N+Prop+Sem/Plc+Sg+Ill (is not standard language)
★Jordandalenis: Jordandalen+N+Prop+Sem/Plc+Sg+Ela (is not standard language)

LEXICON LONDON-LOAN-org Only nominative. Organizations loan.Not used in smj-propernouns

Samfunnsavdelingen # Odd-syllable test examples:*
Samfunnsavdelingen: Samfunnsavdelingen+N+Prop+Sem/Org+Sg+Nom
★Samfunnsavdelingenij: Samfunnsavdelingen+N+Prop+Sem/Org+Sg+Ill (is not standard language)
★Samfunnsavdelingenis: Samfunnsavdelingen+N+Prop+Sem/Org+Sg+Ela (is not standard language)

JOKULL-plc are placenames. Lexicon added to make the code compile (?)

+N+Prop+Sem/Plc: LONDONDECL-PLC-SUR ; Placenames. NB added to make the code compile, needs revision. Gets an odd syllable case marking. Non-assimilated stems.

Drangajökull # Even-syllable test examples:*
Drangajökull: Drangajökull+N+Prop+Sem/Plc+Sg+Nom
Drangajökullij: Drangajökull+N+Prop+Sem/Plc+Sg+Ill
Drangajökullis: Drangajökull+N+Prop+Sem/Plc+Sg+Ela

Words in lexicon ANAR end on conconant, no cg, case marking with ill -ij, ela -is. Gets an odd syllable case marking. Lule sami stems.

LEXICON ANAR-mal Male names.

LEXICON ANAR-plc Place names

Guhttás # Odd-syllable test examples:*
Guhttás: Guhttás+N+Prop+Sem/Plc+Sg+Nom
Guhttásij: Guhttás+N+Prop+Sem/Plc+Sg+Ill
Guhttásis: Guhttás+N+Prop+Sem/Plc+Sg+Ela

Words in PIPPI lexicons are i-final, have no cg, no second syllable vowel change, and get odd syllable case marking with acc -hav, ill -hij, elat -his, etc. Works as “riebij”, but without the -j in nominative (it should maybe be Sirij and Pippij in nom?) and without cg. The last syllable is unstressed. Non-assimilated stems.

LEXICON PIPPI-ani IVowel-final names where case endings are added directly, no cg. Animals.

Rullahuuli # Even-syllable test examples:*
Rullahuuli: Rullahuuli+N+Prop+Sem/Ani+Sg+Nom
Rullahuulihij: Rullahuuli+N+Prop+Sem/Ani+Sg+Ill
Rullahuulihis: Rullahuuli+N+Prop+Sem/Ani+Sg+Ela

LEXICON PIPPI-obj Vowel-final names where case endings are added directly, no cg. Object names

Audi # Even-syllable test examples:*
Audi: Audi+N+Prop+Sem/Obj+Sg+Nom
Audihij: Audi+N+Prop+Sem/Obj+Sg+Ill
Audihis: Audi+N+Prop+Sem/Obj+Sg+Ela

LEXICON PIPPI-org Vowel-final names where caseendings are added directly, no cg. organizations

Kon-Tiki # Even-syllable test examples:*
Kon-Tiki: Kon-Tiki+N+Prop+Sem/Org+Sg+Nom
Kon-Tikihij: Kon-Tiki+N+Prop+Sem/Org+Sg+Ill
Kon-Tikihis: Kon-Tiki+N+Prop+Sem/Org+Sg+Ela

LEXICON PIPPI-mal Vowel-final names where case are added directly, no cg. Male names

Gianni # Even-syllable test examples:*
Gianni: Gianni+N+Prop+Sem/Mal+Sg+Nom
Giannihij: Gianni+N+Prop+Sem/Mal+Sg+Ill
Giannihis: Gianni+N+Prop+Sem/Mal+Sg+Ela

LEXICON PIPPI-fem Vowel-final names where case endings are added directly, no cg. Female names

Guri # Even-syllable test examples:*
Guri: Guri+N+Prop+Sem/Fem+Sg+Nom
Gurihij: Guri+N+Prop+Sem/Fem+Sg+Ill
Gurihis: Guri+N+Prop+Sem/Fem+Sg+Ela

LEXICON PIPPI-femsur Vowel-final names where case endings are added directly, no cg. Female names also used as surnames

Turi # Even-syllable test examples:*
Turi: Turi+N+Prop+Sem/Fem+Sg+Nom
Turihij: Turi+N+Prop+Sem/Fem+Sg+Ill
Turihij: Turi+N+Prop+Sem/Sur+Sg+Ill
Turihis: Turi+N+Prop+Sem/Fem+Sg+Ela

LEXICON PIPPI-malfem Vowel-final names where case endings are added directly, no cg. Names that can be both female and male names

Kari # Even-syllable test examples:*
Kari: Kari+N+Prop+Sem/Mal+Sg+Nom
Kari: Kari+N+Prop+Sem/Fem+Sg+Nom
Karihij: Kari+N+Prop+Sem/Mal+Sg+Ill
Karihis: Kari+N+Prop+Sem/Mal+Sg+Ela
Even-syllable test examples:*
Even-syllable test examples:*

LEXICON PIPPI-sur Vowel-final names where case endings are added directly, no cg. Surnames

Sammallahti # Even-syllable test examples:*
Sammallahti: Sammallahti+N+Prop+Sem/Sur+Sg+Nom
Sammallahtihij: Sammallahti+N+Prop+Sem/Sur+Sg+Ill
Sammallahtihis: Sammallahti+N+Prop+Sem/Sur+Sg+Ela
Even-syllable test examples:*

LEXICON PIPPI-plc Vowel-final names where caseendings are added directly, no cg. Place names

Lapinlampi # Even-syllable test examples:*
Lapinlampi: Lapinlampi+N+Prop+Sem/Plc+Sg+Nom
Lapinlampihij: Lapinlampi+N+Prop+Sem/Plc+Sg+Ill
Lapinlampihis: Lapinlampi+N+Prop+Sem/Plc+Sg+Ela
Even-syllable test examples:*

LEXICON PIPPI-LOAN-plc Only nominatives. Vowel-final names where case endings are added directly, no cg.Place names

Haltiatunturi # Even-syllable test examples:*
Haltiatunturi: Haltiatunturi+N+Prop+Sem/Plc+Sg+Nom
★Haltiatunturijij: Haltiatunturi+N+Prop+Sem/Plc+Sg+Ill (is not standard language)
★Haltiatunturijis: Haltiatunturi+N+Prop+Sem/Plc+Sg+Ela (is not standard language)
Even-syllable test examples:*
Even-syllable test examples:*

Words in lexicon DUORTNUS end on conconant, have cg and second syllable vowel change o:u, e:á. Odd syllable case marking. Real lule sami or one non-assimilated stem.

LEXICON DUORTNUS-mal Male names

Mihkal # Odd-syllable test examples:*
Mihkal: Mihkal+N+Prop+Sem/Mal+Sg+Nom
Mihkkalij: Mihkal+N+Prop+Sem/Mal+Sg+Ill
Mihkkalis: Mihkal+N+Prop+Sem/Mal+Sg+Ela

LEXICON DUORTNUS-sur Male names

Vándar # Odd-syllable test examples:*
Vándar: Vándar+N+Prop+Sem/Sur+Sg+Nom
Vánndarij: Vándar+N+Prop+Sem/Sur+Sg+Ill
Vánndaris: Vándar+N+Prop+Sem/Sur+Sg+Ela

LEXICON DUORTNUS-org Odd-syllable ending on consonant, with cg. Organizations

LEXICON DUORTNUS-plc Odd-syllable ending on consonant, with cg.Placenames

Hardangerduottar # Odd-syllable test examples:*
Hardangerduottar: Hardangerduottar+N+Prop+Sem/Plc+Sg+Nom
Hardangerduoddara: Hardangerduottar+N+Prop+Sem/Plc+Sg+Gen
Hardangerduoddarij: Hardangerduottar+N+Prop+Sem/Plc+Sg+Ill
Hardangerduoddaris: Hardangerduottar+N+Prop+Sem/Plc+Sg+Ela

LEXICON TIEMPEL-obj Same as DUORTNUS, only without second syll vowel change. Odd syllanle case marking Lexicon presently only for two -tiempel-final words. Lule sami stems.

Artemistiempel # Odd-syllable test examples:*
Artemistiempel: Artemistiempel+N+Prop+Sem/Obj+Sg+Nom
Artemistiemmpelij: Artemistiempel+N+Prop+Sem/Obj+Sg+Ill
Artemistiemmpelin: Artemistiempel+N+Prop+Sem/Obj+Sg+Ine
Artemistiemmpelis: Artemistiempel+N+Prop+Sem/Obj+Sg+Ela

LEXICON TIEMPEL-org Same as DUORTNUS, only without second syll vowel change. Odd syllanle case marking Lexicon presently only for two -tiempel-final words. Lule sami stems.

Artemistiempel # Odd-syllable test examples:*
Samovarteáhtar: Samovarteáhtar+N+Prop+Sem/Org+Sg+Nom
Samovarteáhttarij: Samovarteáhtar+N+Prop+Sem/Org+Sg+Ill
Samovarteáhttarin: Samovarteáhtar+N+Prop+Sem/Org+Sg+Ine
Samovarteáhttaris: Samovarteáhtar+N+Prop+Sem/Org+Sg+Ela

Lexicon HEANDARAT is not in use in smj

+Pl+Nom:aQ1 K ; +Pl+Gen:aQ1j K ; +Pl+Gen:aQ1j RHyph ; +Pl+Acc:aQ1jt K ; +Pl+Ill:aQ1jda K ; +Pl+Ine:aQ1jn K ; +Pl+Ela:aQ1js K ; +Pl+Com:aQ1j K ;

Words in lexicon EATNAMAT are odd-syllable plurals. Lule sami stems and non-assimilated stems.

LEXICON EATNAMAT-plc Place names. Presently only for Vuolleednama

Vuolleednama # Contrcted test examples:*
★Vuolleednam: Vuolleednama+N+Prop+Sem/Plc+Sg+Nom (is not standard language)
Vuolleednama: Vuolleednama+N+Prop+Sem/Plc+Pl+Nom
Vuolleednamijda: Vuolleednama+N+Prop+Sem/Plc+Pl+Ill
Vuolleednamijs: Vuolleednama+N+Prop+Sem/Plc+Pl+Ela

LEXICON EATNAMAT-org Organizations

Contracted proper nouns

Words in lexicon DAVVISUOLLU are contracted propernouns ending on -åj/-oj. Lule sami stems

LEXICON DAVVISUOLU-plc Contracted stems ending on -oj. Place names.

Victoriasuoloj # Contracted test examples:*
Victoriasuoloj: Victoriasuoloj+N+Prop+Sem/Plc+Sg+Nom
Victoriasuolluj: Victoriasuoloj+N+Prop+Sem/Plc+Sg+Ill
Victoriasuollus: Victoriasuoloj+N+Prop+Sem/Plc+Sg+Ela

Words in lexicon GEAVNNIS are contracted propernouns ending on -s.

LEXICON GEAVNNIS-plc Contracted stems ending on -es. Place names. Lule sami stems.

Gaza-Sárges # Contracted test examples:*
Gaza-Sárges: Gaza-Sárges+N+Prop+Sem/Plc+Sg+Nom
Gaza-Sárggáj: Gaza-Sárges+N+Prop+Sem/Plc+Sg+Ill
Gaza-Sárggás: Gaza-Sárges+N+Prop+Sem/Plc+Sg+Ela

Words in lexicon SUOLLOT are contracted plurals. Lule sami stems.

LEXICON SULLOT-plc Plural names, only names ending on -suollu.

Falklandsuollu # test examples:*
★Falklandsuoloj: Falklandsuollu+N+Prop+Sem/Plc+Sg+Nom (is not standard language)
Falklandsuollu: Falklandsuollu+N+Prop+Sem/Plc+Pl+Nom
Falklandsuollujda: Falklandsuollu+N+Prop+Sem/Plc+Pl+Ill
Falklandsuollujs: Falklandsuollu+N+Prop+Sem/Plc+Pl+Ela

Lexicons only used in sme/sma and that are sent to other lexicons in smj

ERVASTI is only used in smi-propenouns. Ervasti names are 3-syllable and are needed as a seperate lexicon because of sma. ERVASTI is same as ACCRA in smj and gets even syllable case marking.

MAKI and NIEMI is only used in smi-propenouns. Maki names are even-syllable finnish names and are needed as a seperate lexicon because of sma. MÄKI is same as ACCRA in smj and gets even syllable case marking.

HANNOLA is the same as ACCRA

This (part of) documentation was generated from src/fst/morphology/affixes/propernouns.lexc

src-fst-morphology-affixes-symbols.lexc.md

Symbol affixes

This (part of) documentation was generated from src/fst/morphology/affixes/symbols.lexc

src-fst-morphology-affixes-verbs.lexc.md

Sublexica for Verb

Table of content:

The auxiliaries
Negation verb
Copula
Modals
Main verbs
Even syllable stems
Intransitives
Transitives
Odd syllable stems
Intranstives
Transitives
Contracted stems
Intransitives
Transitives
Assimilated loan verbs
Intransitives
Transitives
Main inflectional categories
suffix lexicas
Verb derivation

IV means intransitive verbs, TV means transitive verbs.

Auxiliary verbs

Negation verb

LEXICON NEG

Copula

LEXICON ÅRROT

LEXICON LIEHKET

LEXICON LULU

Modals

LEXICON GALGGAT_IV even-syllable modal verbs.

soajttet # Eveb-syllable test examples:*
soajtáv: soajttet+V+IV+Ind+Prs+Sg1
soajttiv: soajttet+V+IV+Ind+Prt+Sg1
såjtijma: soajttet+V+IV+Ind+Prt+Pl1

LEXICON VIERTTIT_IV Contracted modal verbs.

hæhttut # Eveb-syllable test examples:*
hæhttut: hæhttut+V+IV+Inf
hähttut: hæhttut+V+IV+Inf
hæhttuv: hæhttut+V+IV+Ind+Prs+Sg1
hähttuv: hæhttut+V+IV+Ind+Prs+Sg1
hæhttujma: hæhttut+V+IV+Ind+Prt+Pl1
hähttujma: hæhttut+V+IV+Ind+Prt+Pl1

Ordinary main verbs

Even-syllable stems

Intransitives

LEXICON GALSSJOT_IV Impersonal o-verbs

hærmmot # Eveb-syllable test examples:*
hærmmu: hærmmot+V+IV+Ind+Prs+Sg3
härmmu: hærmmot+V+IV+Ind+Prs+Sg3
hærmoj: hærmmot+V+IV+Ind+Prt+Sg3
härmoj: hærmmot+V+IV+Ind+Prt+Sg3

LEXICON BÅRSSJOT_IV o-verbs with

hæssot # Eveb-syllable test examples:*
hæsov: hæssot+V+IV+Ind+Prs+Sg1
häsov: hæssot+V+IV+Ind+Prs+Sg1
hessuv: hæssot+V+IV+Ind+Prt+Sg1
hæsojma: hæssot+V+IV+Ind+Prt+Pl1
häsojma: hæssot+V+IV+Ind+Prt+Pl1

LEXICON VILSSJOT_IV o-verbs as BÅRSSJOT but without derivations -stit, -stallat, -stahttet, - stasstet. With dim -astit that are hardcoded

libjjot # Eveb-syllable test examples:*
libjov: libjjot+V+IV+Ind+Prs+Sg1
libjjuv: libjjot+V+IV+Ind+Prt+Sg1
libjojma: libjjot+V+IV+Ind+Prt+Pl1

LEXICON BUOLLET_IV e-verbs

liddet # Even-syllable test examples:*
littáv: liddet+V+IV+Ind+Prs+Sg1
liddiv: liddet+V+IV+Ind+Prt+Sg1
littijma: liddet+V+IV+Ind+Prt+Pl1

LEXICON BOAHTET_IV e-verbs like BUOLLET_IV without passive

boahtet # Even-syllable test examples:*
boadáv: boahtet+V+IV+Ind+Prs+Sg1
båhtiv: boahtet+V+IV+Ind+Prt+Sg1
bådijma: boahtet+V+IV+Ind+Prt+Pl1

LEXICON VIEDJET_IV e-verbs GRADE II-I WITH IE DIPHT.

biehket # Eveb-syllable test examples:*
bægáv: biehket+V+IV+Ind+Prs+Sg1
bägáv: biehket+V+IV+Ind+Prs+Sg1
biehkiv: biehket+V+IV+Ind+Prt+Sg1
biegijma: biehket+V+IV+Ind+Prt+Pl1

LEXICON ASSTAT_IV only for asstat, no passive

asstat # Even-syllable test examples:*
astav: asstat+V+IV+Ind+Prs+Sg1
asstiv: asstat+V+IV+Ind+Prt+Sg1
astajma: asstat+V+IV+Ind+Prt+Pl1

LEXICON RAVGGAT_IV a- and å-verbs only Sg3 passive.

bajássjaddat # Even-syllable test examples:*
biváv bivvat+V+IV+Ind+Prs+Sg1
bivviv bivvat+V+IV+Ind+Prt+Sg1
bivájma: bivvat+V+IV+Ind+Prt+Pl1

LEXICON BIEGGAT_IV Impersonals

dednjat # Even-syllable test examples:*
dædnjá: dednjat+V+IV+Ind+Prs+Sg3
dädnjá: dednjat+V+IV+Ind+Prs+Sg3
denjaj: dednjat+V+IV+Ind+Prt+Sg3

LEXICON RAVGGALASSTET_IV Like RAVGGAT for already derived words (except words ending -uššat) - no actio as first part of compounds, but reintroduced

dehpudallat # Even-syllable test examples:*
dehpudaláv: dehpudallat+V+IV+Ind+Prs+Sg1
dehpudalliv: dehpudallat+V+IV+Ind+Prt+Sg1
dehpudalájma: dehpudallat+V+IV+Ind+Prt+Pl1

LEXICON BIEKKASTALLAT_IV Already derived impersonals

dehpudallat # Even-syllable test examples:*
duhpárasstá: duhpárasstet+V+IV+Ind+Prs+Sg3
duhpárastij: duhpárasstet+V+IV+Ind+Prt+Sg3

LEXICON GUOTTEDALLAT_IV passives on -allat - no actio as first part of compounds, but reintroduced

duolmudallat # Even-syllable test examples:*
duolmudaláv: duolmudallat+V+IV+Ind+Prs+Sg1
duolmudalliv: duolmudallat+V+IV+Ind+Prt+Sg1
duolmudalájma: duolmudallat+V+IV+Ind+Prt+Pl1

LEXICON HIEBADUVVAT_IV passives on -uvvat - no actio as first part of compounds, but reintroduced

duostoduvvat # Even-syllable test examples:*
duostoduváv: duostoduvvat+V+IV+Ind+Prs+Sg1
duostoduvviv: duostoduvvat+V+IV+Ind+Prt+Sg1
duostoduvájma: duostoduvvat+V+IV+Ind+Prt+Pl1

Transitives LEXICON MÁHTTET_TV verbs without personal passive

jáhkket # Even-syllable test examples:*
jáhkáv: jáhkket+V+TV+Ind+Prs+Sg1
jáhkkiv: jáhkket+V+TV+Ind+Prt+Sg1
jáhkijma: jáhkket+V+TV+Ind+Prt+Pl1

LEXICON BASSAT_TV a- and å-verbs. Three passives

jåksåt # Even-syllable test examples:*
jåvsåv: jåksåt+V+TV+Ind+Prs+Sg1
jåksiv: jåksåt+V+TV+Ind+Prt+Sg1
jåvsåjma: jåksåt+V+TV+Ind+Prt+Pl1

LEXICON BASSALASSTET_TV Like BASSAT for already derived words (except words ending -uššat) - no actio as first part of compounds, but reintroduced. Three passives

jårgudallat # Even-syllable test examples:*
jårgudaláv: jårgudallat+V+TV+Ind+Prs+Sg1
jårgudalliv: jårgudallat+V+TV+Ind+Prt+Sg1
jårgudalájma: jårgudallat+V+TV+Ind+Prt+Pl1

LEXICON HIEJTEDAHTTET_TV Like BASSALASSTET_TV, but for words ending on -ahttet. Diffrence is Use/NG an Use/-Spell for NomAg “hiejedahttijn”, since this is rearly used an is mixed up with gerundium “hiejtedattijn”. Like BASSAT for already derived words (except words ending -uššat) - no actio as first part of compounds, but reintroduced. Three passives

jårgudallat # Even-syllable test examples:*
jårgudaláv: jårgudallat+V+TV+Ind+Prs+Sg1
jårgudalliv: jårgudallat+V+TV+Ind+Prt+Sg1
jårgudalájma: jårgudallat+V+TV+Ind+Prt+Pl1

LEXICON JUHKAT_TV a-verbs like BASSAT_TV but but without derivations -stit, -stallat, -stahttet, - stasstet. Dim -istit that are hardcoded. Three passives

njammat # Even-syllable test examples:*
njamáv: njammat+V+TV+Ind+Prs+Sg1
njammiv: njammat+V+TV+Ind+Prt+Sg1
njamájma: njammat+V+TV+Ind+Prt+Pl1

LEXICON LÁHPPET_TV e-verbs. Three passives

oajttet # Even-syllable test examples:*
oajtáv: oajttet+V+TV+Ind+Prs+Sg1
oajttiv: oajttet+V+TV+Ind+Prt+Sg1
åjtijma: oajttet+V+TV+Ind+Prt+Pl1

LEXICON JIEHKET_TV e-verbs GRADE II-I WITH IE DIPHT. Three passives

sievvet # Even-syllable test examples:*
sæváv: sievvet+V+TV+Ind+Prs+Sg1
säváv: sievvet+V+TV+Ind+Prs+Sg1
sievviv: sievvet+V+TV+Ind+Prt+Sg1
sievijma: sievvet+V+TV+Ind+Prt+Pl1

LEXICON DIEHTET_TV Only this one word, unusual diphtong behavior. No passive

diehtet # Even-syllable test examples:*
diedáv: diehtet+V+TV+Ind+Prs+Sg1
diehtiv: diehtet+V+TV+Ind+Prt+Sg1
diedijma: diehtet+V+TV+Ind+Prt+Pl1

LEXICON GÁDJOT_TV o-verbs. only duvvat passive.

sjpædtjot # Even-syllable test examples:*
sjpættjov: sjpædtjot+V+TV+Ind+Prs+Sg1
sjpättjov: sjpædtjot+V+TV+Ind+Prs+Sg1
sjpædtjuv: sjpædtjot+V+TV+Ind+Prt+Sg1
sjpädtjuv: sjpædtjot+V+TV+Ind+Prt+Sg1
sjpættjojma: sjpædtjot+V+TV+Ind+Prt+Pl1
sjpättjojma: sjpædtjot+V+TV+Ind+Prt+Pl1

LEXICON JÅRGGOT_TV o-verbs with dim -astit that are hardcoded. Duvvat and dallat passive.

boarkkot # Even-syllable test examples:*
boarkov: boarkkot+V+TV+Ind+Prs+Sg1
boarkkuv: boarkkot+V+TV+Ind+Prt+Sg1
boarkojma: boarkkot+V+TV+Ind+Prt+Pl1

Odd-syllable stems

This is just awaiting a manual classification

intransitives

LEXICON BIEKKASTIT_IV Impersonals, only Sg3

LEXICON JÅRGESTIT_IV ONLY FOR -STIT verbs, makes Err/Orth jårgest, an with other verbs in this lexicon they will get err/orth Prs Sg3 even if it’s same as corrct prs sg3

doalvestit # Odd-syllable test examples:*
doalvestav: doalvestit+V+IV+Ind+Prs+Sg1
doalvestiv: doalvestit+V+IV+Ind+Prt+Sg1
doalvestijma: doalvestit+V+IV+Ind+Prt+Pl1

LEXICON MÅRIJDIT_IV ONLY FOR Words ending -IJDIT. Same as BEGATJIT, but a common error is to write “jårgidit”, so the whole -ijdit and Err/Orth -idit is made in this lexicon

LEXICON BEGATJIT_IV Words ending -tjit, -jdit, reciprocals on -dit, momentatives on -dit, -edit, continuatives on -ldit, -nit, essives on -hit and 5-syllables - no actio cmps, but only Sg3 passivereintroduced

duojkkuhit # Odd-syllable test examples:*
duojkkuhav: duojkkuhit+V+IV+Ind+Prs+Sg1
duojkkuhiv: duojkkuhit+V+IV+Ind+Prt+Sg1
duojkkuhijma: duojkkuhit+V+IV+Ind+Prt+Pl1

LEXICON VÁNDARDIT_IV At the moment IV

doalvestit # Odd-syllable test examples:*
vándardav: vándardit+V+IV+Ind+Prs+Sg1
vándardiv: vándardit+V+IV+Ind+Prt+Sg1
vándardijma: vándardit+V+IV+Ind+Prt+Pl1

LEXICON BALÁDIT_IV continuatives on -dit, frequentatives on -odit, reciprocals, momentatives and frequentatives ending -alit - actio cpms, only Sg3 passive

lihtudit # Odd-syllable test examples:*
lihtudav: lihtudit+V+IV+Ind+Prs+Sg1
lihtudiv: lihtudit+V+IV+Ind+Prt+Sg1
lihtudijma: lihtudit+V+IV+Ind+Prt+Pl1

LEXICON SUOGNALIT_IV Trisyllabic Verbs ending -lit. only Sg3 passive

loavkkalit # Odd-syllable test examples:*
loavkkalav: loavkkalit+V+IV+Ind+Prs+Sg1
loavkkaliv: loavkkalit+V+IV+Ind+Prt+Sg1
loavkkalijma: loavkkalit+V+IV+Ind+Prt+Pl1

LEXICON LASSÁNIT_IV verbs ending -nit, -sit, no passive

rievddánit # Odd-syllable test examples:*
rievddánav: rievddánit+V+IV+Ind+Prs+Sg1
rievddániv: rievddánit+V+IV+Ind+Prt+Sg1
rievddánijma: rievddánit+V+IV+Ind+Prt+Pl1

LEXICON BÁHTARIT_IV verbs ending -rit. only Sg3 passive

sjtávttjurit # Odd-syllable test examples:*
sjtávttjurav: sjtávttjurit+V+IV+Ind+Prs+Sg1
sjtávttjuriv: sjtávttjurit+V+IV+Ind+Prt+Sg1
sjtávttjurijma: sjtávttjurit+V+IV+Ind+Prt+Pl1

transitives

LEXICON FÁRMASTIT_TV ONLY FOR verbs ending on -stit. makes Err/Orth jårgest, an with other verbs in this lexicon they will get err/orth Prs Sg3 even if it’s same as corrct prs sg3. All -uvvat passives.

LEXICON HÁLIJDIT_TV ONLY FOR Words ending -IJDIT. Same as MUJTATJIT, but a common error is to write “hálidit”, so the whole -ijdit and Err/Orth -idit is made in this lexicon

LEXICON UNNEDIT_TV All -uvvat passives.

nuoledit # Odd-syllable test examples:*
nuoledav: nuoledit+V+TV+Ind+Prs+Sg1
nuolediv: nuoledit+V+TV+Ind+Prt+Sg1
nuoledijma: nuoledit+V+TV+Ind+Prt+Pl1

LEXICON MUJTATJIT_TV Words ending -tjit, -jdit, reciprocals on -dit, momentatives on -dit, -edit, continuatives on -ldit, -nit, essives on -hit and 5-syllables - no actio cmps, but reintroduced. All -uvvat passives

nårddådit # Odd-syllable test examples:*
nårddådav: nårddådit+V+TV+Ind+Prs+Sg1
nårddådiv: nårddådit+V+TV+Ind+Prt+Sg1
nårddådijma: nårddådit+V+TV+Ind+Prt+Pl1

LEXICON BÅNJÅDIT_TV continuatives on -dit, frequentatives on -odit, reciprocals, momentatives and frequentatives ending -alit - actio cpms. All -uvvat passives.

tsirggalit # Odd-syllable test examples:*
tsirggalav: tsirggalit+V+TV+Ind+Prs+Sg1
tsirggaliv: tsirggalit+V+TV+Ind+Prt+Sg1
tsirggalijma: tsirggalit+V+TV+Ind+Prt+Pl1

LEXICON VUORDDELIT_TV Trisyllabic Verbs ending -lit. All -uvvat passives

tsåggålit # Odd-syllable test examples:*
tsåggålav: tsåggålit+V+TV+Ind+Prs+Sg1
tsåggåliv: tsåggålit+V+TV+Ind+Prt+Sg1
tsåggålijma: tsåggålit+V+TV+Ind+Prt+Pl1

Contracted stems

LEXICON SJIERRIT_IV Impersonals

boavddit # Contracted test examples:*
boavddi: boavddit+V+IV+Ind+Prs+Sg3
boavddij: boavddit+V+IV+Ind+Prt+Sg3

LEXICON BASSUT_IV Passives

buvvut # Contracted test examples:*
buvvuv: buvvut+V+IV+Ind+Prs+Sg1
buvvujiv: buvvut+V+IV+Ind+Prt+Sg1
buvvujma: buvvut+V+IV+Ind+Prt+Pl1

LEXICON OADDÁT_IV Inceptive, (doarrut,jåhttåt). Only Sg3 passive. Does not make nouns via -ár derivation.

bæhkkát # Contracted test examples:*
bæhkkáv: bæhkkát+V+IV+Ind+Prs+Sg1
bähkkáv: bæhkkát+V+IV+Ind+Prs+Sg1
bæhkkájiv: bæhkkát+V+IV+Ind+Prt+Sg1
bähkkájiv: bæhkkát+V+IV+Ind+Prt+Sg1
bæhkkájma: bæhkkát+V+IV+Ind+Prt+Pl1
bähkkájma: bæhkkát+V+IV+Ind+Prt+Pl1

LEXICON DULLUT_IV Does not make nouns via -ár derivation. Only Sg3 passiv.

dussut # Contracted test examples:*
dussuv: dussut+V+IV+Ind+Prs+Sg1
dussujiv: dussut+V+IV+Ind+Prt+Sg1
dussujma: dussut+V+IV+Ind+Prt+Pl1

LEXICON TJUOLLÁT_TV Inceptiv. All passive. Does not make nouns via -ár derivation, (gullát, bårråt)

gajkkát # Contracted test examples:*
gajkkáv: gajkkát+V+TV+Ind+Prs+Sg1
gajkkájiv: gajkkát+V+TV+Ind+Prt+Sg1
gajkkájma: gajkkát+V+TV+Ind+Prt+Pl1

LEXICON STRÁFFUT_TV Does not make nouns via -ár derivation. All duvvat passives.

gáhpput # Contracted test examples:*
gáhppuv: gáhpput+V+TV+Ind+Prs+Sg1
gáhppujiv: gáhpput+V+TV+Ind+Prt+Sg1
gáhppujma: gáhpput+V+TV+Ind+Prt+Pl1

LEXICON TSIEGGIT_TV Makes nouns via -ár derivation. All duvvat passives.

gámmpit # Contracted test examples:*
gámmpiv: gámmpit+V+TV+Ind+Prs+Sg1
gámmpijiv: gámmpit+V+TV+Ind+Prt+Sg1
gámmpijma: gámmpit+V+TV+Ind+Prt+Pl1
gámmpár: gámmpit+V+TV+Der/r+N+Sg+Nom

LEXICON VALLIT_TV Makes nouns via -ár derivation. Gets only passive Sg3

hinnit # Contracted test examples:*
hinniv: hinnit+V+TV+Ind+Prs+Sg1
hinnijiv: hinnit+V+TV+Ind+Prt+Sg1
hinnijma: hinnit+V+TV+Ind+Prt+Pl1
hinnár: hinnit+V+TV+Der/r+N+Sg+Nom

contraced verbs assimilated and outside the main pattern.

LEXICON PLÁNIT_TV Transitive Two-syll contraced words not in third grade as contraced verb have been. Two syllable transitive NEW loan verbs. Makes nouns via -ár derivation. All passives.

hinnit # Contracted test examples:*
bloaggiv: bloaggit+V+TV+Ind+Prs+Sg1
blåggiv: bloaggit+V+TV+Ind+Prs+Sg1
bloaggijiv: bloaggit+V+TV+Ind+Prt+Sg1
blåggijiv: bloaggit+V+TV+Ind+Prt+Sg1
bloaggijma: bloaggit+V+TV+Ind+Prt+Pl1
blåggijma: bloaggit+V+TV+Ind+Prt+Pl1
bloaggár: bloaggit+V+TV+Der/r+N+Sg+Nom
blåggår: bloaggit+V+TV+Der/r+N+Sg+Nom

LEXICON SLEDUT_IV Intransitive Two-syll contraced words not in third grade as contraced verb have been. Only Sg3 passiv.

håŋŋlit # Contracted test examples:*
håŋŋliv: håŋŋlit+V+IV+Ind+Prs+Sg1
håŋŋlijiv: håŋŋlit+V+IV+Ind+Prt+Sg1
håŋŋlijma: håŋŋlit+V+IV+Ind+Prt+Pl1

LEXICON BADASS_TV NEW badly assimilated two syllable transitive loan verbs. Makes nouns via -ár derivation. All passives. Err/orth taged in stem file

LEXICON BADASS_IV NEW badly assimilated two syllable intransitive loan verbs. Makes nouns via -ár derivation. Only Sg3 passiv. Err/orth taged in stem file.

LEXICON ABBONERE_TV Transitive loan words with more than two syllables with -erit/ierit endings. Duvvat passives. Does not make nouns via -ár derivation. Only the two last syllables are assimilated to sami. LONG -e is assimilated in different ways in Norway and Sweden: In Norway, it becomes -ie, and in Sweden -e.

LEXICON BRILJERE_IV Intransitive loan words with more than two syllables with -erit/ierit endings. Does not make nouns via -ár derivation. Only the two last syllables are assimilated to sami. Long -e is assimilated in different ways in dialects in Norway and Sweden: In Norway it often becomes -ie, while in Sweden itºs usually -e.

LEXICON BRILJERE_IV_INFL

briljierit # Contracted test examples:*
briljieriv: briljierit+V+IV+Ind+Prs+Sg1
briljeriv: briljierit+V+IV+Ind+Prs+Sg1
briljierijiv: briljierit+V+IV+Ind+Prt+Sg1
briljerijiv: briljierit+V+IV+Ind+Prt+Sg1
briljierijma: briljierit+V+IV+Ind+Prt+Pl1
briljerijma: briljierit+V+IV+Ind+Prt+Pl1

LEXICON ABBONERE_TV_INFL

abbonierit # Contracted test examples:*
abbonieriv: abbonierit+V+TV+Ind+Prs+Sg1
abboneriv: abbonierit+V+TV+Ind+Prs+Sg1
abbonierijiv: abbonierit+V+TV+Ind+Prt+Sg1
abbonerijiv: abbonierit+V+TV+Ind+Prt+Sg1
abbonierijma: abbonierit+V+TV+Ind+Prt+Pl1
abbonerijma: abbonierit+V+TV+Ind+Prt+Pl1

This (part of) documentation was generated from src/fst/morphology/affixes/verbs.lexc

src-fst-morphology-compounding.lexc.md

This (part of) documentation was generated from src/fst/morphology/compounding.lexc

src-fst-morphology-phonology.twolc.md

Lule Sámi morphophonological rule set

This file documents the phonology.twolc file

The file contains the rule set for the non-segmental Lule Sámi morphphonological rules

Background

The file is modeled upon the corresponding file for North Sámi, but has been revised and differs from it on several issues. The grammatical sources are Spiik 1989: Lulesamisk grammatik and Nystø and Johnsen 2001: Sámásta 2.

The rule file has the sections Alphabet, Sets, Definition and Rules. The rules are ordered thematically, with 3 main sections: Consonant alternations (except CG), vowel alternations, and consonant gradation.

Declarations and definitions

The Alphabet section

The real Lule Sámi Alphabet

All Lule Saami letters are listed. The Lule Sámi ENG sound is represented as ñ. Lule Sámi letter repertoire is not fully standardised. In the source code we write (and you shall write!) æ; ø; ŋ, but the parser tolerates input written with the the letters ä; ö; ń, ñ (cf. the 4 rules in the file smj/src/orthography/spellrelax.regex).

small letters = a á b c d e f g h i j k l m n ñ ń ŋ o p q r s t u v w x y z æ:æä ä:æä ø ö å %- é ó ú í à è ò ù ì ë ü ï â ê ô û î ã ý ç č đ ð š ŧ þ ß ª
capital letters = A Á B C D E F G H I J K L M N Ñ Ń Ŋ O P Q R S T U V W X Y Z Æ:ÆÄ Ä:ÆÄ Ø Ö Å É Ó Ú Í À È Ò Ù Ì Ë Ü Ï Â Ê Ô Û Î Ã Ý Ç Č Đ Ð Š Ŧ þ

The 3rd degree mark º is never realized, hence declared as º:0. º:0 = Gradation mark %/ = Literal /, not the TWOLC reserved symbol ‘:’ = Apostrophe

Literal quotes and angles must be escaped (cf morpheme boundaries further down):

h2, g2 etc. are consonants deleted in the Nom. m3, d3 etc. (?) are consonants that undergo certain processes word-finally. This issue should be looked into. Perhaps the two sets can be unified. The reason why there are more distinctions than for sme, is that the cns deletion process is more phonological in sme.

’:’ = Morphophonemes in sme, here temporarily due to common propernoun file
’:’ = these are deleted in nom
’:’ = these can not occur before #
’:’ = Non-sámi cons clusters
’:’ = Do not change these where they would normally undergo umlaut etc

The Dummy symbols

The Dummy symbols are taken from the sme file for convenience, only a small part of them are actually used, they are defined in the Sets section along the way, included there as soon as they are used. The set of actually used Dummy symbols is thus the set declared in “Dummy”. The Dummy symbols trigger morphophonological rules. X is used for nouns and adjectives, Y for verbs and Q for processes common to all The symbols themselves are used in the following way:

OBS: the definitions are not all correct or sufficiently specific

**X1:0**: Deletes final consonants in short essive of odd syllables
**X2:0**: WeG and neutralization of g8, etc. (hivsik-hivsiga)
**X3:0**: Weg and deletion of g8, etc. (bena-bednaga)
**X4:0 : e:á and e:å in illatives and px. a:á and o**: u in Px and ill of a-stem actors and o-stems
**X5:0 : e:á, e:å and o:u in odd-syllable nouns, but also for some even nouns (o**: u f.eks)
**X6:0 : Deviant III-I consonant gradation (in contracted stems, guobbmu**: guomoj)
**X7:0 : WeG and e:á, e:å, o:á, o:u in front of diminutives, e**: å in -lasj der
**X8:0**: Stem vowel alternations in Px
**X9:0**: Stem-vowel and central consonant shortening in first part(s) of compounds
**Q1:0 : The general weak grade trigger. Stem vowel change e:i and o**: u in front of j. Dipht. simpl. Any environment #only# demanding WeG shall use Q1.
**Q2:0 : Vowel harmony**: 2nd syll e realized as å whenever 1st syll is å.
**Q3:0**: WeG in contracted, also does not trigger Dipht simpl.
**Q4:0 : Stem vowel change e:i and o**: u in front of j. Dipht. simpl. Like Q1 but strong grade.
**Q5:0 : e**: á stem vowel change for word diehtet. Weak grade.
**Q6:0 : e**: á stem wovel change for word diehtet. Strong grade.
**Q7:0 : e**: á stem vowel change for word diehtte. Extra strong grade
**Q8:0**: Stem vowel deletion, impII of verbs.
**Q9:0**: TBW
**Y1:0**: Stem vowel deletion, imp 3sg, 3du, 2pl, 3pl of verbs
**Y2:0**: “Indicative Present Singular 3rd Final Vowel in verbs”
**Y3:0**: PrsPrc
**Y4:0**: e > u in front of dersuff, o > u and e > á in front of dersuffix -alla
**Y5:0**: e > a, i > á, o > u, e > å in verb derivation
**Y6:0**: “Consonant insertion as II-III strengthening gradation”, verbs +PrsPrt and +Imprt+Du2
**Y7:0**: “Consonant insertion as II-III strengthening gradation”, nouns and propernouns
**Y8:0**: “Stem vowel deletion in even-syllable verbs, imp 1du, 1pl”
**Y9:0**: “Stem vowel deletion in short passives of even-syllable verbs
**Z1:0 : TBW “i**: á in Verb Derivation guollir>guollár”
**Z2:0 : e:å, o**: u in -lasj der
**Z3:0 : weak grade trigger fºf:f. Stem vowel change e:i and o**: u in front of j.
**Z4:0 : weak grade trigger fºf:f and e:á, e:å, o:á, o:u in front of diminutives, e**: å in -lasj der
**Ø1:0 : optional Word Final Cluster Simplification. Not smj grammar, made only for Err/Orths ! málestit**: málest instead for norm máles
**Ø2:0 : optional e**: i when followed by any conc (not only j). Not smj grammar, made only for Err/Orths ! “iednida”
**Ø3:0 : Not smj grammar, made only for nominative Err/Orths: málles>mállás e:á, e:å, o:u,
**Ø4:0 : Not smj grammar, made only for Err/Orths: changes h:j, b:p, t:d, m:v in contexts where it isn’t supposed to change making err/orths as “årojij” (not g:k because og BENA lexicon)
**Ø5:0 : Not smj grammar, made only for Err/Orths: third extra long grade to first grad: l’l>l, m’m>m and so on

Morpheme boundaries:

** « **: Derivational prefix
** » **: Derivational suffix
** %< **: Inflectional prefx
** %> **: Inflectional suffix
** # **: Word boundary for both lexicalised and dynamic compounds
** %^ **: (exceptional) soft hyphenation point
** % **: a space
** ∑ **: mark before # to indicate dynamic comounds

The Sets section

These are the sets:

Vow: the vowels
Cns: the consonants
StemCns: consonants that may occur in stem-final position
DelCns: the consonants that are deleted in nominative
Dummy: the set of dummy symbols, they are there to trigger certain morphophonological symbols
WeG: the dummy symbols that trigger weak grade
Vow = a á e i o u y æ ä ø ö å æä
A Á E I O U Y Æ Ä Ø Ä Å ÆÄ
é ó ú í à è ò ù ì ë ü ï â ê ô û î ã ý
É Ó Ú Í À È Ò Ù Ì Ë Ü Ï Â Ê Ô Û Î Ã Ý
a9 e9 o9 æ9 ä9
a9 e9 o9 æ9 ä9
É Ó Ú Í À È Ò Ù Ì Ë Ü Ï Â Ê Ô Û Î Ã Ý ;
CapCns = B C D F G H J K L M N Ñ Ń Ŋ P Q
R S T V W X Z Ç Č Đ Ð Š Ŧ þ ;
Cns = b b9 c d d9 f g g8 g9 h h8 h9 j j9 k l l9 m m8 m9 n n8 n9 ŋ ñ ń p q r r9 s t v w x z z9 º ; = All consonants
Cns7 = c f h j l m n ŋ ñ ń p q r s t v w x z ; = Surface cons excl 1st members of xx-type G3
Cns8 = b c d f g h j k l m n ŋ ñ ń p q r s t v w x z ; = All surface consonants
Cns9 = b9 d9 g8 g9 h8 h9 j9 l9 m8 m9 n8 n9 r9 º ; = Underlying consonants
Cns4 = f l m n ŋ ñ ń r v ; = Don’t remember …?
StemCns = b b9 d d9 g g8 g9 h h8 h9 j j9 l l9 m m8 m9 n n8 n9 ŋ ñ ń r r9 s ; = Can occur stem-finally
DelCns = g8 h8 m8 n8 ; = deleted in nom…
WeG = X2 X3 X7 Y5 Q1 Q2 Q3 Q6 Z3 Z4 ;
Dummy = X2 X3 X4 X5 X6 X7 X8 Y1 Y2 Y3 Y4 Y5 Y6 Y7 Y8 Y9 Q1 Q2 Q3 Q4 Q5 Q6 Q7 Z1 Z2 Z3 Z4 %> » %^ ;
Err = Ø1 Ø2 Ø3 Ø4 Ø5 ; Only for making err/orth inflections that don’t follow smj-rules
Hyph = %- ;

The Definitions section

In this section, the consonants are defined. This includes consonant clusters in the various grades and consonant alternations.

G3 vs G2

The alternation patterns according to Spiik’s alternations series, here named S4, S5, … for “Spiik alternation series 4, 5, etc.” as they are presented in his grammar..

Class

Alternation

Series

S7	kkn:k0n	series 1
S8	fºf:f0f	series 2
S9	jgg:j0g	series 3
S4	hkk:h0k	series 4
S5	xy:zy (no zeros)	series 5
S6	xx:yy (no zeros)	series 6
S7	xy:zy (no zeros)	series 7
S8	—– (no cg)	series 8

Definition of gradation symbols:

LowerG2: A definition of Grade2 consonant sequences referring mostly to the surface level
LowerG1: A definition of Grade 1 consonant sequences
LowerG12: A definition of Grade 1 or 2 consonant sequences
G32: A definition of Grade 3 or 2 consonant sequences
G31: A definition of Grade 3 or 1 consonant sequences which is not SMJ-grammar only for err/orths
G3: A definition of Grade 3 consonant sequences

The Rules section

Overview

The rules section has the following chapters: Consonant alternations in certain pos, vowel lengthening, diphthong simplification, stem vowel alternations, consonant gradation rules

Consonant alternations in certain pos

All rules deal with word-final position.

★a (is not standard language)
★b (is not standard language)

**Word Final Devoicing of Certain Single Consonants d9 etc. **

iemed9#
iemet#

**Word Final Devoicing of Certain Single Consonants m9-v ** ! Spilt up because of err/orths ending on v, gierkav> we want err/orth gierkkam

gierkkam9>Q1
gier0kav>0
gierkkam9>a
gierkkam>a
gierkkam9>ij
gierkkam>ij
gierkkam9>Q1X1n
gier0ka0>00n

**Err/Orths. **

Word final weakening -tj and -ttj to -sj part 1

Word final weakening -tj and -ttj to -sj part 2

jågåtj
jågåsj
gålºleX7tj
gål0lå0sj

Word Final Deletion of n8 m8 g8 h8

loavddag8X3#
l0åv0da00#

Word Final Neutralization of g8, h8, m8

Deleting Final h9 in Short Essive of Uneven Syllables

Deleting Final l9 in Short Essive of Uneven Syllables

Deleting Final m9 in Short Essive of Uneven Syllables

Deleting Final n9 in Short Essive of Uneven Syllables

Deleting Final r9 in Short Essive of Uneven Syllables

málest#
máles0#

Vowel lengthening

The second syllable vowel a is lengthened to á whenever the stem consonants are in grade 1 and the first syllable vowel is short. Short vowels cannot preceed and follow a single intervocalic consonant.

Compulsatory lengthening in grade I even-syllables

gussaQ1#
gu0sá0#
skihpaQ1s#
ski0bá0s#

Diphtong simplification

The diphthong simplification handles oa:å and æ:e. Phonologically, these are identical processes, but since the dipthong is written by two letters in the former case and by one letter in the latter, the alternations must be handled separately. This section also handles ie:æ, these are in principle the same as oa:å, but the alternation does not occur in so many contexts.

**oa:å Diphtong Simplification Part I **

oa:å Diphtong Simplification Part II

toahkkeY6X5jn
toahkki00jn
★toahkkeY6X5jn (is not standard language)
★t0åhkki00jn (is not standard language)
boalloX4j
b0ållu0j
roavggoX4j
roavggu0j
★roavggoX4j (is not standard language)
★r0åvggu0j (is not standard language)
toasºsoQ1X5jn
t0ås0su00jn
★toasºsoQ1X5jn (is not standard language)
★toas0su00jn (is not standard language)
★moasºsoX5jn (is not standard language)
★m0ås0su0jn (is not standard language)
moasºsoX5jn
moas0su0jn
goarºroY6X5jn
goar0ru00jn
goarroY6X5jn
goarru00jn
★goarºroY6X5jn (is not standard language)
★g0år0ru00jn (is not standard language)
★goarºroY2 (is not standard language)
★g0år0ru0 (is not standard language)
goarroY2
g0årru0
doad0jeY6
doaddje0
★doad0jeY6 (is not standard language)
★d0åddje0 (is not standard language)
goarºroY5d9it
g0år0ru0dit
★goarºroY5d9it (is not standard language)
★goar0ru0dit (is not standard language)
toab0moY6X4j
toabbmu00j
toabmoX4j
t0åbmu0j
★toa0mboY6X4j (is not standard language)
★t0åbbmu00j (is not standard language)
toabmoX7dallat
t0å0mu0dallat
★toabmoX7dallat (is not standard language)
★toa0mu0dallat (is not standard language)
oaddoY6X4j
oaddu00j
boassjkoQ1X5jn
b0å0sjku00jn
★boassjkoQ1X5jn (is not standard language)
★boas0jku00jn (is not standard language)
boajsstoQ1X5jn
b0åj0stu00jn
★boajsstoQ1X5jn (is not standard language)
★boaj0stu00jn (is not standard language)
boaggoQ1X5jn
b0åkku00jn
★boaggoQ1X5jn (is not standard language)
★boakku00jn (is not standard language)
examples:*
examples:*
examples:*
examples:*
examples:*
examples:*
examples:*
examples:*
examples:*
examples:*

**æ:e Diphthong Simplification **

hærránis
hæärránis
hærránis#gæhttjalibme>
hæärránis#gæähttjalibme>
pasiænnta>Q1
pasien0ta>0
patænnta>Q1
paten0ta>0
kvotiænnta>Q1
kvotien0ta>0
kliænnta>Q1
klien0ta>0
Lævnnja>Q1
Lev0nja>0
a^dræssa#sáhtso>
a^dræässa#sáhtso>
★a^dræssa#sáhtso> (is not standard language)
★a^dressa#sáhtso> (is not standard language)
vædtsag8>X3
vettsa0>0

**ie:æä Diphthong Simplification Part I **

ielvveY9ut
0æälvv00ut
iehttseY1up
0æähtts00up
giesseQ8us
g0ess00us

ie:æä Diphthong Simplification Part II The multichar æä is always the only option

jeht0sa>Y6
jæähttse>0
jeht0sa>Y6
jæähttse>0
gierre»X7dalla>t
g0æä0rá»0dalla>t
boarkkaQ1
b0år0ka0
loavddag8X3#
l0åv0da00#

Vowel-change oa:å for verbs part I

Vowel-change oa:å for verbs part II

hå0llaY2
hoallá0
gå0d0naY6
goaddne0
★hållaY2 (is not standard language)
★hållá0 (is not standard language)
gå0ht0saY6
goahttse0

Stem vowel alternations

This section is divided according to stem vowels: a-, e-, o-, å-stems.

a-stem alternations

For a-stems, there is a:e and a:i. Each alternation is triggered by a combination of phonological content and dummy symbols.

a:e in Present Participle of even-syllable verbs

bassa>Y6
basse>0

a:i in Prs Prc of even-syllable verbs

basºsaY6jt#
bas0si0jt#

a-stem vowel deletion

giedjeg9>a#
giedjeg>a#

e-stem alternations

For e-stems, there is e:i, e:á, e:å, e:u and e:a. Each alternation is triggered by a combination of phonological content and dummy symbols.

e:i in e-stems

manasseQ4j
manassi0j
biesseQ1j
bie0si0j
boaht0eY6j
boahtti0j
gálleQ1tj
gá0li0sj
gálleQ1tjav
gá0li0tjav
gálleQ1tjin
gá0li0tjin
gálleQ1tjihpit
gá0li0tjihpit
gálleQ1tjibá
gá0li0tjibá
gálleQ1tjip
gá0li0tjip
gálleQ1tja
gá0li0tja
gierre>Q1tja
gie0ri>0tja
gierre>Q1tj
gie0ri>0sj

The following two rules constitute a <= / => rule pair.

e:á in certain stem types 1

e:á in certain stem types 2

bárnneX4m
bárnná0m
★bárnneX4m (is not standard language)
★bárnne0m (is not standard language)
bálggeX4v
bálggá0v
gálleY3m#
gállá0m#
gálleQ2v#
gá0lá0v#
báhkoX7tj#
bá0gu0sj#
goahteX7tj#
goa0dá0sj#
★goahteX7tj# (is not standard language)
★go00dá0sj# (is not standard language)

e:å in certain stem types with å as root vowel

gådeQ2v
gådå0v
jåhteQ2v
jå0då0v
gådeY2
gådå0
jåhteY2
jåhtå0
jåhteY3m
jåhtå0m
låhkkeY7tj#
låhkkå0sj#

e-stem vowel deletion

ielvveY9ut
0æälvv00ut

i-stem alternations

For i-stems, there is i:á. The alternation is triggered by a combination of phonological content and dummy symbols.

i:á in Verb Derivation

o-stem alternations

The duplicates of the three lines of the two following rules are there to resolve the => conflict between the two rules.

o:u in certain stem types 1

o:u in certain stem types 2

u:o in contracted nouns

o-stem vowel deletion

For å-stems there is å:e and å:i and vowel deletion. Each alternation is triggered by a combination of phonological content and dummy symbols.

å:e in Present Participle of even-syllable verbs

å:i in Actor nouns of even-syllable verbs

å-stem vowel deletion

alternations valid for several stem types

Stem vowel deletion in even-syllable verbs, imp 3sg, 3du, 2pl, 3pl

ielvveY1up
0æälvv00up
giessaY1up
giess00up
bårråY1up
bårr00up

Consonant gradation rules

The consonant gradation rules differ considerably from the corresponding rules for North Sámi. Instead of generalizing oversets of consonants (Cx:Cy <=> …), each rule contains the alternation for one consonant only, and to the right of the <=> arrow is listed all the contexts where the relevant alternation appears. The disadvantage with this method is that the same context must be written several times, if e.g. both p, t and k are deleted in the same contexts, each of these contexts must be written several times, one for each consonant. The advantage is that there are no conflicts during compilation, compilation takes 10 seconds rather than 3 minutes. The earlier North-Sámi-style rule set was ordered according to CG pattern. This pattern is still visible in the new rules, via the reference S1-3 etc. (Spiik’s Series 1, 3-letter pattern, etc) behind each subrule.

This actually opens up for a migration to an xfst rule file instead of the current twolc format, since what xfst really cannot do is generalize over sets (Cx:Cy etc.). This is an issue for future revisions to decide.

The rules are divided in two subsections, deletion rules and change (alternation) rules.

Deletion rules

The b, d, g deletion rules are similar, via the optional ( b ) etc. in front of the “_” symbol, both bm:m and bbm:bm alternations are covered. The contexts differ to a certain extent. For b and d, the III-I special gradation bbm:m is covered by two separate rules, and a special Dummy (X6), not part of the ordinary WeG set.

Note that one of the rules for t:0 refers to #: as part of its context. As soon as clitics are added to the word form, this rule will thus not be triggered. Look into this when the clitics are added.

Consonant gradation b:0 deletes b in S7 and S9 contexts

Consonant gradation d:0 … etc.

bednag8>X3
be0na0>0

Consonant gradation g:0

Consonant gradation k:0

Consonant gradation l:0

Consonant gradation m:0

Consonant gradation n:0

Consonant gradation p:0

Consonant gradation s:0

russjpeQ1
ru0sjpe0
★russjpeQ1 (is not standard language)
★russjpe0 (is not standard language)

Consonant gradation ŋ:0

Consonant gradation f:0

Consonant gradation r:0

Consonant gradation v:0

Consonant gradation j:0

Consonant gradation t:0

oajváladtj#
oajvála0sj#

Gradation Series 4, II-I, tj and ts

Change rules

The Cx:Cy format was kept for hk:g, hp:b, ht:d, since the left context h:0 was unique, and no compilation conflict thus arose.

The bb:pp, gg:kk, dd:tt alternations were split into three rules, since keeping them in one Cx:Cy rule created compilation conflicts. Also, d:t contain a rule not found for the other two…

Gradation Series 4, II-I

bb:pp

oabbáQ1
oappá0

gg:kk

vággeQ1
vákke0
★vággeQ1 (is not standard language)
★vágge0 (is not standard language)

g:k change for clitic -ge

dd:tt and dtj, dts

Gradation Series 7, III-II, ks(t), kt, ktj, kts

Exceptional II-III inverse gradation in present participles

This gradation is only for II-I syllable verbs that get III as present participles.

Candidates:

bbm - bm - m
ddn - dn - n
ddnj- dnj- nj
ggŋ - gŋ - ŋ
ddj - dj - dj
hkk - hk - g
hpp - hp - b
htt - ht - d
httj- htj- tj
htts- hts- ts

Strategy: Do insertion rule for the initial element.

Consonant insertion as II-III strengthening gradation with bm, gŋ

Consonant insertion as II-III strengthening gradation with dn/j + as I-III strengthening gradation with d

Consonant insertion as II-III strengthening gradation with hk, hp,

Consonant insertion as II-III strengthening gradation with htt(j/s)

Debugging of twol-rules

All rule conflicts have been successfully resolved. The rule file should be kept that way. Look out for conflicts in the compilation process, and resolve them as they appear!

This (part of) documentation was generated from src/fst/morphology/phonology.twolc

src-fst-morphology-root.lexc.md

Lule Sámi morphological analyser

Definitions for Multichar_Symbols

Tags for sub-POS

+Prop = Propernouns
+ACR = Acronym @CODE = telephone number (beta testings)
+URL = Code for urls

Pronoun subtypes

+Pers = Personal pronoun
+Dem = Demonstrative pronoun
+Interr = Interrogative pronoun
+Refl = reflexive pronoun
+Recipr = reciprocal pronoun
+Rel = relative pronoun
+Indef = indefinite pronoun
+Coll = collective numerals
+Arab = arabic numerals
+Rom = remertall

Error tags

All Err-tags must have a normative form as lemma except Err/Lex

+Err/Orth = Substandard. An ungrammatical, non-normative form of normative lemma.
+Err/Lex = No normative lemma
+Err/Cmp = No normative compound, often wrong first part of cmp (in other case than nom or gen, or adv og pronoun)
+Err/Der = Lemmas that break with regular derivation rules, both morphologically and semantically
+Err/Infl = Lemmas that break with normative inflection rules, often used with morphological changes
+Err/MissingSpace = indicates that there is a missing space, causing an orthographic error. Used for “goadedagi”, when it should be “goade dagi”
+Err/Hyph = when there is a hyphen where none should have been
+Err/SpaceCmp = used for compounds written apart - only retained in the HFST Grammar Checker disambiguation analyser
+Err/Spellrelax = used to tag spellrelaxed typos (tag is inserted via flag diacritics)

Usage restriction tags

+Use/Marg – Marginal, but normative lemmas. Not in speller.
+Use/-Spell – Excluded from speller
+Use/-PLX – Excluded from PLX speller
+Use/-PMatch – Do not include in fst’s made for hfst-pmatch
+Use/SpellNoSugg – Recognized, but not suggested in speller
+Use/Circ – Circular path
+Use/CircN – Circular number path
+Use/Ped – Remove from pedagogical speller
+Use/NG – Do not generate, only for Oahpa and MT. In speller.
+Use/MT – Generate for MT only, for restricting analyses needed
+Use/NGminip – Not for miniparadigm in VD dicts
+Use/NotDNorm – For words without formal normalization. Divvun suggest that this shouldn’t be normative.
+Use/DNorm – For words without formal normalization. Divvun suggest that this should be normative. Included in speller.
+Use/PMatch – Only use in fst’s targeted for hfst-pmatch
+Use/Derrog – Derrogatory word. Recognized, but not suggested in speller, same as SpellNoSugg
+Use/GC – only retained in the HFST Grammar Checker disambiguation analyser
+Use/-GC – never retained in the HFST Grammar Checker disambiguation analyser
+Use/TTS – only retained in the HFST Text-To-Speech disambiguation tokeniser
+Use/-TTS – never retained in the HFST Text-To-Speech disambiguation tokeniser

Dialect and Area tags

+Area/SE - Used in Sweden only
+Area/NO - Used in Norway only
+Dial/N Used in the northern areas. Some might say that these words are sme-words, but they are used by lulesamis in the northern part of the dialect area. Words like “válmas”
+Dial/S Used in the southern areas
+Dial/SH Short forms

Compounding tags

The tags are of the following form:

+CmpNP/xxx - Normative (N), Position (P), ie. the tag describes what position the tagged word can be in in a compound
+CmpN/xxx - Normative (N) form ie. the tag describes what form the tagged word should use when making compounds
+Cmp/xxx - Descriptive compounding tags, ie. tags thatdescribes what form a word actually is using in a compound

Normative/prescriptive compounding tags

These govern compound behaviour for normative tools like the speller, ie. what a compound SHOULD BE.

The first part of the component may be ..

+CmpN/Sg = Singular
+CmpN/SgN = Singular Nominative
+CmpN/SgG = Singular Genitive
+CmpN/PlG = Plural Genitive
+CmpN/Attr = Attributive

This part of the component can ..

+CmpNP/All - … be in all positions, default, this tag does not have to be written
+CmpNP/First - … only be first part in a compound or alone
+CmpNP/Pref - … only first part in a compound, NEVER alone
+CmpNP/Last - … only be last part in a compound or alone
+CmpNP/Suff - … only last part in a compound, NEVER alone
+CmpNP/None - … not take part in compounds
+CmpNP/Only - … only be part of a compound, i.e. can never be used alone, but can appear in any position

The second part of the compound may require that the previous (left part) is (and thus overrides the regular CmpN tags):

+CmpN/SgLeft Singular to the left
+CmpN/SgNomLeft Singular nominative to the left
+CmpN/SgGenLeft Singular genitive to the left
+CmpN/PlGenLeft Plural genitive to the left

But these tags can again be overriden by the first word in a compound, if this part of the compound is tagged with a def tag:

+CmpN/Def Left override
+CmpN/DefSgGen Overrides left tag, requires SgGen form
+CmpN/DefPlGen Overrides left tag, requires PlGen form

Descriptive compounding tags

Tags for compound analysis - this is what a compound actually is. Some of these tags are also used in combination with the above normative tags to actually enforce compound restrictions in the fst.

+Cmp/Sg Singular
+Cmp/SgNom Singular Nominative
+Cmp/SgGen Singular Genitive
+Cmp/PlGen Plural Genitiv
+Cmp/PlNom Plural Nominative
+Cmp/Attr Attribute
+Cmp/Sh testing ShCmp
+Cmp/Long Long for om SgNom when short form is default, tjåhkaNIBMEbievddegirjje and bierjjeDAHKAiehket
+Cmp/Unass Unassimiled as first part of Cmp, e.g. telegram- in stead for telegrámma-
+Cmp Dynamic compound - this tag should always be part of a dynamic compound. It is important for Apertium, and useful in other cases as well.
+Cmp/SplitR This is a split compound with the other part to the right: “Arbeids- og inkluderingsdepartementet” => Arbeids- = +Cmp/SplitR
+Cmp/SplitL This is a split compound with the other part to the left
+Cmp/Hyph Compound with (optional) hyphen, heavily weighted
+Cmp/OblHyph Obligatory hyphen, not as badly weighed as +Cmp/Hyph
+Cmp/Cit - Tags citation compounds, which can in principle cover any word. Requires a hyphen.

Inflectional Tags

Adjective specific tags

+Attr Attribute
+Card
+Ord CHECK THIS! In closed-sme there are +Ord entries without circ. tag

Verbal inflection

+Ind Indicative mood
+Prs Present tense
+Prt Past tense
+Pot Potensial mood
+Cond conditional mood
+Imprt Imperative mood
+Sg1 singular first person
+Sg2 singular second person
+Sg3 singular third person
+Du1 dual first person
+Du2 dual second person
+Du3 dual third person
+Pl1 plural first person
+Pl2 plural second person
+Pl3 plural plural person
+Inf infinitive
+Ger gerundium
+ConNeg = the main verb form used with negation verb. Like bårå in Iv bårå guolev.
+Neg negation verb
+ImprtII second imperative mood
+PrsPrc present participle
+PrfPrc past participle
+Sup supinum
+VGen verb genitive
+VAbess verb abessive
+Actio Actio

Other tags

+ABBR
+ACR
+CLB
+PUNCT
+LEFT
+RIGHT +MIDDLE
+CLBfinal Sentence final abbreviated expression ending in full stop, so that the full stop is ambiguous
^GUESSNOUNROOT
+TV !
+IV ! Transitivity tags
+G3 Grade 2-3 for homonymies with grade 1-2, +N+G3
+G7 Grade 3, no consonant gradation, +N+G7
+Multi Multiword phrase tag
+Guess for the name guesser
+NomAg Actor Noun From Verb - Nomen Agentis

Lexeme disambiguation = homonym tags

+Hom1: Homonymy
+Hom2: Homonymy

Stem variant tags

+v1 - variant 1
+v2 - variant 2
+v3 - variant 3
+v4 - variant 4
+v5 - variant 5

Question and Focus particles:

+Qst
+Clt
+SentInit Sentence initial form of “liehket”
+Cltl
+Foc These two are only found in SMJ - do we need them?
+Foc/ge
+Foc/gen
+Foc/ga
+Foc/Neg-k
+Foc/Pos-k

Other tags

+MWE multi word expressions, goes to abbr
+Span - used for numerical expressions denoting spans or intervals, like 5-10, 2012-2015, etc
+Sh Short form
+Long Long form
+Gram/Comp = Inherent comp, lexicalized derivation
+Gram/Superl = Inherent superl, lexicalized derivation
+Gram/Dimin = Inherent diminutive, lexicalized derivation
+Gram/NomAg = Inherent Actor Noun From Verb - Nomen Agentis, lexicalized derivation
+Gram/r = Inherent -r derivation. guollit-guollár
+Gram/NomAct = Inherent Actio Noun From Verb - Nomen Actionis, lexicalized derivation
+Gram/NomInstr = Inherent Intsrumental noun From Verb, Nomen instrumentalis,lexicalized derivation
+Gram/TAbbr: Transitive abbreviation (it needs an argument)
+Gram/NoAbbr: Intransitive abbreviations that are homonymous
+Gram/TNumAbbr: Transitive abbreviation if the following
+Gram/NumNoAbbr: Transitive abbreviations for which numerals
+Gram/TIAbbr: Both transitive and intransitive abbreviation
+Gram/IAbbr: Intransitive abbreviation (it takes no argument)
+Gram/3syll: trisyllabic verbs
!+Gram/SentInit: copula verb le-, tag is replaced by +SentInit

Semantic tags to help disambiguation & syntactic analysis

These tags should always be located just before the POS tag.

+Sem/Act = Activity; cleaning, work, occupation, project, photosynthesis
+Sem/Adr = Webadr
+Sem/Amount = Amount; pile, amount of lynx, amount of food, one fifteenth, ten percent
+Sem/Ani = Animate; dog, reindeer, teddy bear,ndragon, lice
+Sem/Ani-fish = Fish
+Sem/Aniprod = Animal Product; sweat, reindeer skin, lice egg, blood for making dumplings, pee
+Sem/Body = Bodypart; ear, bone, ear canal, artificial leg, mustache, nervous system
+Sem/Body-abstr = Non-physical body part; Sjel, ånd, reason, soul, voice, eyesight, conscience
+Sem/Build = Building; house, museum, Sámi tent, nest, sandcastle
+Sem/Build-room = Room in a building, typically place to be
+Sem/Buildpart = Part of building; closet, room, door, balcony, pool, office
+Sem/Cat = Category; name, subjunction, suffix, password, existential sentence
+Sem/Clth = Clothes/Clothing; shirt, hat, theater costume, shawl, seatbelt, diaper
+Sem/Clth-jewl = Jewelery and similar; watch, sunglasses, ring, necklace, crown
+Sem/Clthpart = Part of clothes; knapp, søm, pocket (OBS! Ctain-abstr), bottom part of an article of clothing, upper part of trousers, seam, button
+Sem/Ctain = Container; suitcase, terrarium, closet, container, gas tank
+Sem/Ctain-abstr = Abstract container; bank account, fund, account, loan fund, pot, bank account
+Sem/Ctain-clth = Eks; lomme/pocket
+Sem/Curr = Currency; not money, euro, US dollar, denar, Danish crown, currency
+Sem/Dance = Dance; swing, rumba, ballet, belly dance, war dance
+Sem/Date = Date
+Sem/Dir = Direction; GPS course, stock exchange price, graph, tendency, starboard
+Sem/Domain = Domain like politics, reindeerherding (a system of actions), anthropology, punk rock, biology, linguistics, medicine
+Sem/Drink = Drink; tea, beer, UHT milk, soda, communion wine
+Sem/Dummytag = Default tag for uncategorized nouns
+Sem/Edu = Educational event; skiing high school skiing academy, course, music lesson, lesson, master
+Sem/Event = Event; wedding, meeting, competition, election, festival
+Sem/Feat = Feature; Árvu, age difference, homosexuality, femininity, identity, congruence
+Sem/Feat-measr = Measurable feature; radius, diameter, volume, circumference, perimeter, frequency
+Sem/Feat-phys = Physiological feature; size, color, height, shape, weight, horsepower
+Sem/Feat-psych = Psychological feature; authority, nature, childishness, creativity, arrogance
+Sem/Fem = Female name
+Sem/Food = Food; bread, vegetarian food, flour, tobacco, salt
+Sem/Food-med = Medicine; birth-control pill, asthma medicine, medicine, penicillin, vaccine
+Sem/Fruit = Fruit and fruit-like edibles
+Sem/Furn = Furniture; throne, chair, table, altar, trampoline
+Sem/Game = Game; biŋgo bingo, TV game, flipper, paintball, chess
+Sem/Geom = Geometrical object; triangle, triangle, tetrahedron, asymptote, star
+Sem/Group = Animal or Human Group; family, herd, group, indigenous people, delegation
+Sem/Hum = Human
+Sem/Hum-abstr = Human abstract
+Sem/ID = ID
+Sem/Ideol = Ideology; nomadism, buddhism, feminism, christianity, fanaticism
+Sem/Lang = Language; South Sámi, mother tongue, Maori, Yiddish, second language
+Sem/Mal = Male name
+Sem/Mat = Material for producing things; paper, steel, wood, leather, wool
+Sem/Measr = Measure; unit of volume, quarter meter, hour, minus degree, wátta watt
+Sem/Money = Has to do with money; wages, not Curr(ency), treasure, belongings, debt, food price, bill, pension
+Sem/Obj = Object; thing, cup, thing, toy, painting
+Sem/Obj-catch =
+Sem/Obj-clo = Cloth; carpet, flag, curtain, silk sheets, napkin
+Sem/Obj-cogn =
+Sem/Obj-el = (Electrical) machine or apparatus; player, lamp, TV, radio, oven
+Sem/Obj-ling = Object with something written on it
+Sem/Obj-play = Play object
+Sem/Obj-rope = flexible ropelike object; barbed wire, thread, rope, cable, dental floss
+Sem/Obj-surfc = Surface object; blackboard, drawing paper, canvas, board (for playing board games), (money) bill
+Sem/Org = Organisation; newspaper, supreme court, company, museum, administration
+Sem/Part = Feature, part of something; half, part, percent, rest, tenth
+Sem/Perc-cogn = Cloth
+Sem/Perc-emo = Emotional perception; fear, feeling of identity, empathy, sadness, working motivation
+Sem/Perc-phys = Physical perception; shoulder pain, gass smell, sleep (during the night), need of sleep, hunger
+Sem/Perc-psych = Psychic perception
+Sem/Plant = Plant; lichen, plant, cucumber, bluebell, poisonous mushroom
+Sem/Plantpart = Plant part; twig, leaf, root, seed, pine trunk
+Sem/Play = Play
+Sem/Plc = Place; world, nature, airport, workplace, fireplace
+Sem/Plc-abstr = Abstract place; bachelor level, job market, third place, Troms website, address
+Sem/Plc-elevate = Elevated place; mountain, peak, mountain top, scree, volcano
+Sem/Plc-line = Place limitations; national border, border, finish line, stop line, equator
+Sem/Plc-water = Water; river, lake, polar sea, sea, well
+Sem/Pos = Position (as in social position job); 50% position, presidency, fixed position, kindergarten place, role in society
+Sem/Process = Process
+Sem/Prod = Product
+Sem/Prod-audio = Audio product; yoik, roar, Beatles song, Bible psalm, blues
+Sem/Prod-cogn = Cognition product; thought, decision, knowledge, lack of understanding, requirement
+Sem/Prod-ling = Linguistic product; message, question, agreement, translation, criticism
+Sem/Prod-vis = Visual product; picture, film, TV series, documentary, art
+Sem/Rel = Relation; relation, dependency, subordination, analogy, equivalence
+Sem/Route = Route-like place; street, path, corridor, bridge, winter path
+Sem/Rule = Rule or convention; cultural tradition, abortion law, EU rule, law of cosines, fair play
+Sem/Semcon = Abstract semantic concept; result, objective, reason, alternative, case
+Sem/Sign = Sign (e.g. numbers, punctuation); ID number, ČSV letters, quotation mark, hieroglyph, symbol
+Sem/Sport = Sport; table tennis, judo, motor cross, ice hockey, floorball
+Sem/State = State; hurry, captivity, anarchy, biodiversity, chaos
+Sem/State-sick = Illness; allergy, cold, autism, dementia, somnambulism
+Sem/Substnc = Substance; Water, sand, air, smoke, carbohydrate, vitamin, dust
+Sem/Sur = Surname
+Sem/Symbol = Symbol
+Sem/Time = Time; áigi time, April, half an hour, Tuesday, deadline
+Sem/Time-clock = Time
+Sem/Tool = Prototypical tool for repairing things; axe, knife, fire striker, plastic hammer, wrench
+Sem/Tool-catch = Tool used for catching; artificial fly, fishing net for cod, fishing rod, lasso used in wintertime, mouse trap
+Sem/Tool-clean = Tool used for cleaning; broom, vegetable brush, toothbrush, cleaning equipment
+Sem/Tool-it = Tool used in IT/tool within IT; IT infrastructure, analyzer, searching function, computer program, browser
+Sem/Tool-measr = Tool used for measuring; barometer, hourglass, ruler, spirit level, scale
+Sem/Tool-music = Musical instrument; shaman drum, guitar, violin, musical instrument, jaw harp
+Sem/Tool-write = Writing tool; colored pen, pencil, chalk, paintbrush, paint
+Sem/Txt = Written document; paper, book, letter, e-mail, song
+Sem/Veh = Vehicle; car, boat, sled, vehicle, draft reindeer, bicycle
+Sem/Wpn = Weapon; rifle, bow, sword, arrow, war axe
+Sem/Wthr = The Weather or the state of ground; cloudy weather, wind, driving conditions, night sunlight, rain shower
+Sem/Year = Year

Multiple Semantic tags:

+Sem/Act_Clth =
+Sem/Act_Domain =
+Sem/Act_Domain_Measr =
+Sem/Act_Event =
+Sem/Act_Feat =
+Sem/Act_Feat-psych =
+Sem/Act_Fruit =
+Sem/Act_Group =
+Sem/Act_Hum =
+Sem/Act_Hum_Obj =
+Sem/Act_Money =
+Sem/Act_Obj =
+Sem/Act_Obj-play =
+Sem/Act_Org =
+Sem/Act_Perc-emo =
+Sem/Act_Plc = A persons job is an activity, and a place as well
+Sem/Act_Prod-audio =
+Sem/Act_Prod-vis =
+Sem/Act_Route = Activity and Route, ie johtolat
+Sem/Act_Semcon =
+Sem/Act_State =
+Sem/Act_Time =
+Sem/Act_Tool-it =
+Sem/Act_Txt =
+Sem/Amount_Build =
+Sem/Amount_Semcon =
+Sem/Ani_Body =
+Sem/Ani_Body-abstr_Hum =
+Sem/Ani_Build =
+Sem/Ani_Build_Hum_Txt =
+Sem/Ani_Buildpart =
+Sem/Ani_Cat =
+Sem/Ani_Clth =
+Sem/Ani_Feat_Hum =
+Sem/Ani_Feat_Plant =
+Sem/Ani_Food =
+Sem/Ani_Group =
+Sem/Ani_Group_Hum =
+Sem/Ani_Group_Prod-vis =
+Sem/Ani_Hum =
+Sem/Ani_Hum_Plc =
+Sem/Ani_Hum_Time =
+Sem/Ani_Obj =
+Sem/Ani_Org =
+Sem/Ani_Plc =
+Sem/Ani_Plc_Txt =
+Sem/Ani_State =
+Sem/Ani_Substnc =
+Sem/Ani_Time =
+Sem/Ani_Veh =
+Sem/Aniprod_Hum =
+Sem/Aniprod_Mat =
+Sem/Aniprod_Obj =
+Sem/Aniprod_Obj-clo =
+Sem/Aniprod_Perc-phys =
+Sem/Aniprod_Plant =
+Sem/Aniprod_Plc =
+Sem/Aniprod_Plc_Route =
+Sem/Aniprod_Substnc =
+Sem/Aniprod_Substnc_Wthr =
+Sem/Body-abstr_Feat-psych =
+Sem/Body-abstr_Prod-audio_Semcon =
+Sem/Body_Body-abstr =
+Sem/Body_Buildpart =
+Sem/Body_Clth =
+Sem/Body_Clthpart =
+Sem/Body_Food =
+Sem/Body_Fruit =
+Sem/Body_Group_Hum =
+Sem/Body_Group_Hum_Time =
+Sem/Body_Hum =
+Sem/Body_Mat =
+Sem/Body_Measr =
+Sem/Body_Obj =
+Sem/Body_Obj_Tool-catch =
+Sem/Body_Org =
+Sem/Body_Part =
+Sem/Body_Plc =
+Sem/Body_Plc-elevate =
+Sem/Body_Plc_State =
+Sem/Body_State =
+Sem/Body_Time =
+Sem/Build-room_Furn =
+Sem/Build-room_Org =
+Sem/Build_Build-room =
+Sem/Build_Buildpart =
+Sem/Build_Clthpart =
+Sem/Build_Edu_Org =
+Sem/Build_Event_Org =
+Sem/Build_Obj =
+Sem/Build_Org =
+Sem/Build_Plc =
+Sem/Build_Route =
+Sem/Build_Tool =
+Sem/Build_Veh =
+Sem/Buildpart_Ctain_Obj =
+Sem/Buildpart_Obj =
+Sem/Buildpart_Obj_Plc =
+Sem/Buildpart_Part =
+Sem/Buildpart_Plc =
+Sem/Buildpart_Prod-audio =
+Sem/Cat_Edu =
+Sem/Cat_Feat =
+Sem/Cat_Feat-psych_Plc =
+Sem/Cat_Group_Hum =
+Sem/Cat_Hum =
+Sem/Cat_Measr =
+Sem/Cat_Obj =
+Sem/Cat_Plantpart =
+Sem/Cat_Plantpart_Semcon =
+Sem/Cat_Txt =
+Sem/Clth-jewl_Curr =
+Sem/Clth-jewl_Fruit =
+Sem/Clth-jewl_Money =
+Sem/Clth-jewl_Obj =
+Sem/Clth-jewl_Org =
+Sem/Clth-jewl_Plant =
+Sem/Clth_Hum =
+Sem/Clth_Obj =
+Sem/Clth_Part =
+Sem/Clth_Sur =
+Sem/Clthpart_Plc =
+Sem/Ctain-abstr_Org =
+Sem/Ctain-clth_Plant =
+Sem/Ctain-clth_Veh =
+Sem/Ctain_Feat-phys =
+Sem/Ctain_Furn =
+Sem/Ctain_Obj-surfc =
+Sem/Ctain_Plc =
+Sem/Ctain_Tool =
+Sem/Ctain_Tool-measr =
+Sem/Ctain_Txt =
+Sem/Curr_Org =
+Sem/Dance_Org =
+Sem/Dance_Prod-audio =
+Sem/Dir_Geom =
+Sem/Domain_Edu =
+Sem/Domain_Feat =
+Sem/Domain_Feat-phys =
+Sem/Domain_Food-med =
+Sem/Domain_Hum =
+Sem/Domain_Ideol =
+Sem/Domain_Obj =
+Sem/Domain_Org =
+Sem/Domain_Org_Plc-abstr =
+Sem/Domain_Perc-emo =
+Sem/Domain_Prod-audio =
+Sem/Domain_State-sick =
+Sem/Domain_Txt =
+Sem/Drink_Plant =
+Sem/Drink_Plc =
+Sem/Drink_Plc_Substnc =
+Sem/Edu_Event =
+Sem/Edu_Geom =
+Sem/Edu_Geom_Plc-line =
+Sem/Edu_Group_Hum =
+Sem/Edu_Hum =
+Sem/Edu_Mat =
+Sem/Edu_Org =
+Sem/Edu_Txt =
+Sem/Event_Food =
+Sem/Event_Hum =
+Sem/Event_Org =
+Sem/Event_Plc =
+Sem/Event_Plc-elevate =
+Sem/Event_Time =
+Sem/Feat-measr_Plc =
+Sem/Feat-phys_Food_State =
+Sem/Feat-phys_Hum =
+Sem/Feat-phys_Obj =
+Sem/Feat-phys_Plc-line =
+Sem/Feat-phys_Tool-write =
+Sem/Feat-phys_Veh =
+Sem/Feat-phys_Wthr =
+Sem/Feat-psych_Hum =
+Sem/Feat-psych_Plc =
+Sem/Feat_Hum =
+Sem/Feat_Plant =
+Sem/Food_Part =
+Sem/Food_Perc-phys =
+Sem/Food_Plant =
+Sem/Food_Substnc =
+Sem/Food_Time =
+Sem/Fruit_Sport =
+Sem/Furn_Ctain-abstr =
+Sem/Game_Obj-play =
+Sem/Geom_Obj =
+Sem/Geom_Plc =
+Sem/Group_Hum =
+Sem/Group_Hum_Org =
+Sem/Group_Hum_Plc =
+Sem/Group_Hum_Prod-vis =
+Sem/Group_Org =
+Sem/Group_Plc =
+Sem/Group_Prod-vis_Txt_Veh =
+Sem/Group_Sign =
+Sem/Group_State =
+Sem/Group_Txt =
+Sem/Hum-abstr =
+Sem/Hum_Lang =
+Sem/Hum_Lang_Plc =
+Sem/Hum_Lang_Time =
+Sem/Hum_Mat_Tool =
+Sem/Hum_Money =
+Sem/Hum_Obj =
+Sem/Hum_Obj_Plc =
+Sem/Hum_Org =
+Sem/Hum_Part =
+Sem/Hum_Plant =
+Sem/Hum_Plc =
+Sem/Hum_Rule =
+Sem/Hum_State =
+Sem/Hum_Tool =
+Sem/Hum_Tool-catch =
+Sem/Hum_Veh =
+Sem/Hum_Wthr =
+Sem/Lang_Tool =
+Sem/Lang_Tool-catch =
+Sem/Mat_Obj =
+Sem/Mat_Part =
+Sem/Mat_Plant =
+Sem/Mat_Plantpart =
+Sem/Mat_Plc =
+Sem/Mat_Tool =
+Sem/Mat_Tool-catch =
+Sem/Mat_Txt =
+Sem/Measr_Plc_Time =
+Sem/Measr_Sign =
+Sem/Measr_Time =
+Sem/Money_Obj =
+Sem/Money_Plc =
+Sem/Money_Txt =
+Sem/Obj-ling_Obj-surfc =
+Sem/Obj-play =
+Sem/Obj-play_Sport =
+Sem/Obj_Part_Sign =
+Sem/Obj_Perc-emo =
+Sem/Obj_Plantpart =
+Sem/Obj_Plc =
+Sem/Obj_Plc-abstr =
+Sem/Obj_Plc_Semcon =
+Sem/Obj_Process =
+Sem/Obj_Prod-audio =
+Sem/Obj_Semcon =
+Sem/Obj_Sign =
+Sem/Obj_State =
+Sem/Obj_Tool-music =
+Sem/Obj_Tool-write =
+Sem/Obj_Txt =
+Sem/Obj_Veh =
+Sem/Org_Play =
+Sem/Org_Plc =
+Sem/Org_Prod-audio =
+Sem/Org_Prod-cogn =
+Sem/Org_Prod-vis =
+Sem/Org_Rule =
+Sem/Org_State =
+Sem/Org_Txt =
+Sem/Org_Veh =
+Sem/Part_Plc =
+Sem/Part_Plc_Prod-audio =
+Sem/Part_Prod-cogn =
+Sem/Part_Substnc =
+Sem/Part_Txt =
+Sem/Perc-emo_Plc =
+Sem/Perc-emo_State =
+Sem/Perc-emo_Wthr =
+Sem/Plant_Plantpart =
+Sem/Plant_Plc =
+Sem/Plant_Time_Wthr =
+Sem/Plant_Tool =
+Sem/Plant_Tool-measr =
+Sem/Plc-abstr_Rel_State =
+Sem/Plc-abstr_Route =
+Sem/Plc-abstr_Rule =
+Sem/Plc-abstr_State =
+Sem/Plc-abstr_Txt =
+Sem/Plc_Pos =
+Sem/Plc_Route =
+Sem/Plc_State =
+Sem/Plc_Substnc =
+Sem/Plc_Substnc_Wthr =
+Sem/Plc_Time =
+Sem/Plc_Time_Wthr =
+Sem/Plc_Tool-catch =
+Sem/Plc_Txt =
+Sem/Plc_Wthr =
+Sem/Prod-audio_Prod-vis =
+Sem/Prod-audio_Substnc =
+Sem/Prod-audio_Txt =
+Sem/Prod-cogn_Txt =
+Sem/Route_State =
+Sem/Route_Txt =
+Sem/Rule_Txt =
+Sem/Semcon_State =
+Sem/Semcon_Txt =
+Sem/State-sick_Substnc =
+Sem/State_Veh =
+Sem/Substnc_Wthr =
+Sem/Time_Wthr =
+Sem/Tool-music =

Not sure which section this goes in: (before POS)

+Allegro from LEXICON GOADE-IU-

Derivation tags

The following tags are used to describe the dynamic derivational system in Lule Sámi as encoded in this lexical description. The tags are classified according to a positional system, where each tag can be in one and only one position, and can only combine with tags from an earlier / lower position. This is done to avoid possible overgeneration in the derivational system.

+Der1 +Der2 +Der3 +Der4 +Der5: - positional tags, preceeds the actual der tag
Der#1 tags - tags in first position
+Der/Caus VV - previously Der/ahtte
+Der/PassD VV - dallat passive
+Der/PassL VV - long passive láhpeduvvat
+Der/PassS VV - Short passive láhpput
+Der/adda VV
+Der/ahtja VV - only odd syll verbs take this der
+Der/ahttjá VV - only odd syll verbs take this der
+Der/alla VV
+Der/asste VV
+Der/d VV
+Der/dalla VV
+Der/dasste VV
+Der/l VV
+Der/ladda VV
+Der/lahtte VV
+Der/lasste VV
+Der/st VV
+Der/stahtte VV
+Der/stalla VV
+Der/stasste VV
+Der/tj VV
+Der/u/a/åd VV
+Der/r VN - NomAg contracted verbs - guollit-guollár
+Der/NomInstr VN - even verbs:fierrot-fierun
+Der/Dimin NN
+Der/gusj Prop -I
+Der/lasj NN
+Der/k NN / NA
+Der/Car NA - abessive. Only even/contr. Morén-Duolljá:caritive
+Der/ferjak NA Adjectival -k der (from ?)
+Der/lasj NA - dont know, guess it Tronds, ojes, I see - is this ok?jes 2 Der:lasj Noun on 1472 Adj on 2040
+Der/n NA. Denominal -n adjective (similar t -k adj)
+Der/sasj NA
+Der/segak NA Adj. -k der from?
+Der/Comp AA
+Der/Superl AA

Der#2 tags - tags in second position

+Der/dahtte VV
+Der/duhtte VV
+Der/ahkes VA
+Der/NomAct VN
Der#3 tags - tags in third position
+Der/duvva VV
+Der/InchL VV (previosuly Der/goahte)
+Der/mus VN
+Der/NomAg VN -diddje
+Der/dahka VN
+Der/NomAct VN Realised in two different ways.
This realisation is Der3. Outcommented
to not define the tag twice, but kept
here for documentation purposes.
+Der/lis VA
Der#4 tags - tags in fourth position
+Der/ahtes NA ! Abessive. Even verbs>NomAct>ahtes and odd-syll nouns. Morén-Duolljá:Caritive
Der#5 tags - tags in fifth position
+Der/AAdv NA AAdv, previously +Der/at
+Der/akti NumAdv Adv
+Der/vuota NA AN (tag harmonization: previosuly Der/vuohta)
Der#other tags - tags that can be in any position

There are no such tags in SMJ, but for symmetry and code coherence with SME the class is still kept.

Tags for originating language

The following tags are used to guide conversion to IPA: loan words and foreign names are usually pronounced (approximately) as in the originating (majority) language. Instead of trying to identify the correct pronunciation based on phonotactics (orthotactics actually), we tag all words that can’t be correctly transcribed using the SME transcriber with source language codes. Once tagged, it is possible to split the lexical transducer in smaller ones according to langu- age, and apply different IPA conversion to each of them. The principle of tagging is that we only tag to the extent needed, and following a priority:

any untagged word is pronounced with SME orthographic conventions
NNO and NOB have identical pronunciation, NNO is only used if different in spelling from NOB
SWE has mostly the same pronunciation as NOB, and is only used if different in spelling from NOB
Occasionally even SME (the default) may be tagged, to block other languages from being specified, mainly during semi-automatic language tagging sessions All in all, we want to get as much correctly transcribed to IPA with as little work as possible. On the other hand, if more words are tagged than strictly needed, this should pose no problem as long as the IPA conversion is correct - at least some words will get the same pronunciation whether read as SME or NOB/NNO/SWE.
- +OLang/SME - North Sámi
- +OLang/SMS - Skolt Sámi
- +OLang/SMA - South Sámi
- +OLang/SJE - Pite Sámi
- +OLang/FIN - Finnish
- +OLang/SWE - Swedish
- +OLang/NOB - Norw. bokmål
- +OLang/NNO - Norw. nynorsk
- +OLang/ENG - English
- +OLang/HUN - Hungarian
- +OLang/RUS - Russian
- +OLang/UND - Undefined
- +OLang/PARA - parallelle navn, navnet skal ikke overføres til andre samisk språk

Flag diacritics

Tags from SME, coming to smj by propernouns.

Flag diacritics

We have manually optimised the structure of our lexicon using following flag diacritics to restrict morhpological combinatorics - only allow compounds with verbs if the verb is further derived into a noun again:

Flag diacritic	Explanation
@P.NeedNoun.ON@	(Dis)allow compounds with verbs unless nominalised
@D.NeedNoun.ON@	(Dis)allow compounds with verbs unless nominalised
@C.NeedNoun@	(Dis)allow compounds with verbs unless nominalised

Flag diacritic Explanation

@P.Pmatch.Loc@ Used on multi-token analyses; tell hfst-tokenise/pmatch where in the form/analysis the token should be split. Used e.g. in bijladagi to split bijla from dagi, or after abbreviations with full stops before the full stop, to allow an alternate +CLB analysis of it in case of a sentence final abbreviation. NB! This will give a faulty lemma for the abbreviation, as it will not include the full stop. This can lead to other issues, but presently we have no other solution if we want to keep the full stopp as a separate token. We could leave a full stop at the end of the abbreviation lemma as well (but not on the input side - we only have one full stop in the input). That must be tested, it could work, but then requires special attention when generating suggestions in e.g. grammar checkers - it should not generate two full stops.

@P.Pmatch.Backtrack@ Used on single-token analyses; tell hfst-tokenise/pmatch to backtrack by reanalysing the substrings before and after this point in the form (to find combinations of shorter analyses that would otherwise be missed)

Flag diacritic	Explanation
@P.Pmatch.Loc@	Used on multi-token analyses; tell hfst-tokenise/pmatch where in the form/analysis the token should be split. Used e.g. in `bijladagi` to split `bijla` from `dagi`, or after abbreviations with full stops before the full stop, to allow an alternate `+CLB` analysis of it in case of a sentence final abbreviation. NB! This will give a faulty lemma for the abbreviation, as it will not include the full stop. This can lead to other issues, but presently we have no other solution if we want to keep the full stopp as a separate token. We could leave a full stop at the end of the abbreviation lemma as well (but not on the input side - we only have one full stop in the input). That must be tested, it could work, but then requires special attention when generating suggestions in e.g. grammar checkers - it should not generate two full stops.
@P.Pmatch.Backtrack@	Used on single-token analyses; tell hfst-tokenise/pmatch to backtrack by reanalysing the substrings before and after this point in the form (to find combinations of shorter analyses that would otherwise be missed)

Flag diacritic	Explanation
@D.ErrOrth.ON@	To be written
@R.ErrOrth.ON@	To be written
@C.ErrOrth@	To be written
@P.ErrOrth.ON@	To be written

For languages that allow compounding, the following flag diacritics are needed to control position-based compounding restrictions for nominals. Their use is handled automatically if combined with +CmpN/xxx tags. If not used, they will do no harm.

Flag diacritic	Explanation
@P.CmpFrst.FALSE@	Require that words tagged as such only appear first
@D.CmpPref.TRUE@	Block such words from entering ENDLEX
@P.CmpPref.FALSE@	Block these words from making further compounds
@D.CmpLast.TRUE@	Block such words from entering R
@D.CmpNone.TRUE@	Combines with the next tag to prohibit compounding
@U.CmpNone.FALSE@	Combines with the prev tag to prohibit compounding
@U.CmpNone.TRUE@	Combines with the two previous ones to block compounding
@P.CmpOnly.TRUE@	Sets a flag to indicate that the word has passed R
@D.CmpOnly.FALSE@	Disallow words coming directly from root.
@U.CmpHyph.FALSE@	Flag to control hyphenated compounds like proper nouns
@U.CmpHyph.TRUE@	Flag to control hyphenated compounds like proper nouns
@C.CmpHyph@	Flag to control hyphenated compounds like proper nouns

Use the following flag diacritics to control downcasing of derived proper nouns (e.g. Finnish Pariisi -> pariisilainen). See e.g. North Sámi for how to use these flags. There exists a ready-made regex that will do the actual down-casing given the proper use of these flags.

Flag diacritic	Explanation
@U.Cap.Obl@	Disallow downcasing of names when not derived: Deatnu
@U.Cap.Opt@	Allowing downcasing of derived names: deatnulasj.
@P.Px.add@	Giving possibility for Px-suffixes (all except from Nom 3.p)
@R.Px.add@	Requiring P.Px.add-flag for Px-suffixes (all except from Nom 3.p)
@P.Nom3Px.add@	Giving possibility for Px-suffixes Nom 3.p
@R.Nom3Px.add@	Requiring P.Nom3Px.add flag for Px-suffixes Nom 3.p

@R.SpellRlx.ON@ Flag used to tag spell-relax-analysed strings (and only those).
@D.SpellRlx.ON@ Flag used to tag spell-relax-analysed strings (and only those).
@C.SpellRlx@ Flag used to tag spell-relax-analysed strings (and only those).
@R.SpaceCmp.ON@ Flag to tag compounds written with a space
@D.SpaceCmp.ON@ Flag to tag compounds written with a space
@C.SpaceCmp@ Flag to tag compounds written with a space+

Flag diacritic	Explanation
@U.number.one@	Flag used to give arabic numerals in smj different cases ;
@U.number.two@	Flag used to give arabic numerals in smj different cases ;
@U.number.three@	Flag used to give arabic numerals in smj different cases ;
@U.number.four@	Flag used to give arabic numerals in smj different cases ;
@U.number.five@	Flag used to give arabic numerals in smj different cases ;
@U.number.six@	Flag used to give arabic numerals in smj different cases ;
@U.number.seven@	Flag used to give arabic numerals in smj different cases ;
@U.number.eight@	Flag used to give arabic numerals in smj different cases ;
@U.number.nine@	Flag used to give arabic numerals in smj different cases ;
@U.number.zero@	Flag used to give arabic numerals in smj different cases ;

@P.number.one@	Flag used to give arabic numerals in smj different cases ;
@P.number.two@	Flag used to give arabic numerals in smj different cases ;
@P.number.three@	Flag used to give arabic numerals in smj different cases ;
@P.number.four@	Flag used to give arabic numerals in smj different cases ;
@P.number.five@	Flag used to give arabic numerals in smj different cases ;
@P.number.six@	Flag used to give arabic numerals in smj different cases ;
@P.number.seven@	Flag used to give arabic numerals in smj different cases ;
@P.number.eight@	Flag used to give arabic numerals in smj different cases ;
@P.number.nine@	Flag used to give arabic numerals in smj different cases ;
@P.number.ten@	Flag used to give arabic numerals in smj different cases ;
@P.number.one@	Flag used to give arabic numerals in smj different cases ;
@P.number.two@	Flag used to give arabic numerals in smj different cases ;
@P.number.three@	Flag used to give arabic numerals in smj different cases ;
@P.number.four@	Flag used to give arabic numerals in smj different cases ;
@P.number.five@	Flag used to give arabic numerals in smj different cases ;
@P.number.six@	Flag used to give arabic numerals in smj different cases ;
@P.number.seven@	Flag used to give arabic numerals in smj different cases ;
@P.number.eight@	Flag used to give arabic numerals in smj different cases ;
@P.number.nine@	Flag used to give arabic numerals in smj different cases ;
@P.number.ten@	Flag used to give arabic numerals in smj different cases ;
@P.number.one@	Flag used to give arabic numerals in smj different cases ;
@P.number.two@	Flag used to give arabic numerals in smj different cases ;
@P.number.three@	Flag used to give arabic numerals in smj different cases ;
@P.number.four@	Flag used to give arabic numerals in smj different cases ;
@P.number.five@	Flag used to give arabic numerals in smj different cases ;
@P.number.six@	Flag used to give arabic numerals in smj different cases ;
@P.number.seven@	Flag used to give arabic numerals in smj different cases ;
@P.number.eight@	Flag used to give arabic numerals in smj different cases ;
@P.number.nine@	Flag used to give arabic numerals in smj different cases ;
@P.number.ten@	Flag used to give arabic numerals in smj different cases ;

Lexicon `Root`

The beginning of everything. Every FST defined in LexC must start with the reserved lexicon name Root.

LEXICON Acronym
LEXICON ProperNoun

Lexicon ENDLEX

And this is the ENDLEX of everything:

@D.CmpOnly.FALSE@@D.CmpPref.TRUE@@D.NeedNoun.ON@ ENDLEX2 ;

The @D.CmpOnly.FALSE@ flag diacritic is ued to disallow words tagged with +CmpNP/Only to end here. The @D.NeedNoun.ON@ flag diacritic is used to block illegal compounds.

ENDLEX2

ENDLEX3

ENDLEX4

This (part of) documentation was generated from src/fst/morphology/root.lexc

src-fst-morphology-stems-adjectives.lexc.md

vájnno vájnno vájnno

This (part of) documentation was generated from src/fst/morphology/stems/adjectives.lexc

src-fst-morphology-stems-adverbs.lexc.md

sme mojonjálmmiid

This (part of) documentation was generated from src/fst/morphology/stems/adverbs.lexc

src-fst-morphology-stems-nouns.lexc.md

**LEXICON Noun ** dividing in NounNoPx, NounPx (with a P.Px.add flag) and NounPxKin (with a P.Nom3Px.add flag)

LOAN SWE altar https://satni.uit.no/termwiki/index.php?title=Dihtorteknologiija_ja_diehtoteknihkka:fáddágilkor https://satni.uit.no/termwiki/index.php?title=Ásttoáigi_ja_faláštallan:gilbossjalljo https://satni.uit.no/termwiki/index.php?title=Ásttoáigi_ja_faláštallan:gilbossjalljo https://satni.uit.no/termwiki/index.php?title=Ásttoáigi_ja_faláštallan:gilbossjalljo LOAN LOAN LOAN

This (part of) documentation was generated from src/fst/morphology/stems/nouns.lexc

src-fst-morphology-stems-numerals.lexc.md

XXXtuvsánat

LEXICON ARABIC-COLL

This (part of) documentation was generated from src/fst/morphology/stems/numerals.lexc

src-fst-morphology-stems-pronouns.lexc.md

Reciprocal pronouns as multiword expression

This (part of) documentation was generated from src/fst/morphology/stems/pronouns.lexc

src-fst-morphology-stems-smj-abbreviations.lexc.md

File containing North Saami abbreviations

Lexica for adding tags and periods

Splitting in 4 + 1 groups, because of the preprocessor

**LEXICON Abbreviation-smj **
1. The ITRAB ; lexicon (intransitive abbrs)
2. The TRNUMAB ; lexicon (abbrs trans wrt. numberals)
3. The TRAB ; lexicon (transitive abbrs)
4. The NOAB ; lexicon (not really abbrs)
5. The NUMNOAB ; lexicon (not behaving as abbr before num)

The abbreviation lexicon itself

**LEXICON ITRAB ** are intransitive abbreviations, A.S. etc.
**LEXICON NOAB ** du, gen, jur

This class contains homonyms, which are both intransitive abbreviations and normal words. The abbreviation usage is less common and thus only the occurences in the middle of the sentnece (when next word has small letters) can be considered as true cases.

**LEXICON TRNUMAB ** contains abbreviations who are transitive in front of numerals

For abbrs for which numerals are complements, but other words not necessarily are. This group treats arabic numerals as if it were transitive but letters as if it were intransitive.

**LEXICON TRAB ** contains transitive abbreviations

This lexicon is for abbrs that always have a constituent following it.

**LEXICON NUMNOAB ** su, dii

This class contains homonyms, which are both abbrs for which numerals are complements and normal words. The abbreviation usage is less common and thus only the occurences in the middle of the sentence can be considered as true cases.

This (part of) documentation was generated from src/fst/morphology/stems/smj-abbreviations.lexc

src-fst-phonetics-acro2ipa.xfscript.md

Phonological ACRO converter for Julev Sámi

Converts ACROS to IPA. Intended for use with TTS. > marks undlying morpheme boundary between lemma and inflectional suffix, : is the same, but in the surface orthography. The idea is that the pronunciation of the last letter sound (like e: when reading the letter P) can be different when followed by a case ending compared to when not. If that is not true, the system can be simplified.

Default, letter by letter pronunciation

This (part of) documentation was generated from src/fst/phonetics/acro2ipa.xfscript

src-fst-phonetics-smj2sampa-from-old-infra.xfscript.md

retroflex plosive, voiceless t ʈ 0288, 648 ( = ASCII 096) retroflex plosive, voiced dɖ 0256, 598 labiodental nasal F ɱ 0271, 625 retroflex nasal n ɳ 0273, 627 palatal nasal J ɲ 0272, 626 velar nasal N ŋ 014B, 331 uvular nasal N\ ɴ 0274, 628

bilabial trill B\ ʙ 0299, 665 uvular trill R\ ʀ 0280, 640 alveolar tap 4 ɾ 027E, 638 retroflex flap rɽ 027D, 637 bilabial fricative, voiceless p\ ɸ 0278, 632 bilabial fricative, voiced B β 03B2, 946 dental fricative, voiceless T θ 03B8, 952 dental fricative, voiced D ð 00F0, 240 postalveolar fricative, voiceless S ʃ 0283, 643 postalveolar fricative, voiced Z ʒ 0292, 658 retroflex fricative, voiceless s ʂ 0282, 642 retroflex fricative, voiced z` ʐ 0290, 656 palatal fricative, voiceless C ç 00E7, 231 palatal fricative, voiced j\ ʝ 029D, 669 velar fricative, voiced G ɣ 0263, 611 uvular fricative, voiceless X χ 03C7, 967 uvular fricative, voiced R ʁ 0281, 641 pharyngeal fricative, voiceless X\ ħ 0127, 295 pharyngeal fricative, voiced ?\ ʕ 0295, 661 glottal fricative, voiced h\ ɦ 0266, 614

alveolar lateral fricative, vl. K alveolar lateral fricative, vd. K\

labiodental approximant P (or v) alveolar approximant r\ retroflex approximant r` velar approximant M\

retroflex lateral approximant l` palatal lateral approximant L velar lateral approximant L
Clicks

bilabial O\ (O = capital letter) dental |
(post)alveolar !\ palatoalveolar =\ alveolar lateral ||
Ejectives, implosives

ejective > e.g. ejective p p> implosive < e.g. implosive b b< Vowels

close back unrounded M close central unrounded 1 close central rounded } lax i I lax y Y lax u U

close-mid front rounded 2 close-mid central unrounded @\ close-mid central rounded 8 close-mid back unrounded 7

schwa @

open-mid front unrounded E open-mid front rounded 9 open-mid central unrounded 3 open-mid central rounded 3\ open-mid back unrounded V open-mid back rounded O

ash (ae digraph) { open schwa (turned a) 6

open front rounded & open back unrounded A open back rounded Q Other symbols

voiceless labial-velar fricative W voiced labial-palatal approx. H voiceless epiglottal fricative H\ voiced epiglottal fricative <\ epiglottal plosive >\

alveolo-palatal fricative, vl. s\ alveolo-palatal fricative, voiced z\ alveolar lateral flap l\ simultaneous S and x x\ tie bar _ Suprasegmentals

primary stress “ secondary stress % long : half-long :\ extra-short _X linking mark -
Tones and word accents

level extra high _T level high _H level mid _M level low _L level extra low _B downstep ! upstep ^ (caret, circumflex)

contour, rising contour, falling _F contour, high rising _H_T contour, low rising _B_L

contour, rising-falling _R_F (NB Instead of being written as diacritics with _, all prosodic marks can alternatively be placed in a separate tier, set off by < >, as recommended for the next two symbols.) global rise global fall Diacritics

voiceless 0 (0 = figure), e.g. n_0 voiced _v aspirated _h more rounded _O (O = letter) less rounded _c advanced _+ retracted _- centralized _” syllabic = (or _=) e.g. n= (or n=) non-syllabic _^ rhoticity `

breathy voiced _t creaky voiced _k linguolabial _N labialized _w palatalized ‘ (or _j) e.g. t’ (or t_j) velarized _G pharyngealized _?\

dental d apical _a laminal _m nasalized ~ (or _~) e.g. A~ (or A~) nasal release _n lateral release _l no audible release _}

velarized or pharyngealized _e velarized l, alternatively 5 raised _r lowered _o advanced tongue root _A retracted tongue root _q

This (part of) documentation was generated from src/fst/phonetics/smj2sampa-from-old-infra.xfscript

src-fst-phonetics-txt2ipa.xfscript.md

Phonological converter for Julev Sámi

Converts to IPA. Mainly intended for use with TTS.

This (part of) documentation was generated from src/fst/phonetics/txt2ipa.xfscript

src-fst-transcriptions-clock-from-old-infra.lexc.md

At some points we will need the genitives, for approximate numbers. Here they are.

avta guovte gålmå nielje vidá gudá gietja gávtse avtse låge lågenanavta lågenanguovte

This (part of) documentation was generated from src/fst/transcriptions/clock-from-old-infra.lexc

src-fst-transcriptions-transcriptor-abbrevs2text.lexc.md

We describe here how abbreviations in Lule Sami are read out, e.g. for text-to-speech systems.

For abbrs for which numerals are complements, but other words not necessarily are. This group treats arabic numerals as if it were transitive but letters as if it were intransitive.

This lexicon is for abbrs that always have a constituent following it

This (part of) documentation was generated from src/fst/transcriptions/transcriptor-abbrevs2text.lexc

src-fst-transcriptions-transcriptor-acro2text.lexc.md

We describe here how abbreviations in Lule Sami are read out, e.g. for text-to-speech systems.

This (part of) documentation was generated from src/fst/transcriptions/transcriptor-acro2text.lexc

src-fst-transcriptions-transcriptor-date-digit2text.lexc.md

This is still a dummy file.

This (part of) documentation was generated from src/fst/transcriptions/transcriptor-date-digit2text.lexc

src-fst-transcriptions-transcriptor-numbers-digit2text.lexc.md

This (part of) documentation was generated from src/fst/transcriptions/transcriptor-numbers-digit2text.lexc

src-fst-transcriptions-transcriptor-symbols2text.lexc.md

This file contains mappings from abbreviations and some acronyms to full forms for text-to-speech purposes. This is a supplement to the analyser; the analyser must tag the strings as +ABBR or similar for the transcriptions to work. The resulting full form must be lemmas known to the analyser, for further processing.

We describe here how abbreviations in Lule Sami are read out, for text-to-speech systems.

The file contains:

miscellaneous symbols

Smileys

Clause boundary symbols
Single punctuation marks
Paired punctuation marks

This (part of) documentation was generated from src/fst/transcriptions/transcriptor-symbols2text.lexc

src-fst-transcriptions-transcriptor-ttsdate-digit2text.lexc.md

This is still a dummy file.

This (part of) documentation was generated from src/fst/transcriptions/transcriptor-ttsdate-digit2text.lexc

tools-grammarcheckers-grammarchecker.cg3.md

L U L E S A A M I G R A M M A R C H E C K E R

Lule Sami language model documentation

src-cg3-disambiguator.cg3.md

Semantic tags

Removing Err/Orth

src-fst-morphology-affixes-abbreviations.lexc.md

Continuation lexicons for abbreviations

Lexica for adding tags and periods

The sublexica

Continuation lexicons for abbrs both with and witout final period

Lexicons without final period

Lexicons with final period

src-fst-morphology-affixes-adjectives.lexc.md

Sublexica for Adjective

Even-syllable stems

Correctly assimilated loanwords, derived from real noun.

Badly assimilated loanwords, some against norm others with no norm

Odd-syllable stems

With CG Sorted by attr

Contracted stems

Comparation

Superlative

Derivation of adjectives

src-fst-morphology-affixes-nouns.lexc.md

Sublexica for Noun

Even-syllable stems

2syll stems

Even-syllable test examples:*

Nouns with comparatives

Plural stems

Partially assimilated loanwords. The first part of the word is “citation borrowed” and keeps its norwegian/swedish orthography, only the last two syllables are adapted to sami.

Loanwords becoming odd-syll

Loanwords becoming contracted-syll

See further down: ÅLMÅJ_LOAN

Error-lexicons, made to not get too many entries with both Err/Orth and correct

Badly assimilated loanwords

Unassimilated loanwords

4syll stems

4 syllable plurals

Adjectival sublexicas. Give 4 syll adjectives inflection

Compound lexicas

Odd-syllable stems

without cg

with cg

Irregular stems

Plural odd-syll

Contracted stems

Contracted stems sublexica

Px lexica

src-fst-morphology-affixes-pronouns.lexc.md

src-fst-morphology-affixes-propernouns.lexc.md

Lule Sáme Proper noun morphology !

Even syllable proper nouns

Odd syllable case marking

Even-syllable test examples:*

Even-syllable test examples:*

Even-syllable test examples:*

Even-syllable test examples:*

Even-syllable test examples:*

Even-syllable test examples:*

Contracted proper nouns

Lexicons only used in sme/sma and that are sent to other lexicons in smj

src-fst-morphology-affixes-symbols.lexc.md

Symbol affixes

src-fst-morphology-affixes-verbs.lexc.md

Sublexica for Verb

Auxiliary verbs

Negation verb

Copula

Modals

Ordinary main verbs

Even-syllable stems

Odd-syllable stems

Contracted stems

src-fst-morphology-compounding.lexc.md

src-fst-morphology-phonology.twolc.md

Lule Sámi morphophonological rule set

Background

Declarations and definitions

The Alphabet section

The real Lule Sámi Alphabet

Lexicon `Root`