North Sami NLP Grammar

Finite state and Constraint Grammar based analysers, proofing tools and other resources

View the project on GitHub giellalt/lang-sme

Background

The immediate background for this list is the meeting of the Sámi language board meeting in Guovdageaidnu, in October, 2005. The list will be relevant also after this meeting, though. It is a key document to the normativity issued faced by the sámi spellchecker program, and lists open normativity issues.

In some cases it is also a question of digging up old documents, and find earlier decisions, as a certain normativity question has actually been decided, but is not really enforced, and practice is unstable.

In mail from the administration of SGL 09.03.2006 it is said that words already existing in dictionaries must be accepted: “Muhtinráje luoikkassániiin leat juo anus dihto čállinvugiin, vrd čállinvuogi sátnegirjjiin, iige daid sániid čállinvuogi leat nu álki rievdadišgoahtit. Mii čujuhit dohkkehuvvon tearbmalisttuide, sátnegirjjiide, ja giellalávdegotti mearrádusaide.” (≈To some degree loanwords are already in use and written in certain ways, for example in dictionaries, and it is not so easy to change the way that they should be written. We refer to accepted terminology lists, dictionaries and decisions made by SGL. )

We interpret this the way that the grammar of Nickel should be followed too regarding specific forms.

Orthographical conventions

Short passive forms

Should the speller accept short passive forms like this:

mannojun vs. mannojuvvon

And short forms like this:

čállon vs. čállojuvvon

Status/actions:

  1. Fully described with examples
  2. Sent to Giellalávdegoddi
  3. Decided on Giellalávdegoddi meeting 131205
  4. Form -jun is not acceptable, čállon is acceptable

Gerund forms -dettiin vs. –diin/-din

Should the speller accept short gerunds?

manadiin/manadin

For example Nickels grammar show both -dettiin and -diin/-din with the long form slightly predominating.

Status/actions:

  1. Fully described with examples
  2. Sent to Giellalávdegoddi
  3. Decided on Giellalávdegoddi meeting 131205
  4. All/Both forms acceptable

Placename+noun: Kárášjoh nieida vs. Kárášjohnieida

Should the placename and the noun be written separately or not. Nickel does not separate them:

romssabárdni

Status/actions:

  1. Fully described with examples
  2. Sent to Giellalávdegoddi
  3. Decided on Giellalávdegoddi meeting 131205
  4. Should be written Kárášjoga nieida

Verbgenitive and neg. and imp. 2p. form of verbs ending with -idit

Verbs ending with -idit could teoretically have verbgenitive and neg. and imp. 2p.sg forms ending either with -d or with -t:

skilaidit—>skilait, háliidit—>háliit

skilaidit—>skilaid, háliidit—>háliid

In our corpus there are both forms.

Status/actions:

  1. Fully described with examples
  2. Decided on Sámiid X. konferánsa 20-220678 and Giellalávdegoddi and Sámiráđđi meeting 18-191084
  3. End consonant is always -t, except in acc/gen pl.

Adjectival forms -eabbo/-abbo vs. –eabbu/-abbu

This is about second stem vowel in comparative forms of odd-stem adjectives:

**garraseabbo/garrasabbo vs. garraseabbu/garrasabbu**

In the eightees these forms were written predominantly with -u, but newer normative publications, like Nickels grammar from 1993, show predominantly -o.

Status/actions:

  1. Fully described with examples
  2. Should be written -eabbo/-abbo. This is how it is written in grammar by Nickel (see “Background”).

Writing of acronyms: NRK:as

This is actually two sub-issues:

Status/actions:

  1. Fully described with examples
  2. Sent to Giellalávdegoddi
  3. Partially decided (the acronyms) on Giellalávdegoddi meeting 131205
  4. Nom. NSR Akk. NSR Gen. NSR Ill. NSR:i Lok. NSR:s Kom. NSR:in Ess. NSR:n
  5. Numerals will be written in the same way as acronyms.
  6. Loan words ending in certain light vowels will be treated as indigeanious words, ie: Bedriftshelsiin.

Second syllable deletion in compounds

There hasn’t been any normative decisions regarding second syllable deletion when compounding. Čállinrávagirji of the Norwegian Sámediggi says that forms with second syllable deletion should be avoided when compounding. They have the example johgáddi vs. johkagáddi. Čállinrávagirji doesn’t more than touches the subject in a few lines with just this one example. At the same time there are definitively occasions where the forms with second syllable deletion have become norm, as in sotnabeaiskuvla (sotnabeaivviskuvlla). The question is where to draw the line. Is the second syllable deletion something that should only occur in specific compounds, in specific words or in specific consonant clusters? If so: in what compounds, words or clusters and how? Pekka Sammalahtis article on compounding (Sátnegoallosteapmi ja čállinvuohki in Tearbmasympisia raporta, Dieđut nr. 3 1994 s. 35 ff .Sámi Instituhtta, Alta: 1994) has many examples on forms with deleted second syllable that has become norm.

1. bálddoalgái vs. bálddaoalgái……………4A

2. čázoaivi vs. čázeoaivi……………………….2

3. gabboaivi vs. gabbaoaivi………………….1

4. garroaivi vs. garraoaivi……………………..1

5. gaskoapmi vs. gaskaoapmi………………4A

6. guoddolggoš vs. *guoddáolggoš……..1

7. guoikkoaivi vs. guoikkaoaivi…………….4A

8. mieseadni vs. mieseeadni………………..1

9. muorroaivi vs, muorraoaivi……………….1

10. námmoaivi vs. námmeoaivi……………1

11. šlubboaivi vs. šlubbooaivi………………1

12. liigieres vs. liigegieres……………………4A

13. niiboagán vs. niibeboagán…………….4A

14. risbárdni vs. ristabárdni………………….4A

15. oaivvuloš vs. oaivevuloš………………..4A

16. njárgeahči vs. njárgageahči…………..4A

17. lihdoaŋggat vs. lihpedoaŋggat………2

18. čipbealli vs. cippebealli………………….1

19. vuohppbealli vs vuohppebealli………2

Using Nickels system (Samisk Grammatikk s. 27 ff. Davvi Girji o.s. 1994) the consonant clusters are divided into different groups. These are indicated by the numbers in the right column. Here we can see that the second syllable deletion isn’t always the same within the different groups. The clusters belonging to one group aren’t pronounced/written in the same way from compound to compound. Difference in case is the most obvious cause of this like in (3): nominative case vs. (8): genitive case. But difference in pronouncing/writing is also depending on the overall fonotactics of a bigger environment , as can be seen in (5): nominative case vs. (16): nominative case.

In our corpus we see that second syllable deletion isn’t at all unusual. Here are a few examples that reinforce what has already been told, regarding the consonant cluster types and differences in pronouncing/writing:

20. guovttgielat vs. guovttigielat………….4D

21. guoihgáddi vs guoikagáddi………….4A

22. gaskabeai- vs. gaskabeaivvi………..4A

23. geassesaj- vs. geassesaji…………….3A

24. maŋŋegeaš- vs. maŋŋegeahči……..2

We see here that the consonant cluster types are more than in Sammalahtis examples. The corpus also shows that there is some confusion about how second syllable deletion should be written, for example (21): guoikgáddi (?).

Other examples on this phenomenon can be found in the grammars of Konrad Nielsen (Lærebok i Lappisk, Bind 1, Grammatikk s. 287 ff. Universitetsforlaget Oslo 1979) and Klaus Peter Nickel (Samisk Grammatikk s. 387 Davvi Girji o.s. 1994). Especially Nickel reviews second syllable deletion only in the second part of three-part compounds though.

Status/actions:

  1. Fully described with examples
  2. Decision on Giellalávdegoddi meeting 181194: “not necessary to use apostrohpe, possible to write čipbealit, niiboagán, vilbealle. h-like sound should be written -h: čiehgahpir, johgáddi. vuohbealle”.
  3. Following the mail from the administration of SGL 09.03.2006 (see “Background”) most of the listed words above will be accepted by the speller. The speller will hence accept single compoundwords with secondsyllable deletion and the acceptance will not be fonologically based on certain clusters.

Second syllable deletion in specific numerals

This issue is very much alike the issue on second syllable deletion in compounds:

1. vihttanuppelot vs. vihttanuppelohkái………..2

2. viđanuppeloht vs. viđanuppelohkái………….2

Should the speller accept the deletion in these cases. How should it be written? What about when it is predicative, and when the word is an attribute?

Status/actions:

  1. Fully described with examples
  2. vihttanuppelot in Nickel (see “Background”) as attribute, hence can be used

Dialectal variation

Diphtong simplification -eddjiid vs. no diphtong simplification–eaddjiid

Usually there is dihptong simplification in first syllable when there is -ii the second syllable:

geavtit —> gevttii, beahci —> beziid

In Eastern dialects however this rule does not apply when the central consonantcluster in all inflectionforms is in the strongest grade, grade III:

oahpaheaddji —> oahpaheaddjiid

Western dialects here show the forms with diphtong simplification:

oahpaheaddji —> oahpaheddjiid

In grade III Western dialects show dipht. simpl. before second syllable. -ui as well:

goargŋu —> gorgŋuid

Also here the Eastern dialects lack dipht. simpl:

goargŋu —> goargŋuid

Normative publications (for example Nickel 1993) show predominantly the Western forms.

Status/actions:

  1. Fully described with examples
  2. Sent to Giellalávdegoddi
  3. Decided on Giellalávdegoddi meeting 131205
  4. Forms with dipht. simpl. shall be used

Adjectival forms -at/-ut vs. –et/-it

This are the attributive shortforms of odd-stem adjectives and the division is between Western and Eastern dialects:

Western: garraset/garrasit

Eastern: garrasat/garrasut

Nickels grammar shows all these forms.

Status/actions:

  1. Fully described with examples
  2. All forms in grammar by Nickel (see “Background”). These are hence acceptable.

Cons. h vs. š between 2 and third syllable

The normal way to write is with š: jurddašit and dárbbašit. In dictionaries there are many times both variants though, for example jurddahit (but not dárbbahit). Should the speller accept just š or both?

Status/actions:

  1. Fully described with examples
  2. Sent to Giellalávdegoddi
  3. Decided on Giellalávdegoddi meeting 131205
  4. Both forms are acceptable

Conditional forms -lit vs. –šit

These are conditional forms and the division is again between Western and Eastern dialects:

Eastern: manašit

Western: manalit

Here Nickels grammar show both forms with the Eastern form slightly predominating.

Status/actions:

  1. Fully described with examples
  2. Both forms in grammar by Nickel (see “Background”). Both forms acceptable.

Conditional forms with diphtong and long -u stem vowel

Usually there is diphtong simplification in u-verbs when the conditional forms apply. This is because the long -u shortens to -o.

soabbut —> soppolin/soppošin

Some Western dialects however lack this shortening of stem vowel and hence also the diphtong simplification:

soabbut —> soappulin/soappušin

Nickels grammar has not these Western forms.

Status/actions:

  1. Fully described with examples
  2. Not in Nickel (see “Background”), hence not acceptable

Conditional alternative form livččon vs. livččen

These are different conditional forms of the verb leat/leahkit. Nickels grammar shows both forms with a predominace for livččen.

Status/actions:

  1. Fully described with examples
  2. Sent to Giellalávdegoddi
  3. Decided on Giellalávdegoddi meeting 131205
  4. Only “normal” forms are acceptable: livččen

Plosives in beginning of word: pievlat vs. bievlat and politihkka vs. bolitihkka

Unvoiced plosives are frequent in the Eastern dialects. Here Nickel only names them when he is talking of indigenius words:

pievlat

When talking about loanwords though he writes them with the original unvoiced plosives:

kušta, poasta, teaksta

So this is actually two sub-issues devided into 1. indigenius words and

  1. loanwords.

Status/actions:

  1. Fully described with examples
  2. Decided on Sámiid X. konferánsa 20-220678 and Giellalávdegoddi and Sámiráđđi meeting 18-191084
  3. Consonants in beginning of words are b-, d-, g-. In young loanwords it is acceptable to write p-, d-, k- too: kirku.

Central consonant softening: rievan vs. rieban

In Eastern dialects central consonant -b- softens to -v-:

rievan

Nickel only names this.

Status/actions:

  1. Fully described with examples
  2. Sent to Giellalávdegoddi
  3. Decided on Sámiid X. konferánsa 20-220678 and Giellalávdegoddi and Sámiráđđi meeting 18-191084
  4. No softening

Du1p of odd-syllable verbs

Du1p og odd-syllable verbs shows many parallel forms:

hálede, hálededne, háledetne

Nickel has all three forms.

Status/actions:

  1. Fully described with examples
  2. Sent to Giellalávdegoddi
  3. Decided on Giellalávdegoddi meeting 131205
  4. Only one form should be allowed: -detne

Imperative forms

Imperative forms differ quit a lot:

Evenstems (Nickel shows all forms):

Pl1: bohtot (Western) Pl1: boahttout (Eastern) Pl2: bohtet (Western) Pl2: boahttit (Eastern)

Oddasyllables (Nickel shows some):

Du1 háliideadnu háliideahkku Pl1 háliideadnot háliideahkkot háliidehkot háliidetnot Pl2 háliideahkket háliidehket

Leat/Leahkit (Nickel shows all forms):

Du1: leadnu Du1: leahkku Pl1: lehkot Pl1: leatnot

Status/actions:

  1. Fully described with examples
  2. Forms that are in Nickel (see “Background”) can be used.

Noun shortforms in acc/gen

In the Eastern dialects odd-syllable nouns have a short form in acc/gen:

beakkán, luopmán

The western form, which is in Nickel as well, is like this:

beakkána, luopmána

Status/actions:

  1. Fully described with examples
  2. Only long forms acceptable according to grammar by Nickel (see “Background”)

Caritative nom. form -hin vs. -heapmi

Some western dialects have short nominative caritive form:

jeagohin

Nickel here only gives the long counterpart -heapmi.

Status/actions:

  1. Fully described with examples
  2. The form that should be used is -heapmi, since -hin not in grammar by Nickel (see “Background”)

Hyphenation

dj and lj

Nickel s. 33: dj og lj bør helst ikke deles, selv om de betegner en dobbeltkonsonant.

Status/actions:

  1. Fully described with examples
  2. Sent to Giellalávdegoddi