Lule Sami NLP Grammar

Finite state and Constraint Grammar based analysers, proofing tools and other resources

View the project on GitHub giellalt/lang-smj

Page Content

Background

The immediate background for this list is the meeting of the Lule Sámi language board meeting. The list will be relevant also after this meeting, though. It is a key document to the normativity issued faced by the sámi spellchecker program, and lists open normativity issues.

Even and contracted stemtypes in nom. and gen./acc. when joined with pos. suffix

The normative grammars are far from precise in this issue with respect to stem-vowel change and central consonant gradation in each case. We are now using a system developed for the translation of the bible.

Status/actions:

  1. Fully described with examples
  2. Sent to Giellalávdegoddi

Loan word intergration

What policy should be followed here? There are no normative decisions made. The writers of Sámásta say that principles have been developed over time, but the practice is unstable and there seems to be a whole lot of confusion (s. 272 ff.). The book gives some suggestions of how to incorporate loanwords, but there hasn’t obviously been any normative organ involved.

Status/actions:

  1. Fully described with examples
  2. Sent to Giellalávdegoddi

Dialect specific forms

According to two main sources there are different ways of writing the same word in Sweden an Norway. F.exs “luohttedahtte” and “kasus” in Sweden and “luohtedahtte” and “kásus” in Norway. We have added both forms, but this gives us some problems. If someone in Norway writes “luohttedahtte”, the speller doesn’t mark it as red, even though the usual form in Norway is “luohtedahtte”. The same problem in Sweden, only opposite. This meens that the speller doesn’t correct words that are written diffrently in Sweden and Norway, even though it’s “wrong” .

Status/actions:

  1. Fully described with examples

Short froms

Some words are beeing shortened f eks: riek (riekta)and guok(guokta). “Tjuohte” is also beeing shortened into “tjuot”, in compounds. To some degree we can find these words beeing used in dictionaries and grammars. Is it allowed to use these shortened forms in writing?

Status/actions:

  1. Fully described with examples

Case forms of abbreviation, acronyms and numerals

Should the case form of abbreviation, acronyms and numerals be written with colon, hyphen or apostroph? Example: NRK:as vs NRK-as vs NRK’as

Should consonant-final abbreviation, acronyms and numerals spell out the epenthetic -a- or not? Example: NRK:s vs NRK:as

Status/actions:

  1. Fully described with examples

Propername+noun

How should various kinds of propernames be written? With hyphen (Windows-prográmma, MEGA-bargge, Elle-áhkko, Finlandia-dåhpe, Adidas-gáma, Oslo-biila, Vuona-ládde etc) or in another way (like Oslo biila, Vuona ládde etc). Is it different for different kinds of first-parts: place, organisation, personname, surname, object etc? Is it different for different kinds of second parts?

Status/actions:

  1. Fully described with examples