GiellaLT

GiellaLT provides an infrastructure for rule-based language technology aimed at minority and indigenous languages, and streamlines building anything from keyboards to speech technology. Read more about Why. See also How to get started and our Privacy document.

View GiellaLT on GitHub

Historical Sámi orthographies

Overview

Linguistically, there are differences within several of the orthographies listed below, the grouping here is made from an OCR point of view.

  1. The Leem orthography (Norway, until 1832)
  2. The Stockfleth/Friis orthography (Norway, 1832-2000)
  3. The Nielsen orthography (Konrad Nielsens publications)
  4. The Ravila/Itkonen orthography (Finland, 1934-1978)
  5. The Bergsland/Ruong orthography (Norway-Sweden, 1948-1978)
  6. The present 1979 orthography

Priority: The important dictionaries we have for 1, 2, 3, 5, 6. The important text corpus orthographies are 2, 4, 6.

TODO: Find out how many OCR models we need, and make them.

The Leem orthography (Norway, until 1832)

This contains Danish letters + palatalisation accents.

The Stockfleth/Friis orthography (Norway, 1832-2000)

This was the dominating orthography until 1948. After 1949, its use was restricted to religious literature (‘‘Nuorttanaste’’ and related texts).

Alphabet

А а	B b	C c	Č č	D d	Đ đ	E e	F f
G g	Ǥ ǥ	H h	I i	J j	K k	L l	M m
N n	Ƞ ƞ	O o	P p	R r	S s	Š š	T t
Ŧ ŧ	U u	V v	Ʒ ʒ Å å	Æ æ	Ø ø

There are training data for this orthography in tesstrain. TODO: The Stockfleth dictionary.

The Nielsen orthography (Konrad Nielsens publications)

Alphabet

Nothing done so far.

The Ravila/Itkonen orthography (Finland, 1934-1978)

Alphabet

Nothing done so far.

The Bergsland/Ruong orthography (Norway-Sweden, 1948-1978)

Most letters are the same as for Friis (ǥ is gone), but many glyphs are different from the 19th century. Both the dialect basis and the orthographic rules are neš, and the bigram pattern is thus new as well.

Alphabet

А а	Á á	B b	C c	Č č	D d	Đ đ	E e
F f	G g	H h	I i	J j	K k	L l	M m
N n	Ƞ ƞ	O o	P p	R r	S s	Š š	T t
Ŧ ŧ	U u	V v	Z z	Ž ž	Æ æ	(Ø ø) Å å

TODO: Frette.

The present 1979 orthography

The letters and the glyphs are the same as for Bergsland/Ruong, but the bigram pattern is different.

Alphabet

А а	Á á	B b	C c	Č č	D d	Đ đ	E e
F f	G g	H h	I i	J j	K k	L l	M m
N n	Ŋ ŋ	O o	P p	R r	S s	Š š	T t
Ŧ ŧ	U u	V v	Z z	Ž ž