Skolt Sami NLP Grammar

Finite state and Constraint Grammar based analysers, proofing tools and other resources

View the project on GitHub giellalt/lang-sms

Page Content

Skolt Sami language model documentation

All doc-comment documentation in one large file.


src-cg3-disambiguator.cg3.md

Skolt Sámi disambiguator

Note: This documentation file is still work-in-progress, and should not yet be used. Read the source file instead.

Delimiters

DELIMITERS = “<.>” “<!>” “<?>” “<…>” “<¶>”;

Tags and sets #

We declare BOS, EOS and all the tags from the fst.

Disambiguation

Cycle 0, rules without context

Possessive suffix

Probably exists only for Refl and for kinship terms In Skolt Sami Possessive suffixes ARE USED Jaska 2020-11-08

Pronouns and nouns

Postpostions

Short Pronouns

No rules.

Proper nouns

Cycle 1

Numerals

Trivialia

Nouns

Nominative plural

Genitive

Verbs

Imperative

There can be Interj, VOC,

Genitive modifier

Subject

M A P P I N G

CC- and CS-Mapping

CASES

PrfPrc

Person

Nomen

Verb or Noun

Dem

No rules

CC and CS or Adv

Adj or Adv

grammatisk ord eller N eller A

N or Adj

N or V

Ger or Der/NomAct

Adj or Indef

Num

Rel or Interr

Interj

no rule

Po or Pr

Adv or Po/Pr

Com

Accusative or illative

Accusative or Genitive

Indef or Adv

special lemmas

no rules

Verb person vs. Inf – moved here in order to have the pronouns disambiguated first.

Proper nouns

Rule set taken from sme

Substituting Prop tags

Prop or not

Removing proper nouns that are lookalikes

Particular proper nouns

Todo: sms-ify.

Mapping rules

SAFE RULES

subject rules and spred rules

Removing Err/Orth

Denne regelen fjerner Err/Orth når det er samme lemma, sjøl om morfologien er forskjellig.


This (part of) documentation was generated from src/cg3/disambiguator.cg3


src-fst-morphology-affixes-acronyms.lexc.md

Inari Saami acronyms

The lexica giving tags and suffixes to the acronyms

+N+ABBR+Sg+Gen:%> # ; +N+ABBR+Sg+Loc:%> # ; +N+ABBR+Ess:%> # ; +N+ABBR+Par:%> # ; +N+ABBR+Pl+Nom:%> # ; +N+Prop:%> ACCRADECL ; +N+Prop:%> BERN-UCASE ; +N+Prop:%> LONDON-UCASE ; +N+Prop:%> NYSTØ-OBL ;


This (part of) documentation was generated from src/fst/morphology/affixes/acronyms.lexc


src-fst-morphology-affixes-adjectives.lexc.md

Skolt Saami adjective declension

These come directly from the xml to lexc xsltransformation lexica

CLASS 1 HIGH VOWEL, NO PALATALIZATION NOMINALS

CLASS 1 LOW VOWEL, NO PALATALIZATION NOMINALS

CLASS 1 LOW VOWEL, PALATALIZATION, ILLATIVE IN U NOMINALS

CLASS 1 HIGH VOWEL, PALATALIZATION NOMINALS

CLASS 1 LOW VOWEL, PALATALIZATION NOMINALS

no separate attribute form 2018-10-13 Russian loanword

CLASS denominals in -i cf. Feist (2012: 198-199) These will need their own expansions HOW DOES JIÕʹNNI decline?

CLASS

-õs ending

WORK HERE 2015-10-14 deverbals

check this 2015-11-10

2. WORDS WITH MULTI-SYLLABLE NOMINATIVE SINGULARS (2009: 293)

2.3 Sg.Loc in -est. e-stems (Sg.Loc, Ess, Par).

2.3.2 Sg.Ill in -a

2.3.2.1 Has Gradation

2.3.2.1.1 Second syllable vowel loss (Sg.Ill, Sg.Loc, Sg.Com; Ess, Par; Pl.Obl)

2. WORDS WITH MULTI-SYLLABLE NOMINATIVE SINGULARS (2009: 293)

2.3 Sg.Loc in -est. e-stems (Sg.Loc, Ess, Par).

2.3.2 Sg.Ill in -a

2.3.2.1 Has Gradation

2.3.2.1.1 Second syllable vowel loss (Sg.Ill, Sg.Loc, Sg.Com; Ess, Par; Pl.Obl)

CLASS 11 ADJECTIVES

säʹmmlaž:säʹmml

ânnʼjõž:ânnʼj

muõrâž:muõr

Class 12 Feist 163

Sammallahti 2010: 151

N›A derivation in +Der+Der/N2A

1A (Feist 2011: 198-199)

1B (Feist 2011: 198-199)

1C (Feist 2011: 198-199)

determiner

determiner

determiner

determiner determiner

CLASS 1 HIGH VOWEL, NO PALATALIZATION NOMINALS pa%{a0%}%{ʹ0%}%{p0%}p V%{V0%}%{ʹ0%}%{C0%}C Sg_Nom: high-vowel=yes monophthong=yes long-vowel=yes palatalization=no consonantism=quant-gem long-consonant=yes

1.1.1.1.1.1. Sg_Nom=”short_vowel geminate” Sg_Gen=”long_vowel geminate”

1. WORDS WITH SINGLE-SYLLABLE NOMINATIVE SINGULARS (2009: 167)

1.1 Sg.Loc in -âst (no vowel shift, all raised)

â-stems (Sg.Loc, Ess, Par).

1.1.1 Sg.Ill vowel -e

1.1.1.1 Has Palatalization

1.1.1.1[1] (Palatalization pattern)

Palatalized: Sg.Ill Not Palatalized: ELSE Sg.Ill in palatalization and -e

1.1.1.1[1].1 Lacks Specifically Pedagogical Gradation

1.1.1.1[1].1.1 Has Orthographic Gradation

1.1.1.1[1].1.1[] (Monophthong + Consonant Geminate alternation)

Extra strong grade: Sg.Nom, Ess, Par Extra strong grade: Sg.Ill Strong grade: Pl.Nom, Sg.Loc, Sg.Com

FORMS

strong_geminate, long_vowel

e.g. e.g. +Use/NG+Sg+Loc+PxSg3

Sg_Nom: vow_mono:vow_short:vow_high:pal_no:cns_xyy similar_to: N_TAALKYS, N_KOONTYR 1.1.1.1.1.1. Sg_Nom=”short_vowel|long_cluster” Sg_Gen=”long_vowel|short_cluster”

1. WORDS WITH SINGLE-SYLLABLE NOMINATIVE SINGULARS (2009: 167)

1.1 Sg.Loc in -âst (no vowel shift, all raised)

â-stems (Sg.Loc, Ess, Par).

1.1.1 Sg.Ill vowel -e

1.1.1.1 Has Palatalization

1.1.1.1[1] (Palatalization pattern)

Palatalized: Sg.Ill Not Palatalized: ELSE Sg.Ill in palatalization and -e

1.1.1.1[1].1 Lacks Specifically Pedagogical Gradation

1.1.1.1[1].1.1 Has Orthographic Gradation

1.1.1.1[1].1.1[] (Monophthong + Consonant Cluster alternation)

Strong grade: Sg.Nom, Ess, Par Strong grade: Sg.Ill Weak grade: Pl.Nom, Sg.Loc, Sg.Com

FORMS

Sg_Nom: vow_mono:vow_short:vow_high:pal_no:cns_vyy 1.1.1.1.1.1. Sg_Nom=”short_vowel|long_V-cluster” Sg_Gen=”long_vowel|short_V-cluster”

1. WORDS WITH SINGLE-SYLLABLE NOMINATIVE SINGULARS (2009: 167)

1.1 Sg.Loc in -âst (no vowel shift, all raised)

â-stems (Sg.Loc, Ess, Par).

1.1.1 Sg.Ill vowel -e

1.1.1.1 Has Palatalization

1.1.1.1[1] (Palatalization pattern)

Palatalized: Sg.Ill Not Palatalized: ELSE Sg.Ill in palatalization and -e

1.1.1.1[1].1 Lacks Specifically Pedagogical Gradation

1.1.1.1[1].1.1 Has Orthographic Gradation

1.1.1.1[1].1.1[] (Monophthong + Consonant Cluster alternation)

Strong grade: Sg.Nom, Ess, Par Strong grade: Sg.Ill Weak grade: Pl.Nom, Sg.Loc, Sg.Com

FORMS

Sg_Nom: vow_di:vow_high:pal_no:cns_xyy 1.1.1.1.1.1. Sg_Nom=”diphthong|long_cluster” Sg_Gen=”diphthong|short_cluster”

1. WORDS WITH SINGLE-SYLLABLE NOMINATIVE SINGULARS (2009: 167)

1.1 Sg.Loc in -âst (no vowel shift, all raised)

â-stems (Sg.Loc, Ess, Par).

1.1.1 Sg.Ill vowel -e

1.1.1.1 Has Palatalization

1.1.1.1[1] (Palatalization pattern)

Palatalized: Sg.Ill Not Palatalized: ELSE Sg.Ill in palatalization and -e

1.1.1.1[1].1 Lacks Specifically Pedagogical Gradation

1.1.1.1[1].1.1 Has Orthographic Gradation

1.1.1.1[1].1.1[] (Monophthong + Consonant Cluster alternation)

Strong grade: Sg.Nom, Ess, Par Strong grade: Sg.Ill Weak grade: Pl.Nom, Sg.Loc, Sg.Com

FORMS

Sg_Nom: vow_di:vow_high:pal_no:cns_xyy 1.1.1.1.1.1. Sg_Nom=”diphthong|long_cluster” Sg_Gen=”diphthong|short_cluster” Sg_Ill=”diphthong|vowel_e-coloration|long_cluster”

1. WORDS WITH SINGLE-SYLLABLE NOMINATIVE SINGULARS (2009: 167)

1.1 Sg.Loc in -âst (no vowel shift, all raised)

â-stems (Sg.Loc, Ess, Par).

1.1.1 Sg.Ill vowel -e

1.1.1.1 Has Palatalization

1.1.1.1[1] (Palatalization pattern)

Palatalized: Sg.Ill Not Palatalized: ELSE Sg.Ill in palatalization and -e

1.1.1.1[1].1 Lacks Specifically Pedagogical Gradation

1.1.1.1[1].1.1 Has Orthographic Gradation

1.1.1.1[1].1.1[] (Monophthong + Consonant Cluster alternation)

Strong grade: Sg.Nom, Ess, Par Strong grade: Sg.Ill Weak grade: Pl.Nom, Sg.Loc, Sg.Com

FORMS

Sg_Nom: vow_mono:vow_short:vow_high:pal_no:cns_gem 1.1.1.1.1.2. Sg_Nom=”short_vowel|geminate” Sg_Gen=”long_vowel|single_consonant”

See also: NMN_TOLL-PLC, which is the same, but minus PL forms and certain cases

1. WORDS WITH SINGLE-SYLLABLE NOMINATIVE SINGULARS (2009: 167)

1.1 Sg.Loc in -âst (no vowel shift, all raised)

â-stems (Sg.Loc, Ess, Par).

1.1.1 Sg.Ill vowel -e

1.1.1.1 Has Palatalization

1.1.1.1[1] (Palatalization pattern)

Palatalized: Sg.Ill Not Palatalized: ELSE Sg.Ill in palatalization and -e

1.1.1.1[1].1 Lacks Specifically Pedagogical Gradation

1.1.1.1[1].1.1 Has Orthographic Gradation

1.1.1.1[1].1.1[] (Monophthong + Consonant and Consonant Geminate alternation)

Extra strong grade: Sg.Nom, Ess, Par Extra strong grade: Sg.Ill Weak grade: Pl.Nom, Sg.Loc, Sg.Com

FORMS

THIS IS NOT THE SAME AS N_MUORR

Sg_Nom: vow_di:vow_high:pal_no:cns_gem 1.1.1.1.1.1. Sg_Nom=”diphthong|geminate” Sg_Gen=”diphthong|single_consonant”

1. WORDS WITH SINGLE-SYLLABLE NOMINATIVE SINGULARS (2009: 167)

1.1 Sg.Loc in -âst (no vowel shift, all raised)

â-stems (Sg.Loc, Ess, Par).

1.1.1 Sg.Ill vowel -e

1.1.1.1 Has Palatalization

1.1.1.1[1] (Palatalization pattern)

Palatalized: Sg.Ill Not Palatalized: ELSE Sg.Ill in palatalization and -e

1.1.1.1[1].2 Has Specifically Pedagogical Gradation

Sg.Ill:

1.1.1.1[1].2.1 Has Orthographic Gradation

1.1.1.1[1].2.1[] (Diphthong + Consonant and Consonant Geminate alternation)

Strong grade: Sg.Nom, Ess, Par Extra strong grade: Sg.Ill Weak grade: Pl.Nom, Sg.Loc, Sg.Com

FORMS

FORMS

similar_to: N_VUYHSS

Sg_Ill=”palatalization e-final”

2. WORDS WITH TWO-SYLLABLE NOMINATIVE SINGULARS (2009: 288)

2.1 Sg.Loc in -âst. â-stems (Sg.Loc, Ess, Par).

2.1.3 Sg.Ill in palatalization and -e

2.1.3.3 Lacks Gradation (in last syllable)

2.1.3.3.1 Monophthong

2.1.3.3.1.3 Consonant always short

2.1.3.3.1.3.4 Sg.Nom long vowel AND Short consonant

2.1.3.3.1.3.4.1.Sg.Gen Weak Grade

2.1.3.3.1.3.4.1.3 Sg.Ill Weak Grade

plaan:plaan

CLASS 1 LOW VOWEL, MONOPHTHONG, NO PALATALIZATION NOMINALS

a-stems

Sg_Nom: vow_mono:vow_short:vow_low:pal_no:cns_gem 1.1.1.2.1. stem_with_gradation: yes 1.1.1.2.1.1. Sg_Nom=”short_vowel|geminate” Sg_Gen=”long_vowel|geminate”

Is for nouns with -ast Loc

1. WORDS WITH SINGLE-SYLLABLE NOMINATIVE SINGULARS (2009: 180)

1.2 Sg.Loc in -ast (vowel shift)

Raised: Sg.Ill Lowered: ELSE a-stems (Sg.Loc, Ess, Par).

1.2.2 Sg.Ill vowel -u

1.2.2.2 Lacks Palatalization

1.2.2.2.1 Lacks Specifically Pedagogical Gradation

1.2.2.2.1.1 Has Orthographic Gradation

1.2.2.2.1.1[] (Monophthong + Consonant Geminate alternation)

Extra strong grade: Sg.Nom, Ess, Par Extra strong grade: Sg.Ill Strong grade: Pl.Nom, Sg.Loc, Sg.Com

FORMS

N_A-URaise3-32

strong_geminate, short_vowel, no_palatalization, low_stem_vowel

strong_geminate, short_vowel, no_palatalization, low_stem_vowel

strong_geminate, short_vowel, no_palatalization, low_stem_vowel, stem_vowel: a

strong_geminate, long_vowel, no_palatalization, low_stem_vowel

strong_geminate, long_vowel, no_palatalization, high_stem_vowel

Sg_Nom: vow_mono:vow_long:vow_low:pal_no:cns_gem Is for nouns with -ast Loc

1. WORDS WITH SINGLE-SYLLABLE NOMINATIVE SINGULARS (2009: 180)

1.2 Sg.Loc in -ast (vowel shift)

Raised: Sg.Ill Lowered: ELSE a-stems (Sg.Loc, Ess, Par).

1.2.2 Sg.Ill vowel -u

1.2.2.2 Lacks Palatalization

1.2.2.2.1 Lacks Specifically Pedagogical Gradation

1.2.2.2.1.1 Has Orthographic Gradation

1.2.2.2.1.1[] (Monophthong + Consonant Geminate alternation)

Strong grade: Sg.Nom, Ess, Par Extra strong grade: Sg.Ill Weak grade: Pl.Nom, Sg.Loc, Sg.Com

FORMS

N_A-URaise3-32

grade: weakened (long_vowel, short_cluster), vowel: neutral (low)

grade: strengthened (short_vowel, long_cluster), vowel: raised

grade: strengthened (short_vowel, long_cluster), vowel: neutral (low)

grade: neutral (short_vowel, long_cluster), vowel: neutral (low), stem_vowel: a

grade: weakened (long_vowel, short_cluster), vowel: raised, stem_vowel: u/a

grade: neutral (short_vowel, long_cluster), vowel: neutral (low)

grade: weakened (long_vowel, short_cluster), vowel: neutral (low)

grade: strengthened (short_vowel, long_cluster), vowel: raised

grade: strengthened (short_vowel, long_cluster), vowel: neutral (low)

grade: neutral (short_vowel, long_cluster), vowel: neutral (low), stem_vowel: a

grade: weakened (long_vowel, short_cluster), vowel: raised, stem_vowel: u/a

grade: allegro (short_vowel, short_cluster), vowel: neutral (low)

CLASS 1 LOW VOWEL, DIPHTHONG, NO PALATALIZATION NOMINALS

Sg_Nom: vow_di:vow_short:vow_low:pal_no:cns_gem_long Is for nouns with -ast Loc

1. WORDS WITH SINGLE-SYLLABLE NOMINATIVE SINGULARS (2009: 180)

1.2 Sg.Loc in -ast (vowel shift)

Raised: Sg.Ill Lowered: ELSE a-stems (Sg.Loc, Ess, Par).

1.2.2 Sg.Ill vowel -u

1.2.2.0 Lacks Palatalization

1.2.2.0.2 Has Specifically Pedagogical Gradation

1.2.2.0.2.0 Lack Orthographic Gradation

1.2.2.0.2.0[] (Diphthong + Consonant Geminate alternation)

Extra strong grade: Sg.Nom, Ess, Par Extra strong grade: Sg.Ill Strong grade: Pl.Nom, Sg.Loc, Sg.Com

FORMS

similar_to: N_PLAAN gradation: no vowel_shift: no Is for nouns with -ast Loc a-stems (Sg.Loc, Ess, Par).

Sg_Nom: vow_mono:vow_long:vow_low:pal_no:cns_gem_dd_type Is for nouns with -ast Loc, No gradation N_A-U1-11

1. WORDS WITH SINGLE-SYLLABLE NOMINATIVE SINGULARS (2009: 180)

1.2 Sg.Loc in -ast (no vowel shift, all lowered)

a-stems (Sg.Loc, Ess, Par).

1.2.2 Sg.Ill vowel -u

1.2.2.2 Lacks Palatalization

1.2.2.2.1 Lacks Specifically Pedagogical Gradation

1.2.2.2.1.2 Lacks Orthographic Gradation

1.2.2.2.1.2[] (Monophthong + Consonant)

Weak grade: Sg.Nom, Ess, Par Weak grade: Sg.Ill Weak grade: Pl.Nom, Sg.Loc, Sg.Com

FORMS

CLASS 1 HIGH VOWEL TYPE, NO PALATALIZATION NOMINALS

CLASS 1 HIGH VOWEL, PALATALIZATION NOMINALS

Sg_Nom: vow_mono:vow_short:vow_high_u:pal_yes:cns_gem_dd_type

1.3 Sg.Loc in -est (vowel shift)

Raised: Sg.Com, Pl.Obl Lowered: ELSE e-stems (Sg.Loc, Ess, Par).

1.3.2 Sg.Ill vowel -a

1.3.2.1 Has Palatalization

Palatalized: ELSE Not Palatalized: Sg.Ill

1.3.2.1.1 Lacks Specifically Pedagogical Gradation

1.3.2.1.1.1 Has Orthographic Gradation

1.3.2.1.1.1[] (Monophthong + Consonant Geminate Variation)

Strong grade: Sg.Nom, Ess, Par Extra strong grade: Sg.Ill Weak grade: Pl.Nom, Sg.Loc, Sg.Com

FORMS

NumContLex=”1.113”

Bahuvriihi: årddnjuuʹnn

FORMS

(1) Sg.Nom: juʹvjj

(2) Pl.Nom: juuʹj

(3) Sg.Ill: joujja

(4) Sg.Loc: juuʹjest

(5) Sg.Com: juuʹjin

(6) Ess: juʹvjjen

(7) Par: juʹvjjed

(8) Pl.Acc: juuʹjid

(9) Der/Dimin.N.Sg.Nom: joujjaž

NumContLex=”1.113” Is for nouns with -est Loc, Extra long vowel

1. WORDS WITH SINGLE-SYLLABLE NOMINATIVE SINGULARS (2009: 180, 197-199)

1.3 Sg.Loc in -est (no vowel shift, all lowered)

e-stems (Sg.Loc, Ess, Par).

1.3.2 Sg.Ill vowel -a

1.3.2.1 Has Palatalization

Palatalized: ELSE Not Palatalized: Sg.Ill

1.3.2.1.2 Has Specifically Pedagogical Gradation

1.3.2.1.1.1 Has Orthographic Gradation

1.3.2.1.1.1[] (Diphthong + Consonant and Geminate variation)

Strong grade: Sg.Nom, Ess, Par Extra strong grade: Sg.Ill Weak grade: Pl.Nom, Sg.Loc, Sg.Com, Dim

FORMS

(1) Sg.Nom: kueʹll

(2) Pl.Nom: kueʼl

(3) Sg.Ill: kuâlˈla

(4) Sg.Loc: kueʹlest

(5) Sg.Com: kuõʹlin ~ kueʹlin

(6) Ess: kueʹllen

(7) Par: kueʹlled

(8) Pl.Acc: kuõʹlid ~ kueʹlid

(9) Der/Dimin.N.Sg.Nom: kuâlaž

1. WORDS WITH SINGLE-SYLLABLE NOMINATIVE SINGULARS (2009: 180)

1.3 Sg.Loc in -est (vowel shift)

Raised: Sg.Com, Pl.Obl Lowered: ELSE e-stems (Sg.Loc, Ess, Par).

1.3.2 Sg.Ill vowel -a

1.3.2.1 Has Palatalization

Palatalized: ELSE Not Palatalized: Sg.Ill

1.3.2.1.1 Lacks Specifically Pedagogical Gradation

1.3.2.1.1.1 Has Orthographic Gradation

1.3.2.1.1.1[] (Monophthong + Consonant Cluster)

Strong grade: Sg.Nom, Ess, Par Strong grade: Sg.Ill Weak grade: Pl.Nom, Sg.Loc, Sg.Com

FORMS

NumContLex=”1.113” Is for nouns with -est Loc, Extra long vowel

CLASS 1 LOW VOWEL, PALATALIZATION, ILLATIVE IN U NOMINALS

WORK NEEDED

CLASS 1 LOW VOWEL, PALATALIZATION, ILLATIVE IN A NOMINALS

e-stems

similar_to: N_PAPP vowel: monophthong vowel_shift: yes consonantism: geminate

1. WORDS WITH SINGLE-SYLLABLE NOMINATIVE SINGULARS (2009: 204)

1.3 Sg.Loc in -est (vowel shift)

Raised: Sg.Com, Pl.Obl Lowered: ELSE e-stems (Sg.Loc, Ess, Par).

1.3.2 Sg.Ill vowel -a

1.3.2.1 Has Palatalization

Palatalized: ELSE Not Palatalized: Sg.Ill

1.3.2.1.1 Lacks Specifically Pedagogical Gradation

1.3.2.1.1.1 Has Orthographic Gradation

1.3.2.1.1.1[] (Monophthong + Consonant Geminate Variation)

Strong grade: Sg.Nom, Ess, Par Extra strong grade: Sg.Ill Weak grade: Pl.Nom, Sg.Loc, Sg.Com

FORMS

NumContLex=”1.113”

similar_to: N_PAPP vowel: monophthong vowel_shift: ?? consonantism: geminate

1. WORDS WITH SINGLE-SYLLABLE NOMINATIVE SINGULARS (2009: 204)

1.3 Sg.Loc in -est (vowel shift)

Raised: Sg.Com, Pl.Obl Lowered: ELSE e-stems (Sg.Loc, Ess, Par).

1.3.2 Sg.Ill vowel -a

1.3.2.1 Has Palatalization

Palatalized: ELSE Not Palatalized: Sg.Ill

1.3.2.1.1 Lacks Specifically Pedagogical Gradation

1.3.2.1.1.1 Has Orthographic Gradation

1.3.2.1.1.1[] (Monophthong + Consonant Geminate Variation)

Strong grade: Sg.Nom, Ess, Par Extra strong grade: Sg.Ill Weak grade: Pl.Nom, Sg.Loc, Sg.Com

FORMS

NumContLex=”1.113”

1. WORDS WITH SINGLE-SYLLABLE NOMINATIVE SINGULARS (2009: 204)

1.3 Sg.Loc in -est (vowel shift)

Raised: Sg.Com, Pl.Obl Lowered: ELSE e-stems (Sg.Loc, Ess, Par).

1.3.2 Sg.Ill vowel -a

1.3.2.1 Has Palatalization

Palatalized: ELSE Not Palatalized: Sg.Ill

1.3.2.1.1 Lacks Specifically Pedagogical Gradation

1.3.2.1.1.1 Has Orthographic Gradation

1.3.2.1.1.1[] (Monophthong + Consonant Geminate Variation)

Strong grade: Sg.Nom, Ess, Par Extra strong grade: Sg.Ill Weak grade: Pl.Nom, Sg.Loc, Sg.Com

FORMS

NumContLex=”1.113”

1. WORDS WITH SINGLE-SYLLABLE NOMINATIVE SINGULARS (2009: 204)

1.3 Sg.Loc in -est (vowel shift)

Raised: Sg.Com, Pl.Obl Lowered: ELSE e-stems (Sg.Loc, Ess, Par).

1.3.2 Sg.Ill vowel -a

1.3.2.1 Has Palatalization

Palatalized: ELSE Not Palatalized: Sg.Ill

1.3.2.1.1 Lacks Specifically Pedagogical Gradation

1.3.2.1.1.1 Has Orthographic Gradation

1.3.2.1.1.1[] (Monophthong + Consonant Geminate Variation)

Strong grade: Sg.Nom, Ess, Par Extra strong grade: Sg.Ill Weak grade: Pl.Nom, Sg.Loc, Sg.Com

FORMS

NumContLex=”1.113”

1. WORDS WITH SINGLE-SYLLABLE NOMINATIVE SINGULARS (2009: 180)

1.3 Sg.Loc in -est (vowel shift)

Raised: Sg.Com, Pl.Obl Lowered: ELSE e-stems (Sg.Loc, Ess, Par).

1.3.2 Sg.Ill vowel -a

1.3.2.1 Has Palatalization

Palatalized: ELSE Not Palatalized: Sg.Ill

1.3.2.1.1 Lacks Specifically Pedagogical Gradation

1.3.2.1.1.1 Has Orthographic Gradation

1.3.2.1.1.1[] (Monophthong + Consonant Cluster)

Strong grade: Sg.Nom, Ess, Par Strong grade: Sg.Ill Weak grade: Pl.Nom, Sg.Loc, Sg.Com

FORMS

NumContLex=”1.113” Is for nouns with -est Loc, Extra long vowel

1. WORDS WITH SINGLE-SYLLABLE NOMINATIVE SINGULARS (2009: 180)

1.3 Sg.Loc in -est (NO vowel shift)

Raised: Sg.Com, Pl.Obl Lowered: ELSE e-stems (Sg.Loc, Ess, Par).

1.3.2 Sg.Ill vowel -a

1.3.2.1 Has Palatalization

Palatalized: ELSE Not Palatalized: Sg.Ill

1.3.2.1.1 Lacks Specifically Pedagogical Gradation

1.3.2.1.1.1 Has Orthographic Gradation

1.3.2.1.1.1[] (Monophthong + Consonant Cluster)

Strong grade: Sg.Nom, Ess, Par Strong grade: Sg.Ill Weak grade: Pl.Nom, Sg.Loc, Sg.Com

FORMS

NumContLex=”1.113 Is for nouns with -est Loc, Extra long vowel

1. WORDS WITH SINGLE-SYLLABLE NOMINATIVE SINGULARS (2009: 180)

1.3 Sg.Loc in -est (NO vowel shift)

Raised: Sg.Com, Pl.Obl Lowered: ELSE e-stems (Sg.Loc, Ess, Par).

1.3.2 Sg.Ill vowel -a

1.3.2.1 Has Palatalization

Palatalized: ELSE Not Palatalized: Sg.Ill

1.3.2.1.1 Lacks Specifically Pedagogical Gradation

1.3.2.1.1.1 Has Orthographic Gradation

1.3.2.1.1.1[] (Monophthong + Consonant Cluster)

Strong grade: Sg.Nom, Ess, Par Strong grade: Sg.Ill Weak grade: Pl.Nom, Sg.Loc, Sg.Com

FORMS

Is for nouns with -est Loc, Extra long vowel

1. WORDS WITH SINGLE-SYLLABLE NOMINATIVE SINGULARS (2009: 180)

1.3 Sg.Loc in -est (no vowel shift, all lowered)

e-stems (Sg.Loc, Ess, Par).

1.3.2 Sg.Ill vowel -a

1.3.2.1 Has Palatalization

Palatalized: ELSE Not Palatalized: Sg.Ill

1.3.2.1.1 Lacks Specifically Pedagogical Gradation

1.3.2.1.1.1 Has Orthographic Gradation

1.3.2.1.1.1[] (Monophthong + Consonant Cluster variation)

Strong grade: Sg.Nom, Ess, Par Strong grade: Sg.Ill Weak grade: Pl.Nom, Sg.Loc, Sg.Com

FORMS

1. WORDS WITH SINGLE-SYLLABLE NOMINATIVE SINGULARS (2009: 204)

1.3 Sg.Loc in -est (vowel shift)

Raised: Sg.Com, Pl.Obl Lowered: ELSE ẹ, ä e-stems (Sg.Loc, Ess, Par).

1.3.2 Sg.Ill vowel -a

1.3.2.1 Has Palatalization

Palatalized: ELSE Not Palatalized: Sg.Ill

1.3.2.1.1 Lacks Specifically Pedagogical Gradation

1.3.2.1.1.1 Has Orthographic Gradation

1.3.2.1.1.1[] (Monophthong + Consonant Geminate Variation)

Strong grade: Sg.Nom, Ess, Par Extra strong grade: Sg.Ill Weak grade: Pl.Nom, Sg.Loc, Sg.Com

FORMS

NumContLex=”1.113”

1. WORDS WITH SINGLE-SYLLABLE NOMINATIVE SINGULARS (2009: 204)

1.3 Sg.Loc in -est (vowel shift)

Raised: Sg.Com, Pl.Obl Lowered: ELSE ẹ, ä e-stems (Sg.Loc, Ess, Par).

1.3.2 Sg.Ill vowel -a

1.3.2.1 Has Palatalization

Palatalized: ELSE Not Palatalized: Sg.Ill

1.3.2.1.1 Lacks Specifically Pedagogical Gradation

1.3.2.1.1.1 Has Orthographic Gradation

1.3.2.1.1.1[] (Monophthong + Consonant Geminate Variation)

Strong grade: Sg.Nom, Ess, Par Extra strong grade: Sg.Ill Weak grade: Pl.Nom, Sg.Loc, Sg.Com

FORMS

NumContLex=”1.113”

1. WORDS WITH SINGLE-SYLLABLE NOMINATIVE SINGULARS (2009: 204)

1.3 Sg.Loc in -est (vowel shift)

Raised: Sg.Com, Pl.Obl Lowered: ELSE ẹ, ä e-stems (Sg.Loc, Ess, Par).

1.3.2 Sg.Ill vowel -a

1.3.2.1 Has Palatalization

Palatalized: ELSE Not Palatalized: Sg.Ill

1.3.2.1.1 Lacks Specifically Pedagogical Gradation

1.3.2.1.1.1 Has Orthographic Gradation

1.3.2.1.1.1[] (Monophthong + Consonant Geminate Variation)

Strong grade: Sg.Nom, Ess, Par Extra strong grade: Sg.Ill Weak grade: Pl.Nom, Sg.Loc, Sg.Com

FORMS

(1) Sg.Nom: vuẹiʹvv

(2) Pl.Nom: vuẹiʼv

(3) Sg.Ill: vuäivva

(4) Sg.Loc: vuẹiʹvest

(5) Sg.Com: vueiʹvin

(6) Ess: vuẹiʹvven

(7) Par: vuẹiʹvved

(8) Pl.Acc: vueiʹvid

(9) Der/Dimin.N.Sg.Nom: vuäivaž

NumContLex=”1.113”

1. WORDS WITH SINGLE-SYLLABLE NOMINATIVE SINGULARS (2009: 204)

1.3 Sg.Loc in -est (vowel shift)

Raised: Sg.Com, Pl.Obl Lowered: ELSE ẹ, ä e-stems (Sg.Loc, Ess, Par).

1.3.2 Sg.Ill vowel -a

1.3.2.1 Has Palatalization

Palatalized: ELSE Not Palatalized: Sg.Ill

1.3.2.1.1 Lacks Specifically Pedagogical Gradation

1.3.2.1.1.1 Has Orthographic Gradation

1.3.2.1.1.1[] (Monophthong + Consonant Geminate Variation)

Strong grade: Sg.Nom, Ess, Par Extra strong grade: Sg.Ill Weak grade: Pl.Nom, Sg.Loc, Sg.Com

FORMS

(1) Sg.Nom: čuäʹrvv

(2) Pl.Nom: čuẹʼrv

(3) Sg.Ill: čuärvva

(4) Sg.Loc: čuẹʹrvest

(5) Sg.Com: čueʹrvin

(6) Ess: čuäʹrvven

(7) Par: čuäʹrvved

(8) Pl.Acc: čueʹrvid

(9) Der/Dimin.N.Sg.Nom: čuärvaž

NumContLex=”1.113”

1. WORDS WITH SINGLE-SYLLABLE NOMINATIVE SINGULARS (2009: 180, 197-199)

1.3 Sg.Loc in -est (vowel shift)

Raised: ELSE Lowered: Sg.Ill e-stems (Sg.Loc, Ess, Par).

1.3.2 Sg.Ill vowel -a

1.3.2.1 Has Palatalization

Palatalized: ELSE Not Palatalized: Sg.Ill

1.3.2.1.1 Lacks Specifically Pedagogical Gradation

1.3.2.1.1.1 Has Orthographic Gradation

1.3.2.1.1.1[] (Diphthong + Consonant Geminate variation)

Strong grade: Sg.Nom, Ess, Par Extra strong grade: Sg.Ill Weak grade: Pl.Nom, Sg.Loc, Sg.Com

FORMS

1. WORDS WITH SINGLE-SYLLABLE NOMINATIVE SINGULARS (2009: 180, 197-199)

1.3 Sg.Loc in -est (no vowel shift, all lowered)

e-stems (Sg.Loc, Ess, Par).

1.3.2 Sg.Ill vowel -a

1.3.2.1 Has Palatalization

Palatalized: ELSE Not Palatalized: Sg.Ill

1.3.2.1.2 Has Specifically Pedagogical Gradation

1.3.2.1.1.1 Has Orthographic Gradation

1.3.2.1.1.1[] (Diphthong + Consonant and Geminate variation)

Strong grade: Sg.Nom, Ess, Par Extra strong grade: Sg.Ill Weak grade: Pl.Nom, Sg.Loc, Sg.Com, Dim

FORMS

WHAT IS THIS CLASS

e.g. e.g. +Use/NG+Sg+Loc+PxSg3 e.g. +Sg+Loc+PxSg1

CLASS 2 NOMINALS with high stem vowel and i-stems

CLASS 2 NOMINALS with low stem vowel and u-stems

u-stems

CLASS 2 NOMINALS with high stem vowel and â-stems

CLASS 3 HIGH VOWEL, MONOPHTHONG, NO PALATALIZATION NOMINALS

m-stems

CLASS 3

n-stems

CLASS 3

CLASS 4 BISYLLABIC, HIGH VOWEL, MONOPHTHONG, NO PALATALIZATION IN PENULTIMATE Â:0

N_GEN2X3-NOM2X1

CLASS 4 BISYLLABIC, HIGH VOWEL, DIPHTHONG, NO PALATALIZATION

CLASS 4 BISYLLABIC, LOW VOWEL, MONOPHTHONG, NO PALATALIZATION

stemtype n-stem jânnam:jânnam N_GEN2X3-NOM2X1

CLASS 4 BISYLLABIC, LOW VOWEL, DIPHTHONG, NO PALATALIZATION IN PENULTIMATE A:0

2. WORDS WITH TWO-SYLLABLE NOMINATIVE SINGULARS (2009: 252)

2.3 Sg.Loc in -est. e-stems (Sg.Loc, Ess, Par).

2.3.2 Sg.Ill in -a

2.3.2.2 LACKS Gradation

2.3.2.2.1 Penultimate stem vowel loss: (Sg.Ill, Sg.Loc, Sg.Com; Ess, Par; Pl.Gen, Pl.Acc, Pl.Ill, Pl.Loc, Pl.Com, Pl.Abe)

2.3.2.2.1.1 The Sg.Com vowel i appears before final n

CLASS 4 BISYLLABIC, LOW VOWEL, MONOPHTHONG, PALATALIZATION

CLASS 4 BISYLLABIC, LOW VOWEL, DIPHTHONG, PALATALIZATION IN PENULTIMATE E:0

-stems

e-a-stems

čâustõk+N+Sg+Gen:čâustõõǥǥ

Stem types from the grammar

These are still not fixed.

Class 5 according to Feist 152

k-stems

stemtype

Class 6 according to Feist 153-154 PRESENT A-02_PARTICIPLES

participles in -I from verbs in ʹ-ed

Class 7 according to Feist 154-155

i-stems

Class 8 according to Feist 155-157

Class 9 according to Feist 158

Diminutive derivations

(2009: 306)

(2009: 310)

Class 11 according to Feist 162

Class 12 Feist 163

Noun phrase heads

Pl

Number and case tags

Used with words like juurd: jurddǥatta

Sg_Nom: vow_di:vow_short:vow_low:pal_no:cns_gem Is for nouns with -ast Loc

1. WORDS WITH SINGLE-SYLLABLE NOMINATIVE SINGULARS (2009: 180)

1.2 Sg.Loc in -ast (vowel shift)

Raised: Sg.Ill Lowered: ELSE a-stems (Sg.Loc, Ess, Par).

1.2.2 Sg.Ill vowel -u

1.2.2.2 Lacks Palatalization

1.2.2.2.1 Lacks Specifically Pedagogical Gradation

1.2.2.2.1.1 Has Orthographic Gradation

1.2.2.2.1.1[] (Monophthong + Consonant Geminate alternation)

Strong grade: Sg.Nom, Ess, Par Extra strong grade: Sg.Ill Weak grade: Pl.Nom, Sg.Loc, Sg.Com uâ:uõ, eä:iâ

FORMS

N_A-URaise3-32

grade: weakened (long_vowel, short_cluster), vowel: neutral (low) Sg_Abe, Sg_Acc, Sg_Gen, Pl_Nom, +Use/NG_Sg_Loc_Px…, stem_vowel: a : Sg_Loc, Sg_Com, Pl_Gen, Pl_Acc, Pl_Ill, Pl_Loc, Pl_Com, Pl_Abe

grade: strengthened (short_vowel, long_cluster), vowel: raised Sg_Ill

grade: strengthened (short_vowel, long_cluster), vowel: neutral (low)

grade: neutral (short_vowel, long_cluster), vowel: neutral (low), stem_vowel: a

grade: weakened (long_vowel, short_cluster), vowel: raised, stem_vowel: u/a

grade: allegro (short_vowel, short_cluster), vowel: neutral (low)

Adjectives – to be moved


This (part of) documentation was generated from src/fst/morphology/affixes/adjectives.lexc


src-fst-morphology-affixes-adverbs.lexc.md

Skolt Saami adverbs


This (part of) documentation was generated from src/fst/morphology/affixes/adverbs.lexc


src-fst-morphology-affixes-nouns.lexc.md

Skolt Saami noun morphology

This file documents the Skolt Saami noun morphology, lexicon by lexicon.

Unclassified words

IRREGULAR NOUN nijdd

CLASS 1 HIGH VOWEL, NO PALATALIZATION NOMINALS

1.1.1.1.1.1. Sg_Nom=”short_vowel geminate” Sg_Gen=”long_vowel geminate”

1. WORDS WITH SINGLE-SYLLABLE NOMINATIVE SINGULARS (2009: 167)

1.1 Sg.Loc in -âst (no vowel shift, all raised)

â-stems (Sg.Loc, Ess, Par).

1.1.1 Sg.Ill vowel -e

1.1.1.1 Has Palatalization

1.1.1.1[1] (Palatalization pattern)

Palatalized: Sg.Ill Not Palatalized: ELSE Sg.Ill in palatalization and -e

1.1.1.1[1].1 Lacks Specifically Pedagogical Gradation

1.1.1.1[1].1.1 Has Orthographic Gradation

1.1.1.1[1].1.1[] (Monophthong + Consonant Geminate alternation)

Extra strong grade: Sg.Nom, Ess, Par Extra strong grade: Sg.Ill Strong grade: Pl.Nom, Sg.Loc, Sg.Com

FORMS

strong_geminate, short_vowel, palatalization

strong_geminate, long_vowel

e.g. e.g. +Use/NG+Sg+Loc+PxSg3 e.g. +Sg+Loc+PxSg3 pp:p papstes

1. WORDS WITH SINGLE-SYLLABLE NOMINATIVE SINGULARS (2009: 167)

1.1 Sg.Loc in -âst (no vowel shift, all raised)

â-stems (Sg.Loc, Ess, Par).

1.1.1 Sg.Ill vowel -e

1.1.1.1 Has Palatalization

1.1.1.1[1] (Palatalization pattern)

Palatalized: Sg.Ill Not Palatalized: ELSE Sg.Ill in palatalization and -e

1.1.1.1[1].1 Lacks Specifically Pedagogical Gradation

1.1.1.1[1].1.1 Has Orthographic Gradation

1.1.1.1[1].1.1[] (Monophthong + Consonant Cluster alternation)

Strong grade: Sg.Nom, Ess, Par Strong grade: Sg.Ill Weak grade: Pl.Nom, Sg.Loc, Sg.Com

FORMS

1. WORDS WITH SINGLE-SYLLABLE NOMINATIVE SINGULARS (2009: 167)

1.1 Sg.Loc in -âst (no vowel shift, all raised)

â-stems (Sg.Loc, Ess, Par).

1.1.1 Sg.Ill vowel -e

1.1.1.1 Has Palatalization

1.1.1.1[1] (Palatalization pattern)

Palatalized: Sg.Ill Not Palatalized: ELSE Sg.Ill in palatalization and -e

1.1.1.1[1].1 Lacks Specifically Pedagogical Gradation

1.1.1.1[1].1.1 Has Orthographic Gradation

1.1.1.1[1].1.1[] (Monophthong + Consonant Cluster alternation)

Strong grade: Sg.Nom, Ess, Par Strong grade: Sg.Ill Weak grade: Pl.Nom, Sg.Loc, Sg.Com

FORMS

See also: NMN_KUSS-PLC, which is the same, but minus PL forms and certain cases

1. WORDS WITH SINGLE-SYLLABLE NOMINATIVE SINGULARS (2009: 167)

1.1 Sg.Loc in -âst (no vowel shift, all raised)

â-stems (Sg.Loc, Ess, Par).

1.1.1 Sg.Ill vowel -e

1.1.1.1 Has Palatalization

1.1.1.1[1] (Palatalization pattern)

Palatalized: Sg.Ill Not Palatalized: ELSE Sg.Ill in palatalization and -e

1.1.1.1[1].1 Lacks Specifically Pedagogical Gradation

1.1.1.1[1].1.1 Has Orthographic Gradation

1.1.1.1[1].1.1[] (Monophthong + Consonant and Consonant Geminate alternation)

Extra strong grade: Sg.Nom, Ess, Par Extra strong grade: Sg.Ill Weak grade: Pl.Nom, Sg.Loc, Sg.Com similar_to:

FORMS

THIS IS NOT THE SAME AS N_MUORR

1. WORDS WITH SINGLE-SYLLABLE NOMINATIVE SINGULARS (2009: 167)

1.1 Sg.Loc in -âst (no vowel shift, all raised)

â-stems (Sg.Loc, Ess, Par).

1.1.1 Sg.Ill vowel -e

1.1.1.1 Has Palatalization

1.1.1.1[1] (Palatalization pattern)

Palatalized: Sg.Ill Not Palatalized: ELSE Sg.Ill in palatalization and -e

1.1.1.1[1].1 Lacks Specifically Pedagogical Gradation

1.1.1.1[1].1.1 Has Orthographic Gradation

1.1.1.1[1].1.1[] (Monophthong + Consonant Cluster alternation)

Strong grade: Sg.Nom, Ess, Par Strong grade: Sg.Ill Weak grade: Pl.Nom, Sg.Loc, Sg.Com

FORMS

1. WORDS WITH SINGLE-SYLLABLE NOMINATIVE SINGULARS (2009: 167)

1.1 Sg.Loc in -âst (no vowel shift, all raised)

â-stems (Sg.Loc, Ess, Par).

1.1.1 Sg.Ill vowel -e

1.1.1.1 Has Palatalization

1.1.1.1[1] (Palatalization pattern)

Palatalized: Sg.Ill Not Palatalized: ELSE Sg.Ill in palatalization and -e

1.1.1.1[1].1 Lacks Specifically Pedagogical Gradation

1.1.1.1[1].1.1 Has Orthographic Gradation

1.1.1.1[1].1.1[] (Monophthong + Consonant Cluster alternation)

Strong grade: Sg.Nom, Ess, Par Strong grade: Sg.Ill Weak grade: Pl.Nom, Sg.Loc, Sg.Com

FORMS

is for words with -âst Loc but -ense Ill 3rd grade

1. WORDS WITH SINGLE-SYLLABLE NOMINATIVE SINGULARS (2009: 167)

1.1 Sg.Loc in -âst (no vowel shift, all raised)

â-stems (Sg.Loc, Ess, Par).

1.1.1 Sg.Ill vowel -e

1.1.1.1 Has Palatalization

1.1.1.1[1] (Palatalization pattern)

Palatalized: Sg.Ill Not Palatalized: ELSE Sg.Ill in palatalization and -e Has Allophonic palatalization

1.1.1.1[1].2 Has Specifically Pedagogical Gradation

1.1.1.1[1].2.1 Has Orthographic Gradation

1.1.1.1[1].2.1[] (Diphthong + Consonant and Consonant Geminate alternation)

Strong grade: Sg.Nom, Ess, Par Extra strong grade: Sg.Ill Weak grade: Pl.Nom, Sg.Loc, Sg.Com

FORMS

(9) Der/Dimin.N.Sg.Nom: puåvâž

puåvv:puåvv

is for words with -âst Loc but -ense Ill 3rd grade

1. WORDS WITH SINGLE-SYLLABLE NOMINATIVE SINGULARS (2009: 167)

1.1 Sg.Loc in -âst (no vowel shift, all raised)

â-stems (Sg.Loc, Ess, Par).

1.1.1 Sg.Ill vowel -e

1.1.1.1 Has Palatalization

1.1.1.1[1] (Palatalization pattern)

Palatalized: Sg.Ill Not Palatalized: ELSE Sg.Ill in palatalization and -e Has Allophonic palatalization

1.1.1.1[1].2 Has Specifically Pedagogical Gradation

1.1.1.1[1].2.1 Has Orthographic Gradation

1.1.1.1[1].2.1[] (Diphthong + Consonant and Consonant Geminate alternation)

Strong grade: Sg.Nom, Ess, Par Extra strong grade: Sg.Ill Weak grade: Pl.Nom, Sg.Loc, Sg.Com

FORMS

(9) Der/Dimin.N.Sg.Nom: siâkkâž

siâkˈk:siâkˈk

1. WORDS WITH SINGLE-SYLLABLE NOMINATIVE SINGULARS (2009: 167)

1.1 Sg.Loc in -âst (no vowel shift, all raised)

â-stems (Sg.Loc, Ess, Par).

1.1.1 Sg.Ill vowel -e

1.1.1.1 Has Palatalization

1.1.1.1[1] (Palatalization pattern)

Palatalized: Sg.Ill Not Palatalized: ELSE Sg.Ill in palatalization and -e

1.1.1.1[1].1 Lacks Specifically Pedagogical Gradation

1.1.1.1[1].1.1 Has Orthographic Gradation

1.1.1.1[1].1.1[] (Monophthong + Consonant Cluster alternation)

Strong grade: Sg.Nom, Ess, Par Strong grade: Sg.Ill Weak grade: Pl.Nom, Sg.Loc, Sg.Com

FORMS

See also: NMN_TOLL-PLC, which is the same, but minus PL forms and certain cases

1. WORDS WITH SINGLE-SYLLABLE NOMINATIVE SINGULARS (2009: 167)

1.1 Sg.Loc in -âst (no vowel shift, all raised)

â-stems (Sg.Loc, Ess, Par).

1.1.1 Sg.Ill vowel -e

1.1.1.1 Has Palatalization

1.1.1.1[1] (Palatalization pattern)

Palatalized: Sg.Ill Not Palatalized: ELSE Sg.Ill in palatalization and -e

1.1.1.1[1].1 Lacks Specifically Pedagogical Gradation

1.1.1.1[1].1.1 Has Orthographic Gradation

1.1.1.1[1].1.1[] (Monophthong + Consonant and Consonant Geminate alternation)

Extra strong grade: Sg.Nom, Ess, Par Extra strong grade: Sg.Ill Weak grade: Pl.Nom, Sg.Loc, Sg.Com

FORMS

1. WORDS WITH SINGLE-SYLLABLE NOMINATIVE SINGULARS (2009: 167)

1.1 Sg.Loc in -âst (no vowel shift, all raised)

â-stems (Sg.Loc, Ess, Par).

1.1.1 Sg.Ill vowel -e

1.1.1.1 Has Palatalization

1.1.1.1[1] (Palatalization pattern)

Palatalized: Sg.Ill Not Palatalized: ELSE Sg.Ill in palatalization and -e

1.1.1.1[1].1 Lacks Specifically Pedagogical Gradation

1.1.1.1[1].1.1 Has Orthographic Gradation

1.1.1.1[1].1.1[] (Monophthong + Consonant Cluster alternation)

Strong grade: Sg.Nom, Ess, Par Strong grade: Sg.Ill Weak grade: Pl.Nom, Sg.Loc, Sg.Com

FORMS

1. WORDS WITH SINGLE-SYLLABLE NOMINATIVE SINGULARS (2009: 167)

1.1 Sg.Loc in -âst (no vowel shift, all raised)

â-stems (Sg.Loc, Ess, Par).

1.1.1 Sg.Ill vowel -e

1.1.1.1 Has Palatalization

1.1.1.1[1] (Palatalization pattern)

Palatalized: Sg.Ill Not Palatalized: ELSE Sg.Ill in palatalization and -e

1.1.1.1[1].1 Lacks Specifically Pedagogical Gradation

1.1.1.1[1].1.1 Has Orthographic Gradation

1.1.1.1[1].1.1[] (Monophthong + Consonant and Consonant Geminate alternation)

Strong grade: Sg.Nom, Ess, Par Extra strong grade: Sg.Ill Weak grade: Pl.Nom, Sg.Loc, Sg.Com

FORMS

THIS IS NOT THE SAME AS N_MUORR

1. WORDS WITH SINGLE-SYLLABLE NOMINATIVE SINGULARS (2009: 167)

1.1 Sg.Loc in -âst (no vowel shift, all raised)

â-stems (Sg.Loc, Ess, Par).

1.1.1 Sg.Ill vowel -e

1.1.1.1 Has Palatalization

1.1.1.1[1] (Palatalization pattern)

Palatalized: Sg.Ill Not Palatalized: ELSE Sg.Ill in palatalization and -e

1.1.1.1[1].1 Lacks Specifically Pedagogical Gradation

1.1.1.1[1].1.1 Has Orthographic Gradation

1.1.1.1[1].1.1[] (Monophthong + Consonant and Consonant Geminate alternation)

Strong grade: Sg.Nom, Ess, Par Extra strong grade: Sg.Ill Weak grade: Pl.Nom, Sg.Loc, Sg.Com

FORMS

FORMS

1. WORDS WITH SINGLE-SYLLABLE NOMINATIVE SINGULARS (2009: 167)

1.1 Sg.Loc in -âst (no vowel shift, all raised)

â-stems (Sg.Loc, Ess, Par).

1.1.1 Sg.Ill vowel -e

1.1.1.1 Has Palatalization

1.1.1.1[1] (Palatalization pattern)

Palatalized: Sg.Ill Not Palatalized: ELSE Sg.Ill in palatalization and -e

1.1.1.1[1].2 Has Specifically Pedagogical Gradation

Sg.Ill:

1.1.1.1[1].2.1 Has Orthographic Gradation

1.1.1.1[1].2.1[] (Diphthong + Consonant and Consonant Geminate alternation)

Strong grade: Sg.Nom, Ess, Par Extra strong grade: Sg.Ill Weak grade: Pl.Nom, Sg.Loc, Sg.Com

FORMS

ǩiõtt+N+Sg+Loc+PxSg1 hand,arm/käsi

similar_to: N_TUYJJ

FORMS

lomaakk:lomaakk

CLASS 1 LOW VOWEL, MONOPHTHONG, NO PALATALIZATION NOMINALS

a-stems

Is for nouns with -ast Loc

1. WORDS WITH SINGLE-SYLLABLE NOMINATIVE SINGULARS (2009: 180)

1.2 Sg.Loc in -ast (vowel shift)

Raised: Sg.Ill Lowered: ELSE a-stems (Sg.Loc, Ess, Par).

1.2.2 Sg.Ill vowel -u

1.2.2.2 Lacks Palatalization

1.2.2.2.1 Lacks Specifically Pedagogical Gradation

1.2.2.2.1.1 Has Orthographic Gradation

1.2.2.2.1.1[] (Monophthong + Consonant Geminate alternation)

Extra strong grade: Sg.Nom, Ess, Par Extra strong grade: Sg.Ill Strong grade: Pl.Nom, Sg.Loc, Sg.Com

FORMS

N_A-URaise3-32

strong_geminate, short_vowel, no_palatalization, low_stem_vowel

strong_geminate, short_vowel, no_palatalization, low_stem_vowel

strong_geminate, long_vowel, no_palatalization, low_stem_vowel

strong_geminate, long_vowel, no_palatalization, high_stem_vowel

grade: allegro (short_vowel, short_cluster), vowel: neutral (low)

FORMS

strong_geminate, short_vowel, no_palatalization, low_stem_vowel

grade: weakened (long_vowel, short_cluster), vowel: neutral (low)

grade: strengthened (short_vowel, long_cluster), vowel: raised

grade: strengthened (short_vowel, long_cluster), vowel: neutral (low)

grade: neutral (short_vowel, long_cluster), vowel: neutral (low), stem_vowel: a

grade: weakened (long_vowel, short_cluster), vowel: raised, stem_vowel: u/a

grade: allegro (short_vowel, short_cluster), vowel: neutral (low)

grade: weakened (long_vowel, short_cluster), vowel: neutral (low)

grade: strengthened (short_vowel, long_cluster), vowel: raised

grade: strengthened (short_vowel, long_cluster), vowel: neutral (low)

grade: neutral (short_vowel, long_cluster), vowel: neutral (low), stem_vowel: a

grade: weakened (long_vowel, short_cluster), vowel: raised, stem_vowel: u/a

grade: allegro (short_vowel, short_cluster), vowel: neutral (low)

grade: weakened (long_vowel, short_cluster), vowel: neutral (low)

grade: strengthened (short_vowel, long_cluster), vowel: raised

grade: strengthened (short_vowel, long_cluster), vowel: neutral (low)

grade: neutral (short_vowel, long_cluster), vowel: neutral (low), stem_vowel: a

grade: weakened (long_vowel, short_cluster), vowel: raised, stem_vowel: u/a

grade: allegro (short_vowel, short_cluster), vowel: neutral (low)

grade: weakened (long_vowel, short_cluster), vowel: neutral (low)

grade: neutral (short_vowel, long_cluster), vowel: neutral (low), stem_vowel: a

grade: weakened (long_vowel, short_cluster), vowel: neutral (low)

grade: strengthened (short_vowel, long_cluster), vowel: raised

grade: strengthened (short_vowel, long_cluster), vowel: neutral (low)

grade: neutral (short_vowel, long_cluster), vowel: neutral (low), stem_vowel: a

grade: weakened (long_vowel, short_cluster), vowel: raised, stem_vowel: u/a

grade: allegro (short_vowel, short_cluster), vowel: neutral (low)

grade: weakened (long_vowel, short_cluster), vowel: neutral (low)

grade: strengthened (short_vowel, long_cluster), vowel: raised

grade: strengthened (short_vowel, long_cluster), vowel: neutral (low)

grade: neutral (short_vowel, long_cluster), vowel: neutral (low), stem_vowel: a

grade: weakened (long_vowel, short_cluster), vowel: raised, stem_vowel: u/a

grade: allegro (short_vowel, short_cluster), vowel: neutral (low)

CLASS 1 LOW VOWEL, DIPHTHONG, NO PALATALIZATION NOMINALS

1. WORDS WITH SINGLE-SYLLABLE NOMINATIVE SINGULARS (2009: 167)

1.1 Sg.Loc in -âst (no vowel shift, all raised)

â-stems (Sg.Loc, Ess, Par).

1.1.1 Sg.Ill vowel -e

1.1.1.1 Has Palatalization

1.1.1.1[1] (Palatalization pattern)

Palatalized: Sg.Ill Not Palatalized: ELSE Sg.Ill in palatalization and -e

1.1.1.1[1].1 Lacks Specifically Pedagogical Gradation

1.1.1.1[1].1.1 Has Orthographic Gradation

1.1.1.1[1].1.1[] (Monophthong + Consonant Cluster alternation)

Strong grade: Sg.Nom, Ess, Par Strong grade: Sg.Ill Weak grade: Pl.Nom, Sg.Loc, Sg.Com

FORMS

(1) Sg.Nom: meârkk

(2) Pl.Nom: meârk

(3) Sg.Ill: mieʹrǩǩe

(4) Sg.Loc: meârkâst

(5) Sg.Com: meârkin

(6) Ess: meârkkân

(7) Par: meârkkâd

(8) Pl.Acc: meârkid

(9) Der/Dimin.N.Sg.Nom: meârkâž

meârkk:meârkk

grade: neutral (short_vowel, long_cluster), vowel: neutral (low)

grade: weakened (long_vowel, short_cluster), vowel: neutral (low)

grade: strengthened (short_vowel, long_cluster), vowel: raised

grade: strengthened (short_vowel, long_cluster), vowel: neutral (low)

grade: neutral (short_vowel, long_cluster), vowel: neutral (low), stem_vowel: a

grade: weakened (long_vowel, short_cluster), vowel: raised, stem_vowel: u/a

grade: allegro (short_vowel, short_cluster), vowel: neutral (low)

FORMS

grade: neutral (short_vowel, long_cluster), vowel: neutral (low)

grade: weakened (long_vowel, short_cluster), vowel: neutral (low)

grade: strengthened (short_vowel, long_cluster), vowel: raised

grade: strengthened (short_vowel, long_cluster), vowel: neutral (low)

grade: neutral (short_vowel, long_cluster), vowel: neutral (low), stem_vowel: a

grade: weakened (long_vowel, short_cluster), vowel: raised, stem_vowel: u/a

grade: allegro (short_vowel, short_cluster), vowel: neutral (low)

grade: weakened (long_vowel, short_cluster), vowel: neutral (low) Sg_Abe, Sg_Acc, Sg_Gen, Pl_Nom, +Use/NG_Sg_Loc_Px…, stem_vowel: a : Sg_Loc, Sg_Com, Pl_Gen, Pl_Acc, Pl_Ill, Pl_Loc, Pl_Com, Pl_Abe

grade: strengthened (short_vowel, long_cluster), vowel: raised Sg_Ill

grade: strengthened (short_vowel, long_cluster), vowel: neutral (low)

grade: neutral (short_vowel, long_cluster), vowel: neutral (low), stem_vowel: a

grade: weakened (long_vowel, short_cluster), vowel: raised, stem_vowel: u/a

grade: allegro (short_vowel, short_cluster), vowel: neutral (low)

grade: neutral (short_vowel, long_cluster), vowel: neutral (low), stem_vowel: a

grade: weakened (long_vowel, short_cluster), vowel: neutral (low)

grade: strengthened (short_vowel, long_cluster), vowel: raised

grade: strengthened (short_vowel, long_cluster), vowel: neutral (low)

grade: neutral (short_vowel, long_cluster), vowel: neutral (low), stem_vowel: a

grade: weakened (long_vowel, short_cluster), vowel: raised, stem_vowel: u/a

grade: allegro (short_vowel, short_cluster), vowel: neutral (low)

grade: weakened (long_vowel, short_cluster), vowel: neutral (low)

grade: strengthened (short_vowel, long_cluster), vowel: neutral (low)

grade: neutral (short_vowel, long_cluster), vowel: neutral (low), stem_vowel: a

N_1A-VWCCC cf. _NEAVVV

grade: strengthened (short_vowel, long_cluster), vowel: neutral (low)

grade: neutral (short_vowel, long_cluster), vowel: neutral (low)

grade: neutral (short_vowel, long_cluster), vowel: neutral (low), stem_vowel: a

grade: weakened (long_vowel, short_cluster), vowel: neutral (low)

grade: weakened (long_vowel, short_cluster), vowel: raised, stem_vowel: u/a

grade: allegro (short_vowel, short_cluster), vowel: neutral (low)

grade: weakened (long_vowel, short_cluster), vowel: neutral (low)

grade: strengthened (short_vowel, long_cluster), vowel: raised

grade: strengthened (short_vowel, long_cluster), vowel: neutral (low)

grade: neutral (short_vowel, long_cluster), vowel: neutral (low), stem_vowel: a

grade: weakened (long_vowel, short_cluster), vowel: raised, stem_vowel: u/a

grade: allegro (short_vowel, short_cluster), vowel: neutral (low)

normative = PẸSS Sg_Nom: vow_mono:vow_short:vow_low:pal_no:cns_gem

grade: weakened (long_vowel, short_cluster), vowel: neutral (low)

Sg_Abe, Sg_Acc, Sg_Gen, Pl_Nom, +Use/NG_Sg_Loc_Px…, stem_vowel: a : Sg_Loc, Sg_Com, Pl_Gen, Pl_Acc, Pl_Ill, Pl_Loc, Pl_Com, Pl_Abe

grade: strengthened (short_vowel, long_cluster), vowel: raised

grade: strengthened (short_vowel, long_cluster), vowel: neutral (low) Ess_Px…, Sg_Ill…, N»A

grade: neutral (short_vowel, long_cluster), vowel: neutral (low), stem_vowel: a

grade: weakened (long_vowel, short_cluster), vowel: raised, stem_vowel: u/a

grade: allegro (short_vowel, short_cluster), vowel: neutral (low) Sg_Loc_Px..

grade: weakened (long_vowel, short_cluster), vowel: neutral (low)

Sg_Abe, Sg_Acc, Sg_Gen, Pl_Nom, +Use/NG_Sg_Loc_Px…, stem_vowel: a : Sg_Loc, Sg_Com, Pl_Gen, Pl_Acc, Pl_Ill, Pl_Loc, Pl_Com, Pl_Abe

grade: strengthened (short_vowel, long_cluster), vowel: raised

grade: strengthened (short_vowel, long_cluster), vowel: neutral (low)

grade: neutral (short_vowel, long_cluster), vowel: neutral (low), stem_vowel: a

grade: weakened (long_vowel, short_cluster), vowel: raised, stem_vowel: u/a

grade: allegro (short_vowel, short_cluster), vowel: neutral (low)

CLASS 1 HIGH VOWEL TYPE, NO PALATALIZATION NOMINALS

CLASS 1 HIGH VOWEL, PALATALIZATION NOMINALS

Sg_Nom: vow_mono:vow_short:vow_high_u:pal_yes:cns_xyy

1. WORDS WITH SINGLE-SYLLABLE NOMINATIVE SINGULARS (2009: 180)

1.3 Sg.Loc in -est (vowel shift)

Raised: Sg.Com, Pl.Obl Lowered: ELSE e-stems (Sg.Loc, Ess, Par).

1.3.2 Sg.Ill vowel -a

1.3.2.1 Has Palatalization

Palatalized: ELSE Not Palatalized: Sg.Ill

1.3.2.1.1 Lacks Specifically Pedagogical Gradation

1.3.2.1.1.1 Has Orthographic Gradation

1.3.2.1.1.1[] (Monophthong + Consonant Cluster)

Strong grade: Sg.Nom, Ess, Par Strong grade: Sg.Ill Weak grade: Pl.Nom, Sg.Loc, Sg.Com

FORMS

(1) Sg.Nom: kuʹvǯǯ

(2) Pl.Nom: kuʼvǯ

(3) Sg.Ill: kouǯǯa

(4) Sg.Loc: kuʹvǯest

(5) Sg.Com: kuʹvǯin

(6) Ess: kuʹvǯǯen

(7) Par: kuʹvǯǯed

(8) Pl.Acc: kuʹvǯid

(9) Der/Dimin.N.Sg.Nom: kouǯaž

NumContLex=”1.113” Is for nouns with -est Loc, Extra long vowel

Sg_Nom: vow_di:vow_short:vow_high_u:pal_yes:cns_xyy similar_to: N_CHUOSHKK

1. WORDS WITH SINGLE-SYLLABLE NOMINATIVE SINGULARS (2009: 180, 197-199)

1.3 Sg.Loc in -est (no vowel shift, all lowered)

e-stems (Sg.Loc, Ess, Par).

1.3.2 Sg.Ill vowel -a

1.3.2.1 Has Palatalization

Palatalized: ELSE Not Palatalized: Sg.Ill

1.3.2.1.2 Has Specifically Pedagogical Gradation

1.3.2.1.1.1 Has Orthographic Gradation

1.3.2.1.1.1[] (Diphthong + Consonant and Geminate variation)

Strong grade: Sg.Nom, Ess, Par Extra strong grade: Sg.Ill Weak grade: Pl.Nom, Sg.Loc, Sg.Com, Dim

FORMS

e.g. +Sg+Acc+PxPl3 e.g. +Use/NG+Sg+Loc+PxSg1

CLASS 1 LOW VOWEL, PALATALIZATION, ILLATIVE IN U NOMINALS

CLASS 1 LOW VOWEL, PALATALIZATION, ILLATIVE IN A NOMINALS

e-stems

FORMS

NumContLex=”1.113”

vowel: monophthong vowel_shift: no gradation: yes

FORMS

similar_to: KÅHTT

čuäʹrvv čuẹʼrv čuäʹrvv+N+Ess čuäʼrvven

WHAT IS THIS CLASS

gradation: yes

FORMS

1. WORDS WITH SINGLE-SYLLABLE NOMINATIVE SINGULARS (2009: 180)

1.3 Sg.Loc in -est (no vowel shift, all raised)

e-stems (Sg.Loc, Ess, Par).

1.3.2 Sg.Ill vowel -e

1.3.2.1 Has Palatalization (all palatalized)

1.3.2.1.1 Lacks Specifically Pedagogical Gradation

1.3.2.1.1.1 Has Orthographic Gradation

1.3.2.1.1.1[] (Monophthong + Consonant Cluster Variation)

Strong grade: Sg.Nom, Ess, Par Strong grade: Sg.Ill Weak grade: Pl.Nom, Sg.Loc, Sg.Com

FORMS

These must be checked 2017-04-03

FORMS

FORMS

e.g. e.g. +Use/NG+Sg+Loc+PxSg3 e.g. +Sg+Loc+PxSg1

e.g. e.g. +Use/NG+Sg+Loc+PxSg3 e.g. +Sg+Loc+PxSg1

e.g. e.g. +Use/NG+Sg+Loc+PxSg3 e.g. +Sg+Loc+PxSg1

CLASS 2 NOMINALS with high stem vowel and i-stems

CLASS 2 NOMINALS with low stem vowel and u-stems

u-stems

CLASS 2 NOMINALS with high stem vowel and â-stems

CLASS 3 HIGH VOWEL, MONOPHTHONG, NO PALATALIZATION NOMINALS

m-stems

CLASS 3

n-stems

CLASS 3

Are these actually necessary 2015-05-30

CLASS 4 BISYLLABIC, HIGH VOWEL, MONOPHTHONG, NO PALATALIZATION IN PENULTIMATE Â:0

N_GEN2X3-NOM2X1

N_GEN2X3-NOM2X1

CLASS 4 BISYLLABIC, HIGH VOWEL, DIPHTHONG, NO PALATALIZATION

CLASS 4 BISYLLABIC, LOW VOWEL, MONOPHTHONG, NO PALATALIZATION

stemtype n-stem jânnam:jânnam N_GEN2X3-NOM2X1

gradation: no

2. WORDS WITH TWO-SYLLABLE NOMINATIVE SINGULARS (2009: 252)

2.3 Sg.Loc in -est. e-stems (Sg.Loc, Ess, Par).

2.3.2 Sg.Ill in -a

2.3.2.2 LACKS Gradation

2.3.2.2.1 Penultimate stem vowel loss: (Sg.Ill, Sg.Loc, Sg.Com; Ess, Par; Pl.Gen, Pl.Acc, Pl.Ill, Pl.Loc, Pl.Com, Pl.Abe)

2.3.2.2.1.1 The Sg.Com vowel i appears before final n

CLASS 4 BISYLLABIC, LOW VOWEL, DIPHTHONG, NO PALATALIZATION IN PENULTIMATE A:0

CLASS 4 BISYLLABIC, LOW VOWEL, MONOPHTHONG, PALATALIZATION

s:z-stem type

CLASS 4 BISYLLABIC, LOW VOWEL, DIPHTHONG, PALATALIZATION IN PENULTIMATE E:0

-stems

e-a-stems

Nouns with -est Loc and -a Ill without ʹ lemma and stem 3rd grade; vowel raising

čâustõk+N+Sg+Gen:čâustõõǥǥ Sajos:Sajo%^1VOW%{ʹØ%}s

stemtype n-stem reʹppiǩ:reʹppiǩ Palatalization loss in Sg.Ill

PENULTIMATE O:0

VOWEL-FINAL STEMS

inflection_type=”?”

Stem types from the grammar

These are still not fixed.

Class 5 according to Feist 152

k-stems

stemtype

stemtype

Class 6 according to Feist 153-154 PRESENT PARTICIPLES

participles in -I from verbs in -âd

participles in -AI from verbs in -ad

participles in -I from verbs in ʹ-ed

Class 7 according to Feist 154-155

i-stems

Class 8 according to Feist 155-157

Class 9 according to Feist 158

Diminutive derivations

Class 10 according to Feist 160-161

neǩ-stem

Sort of like 10

Class 11 according to Feist 162

Class 12 Feist 163

Noun phrase heads

Pl

Number and case tags

Used with words like juurd: jurddǥatta

POSSESSIVE DECLENSION

HatY-STEM-PX

A-STEM-PX

E-STEM-PX

LOAOADDAZH-STEM-PX

VOONYS-STEM-PX

Adjectives – to be moved

Vowel-final stem for PX


This (part of) documentation was generated from src/fst/morphology/affixes/nouns.lexc


src-fst-morphology-affixes-numerals.lexc.md

Inari Saami number <-> text


This (part of) documentation was generated from src/fst/morphology/affixes/numerals.lexc


src-fst-morphology-affixes-pronouns.lexc.md

Skolt Saami Pronoun Morphology

The lexicon PRON_, which is actually not needed, as pronouns get +Pron tag earlier.

Pronouns

Pointing to all the pronominal subgroups

Personal pronouns

Splitting according to person

Demonstrative pronouns

INDEFINITE PRONOUNS

REFLEXIVE PRONOUNS

jiõčč:

Completion needed 2015-09-19

Interrogative pronouns

LEXICON PRON-INTERR_ is referred to from the xml file, hence does not assign +Pron tag.

SPATIAL PRONOUNS

MISC


This (part of) documentation was generated from src/fst/morphology/affixes/pronouns.lexc


src-fst-morphology-affixes-propernouns.lexc.md

SKOLT SAAMI PROPERNOUN MORPHOLOGY

THE LEXICON OUTSIDE_LEXICONS ASSIGNS THE TAG +Attr

like KÕÕNJÂL

1. WORDS WITH SINGLE-SYLLABLE NOMINATIVE SINGULARS (2009: 167)

1.1 Sg.Loc in -âst (no vowel shift, all raised)

â-stems (Sg.Loc, Ess, Par).

1.1.1 Sg.Ill vowel -e

1.1.1.1 Has Palatalization

1.1.1.1[1] (Palatalization pattern)

Palatalized: Sg.Ill Not Palatalized: ELSE Sg.Ill in palatalization and -e

1.1.1.1[1].1 Lacks Specifically Pedagogical Gradation

1.1.1.1[1].1.1 Has Orthographic Gradation

1.1.1.1[1].1.1[] (Monophthong + Consonant and Consonant Geminate alternation)

Extra strong grade: Sg.Nom, Ess, Par Extra strong grade: Sg.Ill Weak grade: Pl.Nom, Sg.Loc, Sg.Com

FORMS

1. WORDS WITH SINGLE-SYLLABLE NOMINATIVE SINGULARS (2009: 167)

1.1 Sg.Loc in -âst (no vowel shift, all raised)

â-stems (Sg.Loc, Ess, Par).

1.1.1 Sg.Ill vowel -e

1.1.1.1 Has Palatalization

1.1.1.1[1] (Palatalization pattern)

Palatalized: Sg.Ill Not Palatalized: ELSE Sg.Ill in palatalization and -e

1.1.1.1[1].1 Lacks Specifically Pedagogical Gradation

1.1.1.1[1].1.1 Has Orthographic Gradation

1.1.1.1[1].1.1[] (Monophthong + Consonant and Consonant Geminate alternation)

Extra strong grade: Sg.Nom, Ess, Par Extra strong grade: Sg.Ill Weak grade: Pl.Nom, Sg.Loc, Sg.Com

FORMS

R ; = * LEXICON PROP_TOOBDYLM_mal kuss :%>â ESS ; = * LEXICON PROP_TOOBDYLM_mal kussân


This (part of) documentation was generated from src/fst/morphology/affixes/propernouns.lexc


src-fst-morphology-affixes-smi-propernouns.lexc.md

+Pl+Nom:%>jit K ;
ACCRA-DC ; :%>ji ACCRA-OBL_PLC-ORG ; :%>ji ACCRA-IICASE ;

ACCRA-IICASE ;

These sublexica are irrelevant for SIJTE, but added for the sake of the lexicon MARJA ! Muhto gč. dat kommentára… Imaštallan dan gal veha… Here, we allow for Illative Sijtei

These sublexica are irrelevant for SIJTE, but added for the sake of the lexicon MARJA ! Muhto gč. dat kommentára… Imaštallan dan gal veha…

For Finnish surnames. Itkonen.

For Finnish names with ending -nen. Kaustinen.

Itkosa as the oblique stem. with ordinary genitive Itkonena

Different lexicon for female persons and place names.

Different lexicon for personal surnames. Blind

As aleuhtat, but with a marginal leakage to sg forms in some cases. (substandard?)


This (part of) documentation was generated from src/fst/morphology/affixes/smi-propernouns.lexc


src-fst-morphology-affixes-symbols.lexc.md

Symbol affixes


This (part of) documentation was generated from src/fst/morphology/affixes/symbols.lexc


src-fst-morphology-affixes-verbs.lexc.md

Skolt Saami verb morphology

First a lexicon V_ for still unclassified entries.

Irregular verbs

Then irregular verbs ij and the copula.

REGULAR VERBS

CLASS 1 HIGH VOWEL, NO PALATALIZATION

(10) Allegro for incoative: piõg»

(10) Allegro for incoative: ǩiõč»

CLASS 1 LOW VOWEL, NO PALATALIZATION

Even-syllable stems in -AD

(1) +V+Inf: tättad (2) +V+Ind+Prs+Sg3: tätt (3) +V+Ind+Prs+Pl3: tätta (4) +V+Ind+Prt+Pl3: tattu (5) +V+Imprt+Sg2: täätt (5) ERR (7) +V+Imprt+Sg3: täättas (8) +V+Imprt+ConNegII: tattu

ExtraStrong-LowVowel-Palatalization

ExtraStrong-LowVowel-No-palatalization

ExtraStrong-LowVowel-Palatalization

ExtraStrong-RaisedVowel-Palatalization

Weak-StableV-No-Palatalization

Weak-StableVowel-No-Palatalization

Weak-LowVowel-No-palatalization Ind+Prs+Sg1, Ind+Prs+Sg2, Cond, Imprt+Sg3

ExtraStrong-RaisedVowel-No-palatalization

ExtraStrong-LowVowel-No-palatalization

(10) Allegro for incoative: läul»

(10) Allegro for incoative: särn»

V+Inf, Ind+Prs+Pl1, Ind+Prs+Pl2, Ind+Prt+ConNeg Imprt+Pl1, Imprt+Pl2 Actio, ActEss, PrsPrc, NomAct in MOsh

+V+Ind+Prs+Sg3

+Ind+Prs+Pl3

+V+Ind+Prt+Pl3, Ind+Prt+Sg1, Ind+Prt+Sg2, Ind+Prt+Sg4

Imprt+ConNeg, Imprt+Sg2, Ind+Prs+ConNeg, Ind+Prs+Sg4, Ind+Prs+Sg1, Ind+Prs+Sg2 +Ger, +VAbess, +Pot, +Cond, +Imprt+Sg3

Imprt+ConNegII, Pass+PrfPrc

+Imprt+Pl3

(10) Allegro for incoative: peit»

(10) Allegro for incoative: påus»

(2) Ind.Prs.Sg3: kuästt (3) Ind.Prs.Pl3: kuästta (4) Ind.Prt.Pl3: kuõsttu (5) Ind.Imprt.Sg2: kuäst- (5) ERR (7) Imprt.Sg3: kuästas (8) Imprt.13.ConNeg: kuõsttu

(2) Ind.Prs.Sg3: veâhss (3) Ind.Prs.Pl3: veâhssa (4) Ind.Prt.Pl3: viõhssu (5) Ind.Imprt.Sg2: veâus- (5) ERR (7) Ind.Imprt.Sg3: veâusas (8) Imprt.13.ConNeg: viõhssu

CLASS 1 HIGH VOWEL, PALATALIZATION

Strong-HighVowel-Palatalization

Strong-LoweredVowel-No-palatalization

ExtraStrong-LoweredVowel-Palatalization

ExtraStrong-RaisedVowel-Palatalization

Weak-StableV-Palatalization

Weak-RaisedVowel-Palatalization

Weak-LoweredVowel-No-palatalization

ExtraStrong-RaisedVowel-No-palatalization

ExtraStrong-LoweredVowel-No-palatalization

(11) Present Participle:

(12) Weak-RaisedVowel-NoPalatalization

Strong-HighVowel-Palatalization

Strong-LoweredVowel-No-palatalization

ExtraStrong-LoweredVowel-Palatalization

ExtraStrong-RaisedVowel-Palatalization

Weak-StableV-Palatalization

Weak-RaisedVowel-Palatalization

Weak-LoweredVowel-No-palatalization

ExtraStrong-RaisedVowel-No-palatalization

ExtraStrong-LoweredVowel-No-palatalization

(11) Present Participle:

(12) Weak-RaisedVowel-NoPalatalization

Strong-LoweredVowel-No-palatalization

ExtraStrong-LoweredVowel-Palatalization

ExtraStrong-RaisedVowel-Palatalization

Weak-StableV-Palatalization

Weak-RaisedVowel-Palatalization

Weak-LoweredVowel-No-palatalization

ExtraStrong-RaisedVowel-No-palatalization

ExtraStrong-LoweredVowel-No-palatalization

(11) Present Participle:

(12) Weak-RaisedVowel-NoPalatalization

Strong-HighVowel-Palatalization

Strong-LoweredVowel-No-palatalization

ExtraStrong-LoweredVowel-Palatalization

ExtraStrong-RaisedVowel-Palatalization

Weak-StableV-Palatalization

Weak-RaisedVowel-Palatalization

Weak-LoweredVowel-No-palatalization

ExtraStrong-RaisedVowel-No-palatalization

ExtraStrong-LoweredVowel-No-palatalization (9)

Allegro (10) lieʹđ»

(11) Present Participle:

(12) Weak-RaisedVowel-NoPalatalization

Strong-LoweredVowel-No-palatalization

ExtraStrong-LoweredVowel-Palatalization

ExtraStrong-RaisedVowel-Palatalization

Weak-StableV-Palatalization

Weak-RaisedVowel-Palatalization

Weak-LoweredVowel-No-palatalization

ExtraStrong-RaisedVowel-No-palatalization

ExtraStrong-LoweredVowel-No-palatalization

(10) Allegro for incoative: jueʹj» Secondary allegro for incoative: juâǥ»

(11) Present Participle:

(12) Weak-RaisedVowel-NoPalatalization

Strong-HighVowel-Palatalization

Strong-LoweredVowel-No-palatalization

ExtraStrong-LoweredVowel-Palatalization

ExtraStrong-RaisedVowel-Palatalization

Weak-StableV-Palatalization

Weak-RaisedVowel-Palatalization

Weak-LoweredVowel-No-palatalization

ExtraStrong-RaisedVowel-No-palatalization

ExtraStrong-LoweredVowel-No-palatalization (9)

Allegro (10) kueʹd»

(11) Present Participle:

(12) Weak-RaisedVowel-NoPalatalization

Strong-HighVowel-Palatalization

Strong-LoweredVowel-No-palatalization

ExtraStrong-LoweredVowel-Palatalization

ExtraStrong-RaisedVowel-Palatalization

Weak-StableV-Palatalization

Weak-RaisedVowel-Palatalization

Weak-LoweredVowel-No-palatalization

ExtraStrong-RaisedVowel-No-palatalization

ExtraStrong-LoweredVowel-No-palatalization (9)

Allegro (10) šieʹt»

(11) Present Participle:

(12) Weak-RaisedVowel-NoPalatalization

Strong-LoweredVowel-No-palatalization

ExtraStrong-LoweredVowel-Palatalization Ind+Prs+Pl3

ExtraStrong-RaisedVowel-Palatalization

Weak-StableV-Palatalization

Weak-RaisedVowel-Palatalization

Weak-LoweredVowel-No-palatalization

ExtraStrong-RaisedVowel-No-palatalization

ExtraStrong-LoweredVowel-No-palatalization

(10) Allegro for incoative: juʹrd»

(11) Present Participle:

(12) Weak-RaisedVowel-NoPalatalization

Strong-LoweredVowel-No-palatalization

ExtraStrong-LoweredVowel-Palatalization

ExtraStrong-RaisedVowel-Palatalization

Weak-StableV-Palatalization

Weak-RaisedVowel-Palatalization

Weak-LoweredVowel-No-palatalization

ExtraStrong-RaisedVowel-No-palatalization

ExtraStrong-LoweredVowel-No-palatalization

(10) Allegro for incoative: puʹht»

(11) Present Participle:

(12) Weak-RaisedVowel-NoPalatalization

Strong-LoweredVowel-No-palatalization xyy2Vyy

ExtraStrong-LoweredVowel-Palatalization xyy2Vyy

ExtraStrong-RaisedVowel-Palatalization

Weak-StableV-Palatalization

Weak-RaisedVowel-Palatalization

Weak-LoweredVowel-No-palatalization

ExtraStrong-RaisedVowel-No-palatalization

ExtraStrong-LoweredVowel-No-palatalization xyy2Vyy

(10) Allegro for incoative: uʹvd»

(11) Present Participle:

(12) Weak-RaisedVowel-NoPalatalization

Strong-LoweredVowel-No-palatalization

ExtraStrong-LoweredVowel-Palatalization

ExtraStrong-RaisedVowel-Palatalization

Weak-StableV-Palatalization

Weak-RaisedVowel-Palatalization

Weak-LoweredVowel-No-palatalization

ExtraStrong-RaisedVowel-No-palatalization

ExtraStrong-LoweredVowel-No-palatalization

(10) Allegro for incoative: tieʹđ»

(11) Present Participle:

(12) Weak-RaisedVowel-NoPalatalization

Strong-Low-Vowel-No-palatalization

ExtraStrong-LoweredVowel-Palatalization

ExtraStrong-RaisedVowel-Palatalization

Weak-StableV-Palatalization

Weak-RaisedVowel-Palatalization

Weak-LoweredVowel-No-palatalization

ExtraStrong-RaisedVowel-No-palatalization

ExtraStrong-LoweredVowel-No-palatalization

(10) Allegro for incoative: ǩieʹld»

(11) Present Participle:

(12) Weak-RaisedVowel-NoPalatalization

(11) Present Participle:

(12) Weak-RaisedVowel-NoPalatalization

XYY-HighVowel-No-palatalization Height=0, PAL=-, V=0, C=0, âae=e

XYY-HighVowel-Palatalization !lowered Height=-, PAL=+, V=-, C=+, âae=e

XYY-RaisedVowel-Palatalization Height=+, PAL=+, V=-, C=+, âae=e

2XY-StableV-Palatalization Height=0, PAL=+, V=+, C=-, âae=e

2XY-RaisedVowel-Palatalization Height=+, PAL=+, V=+, C=-, âae=e

2XY-LoweredVowel-No-palatalization Height=0, PAL=-, V=+, C=-, âae=e

XYY-RaisedVowel-No-palatalization Height=+, PAL=-, V=-, C=+, âae=e

XYY-HighVowel-No-palatalization Height=-, PAL=-, V=-, C=+, âae=e

(10) Allegro for incoative: vueʹlj» Height=0, PAL=+, V=-, C=-, âae=e

(11) Present Participle: Height=+, PAL=+, V=0, C=0, âae=e

(12) Weak-RaisedVowel-NoPalatalization Height=+, PAL=-, V=+, C=-, âae=e

(13) vuõlggled, joottled -Âled be about to leave Height=+, PAL=-, V=0, C=0, âae=e

XYY-HighVowel-No-palatalization

XYY-HighVowel-Palatalization

XYY-RaisedVowel-Palatalization

2XY-StableV-Palatalization

2XY-RaisedVowel-Palatalization

2XY-LoweredVowel-No-palatalization

XYY-RaisedVowel-No-palatalization

XYY-HighVowel-No-palatalization

(10) Allegro for incoative: vueuʹs»

(11) Present Participle:

(12) Weak-RaisedVowel-NoPalatalization

CLASS 1 LOW VOWEL, PALATALIZATION

EVEN-SYLLABLE STEMS IN -ED

ExtraStrong-LowVowel-No-palatalization

Strong-LowVowel-Palatalization

ExtraStrong-RaisedVowel-Palatalization

Strong-StableV-Palatalization

Strong-RaisedVowel-Palatalization

Strong-LowVowel-No-palatalization

ExtraStrong-LowVowel-No-palatalization

(10) Allegro for incoative: käʹđ»

(11) Present Participle:

(12) Weak-RaisedVowel-NoPalatalization

Strong-LowVowel-No-palatalization

ExtraStrong-LowVowel-Palatalization

ExtraStrong-RaisedVowel-Palatalization

Weak-StableV-Palatalization

Weak-RaisedVowel-Palatalization

Weak-LowVowel-No-palatalization

ExtraStrong-RaisedVowel-No-palatalization

ExtraStrong-LoweredVowel-No-palatalization

(11) Present Participle:

(12) Weak-RaisedVowel-NoPalatalization

ExtraStrong-LowVowel-No-palatalization

Strong-LowVowel-Palatalization

ExtraStrong-RaisedVowel-Palatalization CHECKME = 2015-10-11

Strong-StableV-Palatalization

Strong-RaisedVowel-Palatalization

Strong-LowVowel-No-palatalization

ExtraStrong-LowVowel-No-palatalization

(10) Allegro for incoative: keʹt»

(11) Present Participle:

(12) Weak-RaisedVowel-NoPalatalization

Strong-LowVowel-No-palatalization

ExtraStrong-LowVowel-Palatalization

ExtraStrong-RaisedVowel-Palatalization

Weak-StableV-Palatalization

Weak-RaisedVowel-Palatalization

Weak-LowVowel-No-palatalization

ExtraStrong-RaisedVowel-No-palatalization

ExtraStrong-LoweredVowel-No-palatalization

(11) Present Participle:

(12) Weak-RaisedVowel-NoPalatalization

ExtraStrong-LowVowel-No-palatalization

ExtraStrong-LowVowel-Palatalization

ExtraStrong-RaisedVowel-Palatalization

Weak-StableV-Palatalization

Weak-RaisedVowel-Palatalization

Weak-LowVowel-No-palatalization

ExtraStrong-RaisedVowel-No-palatalization

ExtraStrong-LoweredVowel-No-palatalization

(10) Allegro for incoative: ceʹps»

(11) Present Participle:

(12) Weak-RaisedVowel-NoPalatalization

(10) Allegro for inchoative: käʹd»

(11) Present Participle:

(12) Weak-RaisedVowel-NoPalatalization

Strong-LowVowel-No-palatalization

ExtraStrong-LowVowel-Palatalization

ExtraStrong-RaisedVowel-Palatalization

Weak-StableV-Palatalization

Weak-RaisedVowel-Palatalization

Weak-LoweredVowel-No-palatalization

ExtraStrong-RaisedVowel-No-palatalization

ExtraStrong-LoweredVowel-No-palatalization

(11) Present Participle:

(12) Weak-RaisedVowel-NoPalatalization

Strong-LoweredVowel-No-palatalization

ExtraStrong-LoweredVowel-Palatalization

ExtraStrong-RaisedVowel-Palatalization

Weak-StableV-Palatalization

Weak-RaisedVowel-Palatalization

Weak-LoweredVowel-No-palatalization

ExtraStrong-RaisedVowel-No-palatalization

ExtraStrong-LoweredVowel-No-palatalization

(10) Allegro for incoative: kâʹǩ»

(11) Present Participle:

(12) Weak-RaisedVowel-NoPalatalization

Strong-HighVowel-Palatalization

Strong-LoweredVowel-No-palatalization

ExtraStrong-LoweredVowel-Palatalization

ExtraStrong-RaisedVowel-Palatalization

Weak-StableV-Palatalization

Weak-RaisedVowel-Palatalization

Weak-LoweredVowel-No-palatalization

ExtraStrong-RaisedVowel-No-palatalization

ExtraStrong-LoweredVowel-No-palatalization (9)

Allegro (10)

(11) Present Participle:

(12) Weak-RaisedVowel-NoPalatalization

Strong-HighVowel-Palatalization

Strong-LoweredVowel-No-palatalization

ExtraStrong-LoweredVowel-Palatalization

ExtraStrong-RaisedVowel-Palatalization

Weak-StableV-Palatalization

Weak-RaisedVowel-Palatalization

Weak-LoweredVowel-No-palatalization

ExtraStrong-RaisedVowel-No-palatalization

ExtraStrong-LoweredVowel-No-palatalization (9)

No Allegro (10) 2015-12-04 :%^VOWLower%^PAL%^CC2CAllegro%>e FOR-ALLEGRO-DEVERBAL-DERIVATION ;

(11) Present Participle:

Strong-HighVowel-Palatalization

Strong-LoweredVowel-No-palatalization

ExtraStrong-LoweredVowel-Palatalization

ExtraStrong-RaisedVowel-Palatalization

Weak-StableV-Palatalization

Weak-RaisedVowel-Palatalization

Weak-LoweredVowel-No-palatalization

ExtraStrong-RaisedVowel-No-palatalization

ExtraStrong-LoweredVowel-No-palatalization (9)

Allegro (10)

(11) Present Participle:

(12)

Strong-LoweredVowel-No-palatalization

ExtraStrong-LoweredVowel-Palatalization

ExtraStrong-RaisedVowel-Palatalization

Weak-StableV-Palatalization

Weak-RaisedVowel-Palatalization

Weak-LoweredVowel-No-palatalization

ExtraStrong-RaisedVowel-No-palatalization

ExtraStrong-LoweredVowel-No-palatalization

(10) Allegro for incoative: pieʹj» 2015-09-03 IS THIS CORRECT

(11) Present Participle:

(12)

(11) Present Participle:

(12)

Strong-LoweredVowel-No-palatalization

ExtraStrong-LoweredVowel-Palatalization

ExtraStrong-RaisedVowel-Palatalization

Weak-StableV-Palatalization

Weak-RaisedVowel-Palatalization

Weak-LoweredVowel-No-palatalization

ExtraStrong-RaisedVowel-No-palatalization

ExtraStrong-LoweredVowel-No-palatalization (9)

Allegro (10) vuäˈđeškuätt

(11) Present Participle:

(12)

(11) Present Participle:

(12)

Allegro (10)

(11) Present Participle:

(11) Present Participle:

Allegro (10)

(11) Present Participle:

(2) +V+Ind+Prs+Sg3: meätˈt +V+Ind+Prs+Sg3, Ind.Prt.ConNeg, PrfPrc

(3) +V+Ind+Prs+Pl3: meäʹtˈte +V+Ind+Prs+Pl3

(4) +V+Ind+Prt+Pl3: mieʹtˈte +V+Ind+Prt+Pl3, Ind+Prt+Sg1, Ind+Prt+Sg2, Ind+Prt+Sg4

(5) +V+Imprt+Sg2:?? miẹʹtt Imprt+Sg2, Ind+Prs+ConNeg, Ind+Prs+Sg4, VAbess, GerTemp, GerInstr

(6) +V+Pot+Sg3: ??mieʹđež Ind+Prt+Sg3, Ind+Prt+Pl1, Ind+Prt+Pl2, Pot,

(7) +V+Imprt+Sg3: meättas Ind+Prs+Sg1, Ind+Prs+Sg2, Cond, Imprt+Sg3

(8) +V+Imprt+ConNegII: miâtˈtu Imprt+ConNeg, Pass+PrfPrc

CHECK FORM

(11) Present Participle: (11) +V+Act+PrsPrc: mieʹtˈti (12)

(1) +V+Inf: mäʹhssed V+Inf, Ind+Prs+Pl1, Ind+Prs+Pl2, Imprt+Pl1, Imprt+Pl2 Actio, ActEss, PrsPrc

(2) +V+Ind+Prs+Sg3: mähss +V+Ind+Prs+Sg3, Ind.Prt.ConNeg, PrfPrc

(3) +V+Ind+Prs+Pl3: mäʹhsse +V+Ind+Prs+Pl3

(4) +V+Ind+Prt+Pl3: maʹhsse +V+Ind+Prt+Pl3, Ind+Prt+Sg1, Ind+Prt+Sg2, Ind+Prt+Sg4

(5) +V+Imprt+Sg2: määuʹs Imprt+Sg2, Ind+Prs+ConNeg, Ind+Prs+Sg4, VAbess, GerTemp, GerInstr

(6) +V+Pot+Sg3: maauʹsež Ind+Prt+Sg3, Ind+Prt+Pl1, Ind+Prt+Pl2, Pot,

(7) +V+Imprt+Sg3: määusas Ind+Prs+Sg1, Ind+Prs+Sg2, Cond, Imprt+Sg3

(10) mäuʹs

(11) Present Participle: (11) +V+Act+PrsPrc: maʹhssi (12)

DERIVED VERBS WITH PENULTIMATE VOWEL LOSS AND CHANGE

CLASS 2 HIGH VOWEL, NO PALATALIZATION

CLASS 2 LOW VOWEL, NO PALATALIZATION

CLASS 2 HIGH VOWEL, PALATALIZATION

CLASS 2 LOW VOWEL, PALATALIZATION

CLASS 3 HIGH VOWEL, NO PALATALIZATION

CLASS 3 LOW VOWEL, NO PALATALIZATION

CLASS 3 HIGH VOWEL, PALATALIZATION

CLASS 3 LOW VOWEL, PALATALIZATION

CLASS 3 HIGH VOWEL, NO PALATALIZATION, GH

CLASS 3 LOW VOWEL, NO PALATALIZATION, GH

CLASS 4 HIGH VOWEL, NO PALATALIZATION

CLASS 4 LOW VOWEL, NO PALATALIZATION

CLASS 4 HIGH VOWEL, PALATALIZATION

CLASS 4 LOW VOWEL, PALATALIZATION

Not yet written

assuming stem kååʹmmerded

assuming stem kååʹmmerd

VSUF-I-POTKOND_YD, VSUF-I-POTKOND_AD and VSUF-POTENTIAL_ED come here


This (part of) documentation was generated from src/fst/morphology/affixes/verbs.lexc


src-fst-morphology-phonology.twolc.md

Skolt Sámi TWOLC file

This file documents the phonology.twolc file

Introduction

The twolc rule file for Skolt Saami is divided into 5 main sections:

  1. Alphabets, Sets and Definitions
  2. Consonant shift rules (tbw)
  3. Vowel alternation rules
  4. Consonant gradation rules
  5. Rules for cleaning up and composing end result

Alphabets, sets and definitions

Alphabet

Regular letters:

* a b c d e f g h i j k l m n o p q r s t u v w x y z
* A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
* ä å â õ
* Ä Å Â Õ
* č ǩ ǯ ǧ ž đ ǥ ʒ š ŋ
* Č Ǩ Ǯ Ǧ Ž Đ Ǥ Ʒ Š Ŋ
* ẹ Pedagogical purposes
* Ẹ Pedagogical purposes
* 
* æ ø ö á é í
* Æ Ø Ö Á
* É Ó Ú Í À È Ò Ù Ì Ë Ü Ï Ê Ô Û Î Ã Ý þ Ñ Ð
* é ó ú í à è ò ù ì ë ü ï ê ô û î ã ý þ ñ ð ß ª ß ç

Orthographic, suprasegmental markers:

Symbol pairs for consonant lengthening:

Symbol pairs for vowel length:

Symbol pairs for vowel height, by default vowels are low:

Trigger symbols:

Penultimate Palatalisation

Miscellaneous triggers:

CHARACTERISTIC BREAKDOWN 2015-02-17

Removal of suprasegmentals

This introduces vertical bar after diphth before consonants

Various semi-vowel alternations

Gradation triggers 2015.01.23

Other vowel length and consonant length will be phased out

More triggers, possibly realised as a segment:

Hyphen at compound word boundary

Literal quotes and angles must be escaped (cf morpheme boundaries below):

Morpheme boundaries:

End of alphabet definitions

Sets

Definitions

Short consonant cluster

Onset consonant or word boundary OnSetC = [[%{XC%}:Cns\|Cns:Cns] (Cns:\|%{XC%}:Cns) \|.#.\|#:\|%>\](») ;

Penultimate consonant PenUltCns = [Cns:\](%{XC%}:) ;

following morpheme or word boundary

* RBound = [(%^Hyphen: %-|%^NoHyphen:|%{%-Ø%}:) (∑) #:|.#.|%>|»|%-] ;

ossible triggers before VOWLower and VOWRaise

neutral to vowel length

neutral to vowel height and backness

possible triggers before PALE PALÄ

possible triggers between stem and PALNo and PAL

possible triggers between vowel length and consonant grade

NeutrVowHeightDiphPalAllegroPAL = [ (%^VOWRaise:\|%^VOWLower:) ( ((%^PALÂ:\|%^PALÕ:) (%^Allegro:) %^PALNo:\|%^VOWLower: %^PALÄ:\|(%^PALÕ:\|%^PALE:\|%^PALÄ:\|%^PALẸ:) (%^Allegro:) %^PAL:)\| (%^Allegro:) (%^PALÕ:\|%^PALE:\|%^PALÄ:\|%^PALẸ:\|%^PALÂ:) (%^PALNo:\](%^PAL:) ) ) ;

possible triggers between word end and consonant grade

possible triggers between vowel length and Palatalization BetweenVowLengthAndPALNo = [(%^VOWLower:\|%^VOWRaise:) (%^PALÄ:\|%^PALE:\|%^PALẸ:\|%^PALÕ:\](%^PALÂ:) ) ;

BetweenVowHeightAndConsGrade = [((%^PALE:\|%^PALÄ:\|%^PALẸ:\|%^PALÕ:) (%^Allegro:) %^PAL:\|(%^PALÂ:\](%^PALÕ:) (%^Allegro:) %^PALNo:)) ;

BetweenStemAndRightArrow = [NeutrVowLenghtHeightDiphPalAllegroPAL BetweenPALNoAndMorphRightArrow] ;

Penultimate vowel centers possible triggers before VOWLower and VOWRaise

PenBetweenStemAndStemFinalVoicing = [PenBetweenStemAndVowelLoss (%^RmVow:\](%^PenVow2a:)) ;

PenBetweenPALNoAndMorph = [(%^Pen: [(%^Allegro:) %^CC2C:\|(%^Allegro:) %^CC2CAllegro:]\|%^Pen: %^C2CC:\|%^Pen: %^XYY2XY:\|%^Pen: %^KK2GG:\|%^Pen: %^CC2CCC:\|%^Pen: %^CCC2C:\|%^Pen: %^CCC2CC:\|%^Pen: %^XYY2VY:\|%^Pen: %^XYY2VYY:\](%^Pen: %^KKK2GG:) RBound ) ;

used in compounding Cmp/SgNom and Cmp/SgGen SgNomGen = [((%^PALE: %^PAL:) %^CCC2C:\|(%^PALE: %^PAL:) %^CCC2CC:\|%^PALẸ:\|[%^PALE:\|%^PALÕ:] %^PAL: %^XYY2IY:\|[%^PALẸ:\|%^PALE:] %^PAL: %^XYY2XY:\|((%^PALE:) %^PAL:) %^KK2GG:\|(%^PALE:) %^PAL:\| ((%^PALE:) %^PAL:) (%^J2I:) %^CC2C:\](%^PAL: %^XYY2VY:));

neutral to consonant length

CNeutrGrade = [([(%^Allegro:) %^CC2C:\|(%^Allegro:) %^CC2CAllegro:] \|[%^C2CC:] \|%^CC2CCC: \|%^KK2GG: \|%^KKK2GG: \|%^XYY2VY: \|%^XYY2VYY:\](%^CCC2CC: )) ;

neutral to vowel and consonant length

NeutrGrade = [VNeutrGrade \](CNeutrGrade) ;

NoVowRaise = \[ %^VOWRaise: \| #]* [#\](.#.) ;

NoCnsDummy = \[ %^CC2C: \| %^CCC2C: \| %^CCC2CC: \| %^XYY2IY: \| %^XYY2XY: \| %^KK2GG: \| %^XYY2VY: \| %^KKK2GG: \| %^KKK2ZERO: \| %^C2CC: \| %^J2I: \| %^RmCns: \| %^K2GG: \]( # )* ;

SurfaceDiphthong = [ :e :ä \| :e :â \| :i :õ \| :i :â \| :i :e \| :i :ẹ \| :u :â \| :u :õ \| :u :å \| :u :ä \| :u :e \]( :u :ẹ ) ;

Rules

Vowel shortening rules

Vowel shortening â:0 - used in

čââʹđ+N+Sg+Ill: heart/sydän

-â ǩeʹtted+V+Ind+Prs+ConNeg: cook/keittää

Vowel shortening ẹ:0 - used in +Ind+Prs+Pl3, +Ind+Prt+Pl3 teevvad+V+Prt+4:

Vowel shortening e:0 - used in +Ind+Prs+Pl3, +Ind+Prt+Pl3

eʹǩrded+V+Inf

Ââvel+N+Prop+Sg+Loc Ivalo

Jouste

Vowel shortening å:0 - used in +Ind+Prs+Pl3, +Ind+Prt+Pl3 jååʹtted+V+Ind+Prt+Pl3 trekk

sååbbar+N+Sg+Nom meeting

Vowel shortening õ:0 - used in +Ind+Prs+Pl3, +Ind+Prt+Pl3

kõõnjâl+N+Sg+Gen tear

Vowel shortening u:0 - used in +Ind+Prs+Pl3, +Ind+Prt+Pl3

mainstummuš+N+Sg+Ill: story telling/tarinointi

juurd+N+Ess thought

Oulu

Vowel shortening i:0 - used in +Ind+Prs+Pl3, +Ind+Prt+Pl3 viikkâd+V+Ind+Prs+Pl3

prääʹzniǩ+N+Sg+Ill: celebration/juhla

+Sg+Ill N_HÕʹPPI

Vowel shortening o:0 - used in +Ind+Prs+Pl3, +Ind+Prt+Pl3 šoomm+N+Sg+Ill

ooccâd+V+Imprt+Pl3

ooumaž+N+Sg+Nom

**Vowel shortening a:0 ** - used in +Ind+Prs+Pl3, +Ind+Prt+Pl3

Aanar+N+Prop+Sg+Ill: Inari/Enare

mättʼted+V+Inf: teach/opettaa

-a

Vowel shortening ä:0 - used in +Ind+Prs+Pl3, +Ind+Prt+Pl3

%{ʼØ%} for modifier letter apostrophe - jieʹlli+N+Ess animal/eläin

Vowel shortening y:0 - used in PX -y

Vowel shortening ö:0 - used in PX -ö

ZERO to syllable mark 0:ʼ, same as modifier letter apostrophe - used in zeeʹtt+N+Sg+Abe: zeeʹttʼtää mättʼted+V+Inf: teach/opettaa

Vowel alternation rules

VOWEL SHIFT

RAISING

deriving +Ind+Prt+Sg1, +Ind+Prt+Sg2 +Ind+Prt+Pl3 in teevvad:tivvu

Vowel raising o:u - Adding +Prt+Pl3 for ed verbs, Removing u: and i: second element due to njeiddad njeiddu

Vowel raising %{õu%}:u - Adding tõlvvad+V+Pass+PrfPrc:

Vowel raising å:o - Adding

Vowel raising e:i - Adding +Prt+Pl3 for ed verbs, Removing u: and i: second element due to njeiddad njeiddu

reâugg+N+Sg+Ill

Vowel raising ẹ:i - used in

Vowel raising â:õ - Adding +Prt+Pl3 for ed verbs, Removing u: and i: second element due to njeiddad njeiddu mââʹnn+N+Pl+Acc: egg/muna

Diphthong raising beginning with u ä:å - kuärŋŋad+V+Ind+Prt+Pl3

Diphthong raising beginning with e:i ä:â - reäkkad+V+Ind+Prt+Pl3

Diphthong raising beginning with e:i ä:e - used in

Diphthong raising beginning with e:i ä:ẹ - used in

diphthong backing beginning with u ä:õ - used in

Even syllabic verbs I, diphthong raising beginning with ä:a u - deriving läullad > laullum

Even syllabic verbs I, vowel lowering o:å - deriving +Ind+Prs+Sg3, +Ind+Prs+Pl3 in poorrâd poorrâd+V+Ind+Prs+Pl3 eat/syödä

Even syllabic verbs I, vowel lowering i:e - viǯǯâd+V+Ind+Prs+Pl3 fetch/noutaa

siõrrâd+V+Ind+Prs+Pl3: play/leikkiä

cieʹǩǩes+N+Sg+Gen: trick, type of ear mark/tikki, pykälä

Even syllabic verbs I, vowel lowering i:ẹ ** - deriving +Ind+Prs+Sg3 in viǯǯâd viǯǯâd+V+Imprt+Sg2 **fetch/noutaa

Vowel lowering a:ä -

Even syllabic verbs I, vowel lowering u:o - deriving +Ind+Prs+Sg3, +Ind+Prs+Pl3 in uʹvdded

uʹvdded+V+Cond+Pl3 to give

kuullâd+V+Ind+Prs+Pl3 hear/kuulla

stuuʹl+N+Sg+Ill: chair/tuoli

puuʹttes+A+Sg+Gen: bright/kirkas

suukkâd+V+Imprt+Sg3: row/soutaa

**Even syllabic verbs I, vowel lowering u:õ ** - deriving

Even syllabic verbs I, vowel lowering õ:â ** - deriving +Ind+Prs+Sg3, +Ind+Prs+Pl3 in viǯǯâd riõkkâd+V+Ind+Prs+Sg3 **to whip

vââid+N+Sg+Nom: **

Diphthongs

Even syllabic verbs I, diphthong opening after u å:ä - deriving +Ind+Prs+Sg3, in kuåccâd kuäʹcce = a>ä lowering clockwise

Vowel in second syllable e:a - deriving cieʹǩǩes+N+Sg+Gen: ceäkˈkaz cieʹǩǩes+N+Sg+Gen: trick, type of ear mark/tikki, pykälä

**Even syllabic verbs I, diphthong opening i:e â:ä for â:ä ** - deriving +Ind+Prs+Sg3, in čiõkkâd

Even syllabic verbs I, diphthong opening after i:e e:â - deriving +Ind+Prs+Sg3, in pi%{EÂ%}ʹǩǩ:peâkka

Vowel backing

** u å:õ** - used in = a>ä lowering clockwise

Vowel Palatalization

diphthong allophonic realization in palatalization u å:e - deriving e from å

diphthong allophonic realization in palatalization u å:ẹ - deriving

Vowel Lowering and Fronting

Even syllabic verbs I, diphthong opening õ:ä after i:e - deriving +Ind+Prs+Sg3, in čiõkkâd siõrrâd+V+Ind+Prs+Pl3: play/leikkiä

SECONDARY FRONTING

Even syllabic verbs I, secondary vowel fronting with PAL u õ:e - deriving +Ind+Prs+Pl3 in VIQQAD: kuõskkâd >kueʹsǩǩe

**Even syllabic verbs I, secondary u > v **

RELATIVE VOWEL LENGTHENING

vowel lengthening and consonant shortening, %^Pen: %^V2VV and %^CShort

Even syllabic verbs I, relative vowel lengthening %^1VOW:â - deriving +V+Inf in TIETTED: uudd > uʹvdded šõddâd+V+Imprt+Sg3:

nââʹer+N+Sg+Nom sleep

radio+N+Sg+Ill

%^1VOW:ẹ relative vowel lengthening - sẹẹr+N+Pl+Nom: **

Even syllabic verbs I, relative vowel lengthening %^1VOW:e - deriving +V+Inf in TIETTED: uudd > uʹvdded

Jouste+N+Prop+Sg+Ill

Even syllabic verbs I, relative vowel lengthening %^1VOW:å - deriving sååbbar+N+Sg+Nom

Even syllabic verbs I, relative vowel lengthening %^1VOW:õ - deriving +V+Inf in TIETTED: uudd > uʹvdded âʹlǧǧ+N+Pl+Gen: boy/poika

relative vowel lengthening %{õuØ%}:õ

radio+N+Sg+Ill

Even syllabic verbs I, relative vowel lengthening %^1VOW:ä - deriving +V+Inf in TIETTED: uudd > uʹvdded

Määttä+N+Prop+Sg+Ill

Even syllabic verbs I, relative vowel lengthening %^1VOW:a - deriving +V+Inf in MAINSTED: maainstam, mainstam

mäʹhssed+V+Ind+Prt+Sg3: pay/maksaa

taalkâs+N+Sg+Nom

biologia+N+Sg+Ill

Relative vowel lengthening %^1VOW:o - simultaneous lengthening and lowering: juʹrdded > joordam juʹrdded+V+Ind+Prs+Sg1

Even syllabic nouns I, relative vowel lengthening i - ǩiđđ:ǩiiđ

leuʹdd+N+Pl+Gen

Terhi+N+Prop+Sg+Ill

Even syllabic nouns, relative vowel extra lengthening u not followed by v - declension of nouns kunn > kuun mainstummuš+N+Err/Orth+Sg+Gen: story telling/tarinointi

tuuibâl+N+Sg+Nom: **

vuʹvdd+N+Err/Orth+Sg+Gen: area/alue

Word-final vowel ö - Enontekiö+N+Prop+Sg+Ill

simultaneous lengthening and raising, hmm: xfst ordering might be easier

VOWEL DUMMY LOSS

SEMI VOWELS

Even syllabic nouns, for j>i - sijdd > siid This will need a special extra-lengthening rule

Even syllabic verbs, for v>u - uvdd > ouʹdde

Even syllabic nouns, for h>u - luhtt s s: … uhss+N+Der/Dim+N+Sg+Gen door

piiutâs+N+Sg+Nom clothing/vaate

Even syllabic nouns, for h>i - kueʹhtt kueiʹt+Num+Sg+Gen two/kaksi

trisyllabic verbs and doer derivations, i>j - used in

peigg+N+Sg+Ill

VOWELS TENSE vs LAX 2012-11-28

Vowels for â:i - ǩeʹtted+V+Ind+Prs+ConNeg: cook/keittää

Vowels for â:e - miârr+N+Sg+Ill

pieʹhssed+V+Inf: **

Even-syllabic nouns, for â:ẹ - used in pieʹll+N+Sg+Nom: half

Palatalization for ẹ:e - used in reʹhtt+N+Pl+Nom

VOWEL and ZERO ALTERNATION

Realization for â in a - used in

THE NON-ORTHOGRAPHIC SYLLABLE

Loss of ʼ when preceded by vowel - This is a temporary solution to “ʼ” in võʹllʼjed, it deletes softmark when preceded by vowel

%{A1%}:ʼ when subseqent syllable has vowel v - This is a temporary solution to “ʼ” in võʹllʼjed 2013-08-29

PALATALIZATION

%{ʹØ%}:ʹ as transfer from left of v:u and all instances of modifier letter prime - used +Ind+Prs+Pl3, uʹvdded+V+Ind+Prs+Pl3: ouʹdde

uʹvdded+V+Ind+Prs+Pl3: give/antaa

täʹhtt+N+Pl+Nom: bone/luu

čẹẹuʹres+N+Sg+Nom = otter

huʹvǧǧi+N+Sg+Nom: rattle/suhistin

d:đ in weak grade - used in

Even-syllabic verbs I, Palatalization of g:ǧ - used in reäiʹǧǧ+N+Sg+Nom: hole/reikä

bioloog+N+Sg+Ill biologist

huʹvǧǧi+N+Sg+Nom: rattle/suhistin

ääʹǧǧes+N+Sg+Nom: **

Even-syllabic verbs I, Palatalization of k:ǩ here - used in

hääʹsǩ Perhaps the stem should simply be häskk

cieʹǩǩes+N+Sg+Ill: trick, type of ear mark/tikki, pykälä

mieʹlǩǩ+N+Sg+Acc: milk/maito

rääʹǩǩes+A+Sg+Nom beloved/rakas

tõiŋsǩed+V+Inf

Even-syllabic nouns I, Depalatalization of ǩ:k - used in

prääʹzniǩ+N+Sg+Ill: celebration/juhla

Even-syllabic nouns, removing palatalization in -est +Loc nouns - removing palatalization in +Sg+Ill, pieʹss:peässa

Consonant QUANTITY CHANGE gradation rules

Weakening Consonant Cluster

dealing with relative length changes mõõnnâd : mõʹnne : mõõn

Even syllabic verbs I, cg m:0 - used with +Imp+Sg2, +Ind+Prs+ConNeg, oolmaž+N+Sg+Nom: person/henkilö

Even syllabic verbs I, cg for b - used neiʹbb+N+Sg+Gen: knife/veitsi

Even syllabic verbs I, second consonant loss p:0 - used in

Even syllabic verbs I, cg v:0 - used with +Imp+Sg2, +Ind+Prs+ConNeg, teevvad > teev

f:0 - used in

kaaʹff+N+Sg+Gen coffee

Even syllabic verbs I, cg n:0 - used with +Imp+Sg2, +Ind+Prs+ConNeg, jiõnn:jiõn vueʹn+N+Sg+Nom: mother-in-law/anoppi

Even syllabic nouns I, with extra lengthening of vowel ij>ii/uv>uu and dd>d - used with +N+Sg+Nom > +N+Sg+Gen, sijdd > siid

uʹvdded+V+Ind+Prs+4:

Even syllabic verbs I, cg for ʒ - used with pääʹʒʒelm+N+Sg+Ill: päʹʒlma sauʒʒ+N+Pl+Nom sheep/lammas

ǯ:0 - used in kuʹvǯǯ+N+Sg+Gen

č:0 - used in

c:0 - used in

ž:0 - used in

väžsted+V+Inf

z:0 - used in

Even syllabic verbs I, cg for đ - used with ǩiđđ:ǩiiđ

Even syllabic verbs I, cg for r - used with võrr:võõr

Even syllabic verbs I, cg for l - used with vuʹvll+N+Sg+Gen: vuuʹl deriving

pääʹljes+N+Sg+Nom: path/polku

vuʹvll+N+Sg+Gen:

deriving kueʹll+N+Sg+Gen: kueʹl

**j:0 ** - used in

ǩeʹrjj+N+Pl+Nom: book/kirja

Even syllabic verbs I, cg for g - used in cõõggâlm+N+Sg+Ill

äiʹǧǧ+N+Sg+Gen: time/aika

Allegro loss ǧ - used in

GEMINATE TO WEAK QUALITY GEMINATE Even syllabic nouns I, with extra lengthening of vowel V>VV and KK>ǤǤ - used in used with +N+Sg+Nom > +N+Sg+Gen for cases like lookki > looǥǥi.

čâustõk+N+Sg+Gen

čuâǥǥas+N+Sg+Nom road

loǥškueʹtted begin to read

čõõǥǥâs

Even syllabic nouns I, with extra lengthening of vowel V>VV and k:j - used in used with

. tuʹmstõk+N+Der/Dimin+N+Pl+Nom: decision/päätös, mietintö

  • tuʹmstõ^1VOW0k{XC}^V2VV^PAL^K2GG>e
  • tuʹmstõõʹjj000>e

with allegro

Even syllabic nouns I, with extra lengthening of vowel V>VV and ǩ:j - used in used with

.

š:0 - used in Allegro

šapšš+N+Sg+Gen white fish/siika

Even syllabic verbs I, Voicing š:ž - ss:zz, +Imp+Sg2, +Ind+Prs+ConNeg, double consonants at coda become voiced in gradation lookkmõš+N+Sg+Gen

mainstummuš+N+Sg+Gen:

Even syllabic verbs I, Voicing c:ʒ - cc:zz, +Imp+Sg2, +Ind+Prs+ConNeg, double consonants at coda become voiced in gradation

õõʒʒâs+N+Sg+Nom: high water/vuoksi

Even syllabic verbs I, Voicing č:j - ss:zz, +Imp+Sg2, +Ind+Prs+ConNeg, double consonants at coda become voiced in gradation

Even syllabic verbs I, Voicing after long vowel or diphthong s:z s:z - ss:zz, +Imp+Sg2, +Ind+Prs+ConNeg, double consonants at coda become voiced in gradation tääʹss+N+Sg+Gen: level/taso

cieʹǩǩes+N+Sg+Gen: trick, type of ear mark/tikki, pykälä

čårrõs+N+Sg+Gen

Even syllabic verbs I, second consonant loss t:0 - used in autt+N+Pl+Nom car/auto

Consonant loss s:0 - used in

ǩeeʹstes+N+Pl+Nom: glove/kinnas

uhss+N+Sg+Gen door

ǩeäsʼsted+V+Inf:

Consonant loss ŋ:0 - used in

Consonant loss h:0 - used in ruʹhss+N+Sg+Loc+PxSg3:

Even syllabic verbs I, t>đ - tt:đ ǩiõtt+N+Sg+Loc+PxSg1 hand,arm/käsi

Even syllabic nouns I, p>v - pp:v

Consonant quality change ǥ:j - used in

Even syllabic verbs I, second consonant loss k:0 - used in loǥškueʹtted begin to read

Even syllabic verbs I, second consonant loss ǩ:0 - used in

eʹǩrded+V+Inf

Even syllabic verbs I, second consonant loss ǥ:0 - used in

påǥsted+V+Inf

Even syllabic verbs I, lgg>lǥ vueʹlǧǧed+V+Ind+Prs+Sg2

even syllabic verbs I, ‘lgg>’lj - used in vueʹlǧǧed+V+Ind+Prt+Pl1

Rules for cleaning up and composing end result

**Orthographic Consonant lengthening Weak to strong %{XC%}:Cx ** - used in

Orthographic Consonant lengthening Weak to strong %{XC%}:Cx for n and l - used in

suâl+N+Sg+Nom island

kååvas+N+Sg+Nom: kota

kõõnjâl+N+Sg+Nom tear

suâl+N+Nom island/saari

ǩeâlǥal+N+Sg+Nom kilkura

čårrõs+N+Sg+Gen

lookkmõš+N+Sg+Gen

čâustõk+N+Sg+Gen

čâustõk+N+Sg+Gen

CONSONANT QUALITY CHANGE

Pedagogical X3 length mark after diphthongs in vertical line ˈ

Adding X3 length mark

Consonant X3 lengthening after diphthong in vertical line ˈ character - used in +N+Sg+Ill jeäll:jiâlˈlu, b c č ǯ d đ g ǧ k ǩ l m n p r s š t v also htˈt nˈnj

Diphthong extra short marker in vertical line ˈ character - used in +N+Sg+Ill ciâlkâlm:ciâˈlklmest ciâlkâlm+N+Pl+Gen:

Removing X3 length mark

Removing Consonant X3 length mark after diphthong in vertical line ˈ LEFT ARROW - deriving b c č ǯ d đ g ǧ k ǩ l m n p r s š t v also htˈt nˈnj

Removing Consonant X3 length mark after diphthong in vertical line ˈ LEFT ARROW - $ Sakssa-jânnam

Hyphen for splitting between look-alikes - used in Kääzzkõsraajõstuâjj-joouk

Sakssajânnam+N+Prop+Sg+Nom: (∑) Germany/Saksa

koummlo-õhtt+Num+Sg+Nom: (∑) 21

tuâjj+N+Cmp/SgNom+Cmp#joukk+N+Sg+Nom: team/työryhmä

sääʹmm+N+Cmp/SgGen+Cmp#musikk+N+Sg+Nom: Skolt Sámi music/kolttamusiikki sää0mm^PAL^CC2C{-Ø}#musikk sääʹm000-#musikk


This (part of) documentation was generated from src/fst/morphology/phonology.twolc


src-fst-morphology-root.lexc.md

Skolt Sámi morphological analyser

This file contains all definitions of symbols written by more than one character, and it contains the initial Root lexicon.

Definitions for Multichar_Symbols

Grammatical tags

Tags for POS

Pre-derivational POS tags for CG processing

Tags for sub-POS

Types of adverbs

Number

Case

symbols ?

Possessive suffix

Adjective declension

Verb forms Veʹrbbååʹbleʹǩ

###Valence

Person-number

Homonymy

Derivation

All non-positional derivations should be preceded by this tag, to make it possible to target regular expressions at all derivations in a language-independent way: just specify +Der|+Der1 .. +Der5 and you are set.

Verb derivation

Tags for originating language

The following tags are used to guide conversion to IPA: loan words and foreign names are usually pronounced (approximately) as in the originating (majority) language. Instead of trying to identify the correct pronunciation based on phonotactics (orthotactics actually), we tag all words that can’t be correctly transcribed using the SME transcriber with source language codes. Once tagged, it is possible to split the lexical transducer in smaller ones according to langu- age, and apply different IPA conversion to each of them.

The principle of tagging is that we only tag to the extent needed, and following a priority:

  1. any untagged word is pronounced with SME orthographic conventions
  2. NNO and NOB have identical pronunciation, NNO is only used if different in spelling from NOB
  3. SWE has mostly the same pronunciation as NOB, and is only used if different in spelling from NOB
  4. Occasionally even SME (the default) may be tagged, to block other languages from being specified, mainly during semi-automatic language tagging sessions

All in all, we want to get as much correctly transcribed to IPA with as little work as possible. On the other hand, if more words are tagged than strictly needed, this should pose no problem as long as the IPA conversion is correct - at least some words will get the same pronunciation whether read as SME or NOB/NNO/SWE.

Government tags

Semantic tags

Multiple Semantic tags:

Clitic

Tags distinguishing different versions of the same lemma (before POS)

In the xml the varid attribute is used in the st element with a mere numeric value an extra lemma attribute is inserted in the st element, e.g. lemma=”tõlvvad”

Other tags

Punctuation

Letters

Skolt Saami letters

These definitions are probably not needed

Archiphonemes

These are for letters with special behaviour. Say that all m-s change to n in a given context, but not this m, because it is m2. In twolc these are then defined m2:m, etc, i.e. the m2 is an m, although it is a different m.

Diacritic marks

These symbol govern the way the morphophonological rules treat the affix string.

This project started out using arbitrary names, X1, X2…, but since they were hard to remember, we changed to (a bit) more transparent names (^DIADEL, …). On the TODO-list: Change all X1, X2, … to easy-to-remember names. Special iterations

Consonant lengthening

Vowel length and height

for vowel height, by default vowels are low.

CHARACTERISTIC BREAKDOWN 2015-02-17

Gradation triggers 2015.01.23

Gradation triggers 2015.02.09 For Consonant Clusters

Diacritic with mnemonic names

Hyphen at compound word boundary

Escaped symbols

Symbols that need to be escaped on the lower side (towards twolc):

The Usage extents are marked using following tags:

Dialect tags:

Compounding

Flag diacritics

| Flag | Explanation | — | —

We have manually optimised the structure of our lexicon using following flag diacritics to restrict morhpological combinatorics - only allow compounds with verbs if the verb is further derived into a noun again:

Flag Explanation
@P.NeedNoun.ON@ (Dis)allow compounds with verbs unless nominalised
@D.NeedNoun.ON@ (Dis)allow compounds with verbs unless nominalised
@C.NeedNoun@ (Dis)allow compounds with verbs unless nominalised

For languages that allow compounding, the following flag diacritics are needed to control position-based compounding restrictions for nominals. Their use is handled automatically if combined with +CmpN/xxx tags. If not used, they will do no harm.

Flag Explanation
@P.CmpFrst.FALSE@ Require that words tagged as such only appear first
@D.CmpPref.TRUE@ Block such words from entering ENDLEX
@P.CmpPref.FALSE@ Block these words from making further compounds
@D.CmpLast.TRUE@ Block such words from entering R
@D.CmpNone.TRUE@ Combines with the next tag to prohibit compounding
@U.CmpNone.FALSE@ Combines with the prev tag to prohibit compounding
@P.CmpOnly.TRUE@ Sets a flag to indicate that the word has passed R
@D.CmpOnly.FALSE@ Disallow words coming directly from root.

Use the following flag diacritics to control downcasing of derived proper nouns (e.g. Finnish Pariisi -> pariisilainen). See e.g. North Sámi for how to use these flags. There exists a ready-made regex that will do the actual down-casing given the proper use of these flags.

Flag Explanation
@U.Cap.Obl@ Allowing downcasing of derived names: deatnulasj.
@U.Cap.Opt@ Allowing downcasing of derived names: deatnulasj.
@C.ErrOrth@ tbw
@R.ErrOrth.ON@ tbw
@D.ErrOrth.ON@ tbw
@P.ErrOrth.ON@ tbw
@P.Pmatch.Backtrack@ tbw

Basic lexica, pointing to the other lexicon files

INCOMING lemma:stem Contlex sets to be distinguished from glossing in progress

NounRoot

VerbRoot

INTERJ_ Interjections

CONJUNCTIONS INTERJ_

CS_ Subjunction

CS-TEMP_ when

NUM_ NUM_VAHTT

NUM_ALGG NUM_AUTT NUM_TOLL NUM_PAPP NUM_AELDD NUM_KUEQLL NUM_TAQHTT NUM_KAEAEUQC

NUM_AANAR

NUM_ATOM NUM_JEAQNNN

PCLE_ is here since Pcle_sms2x.xml wants it. It does nothing. PCLE-NEG_ is here since Pcle_sms2x.xml wants it. It adds +Neg.

Postpositions with government tagging possible ADP_ PO_tag is the lexicon adding the tag +Po PO-ILL_ PO-LOC_ PO_ is a dummy lexicon not adding anything ADP-GOV-LOC_ PO-GOV-GEN_

Prepositions with government tagging possible

PR_tag is the lexicon adding the tag +Pr PR_ is a dummy lexicon not adding anything

PR-TEMP-GOV-LOC_

PREFIX/A_

SUF/A_


This (part of) documentation was generated from src/fst/morphology/root.lexc


src-fst-morphology-stems-abbreviations.lexc.md

File containing abbreviations

Lexica for adding tags and periods

Splitting in 4 + 1 groups, because of the preprocessor

The sublexica

Dividing between abbreviations with and witout final period

ABBREVIATIONS these still need development 2015-09-11

The lexicons that add tags

+Adv+ABBR: RHyph ;

+ABBR:%.%> # ;

The abbreviation lexicon itself

This class contains homonyms, which are both intransitive abbreviations and normal words. The abbreviation usage is less common and thus only the occurences in the middle of the sentnece (when next word has small letters) can be considered as true cases.

For abbrs for which numerals are complements, but other words not necessarily are. This group treats arabic numerals as if it were transitive but letters as if it were intransitive.

This lexicon is for abbrs that always have a constituent following it.

This class contains homonyms, which are both abbrs for which numerals are complements and normal words. The abbreviation usage is less common and thus only the occurences in the middle of the sentnece can be considered as true cases.


This (part of) documentation was generated from src/fst/morphology/stems/abbreviations.lexc


src-fst-morphology-stems-adjectives_newwords.lexc.md

This is where new words are added as lexc entries before they are added to the xml source files. slooman:slooman A_AANAR ;

ADD ADJECTIVES BELOW

Not added yet to wiki

2017-09-


This (part of) documentation was generated from src/fst/morphology/stems/adjectives_newwords.lexc


src-fst-morphology-stems-adpositions_newwords.lexc.md

This is where new words are added as lexc entries before they are added to the xml source files. Mättʼtõshalltõs:Mättʼtõshalltõs PROP_SAJOS “(eng) /(fin) /(rus) “ ;

ADD POSTPOSTIONS AND PREPOSITIONS BELOW

CODED BY EINO AND JASKA

POSTPOSITIONS

PREPOSITIONS


This (part of) documentation was generated from src/fst/morphology/stems/adpositions_newwords.lexc


src-fst-morphology-stems-adverbs_newwords.lexc.md

This is where new words are added as lexc entries before they are added to the xml source files. Mättʼtõshalltõs:Mättʼtõshalltõs PROP_SAJOS “(eng) /(fin) /(rus) “ ;

ADD ADVERBS BELOW

CODED BY EINO AND JASKA

perintökieli


This (part of) documentation was generated from src/fst/morphology/stems/adverbs_newwords.lexc


src-fst-morphology-stems-conjunctions_newwords.lexc.md

This is where new words are added as lexc entries before they are added to the xml source files. Mättʼtõshalltõs:Mättʼtõshalltõs PROP_SAJOS “(eng) /(fin) /(rus) “ ;

ADD PROPER ADVERBS BELOW


This (part of) documentation was generated from src/fst/morphology/stems/conjunctions_newwords.lexc


src-fst-morphology-stems-exceptions.lexc.md

Exceptions are quite strange word-forms. the ones that do not fit anywhere else. This file contains all enumerated word forms that cannot reasonably be created from lexical data by regular inflection. Usually there should be next to none exceptions, it’s always better to have a paradigm that covers only one or few words than an exception since these will not work nicely with e.g. compounding scheme or possibly many end applications.

IRREGULAR ADJECTIVES

IRREGULAR DETERMINERS

IRREGULAR NOUNS

Some verbs have variant forms:

The verb of negation

PREFIXES for nouns

Spelling errors

Foreign words

FROM FORMER .lexc CONTENT

Skolt Saami adjectives

OUR LONG-TERM GOAL IS NOT TO ADD STEMS MANUALLY TO THIS FILE Instead we want to update the dictionary sms2X/src/a_sms2X.xml, from where the present lexc files will be regularly updated by exporting (we need a script for this).

Skolt Saami adpositions

Skolt Saami adverbs

Skolt Saami Conjunctions

The lexicon Conjunction lists the conjunction

Skolt Saami Interjections

The lexicon ij gives the tag +Interj

The lexicon Interjection lists the interjections

Skolt Saami Particles

List of particles in the lexicon Particle

ges+Pcle:ges PCLE_ ;

Propernoun lexicon, Skolt Sámi specific names

Subjunctions

The lexicon Subjunction lists the subjunctions

Verb roots

Here are the verb types so far:

TEST WORDS BEYOND THIS POINT

DO NOT ADD TRANSLATIONS DO NOT ADD NOTES

Skolt Saami Numerals

Lexicon Subjunction contains okta only. CODED BY JACK

BUT have most of their Contlex values THIS has a separate DB DON’T TRANSLATE


This (part of) documentation was generated from src/fst/morphology/stems/exceptions.lexc


src-fst-morphology-stems-nouns_newwords.lexc.md

This is where new words are added as lexc entries before they are added to the xml source files. ǩiõtt+N:ǩiõtt N_MUORR “(eng) /(fin) /(rus)” ;

ADD NOUNS BELOW Glossing 2015-12-02

Glossing

Newer words Contlex value missing


This (part of) documentation was generated from src/fst/morphology/stems/nouns_newwords.lexc


src-fst-morphology-stems-numerals.lexc.md

Skolt Saami Numerals


This (part of) documentation was generated from src/fst/morphology/stems/numerals.lexc


src-fst-morphology-stems-particles_newwords.lexc.md

This is where new words are added as lexc entries before they are added to the xml source files. Mättʼtõshalltõs:Mättʼtõshalltõs PROP_SAJOS “(eng) /(fin) /(rus) “ ;

ADD PARTICLES BELOW

CODED BY EINO AND JASKA

Lemmas:stems undesignated 2015-03-06 These have been commented out 2015-11-13


This (part of) documentation was generated from src/fst/morphology/stems/particles_newwords.lexc


src-fst-morphology-stems-pronouns_newwords.lexc.md

This is where new words are added as lexc entries before they are added to the xml source files. Mättʼtõshalltõs:Mättʼtõshalltõs PROP_SAJOS “(eng) /(fin) /(rus) “ ;

ADD PRONOUNS BELOW

CODED BY EINO AND JASKA


This (part of) documentation was generated from src/fst/morphology/stems/pronouns_newwords.lexc


src-fst-morphology-stems-propernouns_newwords.lexc.md

This is where new words are added as lexc entries before they are added to the xml source files. Mättʼtõshalltõs:Mättʼtõshalltõs PROP_SAJOS ;

ADD PROPER NOUNS BELOW LACKING SPECIFIC DECLENSION TYPE

WITH SPECIFIC DECLENSION TYPE

First names

SURNAMES

ORGANIZATIONS

Perintökieli


This (part of) documentation was generated from src/fst/morphology/stems/propernouns_newwords.lexc


src-fst-morphology-stems-sms-propernouns.lexc.md

Propernoun lexicon, Skolt Sámi specific names

The lexicon ProperNoun lists the proper nouns

First part of complex names

Ordinary person names

Ordinary place names

Ordinary misc names


This (part of) documentation was generated from src/fst/morphology/stems/sms-propernouns.lexc


src-fst-morphology-stems-toponyms_newwords.lexc.md

This is where new words are added as lexc entries before they are added to the xml source files.

PLACE NAMES

MORE Toponyms


This (part of) documentation was generated from src/fst/morphology/stems/toponyms_newwords.lexc


src-fst-morphology-stems-verbs_newwords.lexc.md

This is where new words are added as lexc entries before they are added to the xml source files. The conversion is done as follows:

  1. Turn the entries into tab-separated format
  2. Run them through src/scripts/newwords-to-xml.pl
  3. Add the output to V_sms2x.xml

Lexicon V_NEWWORDS is for ad-hoc adding of new entries


This (part of) documentation was generated from src/fst/morphology/stems/verbs_newwords.lexc


src-fst-phonetics-txt2ipa.xfscript.md

retroflex plosive, voiceless t ʈ 0288, 648 ( = ASCII 096) retroflex plosive, voiced d ɖ 0256, 598 labiodental nasal F ɱ 0271, 625 retroflex nasal n ɳ 0273, 627 palatal nasal J ɲ 0272, 626 velar nasal N ŋ 014B, 331 uvular nasal N\ ɴ 0274, 628

bilabial trill B\ ʙ 0299, 665 uvular trill R\ ʀ 0280, 640 alveolar tap 4 ɾ 027E, 638 retroflex flap r ɽ 027D, 637 bilabial fricative, voiceless p\ ɸ 0278, 632 bilabial fricative, voiced B β 03B2, 946 dental fricative, voiceless T θ 03B8, 952 dental fricative, voiced D ð 00F0, 240 postalveolar fricative, voiceless S ʃ 0283, 643 postalveolar fricative, voiced Z ʒ 0292, 658 retroflex fricative, voiceless s ʂ 0282, 642 retroflex fricative, voiced z` ʐ 0290, 656 palatal fricative, voiceless C ç 00E7, 231 palatal fricative, voiced j\ ʝ 029D, 669 velar fricative, voiced G ɣ 0263, 611 uvular fricative, voiceless X χ 03C7, 967 uvular fricative, voiced R ʁ 0281, 641 pharyngeal fricative, voiceless X\ ħ 0127, 295 pharyngeal fricative, voiced ?\ ʕ 0295, 661 glottal fricative, voiced h\ ɦ 0266, 614

alveolar lateral fricative, vl. K alveolar lateral fricative, vd. K\

labiodental approximant P (or v) alveolar approximant r\ retroflex approximant r` velar approximant M\

retroflex lateral approximant l` palatal lateral approximant L velar lateral approximant L
Clicks

bilabial O\ (O = capital letter) dental |
(post)alveolar !\ palatoalveolar =\ alveolar lateral ||
Ejectives, implosives

ejective > e.g. ejective p p> implosive < e.g. implosive b b< Vowels

close back unrounded M close central unrounded 1 close central rounded } lax i I lax y Y lax u U

close-mid front rounded 2 close-mid central unrounded @\ close-mid central rounded 8 close-mid back unrounded 7

schwa @

open-mid front unrounded E open-mid front rounded 9 open-mid central unrounded 3 open-mid central rounded 3\ open-mid back unrounded V open-mid back rounded O

ash (ae digraph) { open schwa (turned a) 6

open front rounded & open back unrounded A open back rounded Q Other symbols

voiceless labial-velar fricative W voiced labial-palatal approx. H voiceless epiglottal fricative H\ voiced epiglottal fricative <\ epiglottal plosive >\

alveolo-palatal fricative, vl. s\ alveolo-palatal fricative, voiced z\ alveolar lateral flap l\ simultaneous S and x x\ tie bar _ Suprasegmentals

primary stress “ secondary stress % long : half-long :\ extra-short _X linking mark -
Tones and word accents

level extra high _T level high _H level mid _M level low _L level extra low _B downstep ! upstep ^ (caret, circumflex)

contour, rising contour, falling _F contour, high rising _H_T contour, low rising _B_L

contour, rising-falling _R_F (NB Instead of being written as diacritics with _, all prosodic marks can alternatively be placed in a separate tier, set off by < >, as recommended for the next two symbols.) global rise global fall Diacritics

voiceless 0 (0 = figure), e.g. n_0 voiced _v aspirated _h more rounded _O (O = letter) less rounded _c advanced _+ retracted _- centralized _” syllabic = (or _=) e.g. n= (or n=) non-syllabic _^ rhoticity `

breathy voiced _t creaky voiced _k linguolabial _N labialized _w palatalized ‘ (or _j) e.g. t’ (or t_j) velarized _G pharyngealized _?\

dental d apical _a laminal _m nasalized ~ (or _~) e.g. A~ (or A~) nasal release _n lateral release _l no audible release _}

velarized or pharyngealized _e velarized l, alternatively 5 raised _r lowered _o advanced tongue root _A retracted tongue root _q


This (part of) documentation was generated from src/fst/phonetics/txt2ipa.xfscript


src-fst-syllabification-hyphenation.xfscript.md

Copy from smn starts here

Copy from smn ends here


This (part of) documentation was generated from src/fst/syllabification/hyphenation.xfscript


src-fst-transcriptions-transcriptor-abbrevs2text.lexc.md

We describe here how abbreviations are in Skolt Sami are read out, e.g. for text-to-speech systems.

For example:


This (part of) documentation was generated from src/fst/transcriptions/transcriptor-abbrevs2text.lexc


src-fst-transcriptions-transcriptor-clock-digit2text.lexc.md

This is still a dummy version, containing Skolt Saami.

**


This (part of) documentation was generated from src/fst/transcriptions/transcriptor-clock-digit2text.lexc


src-fst-transcriptions-transcriptor-date-digit2text.lexc.md

The Skolt Sámi dates ! This is still a dummy version, containing South Saami.

**


This (part of) documentation was generated from src/fst/transcriptions/transcriptor-date-digit2text.lexc


src-fst-transcriptions-transcriptor-numbers-digit2text.lexc.md

Skolt Saami number <-> letter transducer


This (part of) documentation was generated from src/fst/transcriptions/transcriptor-numbers-digit2text.lexc


tools-grammarcheckers-grammarchecker.cg3.md

S K O L T S A A M I G R A M M A R C H E C K E R

DELIMITERS

TAGS AND SETS

Tags

This section lists all the tags inherited from the fst, and used as tags in the syntactic analysis. The next section, Sets, contains sets defined on the basis of the tags listed here, those set names are not visible in the output.

Beginning and end of sentence

BOS EOS

Parts of speech tags

N A Adv V Pron CS CC CC-CS Po Pr Pcle Num Interj ABBR ACR CLB LEFT RIGHT WEB PPUNCT PUNCT

COMMA ¶

Tags for POS sub-categories

Pers Dem Interr Indef Recipr Refl Rel Coll NomAg Prop Allegro Arab Romertall

Tags for morphosyntactic properties

Nom Acc Gen Ill Loc Com Ess Ess Sg Du Pl Cmp/SplitR Cmp/SgNom Cmp/SgGen Cmp/SgGen PxSg1 PxSg2 PxSg3 PxDu1 PxDu2 PxDu3 PxPl1 PxPl2 PxPl3 Px

Comp Superl Attr Ord Qst IV TV Prt Prs Ind Pot Cond Imprt ImprtII Sg1 Sg2 Sg3 Du1 Du2 Du3 Pl1 Pl2 Pl3 Inf ConNeg Neg PrfPrc VGen PrsPrc Ger Sup Actio VAbess

Err/Orth

Semantic tags

Sem/Act Sem/Ani Sem/Atr Sem/Body Sem/Clth Sem/Domain Sem/Feat-phys Sem/Fem Sem/Group Sem/Lang Sem/Mal Sem/Measr Sem/Money Sem/Obj Sem/Obj-el Sem/Org Sem/Perc-emo Sem/Plc Sem/Sign Sem/State-sick Sem/Sur Sem/Time Sem/Txt

HUMAN

PROP-ATTR PROP-SUR

TIME-N-SET

Syntactic tags

@+FAUXV @+FMAINV @-FAUXV @-FMAINV @-FSUBJ> @-F<OBJ @-FOBJ> @-FSPRED<OBJ @-F<ADVL @-FADVL> @-F<SPRED @-F<OPRED @-FSPRED> @-FOPRED> @>ADVL @ADVL< @<ADVL @ADVL> @ADVL @HAB> @<HAB @>N @Interj @N< @>A @P< @>P @HNOUN @INTERJ @>Num @Pron< @>Pron @Num< @OBJ @<OBJ @OBJ> @OPRED @<OPRED @OPRED> @PCLE @COMP-CS< @SPRED @<SPRED @SPRED> @SUBJ @<SUBJ @SUBJ> SUBJ SPRED OPRED @PPRED @APP @APP-N< @APP-Pron< @APP>Pron @APP-Num< @APP-ADVL< @VOC @CVP @CNP OBJ

-OTHERS SYN-V @X ## Sets containing sets of lists and tags This part of the file lists a large number of sets based partly upon the tags defined above, and partly upon lexemes drawn from the lexicon. See the sourcefile itself to inspect the sets, what follows here is an overview of the set types. ### Sets for Single-word sets INITIAL ### Sets for word or not WORD NOT-COMMA ### Case sets ADLVCASE CASE-AGREEMENT CASE NOT-NOM NOT-GEN NOT-ACC ### Verb sets NOT-V ### Sets for finiteness and mood REAL-NEG MOOD-V NOT-PRFPRC ### Sets for person SG1-V SG2-V SG3-V DU1-V DU2-V DU3-V PL1-V PL2-V PL3-V ### Pronoun sets ### Adjectival sets and their complements ### Adverbial sets and their complements ### Sets of elements with common syntactic behaviour ### NP sets defined according to their morphosyntactic features ### The PRE-NP-HEAD family of sets These sets model noun phrases (NPs). The idea is to first define whatever can occur in front of the head of the NP, and thereafter negate that with the expression **WORD - premodifiers**. ### Border sets and their complements ### Grammarchecker sets * * * This (part of) documentation was generated from [tools/grammarcheckers/grammarchecker.cg3](https://github.com/giellalt/lang-sms/blob/main/tools/grammarcheckers/grammarchecker.cg3) --- # tools-tokenisers-tokeniser-disamb-gt-desc.pmscript.md # Tokeniser for sms Usage: ``` $ make $ echo "ja, ja" | hfst-tokenise --giella-cg tokeniser-disamb-gt-desc.pmhfst $ echo "Juos gorreválggain lea (dárbbašlaš) deavdit gáibádusa boasttu olmmoš, man mielde lahtuid." | hfst-tokenise --giella-cg tokeniser-disamb-gt-desc.pmhfst $ echo "(gáfe) 'ja' ja 3. ja? ц jaja ukjend \"ukjend\"" | hfst-tokenise --giella-cg tokeniser-disamb-gt-desc.pmhfst $ echo "márffibiillagáffe" | hfst-tokenise --giella-cg tokeniser-disamb-gt-desc.pmhfst ``` Pmatch documentation: <https://github.com/hfst/hfst/wiki/HfstPmatch> Characters which have analyses in the lexicon, but can appear without spaces before/after, that is, with no context conditions, and adjacent to words: * Punct contains ASCII punctuation marks * The symbol after m-dash is soft-hyphen `U+00AD` * The symbol following {•} is byte-order-mark / zero-width no-break space `U+FEFF`. Whitespace contains ASCII white space and the List contains some unicode white space characters * En Quad U+2000 to Zero-Width Joiner U+200d' * Narrow No-Break Space U+202F * Medium Mathematical Space U+205F * Word joiner U+2060 Apart from what's in our morphology, there are 1. unknown word-like forms, and 2. unmatched strings We want to give 1) a match, but let 2) be treated specially by `hfst-tokenise -a` Unknowns are made of: * lower-case ASCII * upper-case ASCII * select extended latin symbols * skolt specific charactesr ASCII digits * select symbols * Combining diacritics as individual symbols, * various symbols from Private area (probably Microsoft), so far: * U+F0B7 for "x in box" ## Unknown handling Unknowns are tagged ?? and treated specially with `hfst-tokenise` hfst-tokenise --giella-cg will treat such empty analyses as unknowns, and remove empty analyses from other readings. Empty readings are also legal in CG, they get a default baseform equal to the wordform, but no tag to check, so it's safer to let hfst-tokenise handle them. Finally we mark as a token any sequence making up a: * known word in context * unknown (OOV) token in context * sequence of word and punctuation * URL in context * * * This (part of) documentation was generated from [tools/tokenisers/tokeniser-disamb-gt-desc.pmscript](https://github.com/giellalt/lang-sms/blob/main/tools/tokenisers/tokeniser-disamb-gt-desc.pmscript) --- # tools-tokenisers-tokeniser-gramcheck-gt-desc.pmscript.md # Grammar checker tokenisation for sms Requires a recent version of HFST (3.10.0 / git revision>=3aecdbc) Then just: ``` $ make $ echo "ja, ja" | hfst-tokenise --giella-cg tokeniser-disamb-gt-desc.pmhfst ``` More usage examples: ``` $ echo "Juos gorreválggain lea (dárbbašlaš) deavdit gáibádusa boasttu olmmoš, man mielde lahtuid." | hfst-tokenise --giella-cg tokeniser-disamb-gt-desc.pmhfst $ echo "(gáfe) 'ja' ja 3. ja? ц jaja ukjend \"ukjend\"" | hfst-tokenise --giella-cg tokeniser-disamb-gt-desc.pmhfst $ echo "márffibiillagáffe" | hfst-tokenise --giella-cg tokeniser-disamb-gt-desc.pmhfst ``` Pmatch documentation: <https://github.com/hfst/hfst/wiki/HfstPmatch> Characters which have analyses in the lexicon, but can appear without spaces before/after, that is, with no context conditions, and adjacent to words: * Punct contains ASCII punctuation marks * The symbol after m-dash is soft-hyphen `U+00AD` * The symbol following {•} is byte-order-mark / zero-width no-break space `U+FEFF`. Whitespace contains ASCII white space and the List contains some unicode white space characters * En Quad U+2000 to Zero-Width Joiner U+200d' * Narrow No-Break Space U+202F * Medium Mathematical Space U+205F * Word joiner U+2060 Apart from what's in our morphology, there are 1) unknown word-like forms, and 2) unmatched strings We want to give 1) a match, but let 2) be treated specially by hfst-tokenise -a * select extended latin symbols * select symbols * various symbols from Private area (probably Microsoft), so far: * U+F0B7 for "x in box" TODO: Could use something like this, but built-in's don't include šžđčŋ: Simply give an empty reading when something is unknown: hfst-tokenise --giella-cg will treat such empty analyses as unknowns, and remove empty analyses from other readings. Empty readings are also legal in CG, they get a default baseform equal to the wordform, but no tag to check, so it's safer to let hfst-tokenise handle them. Finally we mark as a token any sequence making up a: * known word in context * unknown (OOV) token in context * sequence of word and punctuation * URL in context * * * This (part of) documentation was generated from [tools/tokenisers/tokeniser-gramcheck-gt-desc.pmscript](https://github.com/giellalt/lang-sms/blob/main/tools/tokenisers/tokeniser-gramcheck-gt-desc.pmscript) --- # tools-tokenisers-tokeniser-tts-cggt-desc.pmscript.md # TTS tokenisation for smj Requires a recent version of HFST (3.10.0 / git revision>=3aecdbc) Then just: ```sh make echo "ja, ja" \ | hfst-tokenise --giella-cg tokeniser-disamb-gt-desc.pmhfst ``` More usage examples: ```sh echo "Juos gorreválggain lea (dárbbašlaš) deavdit gáibádusa \ boasttu olmmoš, man mielde lahtuid." \ | hfst-tokenise --giella-cg tokeniser-disamb-gt-desc.pmhfst echo "(gáfe) 'ja' ja 3. ja? ц jaja ukjend \"ukjend\"" \ | hfst-tokenise --giella-cg tokeniser-disamb-gt-desc.pmhfst echo "márffibiillagáffe" \ | hfst-tokenise --giella-cg tokeniser-disamb-gt-desc.pmhfst ``` Pmatch documentation: <https://kitwiki.csc.fi/twiki/bin/view/KitWiki/HfstPmatch> Characters which have analyses in the lexicon, but can appear without spaces before/after, that is, with no context conditions, and adjacent to words: * Punct contains ASCII punctuation marks * The symbol after m-dash is soft-hyphen `U+00AD` * The symbol following {•} is byte-order-mark / zero-width no-break space `U+FEFF`. Whitespace contains ASCII white space and the List contains some unicode white space characters * En Quad U+2000 to Zero-Width Joiner U+200d' * Narrow No-Break Space U+202F * Medium Mathematical Space U+205F * Word joiner U+2060 Apart from what's in our morphology, there are 1) unknown word-like forms, and 2) unmatched strings We want to give 1) a match, but let 2) be treated specially by hfst-tokenise -a * select extended latin symbols * select symbols * various symbols from Private area (probably Microsoft), so far: * U+F0B7 for "x in box" TODO: Could use something like this, but built-in's don't include šžđčŋ: Simply give an empty reading when something is unknown: hfst-tokenise --giella-cg will treat such empty analyses as unknowns, and remove empty analyses from other readings. Empty readings are also legal in CG, they get a default baseform equal to the wordform, but no tag to check, so it's safer to let hfst-tokenise handle them. Needs hfst-tokenise to output things differently depending on the tag they get * * * This (part of) documentation was generated from [tools/tokenisers/tokeniser-tts-cggt-desc.pmscript](https://github.com/giellalt/lang-sms/blob/main/tools/tokenisers/tokeniser-tts-cggt-desc.pmscript)