Liv NLP Grammar

Finite state and Constraint Grammar based analysers, proofing tools and other resources

View the project on GitHub giellalt/lang-liv

Liv language model documentation

All doc-comment documentation in one large file.


src-cg3-functions.cg3.md

These sets model noun phrases (NPs). The idea is to first define whatever can occur in front of the head of the NP, and thereafter negate that with the expression WORD - premodifiers.

The set NOT-NPMOD is used to find barriers between NPs. Typical usage: … (*1 N BARRIER NPT-NPMOD) … meaning: Scan to the first noun, ignoring anything that can be part of the noun phrase of that noun (i.e., “scan to the next NP head”)

These were the set types.

HABITIVE MAPPING

sma object

SUBJ MAPPING - leftovers

OBJ MAPPING - leftovers

HNOUN MAPPING


This (part of) documentation was generated from src/cg3/functions.cg3


src-fst-morphology-affixes-adjectives.lexc.md

Adjective inflection

This file documents affixes/adjectives.lexc, the file for Livonian adjective inflection.

Indeclneables

**LEXICON A_-ZERO = modifiers that do not decline, goes to #

**LEXICON A_ = gives Pos tag.

Stem lexica

LEXICON A_PŪ contains pū: 12

LEXICON A_BRĪ contains brī:brī 16

LEXICON A_KALĀ contains kalā:kaʼlā 18

LEXICON A_TUBĀ tubā:tuʼbā 19

LEXICON A_AMĀ amā:aʼm 19a

LEXICON A_AIGĀ aigā:aʼig 20

LEXICON A_KŪJA kūja:??lēba 21

LEXICON A_IZĀ izā:izā 25

LEXICON A_OKSĀ oksā:oksā 30

LEXICON A_ĀIGA āiga:āiga 33

LEXICON A_SĪLMA sīlma:sīlma 34

LEXICON A_PADĀ padā:padā 39

LEXICON A_KÄPĀ käpā:käpā 41

LEXICON A_MAKSĀ maksā:maksā 42

LEXICON A_KĒRA kēra:kēra 43

LEXICON A_JǬRA jǭra:jǭra 44

LEXICON A_ĀITA āita:āita 46

LEXICON A_ŪŠKA ūška:ūška 47

LEXICON A_MȬKA mȭka:mȭka 48

LEXICON A_DADŽĀ dadžā:dadžā 49

LEXICON A_TĪERA tīera:tīera 54

LEXICON A_LILLA kuțā:kuțā 57

LEXICON A_KIʼV kiʼv:kiv 59

LEXICON A_PIʼŅ piʼņ:piņ 64

LEXICON A_OKŠ : 68

LEXICON A_KAŠ : 69

LEXICON A_TORĪ torī: 71

LEXICON A_KÕʼL kõʼl:kõl 73

LEXICON A_NIʼM niʼm:nim 76

LEXICON A_KAND kand: 94

LEXICON A_UL ul: 99

LEXICON A_NIŖȚ niŗț: 102

LEXICON A_DAŅTŠ daņtš: 105

LEXICON A_TÄUŽ täuž: adres 112

LEXICON A_SIELDÕ sieldõ: 118

LEXICON A_NǬʼGÕ nǭʼgõ:nǭgõ 119

LEXICON A_AŠŠÕ : 120

LEXICON A_DRŪʼOŠÕ drūʼošõ:drūošõ 121

LEXICON A_IRM : 125

LEXICON A_KIM : 126

LEXICON A_VAʼIT vaʼit:vait 128

LEXICON A_AMĀT : 129

LEXICON A_SAʼGDIT saʼgdit:sagdit 131

LEXICON A_VIĻȚ : 132

LEXICON A_EĻ eļ: 133

LEXICON A_BLĒʼḐ blēʼḑ:blēḑ 134

LEXICON A_FAKT : 135

LEXICON A_SĪEND sīend: 138

LEXICON A_LǞʼND lǟʼnd:lǟnd 139

LEXICON A_ĀIGAST āigast: 140

LEXICON A_ANALĪZ analīz: 141

LEXICON A_NĪʼEM nīʼem:nīem 142

LEXICON A_VIŠ : 144

LEXICON A_SIDĀM : 157

LEXICON A_TŪOITÕG : 158

LEXICON A_KǬRAND kǭrand: 159

LEXICON A_ȬʼDÕG ȭʼdõg:ȭdõg 160

LEXICON A_TAPTÕD taptõd: 161

LEXICON A_TĪʼEDÕD tīʼedõd:tīedõd 162

LEXICON A_VĪDÕZ vīdõz: 163

LEXICON A_TUOISTÕNZ : 164

LEXICON A_ĪʼDÕKSMÕZ ī’dõksmõz:īdõksmõz 165

LEXICON A_KÄBRĀZ : 168

LEXICON A_MAIGĀZ : 169

LEXICON A_NÕTKĀZ : 170

LEXICON A_RIKĀZ : 171

LEXICON A_ĀMBAZ āmbaz:āmba 173

LEXICON A_PŪŖAZ : 174

LEXICON A_PǬĻAZ : 175

LEXICON A_MÕTKÕZ mõtkõz: 179

LEXICON A_VȬRÕZ vȭrõz: 180

LEXICON A_ARĀGÕZ : 181

LEXICON A_ÄʼGGÕZ ä’ggõz:äggõz 182

LEXICON A_PŪʼDÕZ pūʼdõz:pūdõz 183

LEXICON A_SĒJI : 186 āndaji:āndaji sēji:sēji

LEXICON A_AKKIJI akkiji:akkiji 187

LEXICON A_LĒʼJI lēʼji:lēʼji 188

LEXICON A_AʼIGI aʼigi:aigi 192

LEXICON A_PUʼNNI pu’nni:punni 193

LEXICON A_KAȚKI : 194

LEXICON A_KUKKI : 195

LEXICON A_AIGI aigi:aigi 196

LEXICON A_OUKI : 197

LEXICON A_PAŖĪ : 198

LEXICON A_TŪĻI : 199

LEXICON A_AʼBLI aʼbli:abli 200

LEXICON A_SĒMI : 201

LEXICON A_LĒʼMI lē’mi:lēʼmi 202

LEXICON A_ALĪZ : 203

LEXICON A_KĒRATÕKS : 207

LEXICON A_VARĪKŠ varīkš: 209

LEXICON A_ŪŽ : 219 ūž:ūd

LEXICON A_JŪŖ jūŗ:jūr 221

LEXICON A_SŪR sūr:sūr 222

LEXICON A_DULLÕNZ dullõnz:dullõn 227

LEXICON A_AŅGÕRZ : aņgõrz:aņgõr 229

LEXICON A_TIDĀR tidār:tidār 233

LEXICON A_APPÕN appõn:appõn 235

LEXICON A_ǬʼRÕN ǭʼrõn:ǭrõn 236

LEXICON A_KĪNDÕR kīndõr:kīndõr 237

LEXICON A_BÄʼZMÕR bäʼzmõr:bäzmõr 238

LEXICON A_TARĪĻ tarīļ:tarīļ 239

LEXICON A_ĀNKAŖ ānkaŗ:ānkaŗ 240

LEXICON A_ǬʼBIĻ ǭʼbiļ:ǭbiļ 242


This (part of) documentation was generated from src/fst/morphology/affixes/adjectives.lexc


src-fst-morphology-affixes-adpositions.lexc.md

Adjective inflection

This file documents affixes/adpositions.lexc

**LEXICON POSTP_ = points to #

**LEXICON POSTP_ = points to #


This (part of) documentation was generated from src/fst/morphology/affixes/adpositions.lexc


src-fst-morphology-affixes-conjunctors.lexc.md

Conjunctions

This file documents affixes/conjunctors.lexc

**LEXICON CONJ_ = These need to be corrected, it points to #.

**LEXICON CC_ = Livonian conjunctors, points to #

**LEXICON CS_ = Livonian subjunctors, points to #


This (part of) documentation was generated from src/fst/morphology/affixes/conjunctors.lexc


src-fst-morphology-affixes-determiners.lexc.md

Determiner inflection

This file documents affixes/determiners.lexc, the language model for Livonian determiner inflection.

Stem lexica

LEXICON DET_NAI nai: 191

LEXICON DET_TŪĻI tūļi: 199

LEXICON DET_SĒMI sēmi: 201


This (part of) documentation was generated from src/fst/morphology/affixes/determiners.lexc


src-fst-morphology-affixes-nouns.lexc.md

Livonian noun inflection

This file documents affixes/nouns.lexc, the Livonian noun inflection file.

Ad hoc lexica

PROBLEMS with dictionary lexica

Stem lexica

Nominal inflection

Inflection lexica

13

14 Stem change: Yes Vowel raising ǟ:ē +Pl +Ela/+Ill/+Par Stød: Yes

tiēšti

16 Stem change: None

SG-INE ;

18

18a

19

19a

20

21

22

23 Stem change: Yes Vowel change in 1st syllable ǭ:a Consonant change ij:j Stød: None

24

25

Stem change: Yes

Stem change: Yes (Vowel)

33

33b LĀNGA Stem change: Yes (Vowel) Stød: None

34

35

37

38

39, 40, 41, 42

40

41

42

43

44

45

46

59 kiv:kiʼvv

60

76

102

125, 126, 128

126

126b

129, 130, 131

132

135

140, 141, 142 241

241 was ĀIGAST

141 87

142

142

143, 144, 145

145

158

159

160 72

179

181

182

183

184

192

aʼigi:aʼigi

199

211

212

225

226, 227, 228

233

SG-DAT ; SG-ELA ; SG-ILL ; SG-INS ; SG-PAR ;

NUMBER AND CASE

above as pair in SG-ELA/INE_st; 2014 jaska

A trigger for z:ž will be required


This (part of) documentation was generated from src/fst/morphology/affixes/nouns.lexc


src-fst-morphology-affixes-pronouns.lexc.md

Prounoun inflection

This file documents affixes/pronouns.lexc, the file on Livonian pronoun inflection

**LEXICON PRON_ = goes to # only.

LEXICON PRON_MIS mis:mi 1

LEXICON PRON_JEGĀ jegā:jeʼgā 2

LEXICON PRON_MŪ mū:m 3

LEXICON PRON_SE se:s 4

LEXICON PRON_TÄMĀ tämā: 5

LEXICON PRON_NE ne:n 4 & 5

LEXICON PRON_MINĀ 6 ma:m

LEXICON PRON_MĒG minā:m 6

LEXICON PRON_SINĀ sinā:0 7

LEXICON PRON_TĒG tēg:t 7

LEXICON PRON_KIS kis:kī 8

LEXICON PRON_ĪʼŽ 9 īʼž:0

LEXICON PRON_MIDĀGÕD midāgõd:midāg 10

LEXICON PRON_MITS 11 mits:mit

LEXICON PRON_SET 11b set:set

Stem lexica LEXICON PRON_TUBĀ tubā:tubā 19

LEXICON PRON_TUBĀ-PL tubā:tubā 19

LEXICON PRON_ĀITA āita:āita 46

LEXICON PRON_ĀIGAST āigast: 140

LEXICON PRON_AZŪM-PL azūm: 153

LEXICON PRON_VĪDÕZ vīdõz: 163

LEXICON PRON_ĪKŠ : 217


This (part of) documentation was generated from src/fst/morphology/affixes/pronouns.lexc


src-fst-morphology-affixes-propernouns.lexc.md

Proper noun inflection

This file documents affixes/propernouns.lexc, the file for inflection of propernouns.

Livonian proper nouns inflect in the same cases as regular nouns, but with a colon (‘:’) as separator.

**LEXICON PROP_ = this lexicon goes to K only

Stem lexica LEXICON PROP_TOP_PŪ contains pū: 12

LEXICON PROP_PŪ contains pū: 12

LEXICON PROP_PŪ-SG contains pū: 12

LEXICON PROP_KALĀ contains kalā:kalā 18

LEXICON PROP_KALĀ-SG contains kalā:kalā 18

LEXICON PROP_IRĒ-SG contains irē:iʼr 18a

LEXICON PROP_TUBĀ tubā:tubā 19

LEXICON PROP_VĒNA vēna:vēna 37

LEXICON PROP_PADĀ padā:padā 39

LEXICON PROP_JǬRA jǭra:jǭra 44

LEXICON PROP_JǬRA-PL jǭra:jǭra 44

LEXICON PROP_ĀITA āita:āita 46

LEXICON PROP_ŪŠKA ūška:ūška 47

LEXICON PROP_DADŽĀ dadžā:dadžā 49

LEXICON PROP_KRǬIPA krǭipa:krǭipa 55

LEXICON PROP_DUŅTŠ : 70

LEXICON PROP_NIʼM niʼm:niʼm 76

LEXICON PROP_NIʼM-PL niʼm:niʼm 76

LEXICON PROP_TUP tup:tup 79

LEXICON PROP_NǬʼGÕ nǭʼgõ:nǭgõ 119

LEXICON PROP_KǬJ : 123

LEXICON PROP_KIM : 126

LEXICON PROP_KIM-SG : 126

LEXICON PROP_VAʼIT vaʼit:vait 128

LEXICON PROP_AMĀT : 129

LEXICON PROP_KULTŪR : 130

LEXICON PROP_VIĻȚ : 132

LEXICON PROP_FAKT fakt:fakt 135

LEXICON PROP_FAKT-SG fakt:fakt 135

LEXICON PROP_ĀIGAST : 140

LEXICON PROP_ANALĪZ : 141

LEXICON PROP_NĪʼEM-SG nīʼem:nīʼem 142

LEXICON PROP_JAĻKŠ : 143

LEXICON PROP_RŪʼTŠ rūʼtš:rūʼtš 145

LEXICON PROP_SIDĀM : 157

LEXICON PROP_TŪOITÕG : 158

LEXICON PROP_TŪOITÕG-SG : 158

LEXICON PROP_KǬRAND : 159

LEXICON PROP_KǬRAND-SG : 159

LEXICON PROP_ȬʼDÕG ȭʼdõg:ȭʼdõg 160

LEXICON PROP_ĀNDÕKS : 206

LEXICON PROP_PŪOL : 216

LEXICON PROP_SŪR : 222

LEXICON PROP_BIRKOV : 224

LEXICON PROP_SALĀJ-SG : 225

LEXICON PROP_TIDĀR tidār:tidār 233

LEXICON PROP_TIDĀR-PL tidār:tidār 233

LEXICON PROP_PĒGAL pēgal:pēgal 234

LEXICON PROP_APPÕN appõn:appõn 235

LEXICON PROP_KĪNDÕR kīndõr:kīndõr 237


This (part of) documentation was generated from src/fst/morphology/affixes/propernouns.lexc


src-fst-morphology-affixes-quantifiers.lexc.md

Quantifier inflection

This file documents the file on Livonian quantifier morphology.

LEXICON QNT_APPÕN : 216

LEXICON QNT_PŪOL : 216

Stem lexica LEXICON NUM_PADĀ padā:padā 39

LEXICON NUM_KĒRA kēra:kēra 43

LEXICON NUM_OKŠ : 68

LEXICON NUM_NǬʼGÕ nǭʼgõ:nǭgõ 119

LEXICON NUM_IRM irm: 125

LEXICON NUM_KIM : 126 kim:kim

LEXICON NUM_FAKT fakt: 135

LEXICON NUM_ĀIGAST āigast: 140

LEXICON NUM_NAI nai: 191

LEXICON NUM_ÄʼBȚÕKS ä’bțõks:äbțõks 208

LEXICON NUM_TŪĻ : 214

LEXICON NUM_ĪKŠ : 217

LEXICON NUM_KAKŠ : 218

LEXICON NUM_ŪŽ : 219

LEXICON NUM_APPÕN appõn:appõn 235


This (part of) documentation was generated from src/fst/morphology/affixes/quantifiers.lexc


src-fst-morphology-affixes-symbols.lexc.md

Symbol affixes

**LEXICON Noun_symbols_possibly_inflected =

**LEXICON Noun_symbols_never_inflected =

**LEXICON SYMBOL_connector =

**LEXICON SYMBOL_NO_suff =

**LEXICON SYMBOL_suff =


This (part of) documentation was generated from src/fst/morphology/affixes/symbols.lexc


src-fst-morphology-affixes-verbs.lexc.md

Livonian Verb inflection

This file documents the verb inflection of Livonian.

Verb stem classes

**LEXICON V_ = CONJUGATION TYPE MISSING

**LEXICON TV_ = CONJUGATION TYPE MISSING

**LEXICON V-AUX_VȰLDA = 10 vȱlda:ZERO

LEXICON IV_VȰLDA = 10 vȱlda: goes to **K

**LEXICON V-AUX_LǞʼDÕ = 1 lǟʼdõ:lǟʼ

**LEXICON IV_LǞʼDÕ = 1 lǟʼdõ:lǟʼ

**LEXICON TV_TǬʼDÕ = 2 tǭʼdõ:tǭʼ

**LEXICON V-AUX_VĪDÕ = 3 vīdõ:vī

**LEXICON IV_VĪDÕ = 3 vīdõ:vī

**LEXICON TV_VĪDÕ = 3 vīdõ:vī

**LEXICON TV_NǞʼDÕ = 4 nǟʼdõ:nǟʼ

**LEXICON IV_KǞʼDÕ = 5 kǟʼdõ:kǟʼ

**LEXICON TV_TĪʼEDÕ = 6 tīʼedõ:tīʼe

**LEXICON V-AUX_SĪEDÕ = 7 sīedõ:sīe

**LEXICON IV_SĪEDÕ = 7 sīedõ:sīe

**LEXICON TV_SĪEDÕ = 7 sīedõ:sīe

**LEXICON IV_SǬDÕ = 8 sǭdõ:s

**LEXICON TV_SǬDÕ = 8 sǭdõ:s

**LEXICON V-AUX_SǬDÕ = 8 sǭdõ:s

**LEXICON TV_JŪODÕ = 9 jūodõ:jūo

**LEXICON IV_TŪLDA = 11 tūlda:tūʼl

**LEXICON V-AUX_PĀNDA = 12 pānda:pāʼn

**LEXICON IV_PĀNDA = 12 pānda:pāʼn

**LEXICON TV_PĀNDA = 12 pānda:pāʼn

**LEXICON IV_JEʼLLÕ = 13 jeʼllõ:jeʼlā

**LEXICON TV_JEʼLLÕ = 13 jeʼllõ:jeʼllõ

**LEXICON IV_ASTÕ = 18 astõ:astõ

**LEXICON TV_ASTÕ = 18 astõ:astõ

**LEXICON TV_VÕTTÕ = 19 võttõ:võttõ

**LEXICON IV_VIEʼDDÕ = 24 vieʼddõ:vieʼddõ

**LEXICON TV_VIEʼDDÕ = 24 vieʼddõ:vieʼddõ

**LEXICON IV_MAKSÕ = 25 maksõ:maksõ

**LEXICON TV_MAKSÕ = 25 maksõ:maksõ

**LEXICON TV_TAPPÕ = 26 tappõ:tappõ

**LEXICON IV_MÄNGÕ = 14 mängõ:mǟnga

**LEXICON TV_KILLÕ = 15 killõ:kīla

**LEXICON TV_PALLÕ = 16 pallõ:pǭla

**LEXICON TV_LOULÕ = 17 loulõ:lōla

**LEXICON IV_LAITÕ = 20 laittõ:lāita

**LEXICON TV_LAITÕ = 20 laittõ:lāita

**LEXICON IV_TÄUTÕ = 21 täutõ:tǟta

**LEXICON TV_TÄUTÕ = 21 täutõ:tǟuta

**LEXICON TV_PȮĻTÕ = 22 pȯļtõ:pūoļta

**LEXICON TV_MȮISTÕ = 23 mȯistõ:mūošta

**LEXICON IV_ANDÕ = 27 andõ:ānda

**LEXICON TV_ANDÕ = 27 andõ:ānda

**LEXICON IV_TIEUDÕ = 28 tieudõ:tīeda

**LEXICON TV_TIEUDÕ = 28 tieudõ:tīeda

29-48 follow same pattern

**LEXICON IV_LUʼGGÕ = luʼggõ:luʼggõ 29

**LEXICON TV_LUʼGGÕ = luʼggõ:lugū 29

**LEXICON IV_MUʼDŽÕ = muʼdžõ:mudžū 30

**LEXICON TV_MUʼDŽÕ = muʼdžõ:mudžū 30

**LEXICON IV_VAKȚÕ = vakțõ:vakțū 31

**LEXICON TV_VAKȚÕ = vakțõ:vakțū 31

**LEXICON IV_KITTÕ = kittõ:kittõ 32

**LEXICON TV_KITTÕ = kittõ:kittõ 32

**LEXICON V-AUX_RIʼDDÕ = riʼddõ:ridū 33

**LEXICON IV_RIʼDDÕ = riʼddõ:ridū 33

**LEXICON TV_RIʼDDÕ = riʼddõ:ridū 33

**LEXICON IV_KUTSÕ = kutsõ:kutsū 34

**LEXICON TV_KUTSÕ = kutsõ:kutsū 34

**LEXICON V-AUX_LASKÕ = laskõ:laskū 35

**LEXICON IV_LASKÕ = laskõ:laskū 35

**LEXICON TV_LASKÕ = laskõ:laskū 35

**LEXICON TV_KÄSKÕ = laskõ:laskū 35b

**LEXICON IV_AKKÕ = akkõ:akū 36 Should ss be s and šš be š? 2013-02-19

**LEXICON TV_AKKÕ = akkõ:akū 36

**LEXICON V-AUX_AIGÕ = aigõ:āigõ 37

**LEXICON IV_AIGÕ = aigõ:āigõ 37

**LEXICON TV_AIGÕ = aigõ:āigõ 37

**LEXICON TV_KUOŖŖÕ = kuoŗŗõ:kūoŗõ 38

**LEXICON TV_VANNÕ = vannõ:vǭnõ 39

**LEXICON IV_PȮĻĻÕ = pȯļļõ:pūoļõ 40

**LEXICON IV_PȮIMÕ = pȯimõ:pūoimõ 41

**LEXICON TV_PȮIMÕ = pȯimõ:pūoimõ 41

**LEXICON IV_OUŖÕ = ouŗõ:ōŗõ 42

**LEXICON IV_KEIJÕ = keijõ:kējõ 43

**LEXICON TV_KEIJÕ = keijõ:kējõ 43

**LEXICON IV_AŖŠTÕ = aŗštõ:āŗštõ 44

**LEXICON TV_AŖŠTÕ = aŗštõ:āŗštõ 44

**LEXICON TV_PȮRTÕ = pȯrtõ:pūortõ 45

**LEXICON TV_OUTÕ = outõ:ōtõ 46

**LEXICON V-AUX_TUNDÕ = tundõ:tūndõ 47

**LEXICON IV_TUNDÕ = tundõ:tūndõ 47

**LEXICON TV_TUNDÕ = tundõ:tūndõ 47

**LEXICON TV_OUDÕ = oudõ:ōdõ 48

**LEXICON IV_KŪLÕ = kūlõ:kūlõ 49

**LEXICON TV_KŪLÕ = kūlõ:kūlõ 49

**LEXICON IV_ARRÕ = arrõ:arrõ 50

**LEXICON TV_ARRÕ = arrõ:arrõ 50

**LEXICON IV_AʼILÕ = aʼilõ:aʼilõ 51

**LEXICON TV_AʼILÕ = aʼilõ:aʼilõ 51

**LEXICON TV_SVAʼRRÕ = svaʼrrõ:svaʼrrõ 52

**LEXICON V-AUX_KĪTÕ = kītõ:kīt 53

**LEXICON IV_KĪTÕ = kītõ:kīt 53 ~701

**LEXICON TV_KĪTÕ = kītõ:kīt 53

**LEXICON IV_ÄʼBȚÕ = äʼbțõ:äʼbț 54

**LEXICON TV_ÄʼBȚÕ = äʼbțõ:äʼbț 54

**LEXICON V-AUX_KŪLDÕ = kūldõ:kūld 55

**LEXICON IV_KŪLDÕ = kūldõ:kūld 55

**LEXICON TV_KŪLDÕ = kūldõ:kūld 55

**LEXICON TV_KĪSKÕ = kīskõ:kīsk 56

**LEXICON V-AUX_ĪʼEDÕ = īʼedõ:īed 57

**LEXICON IV_ĪʼEDÕ = īʼedõ:īed 57

**LEXICON TV_ĪʼEDÕ = īʼedõ:īed 57

**LEXICON IV_UMBLÕ = umblõ: 58

**LEXICON TV_UMBLÕ = umblõ: 58

**LEXICON IV_ERȚĻÕ = erțļõ:erțõlõ 58b

**LEXICON TV_ERȚĻÕ = erțļõ:erțõlõ 58b

**LEXICON V-AUX_MÕTLÕ = mõtlõ: 59

**LEXICON IV_MÕTLÕ = mõtlõ: 59

**LEXICON TV_MÕTLÕ = mõtlõ: 59

**LEXICON IV_MǞʼDLÕ = mǟʼdlõ: 60

**LEXICON TV_MǞʼDLÕ = mǟʼdlõ: 60

**LEXICON IV_NAʼGRÕ = naʼgrõ: 60

**LEXICON TV_NAʼGRÕ = naʼgrõ: 60

**LEXICON V-AUX_ÄʼB = 62 äʼb:ä

**LEXICON TV_SÄ = 63 sä:sä

**LEXICON V-AUX_PIḐĪKS = 64 piḑīks:piḑī

After transitive, intransitive, auxiliary and such tags have been added

1

2 tǭʼdõ:tǭʼ

Prt Imprt

Jus Qvo

participles

3 **LEXICON V-01_VĪDÕ = This is mutual for 3: vīdõ:vī Prt Imprt

Jus Qvo

participles

**LEXICON V-01_NǞʼDÕ = This is mutual for ??: 4 nǟʼdõ:n Prt Imprt

Jus Qvo

participles

**LEXICON V-01_KǞʼDÕ = This is mutual for ??: 4 kǟʼdõ:kǟʼ Prt Imprt

Jus Qvo

participles

**LEXICON V-01_TĪʼEDÕ = : 6 tīedõ:tīʼe

Jus Qvo participles

**LEXICON V-01_SĪEDÕ = : 7 sīedõ:sīe

Jus Qvo

participles

8 sǭdõ:s Prt Imprt

Jus Qvo

participles 9 9 jūodõ:jūo Prt Imprt

Jus Qvo

participles 10

11 tūlda:tūʼl Prt Imprt

Jus Qvo participles 11

12 12 pānda:pāʼn Prt Imprt

Jus Qvo participles

**LEXICON V-01_JEʼLLÕ = 13 jeʼllõ, 18 astõ, 19 võttõ, 24 vieʼddõ, 25 maksõ, 26 tappõ

Cond Imprt Jus Qvo

participles

14 mängõ:mǟngõ **LEXICON V-01_MÄNGÕ = 14 mängõ, 15 killõ, 16 pallõ, 17 loulõ, 20 laitõ, 21 täutõ, 22 pȯļtõ, 23 mȯistõ, 27 āndõ, 28 tīeudõ

Imprt Jus Qvo

participles

15 killõ:kīllõ **LEXICON V-01_KILLÕ = 14 mängõ, 15 killõ, 16 pallõ, 17 loulõ, 20 laitõ, 21 täutõ, 22 pȯļtõ, 23 mȯistõ, 27 āndõ, 28 tīeudõ

Imprt Jus Qvo

participles

16 pallõ:pǭllõ **LEXICON V-01_PALLÕ = 14 mängõ, 15 killõ, 16 pallõ, 17 loulõ, 20 laitõ, 21 täutõ, 22 pȯļtõ, 23 mȯistõ, 27 āndõ, 28 tīeudõ

Imprt Jus Qvo

participles

17 loulõ:lōulõ **LEXICON V-01_LOULÕ = 14 mängõ, 15 killõ, 16 pallõ, 17 loulõ

18 astõ:astõ **LEXICON V-01_ASTÕ = 14 mängõ, 15 killõ, 16 pallõ, 17 loulõ

19 võttõ:võttõ **LEXICON V-01_VÕTTÕ = 14 mängõ, 15 killõ, 16 pallõ, 17 loulõ

20 laitõ: **LEXICON V-01_LAITÕ = 14 mängõ, 15 killõ, 16 pallõ, 17 loulõ, 20 laitõ

**LEXICON V-01_TÄUTÕ = 21 täutõ:tǟutõ

22 pȯļțõ:p **LEXICON V-01_PȮĻTÕ = 22 pȯļtõ, 23 mȯistõ, 27 āndõ, 28 tīeudõ Cond Imprt

Jus Qvo

participles

23 mȯistõ:m **LEXICON V-01_MȮISTÕ = 23 mȯistõ, 27 āndõ, 28 tīeudõ

Imprt

Jus Qvo

participles

**LEXICON V-01_VIEʼDDÕ = 13 jeʼllõ, 18 astõ, 19 võttõ, 24 vieʼddõ, 25 maksõ, 26 tappõ

Cond Imprt Jus Qvo

participles

**LEXICON V-01_MAKSÕ = 13 jeʼllõ, 18 astõ, 19 võttõ, 24 vieʼddõ, 25 maksõ, 26 tappõ

Cond Imprt Jus Qvo

participles

**LEXICON V-01_TAPPÕ = 13 jeʼllõ, 18 astõ, 19 võttõ, 24 vieʼddõ, 25 maksõ, 26 tappõ

Cond Imprt Jus Qvo

participles

27 andõ:āndõ **LEXICON V-01_ANDÕ = 14 mängõ, 15 killõ, 16 pallõ, 17 loulõ, 20 laitõ, 21 täutõ, 22 pȯļtõ, 23 mȯistõ, 27 āndõ, 28 tīeudõ

Imprt

Jus Qvo

participles

28 tieudõ:tīeudõ **LEXICON V-01_TIEUDÕ = 14 mängõ, 15 killõ, 16 pallõ, 17 loulõ, 20 laitõ, 21 täutõ, 22 pȯļtõ, 23 mȯistõ, 27 āndõ, 28 tīeudõ

Imprt

Jus Qvo

participles

29 LEXICON V-01_LUʼGGÕ luʼggõ:luʼggõ 29 This is mutual for 29-36: luʼggõ, muʼdžõ, vakțõ, kittõ, riʼddõ, kutsõ, laskõ, akkõ Prt ImprtI

Jus Kvo

participles

30 LEXICON V-01_MUʼDŽÕ luʼggõ:luʼggõ 29 This is mutual for 29-36: luʼggõ, muʼdžõ, vakțõ, kittõ, riʼddõ, kutsõ, laskõ, akkõ Prt ImprtI

Jus Kvo

participles

31 LEXICON V-01_VAKȚÕ luʼggõ:luʼggõ 29 This is mutual for 29-36: luʼggõ, muʼdžõ, vakțõ, kittõ, riʼddõ, kutsõ, laskõ, akkõ Prt ImprtI

Jus Kvo

participles

32 LEXICON V-01_KITTÕ kittõ:kittõ 32 This is mutual for 29-36: luʼggõ, muʼdžõ, vakțõ, kittõ, riʼddõ, kutsõ, laskõ, akkõ

Prt ImprtI

Jus Kvo

participles

33 LEXICON V-01_RIʼDDÕ riʼddõ:riʼddõ 33 This is mutual for 29-36: luʼggõ, muʼdžõ, vakțõ, kittõ, riʼddõ, kutsõ, laskõ, akkõ Prt ImprtI

Jus Kvo

participles

34 LEXICON V-01_KUTSÕ luʼggõ:luʼggõ 29 This is mutual for 29-36: luʼggõ, muʼdžõ, vakțõ, kittõ, riʼddõ, kutsõ, laskõ, akkõ Prt ImprtI

Jus Kvo

participles

35 LEXICON V-01_LASKÕ luʼggõ:luʼggõ 29 This is mutual for 29-36: luʼggõ, muʼdžõ, vakțõ, kittõ, riʼddõ, kutsõ, laskõ, akkõ Prt ImprtI

Jus Kvo

participles

35b LEXICON V-01_KÄSKÕ luʼggõ:luʼggõ 29 This is mutual for 29-36: luʼggõ, muʼdžõ, vakțõ, kittõ, riʼddõ, kutsõ, laskõ, akkõ Prt ImprtI

Jus Kvo

participles

36 LEXICON V-01_AKKÕ luʼggõ:luʼggõ 29 This is mutual for 29-36: luʼggõ, muʼdžõ, vakțõ, kittõ, riʼddõ, kutsõ, laskõ, akkõ Prt ImprtI

Jus Kvo

participles

37 This is mutual for 37-48

Prt

participles

**LEXICON V-01_KUOŖŖÕ = kuoŗŗõ:kūoŗŗõ 38 This is mutual for 37-48

Prt

participles

**LEXICON V-01_VANNÕ = vannõ:vǭnnõ 39 This is mutual for 37-48

Prt

participles

**LEXICON V-01_PȮĻĻÕ = pȯļļõ:pūoļļõ 40 This is mutual for 37-48

Prt

participles

**LEXICON V-01_PȮIMÕ = pȯimõ:pūoimõ 41 This is mutual for 37-48

Prt

participles

**LEXICON V-01_OUŖÕ = ouŗõ:ōuŗõ 42 This is mutual for 37-48

Prt

participles

**LEXICON V-01_KEIJÕ = keijõ:kēijõ 43 This is mutual for 37-48

Prt

participles

**LEXICON V-01_AŖŠTÕ = aŗštõ:āŗštõ 44 This is mutual for 37-48

Prt

participles

**LEXICON V-01_PȮRTÕ = pȯrtõ:pūortõ 45 This is mutual for 37-48

Prt

participles

**LEXICON V-01_OUTÕ = outõ:ōutõ 46 This is mutual for 37-48

Prt

participles

**LEXICON V-01_TUNDÕ = tundõ:tūndõ 47 This is mutual for 37-48

Prt

participles

**LEXICON V-01_OUDÕ = oudõ:ōdõ 48 This is mutual for 37-48

Prt

participles

49 **LEXICON V-01_KŪLÕ = This is mutual for 49-50, 52-57 Prt +Act+PrfPrc Cond

50 arrõ:arrõ **LEXICON V-01_ARRÕ = This is mutual for 49-50, 52-57 Prt +Act+PrfPrc Cond

51 **LEXICON V-01_AʼILÕ = This is mutual for 51 Ger, Ger_Ine

Imprt+Pl1, Imprt+Pl2, Imprt+ConNeg

Jus+Sg3, Jus+Pl3

Quo+Sg3, Quo+Pl3, +NomAct -mi

52 **LEXICON V-01_SVAʼRRÕ = This is mutual for 49-50, 52-57 Prt +Act+PrfPrc Cond

**LEXICON V-01_KĪTÕ = This is mutual for 49-50, 52-57 Prt +Act+PrfPrc Cond

**LEXICON V-01_ÄʼBȚÕ = This is mutual for 49-50, 52-57

Prt +Act+PrfPrc Cond

55 **LEXICON V-01_KŪLDÕ = This is mutual for 49-50, 52-57 Prt +Act+PrfPrc Cond

56 **LEXICON V-01_KĪSKÕ = This is mutual for 49-50, 52-57 Prt +Act+PrfPrc Cond

57 īedõ:īʼedõ **LEXICON V-01_ĪʼEDÕ = This is mutual for 49-50, 52-57 Prt +Act+PrfPrc Cond

58 umblõ:umbõlõ **LEXICON V-01_UMBLÕ = This is mutual for 58-61: umblõ, mõtlõ, mǟʼdlõ, naʼgrõ Prt Imprt

Jus Qvo

participles

58 erțļõ:erțõlõ **LEXICON V-01_ERȚĻÕ = This is mutual for 58-61: umblõ, mõtlõ, mǟʼdlõ, naʼgrõ Prt Imprt

Jus Qvo

participles

59 mõtlõ:umbõlõ **LEXICON V-01_MÕTLÕ = This is mutual for 58-61: umblõ, mõtlõ, mǟʼdlõ, naʼgrõ Prt Imprt

Jus Qvo

participles

60 mǟʼdlõ:mǟʼdõlõ **LEXICON V-01_MǞʼDLÕ = This is mutual for 58-61: umblõ, mõtlõ, mǟʼdlõ, naʼgrõ Prt Imprt

Jus Qvo

participles

61 naʼgrõ:naʼgõrõ **LEXICON V-01_NAʼGRÕ = This is mutual for 58-61: umblõ, mõtlõ, mǟʼdlõ, naʼgrõ Prt Imprt

Jus Qvo

participles

Nonfinites

**LEXICON GER_s =

**LEXICON GER_sõ =

**LEXICON INF_ZERO =

**LEXICON INF_dõ =

**LEXICON INF_da =

**LEXICON SUP-STEM_m =

**LEXICON SUP_m =

**LEXICON SUP_m =

**LEXICON SUP_mõ =

**LEXICON ACTPRSPRC =

**LEXICON ACTPRSPRC =

**LEXICON ACTPRFPRC_nd = Are the singular and plural homonyms?

**LEXICON ACTPRFPRC_SG-d/PL-nõd =

**LEXICON ACTPRFPRC_SG-nd/PL-nõd = Are the singular and plural homonyms?

**LEXICON ACTPRFPRC_SG-nd/PL-nnõd = Are the singular and plural homonyms?

**LEXICON PSSPRSPRC =

**LEXICON PSSPRSPRC_b =

**LEXICON PSSPRSPRC_tõb =

**LEXICON PSSPRFPRCSG = 2014-08-21

**LEXICON PSSPRFPRCSG_d = 2014-08-21

**LEXICON PSSPRFPRCSG_tõd = 2014-08-21

Finites

**LEXICON INDPRS_tõ = Indicative present

**LEXICON INDPRS_mõ/tõ/bõd = Indicative present

**LEXICON INDPRS_m/t/bõd = Indicative present

**LEXICON INDPRT_ZERO = Indicative preterite in i

**LEXICON INDPRT_i = Indicative preterite in i

**LEXICON INDPRT_ita = Indicative preterite in i

**LEXICON INDPRT_z = Indicative preterite in z

**LEXICON INDPRT_ztõ = Indicative preterite in z

**LEXICON INDPRT_zt/ztõ = Indicative preterite in z

**LEXICON INDPRT_ž = Indicative preterite in ž

**LEXICON INDPRT_žtõ = Indicative preterite in ž

**LEXICON INDPRTSG3-STEM_tõ =

**LEXICON COND = Conditional present

Indicative present

**LEXICON INDPRSSG1-STEM =

Conditional

Imperative

Jussative

Quotative


This (part of) documentation was generated from src/fst/morphology/affixes/verbs.lexc


src-fst-morphology-phonology.twolc.md

Livonian morphophonology

This file documents the phonology.twolc file

We first show alphabet and sets, thereafter rules.

Alphabet

Literal quotes and angles

They must be escaped (cf morpheme boundaries further down):

»7 «7 %[%>%] - Literal > %[%<%] - Literal <

Archiphonemes for consonant lengthening

Triggers

Vowel raising

Vowel metathesis

VOWEL SHORTENING

Sets

Rule section

Vowel rules

Shortening in first syllable

Rule: ǟ:ä in first syllable

Rule: ā:a in first syllable

Rule: ȱ:ȯ

Rule: ā:ī in second syllable plural

Rule: ū:ī in second syllable plural

Rule: ǭ:a in first syllable

Rule: ē:e in first syllable rēnaz+N+Sg+Gen:

Rule: ū:u in first syllable

Rule: ū:ȯ in first syllable

Rule: ī:i in first syllable

Rule: ȭ:õ in first syllable

Rule: ō:o in first syllable rōda+N+Pl+Par

lengthen vowels

Rule: e:ē in first syllable

Rule: u:ū in first syllable

Rule: ȯ:ū in first syllable

Rule: ä:ǟ in first syllable

VOWEL LENGTHENING

Rule: a:ǭ in first syllable

Rule: a:ā in first syllable

Rule: o:ō in first syllable

LOWER VOWELS Rule: ī:ē in tīe 15

Destressing in second syllable **Rule: ā:õ **

**Rule: a:õ **

**Rule: ū:õ **

Rule: õ:i

VOWEL LOSS

Rule: ā:0

Rule: ō:0

Rule: ū:0

Rule: ī:0

Rule: a:0 rēnaz+N+Sg+Gen:

rōda+N+Pl+Par

Rule: e:0

Rule: {õØ}:0

Rule: õ:0

Rule: i:0 in first syllable

Rule: u:0 in second position of first-syllable diphthong

Rule: o:0 in second position of first-syllable diphthong


* pūol0a%^Stress1to2%^ConsL examples:*

* pȯ0llõ00 examples:*

Zero to vowel

Rule: 0:õ in vowel metathesis

Consonant rules

Consonant loss

Rule: shorten consonantism between 1st and 2nd vowel center jeʼllõ:jelāb

Rule: g:0

Rule: l:0

Rule: m:0

Rule: z:0 rēnaz+N+Sg+Gen:

Consonant lengthening

Lengthening consonantism between first and second vowel center simultaneous to reducing vowel of second syllable

Rule: %{XC%}:Cx

%{XC%}:p 2014-02-27

%{XC%}:s 2020-10-21 tas+N+Sg+Ill

%{XC%}:ž 2014-02-27

%{XC%}:k 2014-02-27

Rule: Stod removal left

Rule: z:ž

Rule: d:ḑ lēʼḑ:līʼed 147 rōda+N+Pl+Par

Rule: ļ:l

Rule: l:ļ This rule should not require the %^ConsRM:0 trigger, but for now this makes it work. kēļ:kēl 215

Rule: n:ņ palatalization

Rule: r:ŗ jūŗ:jūr 221

Rule: d:t

Rule: d:ț

Rule: d:ž

Rules for consonant loss

Rule: d:0 Vow: (ʼ:) (Cns:+) _ (%^Pen: %^VowsRM:|%^VOWRaise:) (%^PreI: %^StodRM:|%^VowsLI1:|%^StodRM:) [%^D2ZERO:0|%^ConsRM:] ;

P loss before subsequent morpheme with underlying

T loss before subsequent morpheme with underlying initial d

Rule: k:0

Rule: š:0

Rule: ț:0

Rule: s:š palatalization

Rule: ǟ:ē palatalization

Rule: ǟ:e short and palatalization

Rule: ä:e short and palatalization

Rule: ǭ:ä palatalization

Rule: a:ä palatalization


This (part of) documentation was generated from src/fst/morphology/phonology.twolc


src-fst-morphology-root.lexc.md

Morphology

INTRODUCTION TO THE MORPHOLOGICAL ANALYSER OF LIVONIAN.

List of the multichar symbols

The morphological analyses of wordforms in Livonian are presented in this system in terms of the symbols declared below.

(It is highly suggested to follow existing GiellaLT standards when adding new tags).

The parts-of-speech are:

Parts of speech are further split up into:

Nouns

Pronouns

Nominals are inflected for Number and Case

Number

Case

Possession is marked as such:

The comparative forms are:

Numerals are classified under:

Verb moods are:

Tenses

Voice

Verb personal forms are:

Other verb forms are

Verbs are syntactically split according to transitivity:

Usage extents are marked using following tags:

Abbreviated words are classified with:

Special symbols are classified with:

Special multiword units are analysed with:

Normative/prescriptive compounding tags

(to govern compound behaviour for the speller, ie what a compound SHOULD BE):

The first part of the component may be ..

This entry / word can …

Non-dictionary words can be recognised with:

Question and Focus particles:

Semantics are classified with

Homonymy

Derivations are classified under the morphophonetic form of the suffix, the source and target part-of-speech.

Symbols that need to be escaped on the lower side (towards twolc):

Morphophonology

To represent phonologic variations in word forms we use the following symbols in the lexicon files:

And following triggers to control variation

Flag diacritics

We have manually optimised the structure of our lexicon using following flag diacritics to restrict morhpological combinatorics - only allow compounds with verbs if the verb is further derived into a noun again: | @P.NeedNoun.ON@ | (Dis)allow compounds with verbs unless nominalised | @D.NeedNoun.ON@ | (Dis)allow compounds with verbs unless nominalised | @C.NeedNoun@ | (Dis)allow compounds with verbs unless nominalised

For languages that allow compounding, the following flag diacritics are needed to control position-based compounding restrictions for nominals. Their use is handled automatically if combined with +CmpN/xxx tags. If not used, they will do no harm. | @P.CmpFrst.FALSE@ | Require that words tagged as such only appear first | @D.CmpPref.TRUE@ | Block such words from entering ENDLEX | @P.CmpPref.FALSE@ | Block these words from making further compounds | @D.CmpLast.TRUE@ | Block such words from entering R | @D.CmpNone.TRUE@ | Combines with the next tag to prohibit compounding | @U.CmpNone.FALSE@ | Combines with the prev tag to prohibit compounding | @P.CmpOnly.TRUE@ | Sets a flag to indicate that the word has passed R | @D.CmpOnly.FALSE@ | Disallow words coming directly from root.

Use the following flag diacritics to control downcasing of derived proper nouns (e.g. Finnish Pariisi -> pariisilainen). See e.g. North Sámi for how to use these flags. There exists a ready-made regex that will do the actual down-casing given the proper use of these flags. | @U.Cap.Obl@ | Allowing downcasing of derived names: deatnulasj. | @U.Cap.Opt@ | Allowing downcasing of derived names: deatnulasj.

Root lexicon

The word forms in Livonian start from the lexeme roots of basic word classes

Lexica for words that are not inflected

These are but here for the time being

adverb lexicon

Interjections lexicon

pcle-mod lexicon

pcle-lexicon

This is used in compounding, e.g. äʼb-:äʼb


This (part of) documentation was generated from src/fst/morphology/root.lexc


src-fst-morphology-stems-acronyms.lexc.md

Acronyms Livonian acronyms …


This (part of) documentation was generated from src/fst/morphology/stems/acronyms.lexc


src-fst-morphology-stems-adjectives_newwords.lexc.md

This is where new words are added as lexc entries before they are added to the xml source files. A_ “(eng) /(est) /(fin) /(lav)” ;

ADD NEW ADJECTIVES BELOW


This (part of) documentation was generated from src/fst/morphology/stems/adjectives_newwords.lexc


src-fst-morphology-stems-adverbs_newwords.lexc.md

This is where new words are added as lexc entries before they are added to the xml source files. ADV_ “(eng) /(est) /(fin) /(lav)” ;

ADD NEW ADVERBS BELOW


This (part of) documentation was generated from src/fst/morphology/stems/adverbs_newwords.lexc


src-fst-morphology-stems-exceptions.lexc.md

Exceptions are quite strange word-forms. the ones that do not fit anywhere else. This file contains all enumerated word forms that cannot reasonably be created from lexical data by regular inflection. Usually there should be next to none exceptions, it’s always better to have a paradigm that covers only one or few words than an exception since these will not work nicely with e.g. compounding scheme or possibly many end applications.

the verbs of negation have partial inflection:

Some verbs only have few word-forms left:

The verb lǟdõ has irregular forms:

The verb vȱlda has irregular forms:

PRONOUNS

PROPER NOUNS

NOUNS partitive for morfa demo

NUMERALS testing

testing what is this


This (part of) documentation was generated from src/fst/morphology/stems/exceptions.lexc


src-fst-morphology-stems-nouns_newwords.lexc.md

This is where new words are added as lexc entries before they are added to the xml source files. N_ “(eng) ear/(est) /(fin) /(lav)” ;

ADD NEW NOUNS BELOW


This (part of) documentation was generated from src/fst/morphology/stems/nouns_newwords.lexc


src-fst-morphology-stems-propernouns_newwords.lexc.md

This is where new words are added as lexc entries before they are added to the xml source files. PROP_ “(eng) ear/(est) /(fin) /(lav)” ;


This (part of) documentation was generated from src/fst/morphology/stems/propernouns_newwords.lexc


src-fst-morphology-stems-questionablemisc_newwords.lexc.md

This is where new words are added as lexc entries before they are added to the xml source files. V_ “(eng) ear/(est) /(fin) /(lav)” ;


This (part of) documentation was generated from src/fst/morphology/stems/questionablemisc_newwords.lexc


src-fst-morphology-stems-verbs_newwords.lexc.md

This is where new words are added as lexc entries before they are added to the xml source files. V_ “(eng) ear/(est) /(fin) /(lav)” ;

Add new verbs below


This (part of) documentation was generated from src/fst/morphology/stems/verbs_newwords.lexc


src-fst-phonetics-txt2ipa.xfscript.md

retroflex plosive, voiceless t ʈ 0288, 648 ( = ASCII 096) retroflex plosive, voiced d ɖ 0256, 598 labiodental nasal F ɱ 0271, 625 retroflex nasal n ɳ 0273, 627 palatal nasal J ɲ 0272, 626 velar nasal N ŋ 014B, 331 uvular nasal N\ ɴ 0274, 628

bilabial trill B\ ʙ 0299, 665 uvular trill R\ ʀ 0280, 640 alveolar tap 4 ɾ 027E, 638 retroflex flap r ɽ 027D, 637 bilabial fricative, voiceless p\ ɸ 0278, 632 bilabial fricative, voiced B β 03B2, 946 dental fricative, voiceless T θ 03B8, 952 dental fricative, voiced D ð 00F0, 240 postalveolar fricative, voiceless S ʃ 0283, 643 postalveolar fricative, voiced Z ʒ 0292, 658 retroflex fricative, voiceless s ʂ 0282, 642 retroflex fricative, voiced z` ʐ 0290, 656 palatal fricative, voiceless C ç 00E7, 231 palatal fricative, voiced j\ ʝ 029D, 669 velar fricative, voiced G ɣ 0263, 611 uvular fricative, voiceless X χ 03C7, 967 uvular fricative, voiced R ʁ 0281, 641 pharyngeal fricative, voiceless X\ ħ 0127, 295 pharyngeal fricative, voiced ?\ ʕ 0295, 661 glottal fricative, voiced h\ ɦ 0266, 614

alveolar lateral fricative, vl. K alveolar lateral fricative, vd. K\

labiodental approximant P (or v) alveolar approximant r\ retroflex approximant r` velar approximant M\

retroflex lateral approximant l` palatal lateral approximant L velar lateral approximant L
Clicks

bilabial O\ (O = capital letter) dental |
(post)alveolar !\ palatoalveolar =\ alveolar lateral ||
Ejectives, implosives

ejective > e.g. ejective p p> implosive < e.g. implosive b b< Vowels

close back unrounded M close central unrounded 1 close central rounded } lax i I lax y Y lax u U

close-mid front rounded 2 close-mid central unrounded @\ close-mid central rounded 8 close-mid back unrounded 7

schwa ə @

open-mid front unrounded E open-mid front rounded 9 open-mid central unrounded 3 open-mid central rounded 3\ open-mid back unrounded V open-mid back rounded O

ash (ae digraph) { open schwa (turned a) 6

open front rounded & open back unrounded A open back rounded Q Other symbols

voiceless labial-velar fricative W voiced labial-palatal approx. H voiceless epiglottal fricative H\ voiced epiglottal fricative <\ epiglottal plosive >\

alveolo-palatal fricative, vl. s\ alveolo-palatal fricative, voiced z\ alveolar lateral flap l\ simultaneous S and x x\ tie bar _ Suprasegmentals

primary stress “ secondary stress % long : half-long :\ extra-short _X linking mark -
Tones and word accents

level extra high _T level high _H level mid _M level low _L level extra low _B downstep ! upstep ^ (caret, circumflex)

contour, rising contour, falling _F contour, high rising _H_T contour, low rising _B_L

contour, rising-falling _R_F (NB Instead of being written as diacritics with _, all prosodic marks can alternatively be placed in a separate tier, set off by < >, as recommended for the next two symbols.) global rise global fall Diacritics

voiceless 0 (0 = figure), e.g. n_0 voiced _v aspirated _h more rounded _O (O = letter) less rounded _c advanced _+ retracted _- centralized _” syllabic = (or _=) e.g. n= (or n=) non-syllabic _^ rhoticity `

breathy voiced _t creaky voiced _k linguolabial _N labialized _w palatalized ‘ (or _j) e.g. t’ (or t_j) velarized _G pharyngealized _?\

dental d apical _a laminal _m nasalized ~ (or _~) e.g. A~ (or A~) nasal release _n lateral release _l no audible release _}

velarized or pharyngealized _e velarized l, alternatively 5 raised _r lowered _o advanced tongue root _A retracted tongue root _q


This (part of) documentation was generated from src/fst/phonetics/txt2ipa.xfscript


src-fst-transcriptions-transcriptor-abbrevs2text.lexc.md

We describe here how abbreviations are in Liv are read out, e.g. for text-to-speech systems.

For example:


This (part of) documentation was generated from src/fst/transcriptions/transcriptor-abbrevs2text.lexc


src-fst-transcriptions-transcriptor-numbers-digit2text.lexc.md

Starting work with ordinals


This (part of) documentation was generated from src/fst/transcriptions/transcriptor-numbers-digit2text.lexc


tools-grammarcheckers-grammarchecker.cg3.md

[ L A N G U A G E ] G R A M M A R C H E C K E R

DELIMITERS

TAGS AND SETS

Tags

This section lists all the tags inherited from the fst, and used as tags in the syntactic analysis. The next section, Sets, contains sets defined on the basis of the tags listed here, those set names are not visible in the output.

Beginning and end of sentence

BOS EOS

Parts of speech tags

N A Adv V Pron CS CC CC-CS Po Pr Pcle Num Interj ABBR ACR CLB LEFT RIGHT WEB PPUNCT PUNCT

COMMA ¶

Tags for POS sub-categories

Pers Dem Interr Indef Recipr Refl Rel Coll NomAg Prop Allegro Arab Romertall

Tags for morphosyntactic properties

Nom Acc Gen Ill Loc Com Ess Ess Sg Du Pl Cmp/SplitR Cmp/SgNom Cmp/SgGen Cmp/SgGen PxSg1 PxSg2 PxSg3 PxDu1 PxDu2 PxDu3 PxPl1 PxPl2 PxPl3 Px

Comp Superl Attr Ord Qst IV TV Prt Prs Ind Pot Cond Imprt ImprtII Sg1 Sg2 Sg3 Du1 Du2 Du3 Pl1 Pl2 Pl3 Inf ConNeg Neg PrfPrc VGen PrsPrc Ger Sup Actio VAbess

Err/Orth

Semantic tags

Sem/Act Sem/Ani Sem/Atr Sem/Body Sem/Clth Sem/Domain Sem/Feat-phys Sem/Fem Sem/Group Sem/Lang Sem/Mal Sem/Measr Sem/Money Sem/Obj Sem/Obj-el Sem/Org Sem/Perc-emo Sem/Plc Sem/Sign Sem/State-sick Sem/Sur Sem/Time Sem/Txt

HUMAN

PROP-ATTR PROP-SUR

TIME-N-SET

Syntactic tags

@+FAUXV @+FMAINV @-FAUXV @-FMAINV @-FSUBJ> @-F<OBJ @-FOBJ> @-FSPRED<OBJ @-F<ADVL @-FADVL> @-F<SPRED @-F<OPRED @-FSPRED> @-FOPRED> @>ADVL @ADVL< @<ADVL @ADVL> @ADVL @HAB> @<HAB @>N @Interj @N< @>A @P< @>P @HNOUN @INTERJ @>Num @Pron< @>Pron @Num< @OBJ @<OBJ @OBJ> @OPRED @<OPRED @OPRED> @PCLE @COMP-CS< @SPRED @<SPRED @SPRED> @SUBJ @<SUBJ @SUBJ> SUBJ SPRED OPRED @PPRED @APP @APP-N< @APP-Pron< @APP>Pron @APP-Num< @APP-ADVL< @VOC @CVP @CNP OBJ

-OTHERS SYN-V @X ## Sets containing sets of lists and tags This part of the file lists a large number of sets based partly upon the tags defined above, and partly upon lexemes drawn from the lexicon. See the sourcefile itself to inspect the sets, what follows here is an overview of the set types. ### Sets for Single-word sets INITIAL ### Sets for word or not WORD NOT-COMMA ### Case sets ADLVCASE CASE-AGREEMENT CASE NOT-NOM NOT-GEN NOT-ACC ### Verb sets NOT-V ### Sets for finiteness and mood REAL-NEG MOOD-V NOT-PRFPRC ### Sets for person SG1-V SG2-V SG3-V DU1-V DU2-V DU3-V PL1-V PL2-V PL3-V ### Pronoun sets ### Adjectival sets and their complements ### Adverbial sets and their complements ### Sets of elements with common syntactic behaviour ### NP sets defined according to their morphosyntactic features ### The PRE-NP-HEAD family of sets These sets model noun phrases (NPs). The idea is to first define whatever can occur in front of the head of the NP, and thereafter negate that with the expression **WORD - premodifiers**. ### Border sets and their complements ### Grammarchecker sets * * * This (part of) documentation was generated from [tools/grammarcheckers/grammarchecker.cg3](https://github.com/giellalt/lang-liv/blob/main/tools/grammarcheckers/grammarchecker.cg3) --- # tools-tokenisers-tokeniser-disamb-gt-desc.pmscript.md # Tokeniser for liv Usage: ``` $ make $ echo "ja, ja" | hfst-tokenise --giella-cg tokeniser-disamb-gt-desc.pmhfst $ echo "Juos gorreválggain lea (dárbbašlaš) deavdit gáibádusa boasttu olmmoš, man mielde lahtuid." | hfst-tokenise --giella-cg tokeniser-disamb-gt-desc.pmhfst $ echo "(gáfe) 'ja' ja 3. ja? ц jaja ukjend \"ukjend\"" | hfst-tokenise --giella-cg tokeniser-disamb-gt-desc.pmhfst $ echo "márffibiillagáffe" | hfst-tokenise --giella-cg tokeniser-disamb-gt-desc.pmhfst ``` Pmatch documentation: <https://github.com/hfst/hfst/wiki/HfstPmatch> Characters which have analyses in the lexicon, but can appear without spaces before/after, that is, with no context conditions, and adjacent to words: * Punct contains ASCII punctuation marks * The symbol after m-dash is soft-hyphen `U+00AD` * The symbol following {•} is byte-order-mark / zero-width no-break space `U+FEFF`. Whitespace contains ASCII white space and the List contains some unicode white space characters * En Quad U+2000 to Zero-Width Joiner U+200d' * Narrow No-Break Space U+202F * Medium Mathematical Space U+205F * Word joiner U+2060 Apart from what's in our morphology, there are 1. unknown word-like forms, and 2. unmatched strings We want to give 1) a match, but let 2) be treated specially by `hfst-tokenise -a` Unknowns are made of: * lower-case ASCII * upper-case ASCII * select extended latin symbols * liv specific latin extension ASCII digits * select symbols * Combining diacritics as individual symbols, * various symbols from Private area (probably Microsoft), so far: * U+F0B7 for "x in box" ## Unknown handling Unknowns are tagged ?? and treated specially with `hfst-tokenise` hfst-tokenise --giella-cg will treat such empty analyses as unknowns, and remove empty analyses from other readings. Empty readings are also legal in CG, they get a default baseform equal to the wordform, but no tag to check, so it's safer to let hfst-tokenise handle them. Finally we mark as a token any sequence making up a: * known word in context * unknown (OOV) token in context * sequence of word and punctuation * URL in context * * * This (part of) documentation was generated from [tools/tokenisers/tokeniser-disamb-gt-desc.pmscript](https://github.com/giellalt/lang-liv/blob/main/tools/tokenisers/tokeniser-disamb-gt-desc.pmscript) --- # tools-tokenisers-tokeniser-gramcheck-gt-desc.pmscript.md # Grammar checker tokenisation for liv Requires a recent version of HFST (3.10.0 / git revision>=3aecdbc) Then just: ``` $ make $ echo "ja, ja" | hfst-tokenise --giella-cg tokeniser-disamb-gt-desc.pmhfst ``` More usage examples: ``` $ echo "Juos gorreválggain lea (dárbbašlaš) deavdit gáibádusa boasttu olmmoš, man mielde lahtuid." | hfst-tokenise --giella-cg tokeniser-disamb-gt-desc.pmhfst $ echo "(gáfe) 'ja' ja 3. ja? ц jaja ukjend \"ukjend\"" | hfst-tokenise --giella-cg tokeniser-disamb-gt-desc.pmhfst $ echo "márffibiillagáffe" | hfst-tokenise --giella-cg tokeniser-disamb-gt-desc.pmhfst ``` Pmatch documentation: <https://github.com/hfst/hfst/wiki/HfstPmatch> Characters which have analyses in the lexicon, but can appear without spaces before/after, that is, with no context conditions, and adjacent to words: * Punct contains ASCII punctuation marks * The symbol after m-dash is soft-hyphen `U+00AD` * The symbol following {•} is byte-order-mark / zero-width no-break space `U+FEFF`. Whitespace contains ASCII white space and the List contains some unicode white space characters * En Quad U+2000 to Zero-Width Joiner U+200d' * Narrow No-Break Space U+202F * Medium Mathematical Space U+205F * Word joiner U+2060 Apart from what's in our morphology, there are 1) unknown word-like forms, and 2) unmatched strings We want to give 1) a match, but let 2) be treated specially by hfst-tokenise -a * select extended latin symbols * select symbols * various symbols from Private area (probably Microsoft), so far: * U+F0B7 for "x in box" TODO: Could use something like this, but built-in's don't include šžđčŋ: Simply give an empty reading when something is unknown: hfst-tokenise --giella-cg will treat such empty analyses as unknowns, and remove empty analyses from other readings. Empty readings are also legal in CG, they get a default baseform equal to the wordform, but no tag to check, so it's safer to let hfst-tokenise handle them. Finally we mark as a token any sequence making up a: * known word in context * unknown (OOV) token in context * sequence of word and punctuation * URL in context * * * This (part of) documentation was generated from [tools/tokenisers/tokeniser-gramcheck-gt-desc.pmscript](https://github.com/giellalt/lang-liv/blob/main/tools/tokenisers/tokeniser-gramcheck-gt-desc.pmscript) --- # tools-tokenisers-tokeniser-tts-cggt-desc.pmscript.md # TTS tokenisation for smj Requires a recent version of HFST (3.10.0 / git revision>=3aecdbc) Then just: ```sh make echo "ja, ja" \ | hfst-tokenise --giella-cg tokeniser-disamb-gt-desc.pmhfst ``` More usage examples: ```sh echo "Juos gorreválggain lea (dárbbašlaš) deavdit gáibádusa \ boasttu olmmoš, man mielde lahtuid." \ | hfst-tokenise --giella-cg tokeniser-disamb-gt-desc.pmhfst echo "(gáfe) 'ja' ja 3. ja? ц jaja ukjend \"ukjend\"" \ | hfst-tokenise --giella-cg tokeniser-disamb-gt-desc.pmhfst echo "márffibiillagáffe" \ | hfst-tokenise --giella-cg tokeniser-disamb-gt-desc.pmhfst ``` Pmatch documentation: <https://kitwiki.csc.fi/twiki/bin/view/KitWiki/HfstPmatch> Characters which have analyses in the lexicon, but can appear without spaces before/after, that is, with no context conditions, and adjacent to words: * Punct contains ASCII punctuation marks * The symbol after m-dash is soft-hyphen `U+00AD` * The symbol following {•} is byte-order-mark / zero-width no-break space `U+FEFF`. Whitespace contains ASCII white space and the List contains some unicode white space characters * En Quad U+2000 to Zero-Width Joiner U+200d' * Narrow No-Break Space U+202F * Medium Mathematical Space U+205F * Word joiner U+2060 Apart from what's in our morphology, there are 1) unknown word-like forms, and 2) unmatched strings We want to give 1) a match, but let 2) be treated specially by hfst-tokenise -a * select extended latin symbols * select symbols * various symbols from Private area (probably Microsoft), so far: * U+F0B7 for "x in box" TODO: Could use something like this, but built-in's don't include šžđčŋ: Simply give an empty reading when something is unknown: hfst-tokenise --giella-cg will treat such empty analyses as unknowns, and remove empty analyses from other readings. Empty readings are also legal in CG, they get a default baseform equal to the wordform, but no tag to check, so it's safer to let hfst-tokenise handle them. Needs hfst-tokenise to output things differently depending on the tag they get * * * This (part of) documentation was generated from [tools/tokenisers/tokeniser-tts-cggt-desc.pmscript](https://github.com/giellalt/lang-liv/blob/main/tools/tokenisers/tokeniser-tts-cggt-desc.pmscript)