Liv NLP Grammar

Finite state and Constraint Grammar based analysers, proofing tools and other resources

View the project on GitHub giellalt/lang-liv

Page Content

  • src-fst-morphology-stems-nouns_newwords.lexc.md
  • src-fst-morphology-stems-propernouns_newwords.lexc.md
  • src-fst-morphology-stems-questionablemisc_newwords.lexc.md
  • src-fst-morphology-stems-verbs_newwords.lexc.md
  • src-fst-phonetics-txt2ipa.xfscript.md
  • src-fst-transcriptions-transcriptor-abbrevs2text.lexc.md
  • src-fst-transcriptions-transcriptor-numbers-digit2text.lexc.md
  • tools-grammarcheckers-grammarchecker.cg3.md
  • DELIMITERS
  • TAGS AND SETS
  • Liv language model documentation

    All doc-comment documentation in one large file.


    src-cg3-functions.cg3.md

    These sets model noun phrases (NPs). The idea is to first define whatever can occur in front of the head of the NP, and thereafter negate that with the expression WORD - premodifiers.

    The set NOT-NPMOD is used to find barriers between NPs. Typical usage: … (*1 N BARRIER NPT-NPMOD) … meaning: Scan to the first noun, ignoring anything that can be part of the noun phrase of that noun (i.e., “scan to the next NP head”)

    These were the set types.

    HABITIVE MAPPING

    sma object

    SUBJ MAPPING - leftovers

    OBJ MAPPING - leftovers

    HNOUN MAPPING


    This (part of) documentation was generated from src/cg3/functions.cg3


    src-fst-morphology-affixes-adjectives.lexc.md

    Adjective inflection

    This file documents affixes/adjectives.lexc, the file for Livonian adjective inflection.

    Indeclneables

    **LEXICON A_-ZERO = modifiers that do not decline, goes to #

    **LEXICON A_ = gives Pos tag.

    Stem lexica

    LEXICON A_PŪ contains pū: 12

    LEXICON A_BRĪ contains brī:brī 16

    LEXICON A_KALĀ contains kalā:kaʼlā 18

    LEXICON A_TUBĀ tubā:tuʼbā 19

    LEXICON A_AMĀ amā:aʼm 19a

    LEXICON A_AIGĀ aigā:aʼig 20

    LEXICON A_KŪJA kūja:??lēba 21

    LEXICON A_IZĀ izā:izā 25

    LEXICON A_OKSĀ oksā:oksā 30

    LEXICON A_ĀIGA āiga:āiga 33

    LEXICON A_SĪLMA sīlma:sīlma 34

    LEXICON A_PADĀ padā:padā 39

    LEXICON A_KÄPĀ käpā:käpā 41

    LEXICON A_MAKSĀ maksā:maksā 42

    LEXICON A_KĒRA kēra:kēra 43

    LEXICON A_JǬRA jǭra:jǭra 44

    LEXICON A_ĀITA āita:āita 46

    LEXICON A_ŪŠKA ūška:ūška 47

    LEXICON A_MȬKA mȭka:mȭka 48

    LEXICON A_DADŽĀ dadžā:dadžā 49

    LEXICON A_TĪERA tīera:tīera 54

    LEXICON A_LILLA kuțā:kuțā 57

    LEXICON A_KIʼV kiʼv:kiv 59

    LEXICON A_PIʼŅ piʼņ:piņ 64

    LEXICON A_OKŠ : 68

    LEXICON A_KAŠ : 69

    LEXICON A_TORĪ torī: 71

    LEXICON A_KÕʼL kõʼl:kõl 73

    LEXICON A_NIʼM niʼm:nim 76

    LEXICON A_KAND kand: 94

    LEXICON A_UL ul: 99

    LEXICON A_NIŖȚ niŗț: 102

    LEXICON A_DAŅTŠ daņtš: 105

    LEXICON A_TÄUŽ täuž: adres 112

    LEXICON A_SIELDÕ sieldõ: 118

    LEXICON A_NǬʼGÕ nǭʼgõ:nǭgõ 119

    LEXICON A_AŠŠÕ : 120

    LEXICON A_DRŪʼOŠÕ drūʼošõ:drūošõ 121

    LEXICON A_IRM : 125

    LEXICON A_KIM : 126

    LEXICON A_VAʼIT vaʼit:vait 128

    LEXICON A_AMĀT : 129

    LEXICON A_SAʼGDIT saʼgdit:sagdit 131

    LEXICON A_VIĻȚ : 132

    LEXICON A_EĻ eļ: 133

    LEXICON A_BLĒʼḐ blēʼḑ:blēḑ 134

    LEXICON A_FAKT : 135

    LEXICON A_SĪEND sīend: 138

    LEXICON A_LǞʼND lǟʼnd:lǟnd 139

    LEXICON A_ĀIGAST āigast: 140

    LEXICON A_ANALĪZ analīz: 141

    LEXICON A_NĪʼEM nīʼem:nīem 142

    LEXICON A_VIŠ : 144

    LEXICON A_SIDĀM : 157

    LEXICON A_TŪOITÕG : 158

    LEXICON A_KǬRAND kǭrand: 159

    LEXICON A_ȬʼDÕG ȭʼdõg:ȭdõg 160

    LEXICON A_TAPTÕD taptõd: 161

    LEXICON A_TĪʼEDÕD tīʼedõd:tīedõd 162

    LEXICON A_VĪDÕZ vīdõz: 163

    LEXICON A_TUOISTÕNZ : 164

    LEXICON A_ĪʼDÕKSMÕZ ī’dõksmõz:īdõksmõz 165

    LEXICON A_KÄBRĀZ : 168

    LEXICON A_MAIGĀZ : 169

    LEXICON A_NÕTKĀZ : 170

    LEXICON A_RIKĀZ : 171

    LEXICON A_ĀMBAZ āmbaz:āmba 173

    LEXICON A_PŪŖAZ : 174

    LEXICON A_PǬĻAZ : 175

    LEXICON A_MÕTKÕZ mõtkõz: 179

    LEXICON A_VȬRÕZ vȭrõz: 180

    LEXICON A_ARĀGÕZ : 181

    LEXICON A_ÄʼGGÕZ ä’ggõz:äggõz 182

    LEXICON A_PŪʼDÕZ pūʼdõz:pūdõz 183

    LEXICON A_SĒJI : 186 āndaji:āndaji sēji:sēji

    LEXICON A_AKKIJI akkiji:akkiji 187

    LEXICON A_LĒʼJI lēʼji:lēʼji 188

    LEXICON A_AʼIGI aʼigi:aigi 192

    LEXICON A_PUʼNNI pu’nni:punni 193

    LEXICON A_KAȚKI : 194

    LEXICON A_KUKKI : 195

    LEXICON A_AIGI aigi:aigi 196

    LEXICON A_OUKI : 197

    LEXICON A_PAŖĪ : 198

    LEXICON A_TŪĻI : 199

    LEXICON A_AʼBLI aʼbli:abli 200

    LEXICON A_SĒMI : 201

    LEXICON A_LĒʼMI lē’mi:lēʼmi 202

    LEXICON A_ALĪZ : 203

    LEXICON A_KĒRATÕKS : 207

    LEXICON A_VARĪKŠ varīkš: 209

    LEXICON A_ŪŽ : 219 ūž:ūd

    LEXICON A_JŪŖ jūŗ:jūr 221

    LEXICON A_SŪR sūr:sūr 222

    LEXICON A_DULLÕNZ dullõnz:dullõn 227

    LEXICON A_AŅGÕRZ : aņgõrz:aņgõr 229

    LEXICON A_TIDĀR tidār:tidār 233

    LEXICON A_APPÕN appõn:appõn 235

    LEXICON A_ǬʼRÕN ǭʼrõn:ǭrõn 236

    LEXICON A_KĪNDÕR kīndõr:kīndõr 237

    LEXICON A_BÄʼZMÕR bäʼzmõr:bäzmõr 238

    LEXICON A_TARĪĻ tarīļ:tarīļ 239

    LEXICON A_ĀNKAŖ ānkaŗ:ānkaŗ 240

    LEXICON A_ǬʼBIĻ ǭʼbiļ:ǭbiļ 242


    This (part of) documentation was generated from src/fst/morphology/affixes/adjectives.lexc


    src-fst-morphology-affixes-adpositions.lexc.md

    Adjective inflection

    This file documents affixes/adpositions.lexc

    **LEXICON POSTP_ = points to #

    **LEXICON POSTP_ = points to #


    This (part of) documentation was generated from src/fst/morphology/affixes/adpositions.lexc


    src-fst-morphology-affixes-conjunctors.lexc.md

    Conjunctions

    This file documents affixes/conjunctors.lexc

    **LEXICON CONJ_ = These need to be corrected, it points to #.

    **LEXICON CC_ = Livonian conjunctors, points to #

    **LEXICON CS_ = Livonian subjunctors, points to #


    This (part of) documentation was generated from src/fst/morphology/affixes/conjunctors.lexc


    src-fst-morphology-affixes-determiners.lexc.md

    Determiner inflection

    This file documents affixes/determiners.lexc, the language model for Livonian determiner inflection.

    Stem lexica

    LEXICON DET_NAI nai: 191

    LEXICON DET_TŪĻI tūļi: 199

    LEXICON DET_SĒMI sēmi: 201


    This (part of) documentation was generated from src/fst/morphology/affixes/determiners.lexc


    src-fst-morphology-affixes-nouns.lexc.md

    Livonian noun inflection

    This file documents affixes/nouns.lexc, the Livonian noun inflection file.

    Ad hoc lexica

    PROBLEMS with dictionary lexica

    Stem lexica

    Nominal inflection

    Inflection lexica

    13

    14 Stem change: Yes Vowel raising ǟ:ē +Pl +Ela/+Ill/+Par Stød: Yes

    tiēšti

    16 Stem change: None

    SG-INE ;

    18

    18a

    19

    19a

    20

    21

    22

    23 Stem change: Yes Vowel change in 1st syllable ǭ:a Consonant change ij:j Stød: None

    24

    25

    Stem change: Yes

    Stem change: Yes (Vowel)

    33

    33b LĀNGA Stem change: Yes (Vowel) Stød: None

    34

    35

    37

    38

    39, 40, 41, 42

    40

    41

    42

    43

    44

    45

    46

    59 kiv:kiʼvv

    60

    76

    102

    125, 126, 128

    126

    126b

    129, 130, 131

    132

    135

    140, 141, 142 241

    241 was ĀIGAST

    141 87

    142

    142

    143, 144, 145

    145

    158

    159

    160 72

    179

    181

    182

    183

    184

    192

    aʼigi:aʼigi

    199

    211

    212

    225

    226, 227, 228

    233

    SG-DAT ; SG-ELA ; SG-ILL ; SG-INS ; SG-PAR ;

    NUMBER AND CASE

    above as pair in SG-ELA/INE_st; 2014 jaska

    A trigger for z:ž will be required


    This (part of) documentation was generated from src/fst/morphology/affixes/nouns.lexc


    src-fst-morphology-affixes-pronouns.lexc.md

    Prounoun inflection

    This file documents affixes/pronouns.lexc, the file on Livonian pronoun inflection

    **LEXICON PRON_ = goes to # only.

    LEXICON PRON_MIS mis:mi 1

    LEXICON PRON_JEGĀ jegā:jeʼgā 2

    LEXICON PRON_MŪ mū:m 3

    LEXICON PRON_SE se:s 4

    LEXICON PRON_TÄMĀ tämā: 5

    LEXICON PRON_NE ne:n 4 & 5

    LEXICON PRON_MINĀ 6 ma:m

    LEXICON PRON_MĒG minā:m 6

    LEXICON PRON_SINĀ sinā:0 7

    LEXICON PRON_TĒG tēg:t 7

    LEXICON PRON_KIS kis:kī 8

    LEXICON PRON_ĪʼŽ 9 īʼž:0

    LEXICON PRON_MIDĀGÕD midāgõd:midāg 10

    LEXICON PRON_MITS 11 mits:mit

    LEXICON PRON_SET 11b set:set

    Stem lexica LEXICON PRON_TUBĀ tubā:tubā 19

    LEXICON PRON_TUBĀ-PL tubā:tubā 19

    LEXICON PRON_ĀITA āita:āita 46

    LEXICON PRON_ĀIGAST āigast: 140

    LEXICON PRON_AZŪM-PL azūm: 153

    LEXICON PRON_VĪDÕZ vīdõz: 163

    LEXICON PRON_ĪKŠ : 217


    This (part of) documentation was generated from src/fst/morphology/affixes/pronouns.lexc


    src-fst-morphology-affixes-propernouns.lexc.md

    Proper noun inflection

    This file documents affixes/propernouns.lexc, the file for inflection of propernouns.

    Livonian proper nouns inflect in the same cases as regular nouns, but with a colon (‘:’) as separator.

    **LEXICON PROP_ = this lexicon goes to K only

    Stem lexica LEXICON PROP_TOP_PŪ contains pū: 12

    LEXICON PROP_PŪ contains pū: 12

    LEXICON PROP_PŪ-SG contains pū: 12

    LEXICON PROP_KALĀ contains kalā:kalā 18

    LEXICON PROP_IRĒ-SG contains irē:iʼr 18a

    LEXICON PROP_TUBĀ tubā:tubā 19

    LEXICON PROP_VĒNA vēna:vēna 37

    LEXICON PROP_PADĀ padā:padā 39

    LEXICON PROP_JǬRA jǭra:jǭra 44

    LEXICON PROP_JǬRA-PL jǭra:jǭra 44

    LEXICON PROP_ĀITA āita:āita 46

    LEXICON PROP_DADŽĀ dadžā:dadžā 49

    LEXICON PROP_KRǬIPA krǭipa:krǭipa 55

    LEXICON PROP_DUŅTŠ : 70

    LEXICON PROP_NIʼM niʼm:niʼm 76

    LEXICON PROP_NIʼM-PL niʼm:niʼm 76

    LEXICON PROP_TUP tup:tup 79

    LEXICON PROP_NǬʼGÕ nǭʼgõ:nǭgõ 119

    LEXICON PROP_KǬJ : 123

    LEXICON PROP_KIM : 126

    LEXICON PROP_KIM-SG : 126

    LEXICON PROP_VAʼIT vaʼit:vait 128

    LEXICON PROP_AMĀT : 129

    LEXICON PROP_KULTŪR : 130

    LEXICON PROP_VIĻȚ : 132

    LEXICON PROP_FAKT fakt:fakt 135

    LEXICON PROP_FAKT-SG fakt:fakt 135

    LEXICON PROP_ĀIGAST : 140

    LEXICON PROP_ANALĪZ : 141

    LEXICON PROP_NĪʼEM-SG nīʼem:nīʼem 142

    LEXICON PROP_JAĻKŠ : 143

    LEXICON PROP_RŪʼTŠ rūʼtš:rūʼtš 145

    LEXICON PROP_SIDĀM : 157

    LEXICON PROP_TŪOITÕG : 158

    LEXICON PROP_TŪOITÕG-SG : 158

    LEXICON PROP_KǬRAND : 159

    LEXICON PROP_KǬRAND-SG : 159

    LEXICON PROP_ȬʼDÕG ȭʼdõg:ȭʼdõg 160

    LEXICON PROP_ĀNDÕKS : 206

    LEXICON PROP_PŪOL : 216

    LEXICON PROP_SŪR : 222

    LEXICON PROP_BIRKOV : 224

    LEXICON PROP_SALĀJ-SG : 225

    LEXICON PROP_TIDĀR tidār:tidār 233

    LEXICON PROP_TIDĀR-PL tidār:tidār 233

    LEXICON PROP_PĒGAL pēgal:pēgal 234

    LEXICON PROP_APPÕN appõn:appõn 235

    LEXICON PROP_KĪNDÕR kīndõr:kīndõr 237


    This (part of) documentation was generated from src/fst/morphology/affixes/propernouns.lexc


    src-fst-morphology-affixes-quantifiers.lexc.md

    Quantifier inflection

    This file documents the file on Livonian quantifier morphology.

    LEXICON QNT_APPÕN : 216

    LEXICON QNT_PŪOL : 216

    Stem lexica LEXICON NUM_PADĀ padā:padā 39

    LEXICON NUM_KĒRA kēra:kēra 43

    LEXICON NUM_OKŠ : 68

    LEXICON NUM_NǬʼGÕ nǭʼgõ:nǭgõ 119

    LEXICON NUM_IRM irm: 125

    LEXICON NUM_KIM : 126 kim:kim

    LEXICON NUM_FAKT fakt: 135

    LEXICON NUM_ĀIGAST āigast: 140

    LEXICON NUM_NAI nai: 191

    LEXICON NUM_ÄʼBȚÕKS ä’bțõks:äbțõks 208

    LEXICON NUM_TŪĻ : 214

    LEXICON NUM_ĪKŠ : 217

    LEXICON NUM_KAKŠ : 218

    LEXICON NUM_ŪŽ : 219

    LEXICON NUM_APPÕN appõn:appõn 235


    This (part of) documentation was generated from src/fst/morphology/affixes/quantifiers.lexc


    src-fst-morphology-affixes-symbols.lexc.md

    Symbol affixes

    **LEXICON Noun_symbols_possibly_inflected =

    **LEXICON Noun_symbols_never_inflected =

    **LEXICON SYMBOL_connector =

    **LEXICON SYMBOL_NO_suff =

    **LEXICON SYMBOL_suff =


    This (part of) documentation was generated from src/fst/morphology/affixes/symbols.lexc


    src-fst-morphology-affixes-verbs.lexc.md

    Livonian Verb inflection

    This file documents the verb inflection of Livonian.

    Verb stem classes

    **LEXICON V_ = CONJUGATION TYPE MISSING

    **LEXICON TV_ = CONJUGATION TYPE MISSING

    **LEXICON V-AUX_VȰLDA = 10 vȱlda:ZERO

    **LEXICON V-AUX_LǞʼDÕ = 1 lǟʼdõ:lǟʼ

    **LEXICON IV_LǞʼDÕ = 1 lǟʼdõ:lǟʼ

    **LEXICON TV_TǬʼDÕ = 2 tǭʼdõ:tǭʼ

    **LEXICON V-AUX_VĪDÕ = 3 vīdõ:vī

    **LEXICON IV_VĪDÕ = 3 vīdõ:vī

    **LEXICON TV_VĪDÕ = 3 vīdõ:vī

    **LEXICON TV_NǞʼDÕ = 4 nǟʼdõ:nǟʼ

    **LEXICON IV_KǞʼDÕ = 5 kǟʼdõ:kǟʼ

    **LEXICON TV_TĪʼEDÕ = 6 tīʼedõ:tīʼe

    **LEXICON V-AUX_SĪEDÕ = 7 sīedõ:sīe

    **LEXICON IV_SĪEDÕ = 7 sīedõ:sīe

    **LEXICON TV_SĪEDÕ = 7 sīedõ:sīe

    **LEXICON IV_SǬDÕ = 8 sǭdõ:s

    **LEXICON TV_SǬDÕ = 8 sǭdõ:s

    **LEXICON V-AUX_SǬDÕ = 8 sǭdõ:s

    **LEXICON TV_JŪODÕ = 9 jūodõ:jūo

    **LEXICON IV_TŪLDA = 11 tūlda:tūʼl

    **LEXICON V-AUX_PĀNDA = 12 pānda:pāʼn

    **LEXICON IV_PĀNDA = 12 pānda:pāʼn

    **LEXICON TV_PĀNDA = 12 pānda:pāʼn

    **LEXICON IV_JEʼLLÕ = 13 jeʼllõ:jeʼlā

    **LEXICON TV_JEʼLLÕ = 13 jeʼllõ:jeʼllõ

    **LEXICON IV_ASTÕ = 18 astõ:astõ

    **LEXICON TV_ASTÕ = 18 astõ:astõ

    **LEXICON TV_VÕTTÕ = 19 võttõ:võttõ

    **LEXICON IV_VIEʼDDÕ = 24 vieʼddõ:vieʼddõ

    **LEXICON TV_VIEʼDDÕ = 24 vieʼddõ:vieʼddõ

    **LEXICON IV_MAKSÕ = 25 maksõ:maksõ

    **LEXICON TV_MAKSÕ = 25 maksõ:maksõ

    **LEXICON TV_TAPPÕ = 26 tappõ:tappõ

    **LEXICON IV_MÄNGÕ = 14 mängõ:mǟnga

    **LEXICON TV_KILLÕ = 15 killõ:kīla

    **LEXICON TV_PALLÕ = 16 pallõ:pǭla

    **LEXICON TV_LOULÕ = 17 loulõ:lōla

    **LEXICON IV_LAITÕ = 20 laittõ:lāita

    **LEXICON TV_LAITÕ = 20 laittõ:lāita

    **LEXICON IV_TÄUTÕ = 21 täutõ:tǟta

    **LEXICON TV_TÄUTÕ = 21 täutõ:tǟuta

    **LEXICON TV_PȮĻTÕ = 22 pȯļtõ:pūoļta

    **LEXICON TV_MȮISTÕ = 23 mȯistõ:mūošta

    **LEXICON IV_ANDÕ = 27 andõ:ānda

    **LEXICON TV_ANDÕ = 27 andõ:ānda

    **LEXICON IV_TIEUDÕ = 28 tieudõ:tīeda

    **LEXICON TV_TIEUDÕ = 28 tieudõ:tīeda

    29-48 follow same pattern

    **LEXICON IV_LUʼGGÕ = luʼggõ:luʼggõ 29

    **LEXICON TV_LUʼGGÕ = luʼggõ:lugū 29

    **LEXICON IV_MUʼDŽÕ = muʼdžõ:mudžū 30

    **LEXICON TV_MUʼDŽÕ = muʼdžõ:mudžū 30

    **LEXICON IV_VAKȚÕ = vakțõ:vakțū 31

    **LEXICON TV_VAKȚÕ = vakțõ:vakțū 31

    **LEXICON IV_KITTÕ = kittõ:kittõ 32

    **LEXICON TV_KITTÕ = kittõ:kittõ 32

    **LEXICON V-AUX_RIʼDDÕ = riʼddõ:ridū 33

    **LEXICON IV_RIʼDDÕ = riʼddõ:ridū 33

    **LEXICON TV_RIʼDDÕ = riʼddõ:ridū 33

    **LEXICON IV_KUTSÕ = kutsõ:kutsū 34

    **LEXICON TV_KUTSÕ = kutsõ:kutsū 34

    **LEXICON V-AUX_LASKÕ = laskõ:laskū 35

    **LEXICON IV_LASKÕ = laskõ:laskū 35

    **LEXICON TV_LASKÕ = laskõ:laskū 35

    **LEXICON TV_KÄSKÕ = laskõ:laskū 35b

    **LEXICON IV_AKKÕ = akkõ:akū 36 Should ss be s and šš be š? 2013-02-19

    **LEXICON TV_AKKÕ = akkõ:akū 36

    **LEXICON V-AUX_AIGÕ = aigõ:āigõ 37

    **LEXICON IV_AIGÕ = aigõ:āigõ 37

    **LEXICON TV_AIGÕ = aigõ:āigõ 37

    **LEXICON TV_KUOŖŖÕ = kuoŗŗõ:kūoŗõ 38

    **LEXICON TV_VANNÕ = vannõ:vǭnõ 39

    **LEXICON IV_PȮĻĻÕ = pȯļļõ:pūoļõ 40

    **LEXICON IV_PȮIMÕ = pȯimõ:pūoimõ 41

    **LEXICON TV_PȮIMÕ = pȯimõ:pūoimõ 41

    **LEXICON IV_OUŖÕ = ouŗõ:ōŗõ 42

    **LEXICON IV_KEIJÕ = keijõ:kējõ 43

    **LEXICON TV_KEIJÕ = keijõ:kējõ 43

    **LEXICON IV_AŖŠTÕ = aŗštõ:āŗštõ 44

    **LEXICON TV_AŖŠTÕ = aŗštõ:āŗštõ 44

    **LEXICON TV_PȮRTÕ = pȯrtõ:pūortõ 45

    **LEXICON TV_OUTÕ = outõ:ōtõ 46

    **LEXICON V-AUX_TUNDÕ = tundõ:tūndõ 47

    **LEXICON IV_TUNDÕ = tundõ:tūndõ 47

    **LEXICON TV_TUNDÕ = tundõ:tūndõ 47

    **LEXICON TV_OUDÕ = oudõ:ōdõ 48

    **LEXICON IV_KŪLÕ = kūlõ:kūlõ 49

    **LEXICON TV_KŪLÕ = kūlõ:kūlõ 49

    **LEXICON IV_ARRÕ = arrõ:arrõ 50

    **LEXICON TV_ARRÕ = arrõ:arrõ 50

    **LEXICON IV_AʼILÕ = aʼilõ:aʼilõ 51

    **LEXICON TV_AʼILÕ = aʼilõ:aʼilõ 51

    **LEXICON TV_SVAʼRRÕ = svaʼrrõ:svaʼrrõ 52

    **LEXICON V-AUX_KĪTÕ = kītõ:kīt 53

    **LEXICON IV_KĪTÕ = kītõ:kīt 53 ~701

    **LEXICON TV_KĪTÕ = kītõ:kīt 53

    **LEXICON IV_ÄʼBȚÕ = äʼbțõ:äʼbț 54

    **LEXICON TV_ÄʼBȚÕ = äʼbțõ:äʼbț 54

    **LEXICON V-AUX_KŪLDÕ = kūldõ:kūld 55

    **LEXICON IV_KŪLDÕ = kūldõ:kūld 55

    **LEXICON TV_KŪLDÕ = kūldõ:kūld 55

    **LEXICON TV_KĪSKÕ = kīskõ:kīsk 56

    **LEXICON V-AUX_ĪʼEDÕ = īʼedõ:īed 57

    **LEXICON IV_ĪʼEDÕ = īʼedõ:īed 57

    **LEXICON TV_ĪʼEDÕ = īʼedõ:īed 57

    **LEXICON IV_UMBLÕ = umblõ: 58

    **LEXICON TV_UMBLÕ = umblõ: 58

    **LEXICON IV_ERȚĻÕ = erțļõ:erțõlõ 58b

    **LEXICON TV_ERȚĻÕ = erțļõ:erțõlõ 58b

    **LEXICON V-AUX_MÕTLÕ = mõtlõ: 59

    **LEXICON IV_MÕTLÕ = mõtlõ: 59

    **LEXICON TV_MÕTLÕ = mõtlõ: 59

    **LEXICON IV_MǞʼDLÕ = mǟʼdlõ: 60

    **LEXICON TV_MǞʼDLÕ = mǟʼdlõ: 60

    **LEXICON IV_NAʼGRÕ = naʼgrõ: 60

    **LEXICON TV_NAʼGRÕ = naʼgrõ: 60

    **LEXICON V-AUX_ÄʼB = 62 äʼb:ä

    **LEXICON TV_SÄ = 63 sä:sä

    **LEXICON V-AUX_PIḐĪKS = 64 piḑīks:piḑī

    After transitive, intransitive, auxiliary and such tags have been added

    1

    2 tǭʼdõ:tǭʼ

    Prt Imprt

    Jus Qvo

    participles

    3 **LEXICON V-01_VĪDÕ = This is mutual for 3: vīdõ:vī Prt Imprt

    Jus Qvo

    participles

    **LEXICON V-01_NǞʼDÕ = This is mutual for ??: 4 nǟʼdõ:n Prt Imprt

    Jus Qvo

    participles

    **LEXICON V-01_KǞʼDÕ = This is mutual for ??: 4 kǟʼdõ:kǟʼ Prt Imprt

    Jus Qvo

    participles

    **LEXICON V-01_TĪʼEDÕ = : 6 tīedõ:tīʼe

    Jus Qvo participles

    **LEXICON V-01_SĪEDÕ = : 7 sīedõ:sīe

    Jus Qvo

    participles

    8 sǭdõ:s Prt Imprt

    Jus Qvo

    participles 9 9 jūodõ:jūo Prt Imprt

    Jus Qvo

    participles 10

    11 tūlda:tūʼl Prt Imprt

    Jus Qvo participles 11

    12 12 pānda:pāʼn Prt Imprt

    Jus Qvo participles

    **LEXICON V-01_JEʼLLÕ = 13 jeʼllõ, 18 astõ, 19 võttõ, 24 vieʼddõ, 25 maksõ, 26 tappõ

    Cond Imprt Jus Qvo

    participles

    14 mängõ:mǟngõ **LEXICON V-01_MÄNGÕ = 14 mängõ, 15 killõ, 16 pallõ, 17 loulõ, 20 laitõ, 21 täutõ, 22 pȯļtõ, 23 mȯistõ, 27 āndõ, 28 tīeudõ

    Imprt Jus Qvo

    participles

    15 killõ:kīllõ **LEXICON V-01_KILLÕ = 14 mängõ, 15 killõ, 16 pallõ, 17 loulõ, 20 laitõ, 21 täutõ, 22 pȯļtõ, 23 mȯistõ, 27 āndõ, 28 tīeudõ

    Imprt Jus Qvo

    participles

    16 pallõ:pǭllõ **LEXICON V-01_PALLÕ = 14 mängõ, 15 killõ, 16 pallõ, 17 loulõ, 20 laitõ, 21 täutõ, 22 pȯļtõ, 23 mȯistõ, 27 āndõ, 28 tīeudõ

    Imprt Jus Qvo

    participles

    17 loulõ:lōulõ **LEXICON V-01_LOULÕ = 14 mängõ, 15 killõ, 16 pallõ, 17 loulõ

    18 astõ:astõ **LEXICON V-01_ASTÕ = 14 mängõ, 15 killõ, 16 pallõ, 17 loulõ

    19 võttõ:võttõ **LEXICON V-01_VÕTTÕ = 14 mängõ, 15 killõ, 16 pallõ, 17 loulõ

    20 laitõ: **LEXICON V-01_LAITÕ = 14 mängõ, 15 killõ, 16 pallõ, 17 loulõ, 20 laitõ

    **LEXICON V-01_TÄUTÕ = 21 täutõ:tǟutõ

    22 pȯļțõ:p **LEXICON V-01_PȮĻTÕ = 22 pȯļtõ, 23 mȯistõ, 27 āndõ, 28 tīeudõ Cond Imprt

    Jus Qvo

    participles

    23 mȯistõ:m **LEXICON V-01_MȮISTÕ = 23 mȯistõ, 27 āndõ, 28 tīeudõ

    Imprt

    Jus Qvo

    participles

    **LEXICON V-01_VIEʼDDÕ = 13 jeʼllõ, 18 astõ, 19 võttõ, 24 vieʼddõ, 25 maksõ, 26 tappõ

    Cond Imprt Jus Qvo

    participles

    **LEXICON V-01_MAKSÕ = 13 jeʼllõ, 18 astõ, 19 võttõ, 24 vieʼddõ, 25 maksõ, 26 tappõ

    Cond Imprt Jus Qvo

    participles

    **LEXICON V-01_TAPPÕ = 13 jeʼllõ, 18 astõ, 19 võttõ, 24 vieʼddõ, 25 maksõ, 26 tappõ

    Cond Imprt Jus Qvo

    participles

    27 andõ:āndõ **LEXICON V-01_ANDÕ = 14 mängõ, 15 killõ, 16 pallõ, 17 loulõ, 20 laitõ, 21 täutõ, 22 pȯļtõ, 23 mȯistõ, 27 āndõ, 28 tīeudõ

    Imprt

    Jus Qvo

    participles

    28 tieudõ:tīeudõ **LEXICON V-01_TIEUDÕ = 14 mängõ, 15 killõ, 16 pallõ, 17 loulõ, 20 laitõ, 21 täutõ, 22 pȯļtõ, 23 mȯistõ, 27 āndõ, 28 tīeudõ

    Imprt

    Jus Qvo

    participles

    29 LEXICON V-01_LUʼGGÕ luʼggõ:luʼggõ 29 This is mutual for 29-36: luʼggõ, muʼdžõ, vakțõ, kittõ, riʼddõ, kutsõ, laskõ, akkõ Prt ImprtI

    Jus Kvo

    participles

    30 LEXICON V-01_MUʼDŽÕ luʼggõ:luʼggõ 29 This is mutual for 29-36: luʼggõ, muʼdžõ, vakțõ, kittõ, riʼddõ, kutsõ, laskõ, akkõ Prt ImprtI

    Jus Kvo

    participles

    31 LEXICON V-01_VAKȚÕ luʼggõ:luʼggõ 29 This is mutual for 29-36: luʼggõ, muʼdžõ, vakțõ, kittõ, riʼddõ, kutsõ, laskõ, akkõ Prt ImprtI

    Jus Kvo

    participles

    32 LEXICON V-01_KITTÕ kittõ:kittõ 32 This is mutual for 29-36: luʼggõ, muʼdžõ, vakțõ, kittõ, riʼddõ, kutsõ, laskõ, akkõ

    Prt ImprtI

    Jus Kvo

    participles

    33 LEXICON V-01_RIʼDDÕ riʼddõ:riʼddõ 33 This is mutual for 29-36: luʼggõ, muʼdžõ, vakțõ, kittõ, riʼddõ, kutsõ, laskõ, akkõ Prt ImprtI

    Jus Kvo

    participles

    34 LEXICON V-01_KUTSÕ luʼggõ:luʼggõ 29 This is mutual for 29-36: luʼggõ, muʼdžõ, vakțõ, kittõ, riʼddõ, kutsõ, laskõ, akkõ Prt ImprtI

    Jus Kvo

    participles

    35 LEXICON V-01_LASKÕ luʼggõ:luʼggõ 29 This is mutual for 29-36: luʼggõ, muʼdžõ, vakțõ, kittõ, riʼddõ, kutsõ, laskõ, akkõ Prt ImprtI

    Jus Kvo

    participles

    35b LEXICON V-01_KÄSKÕ luʼggõ:luʼggõ 29 This is mutual for 29-36: luʼggõ, muʼdžõ, vakțõ, kittõ, riʼddõ, kutsõ, laskõ, akkõ Prt ImprtI

    Jus Kvo

    participles

    36 LEXICON V-01_AKKÕ luʼggõ:luʼggõ 29 This is mutual for 29-36: luʼggõ, muʼdžõ, vakțõ, kittõ, riʼddõ, kutsõ, laskõ, akkõ Prt ImprtI

    Jus Kvo

    participles

    37 This is mutual for 37-48

    Prt

    participles

    **LEXICON V-01_KUOŖŖÕ = kuoŗŗõ:kūoŗŗõ 38 This is mutual for 37-48

    Prt

    participles

    **LEXICON V-01_VANNÕ = vannõ:vǭnnõ 39 This is mutual for 37-48

    Prt

    participles

    **LEXICON V-01_PȮĻĻÕ = pȯļļõ:pūoļļõ 40 This is mutual for 37-48

    Prt

    participles

    **LEXICON V-01_PȮIMÕ = pȯimõ:pūoimõ 41 This is mutual for 37-48

    Prt

    participles

    **LEXICON V-01_OUŖÕ = ouŗõ:ōuŗõ 42 This is mutual for 37-48

    Prt

    participles

    **LEXICON V-01_KEIJÕ = keijõ:kēijõ 43 This is mutual for 37-48

    Prt

    participles

    **LEXICON V-01_AŖŠTÕ = aŗštõ:āŗštõ 44 This is mutual for 37-48

    Prt

    participles

    **LEXICON V-01_PȮRTÕ = pȯrtõ:pūortõ 45 This is mutual for 37-48

    Prt

    participles

    **LEXICON V-01_OUTÕ = outõ:ōutõ 46 This is mutual for 37-48

    Prt

    participles

    **LEXICON V-01_TUNDÕ = tundõ:tūndõ 47 This is mutual for 37-48

    Prt

    participles

    **LEXICON V-01_OUDÕ = oudõ:ōdõ 48 This is mutual for 37-48

    Prt

    participles

    49 **LEXICON V-01_KŪLÕ = This is mutual for 49-50, 52-57 Prt +Act+PrfPrc Cond

    50 arrõ:arrõ **LEXICON V-01_ARRÕ = This is mutual for 49-50, 52-57 Prt +Act+PrfPrc Cond

    51

    52 **LEXICON V-01_SVAʼRRÕ = This is mutual for 49-50, 52-57 Prt +Act+PrfPrc Cond

    **LEXICON V-01_KĪTÕ = This is mutual for 49-50, 52-57 Prt +Act+PrfPrc Cond

    **LEXICON V-01_ÄʼBȚÕ = This is mutual for 49-50, 52-57

    Prt +Act+PrfPrc Cond

    55 **LEXICON V-01_KŪLDÕ = This is mutual for 49-50, 52-57 Prt +Act+PrfPrc Cond

    56 **LEXICON V-01_KĪSKÕ = This is mutual for 49-50, 52-57 Prt +Act+PrfPrc Cond

    57 īedõ:īʼedõ **LEXICON V-01_ĪʼEDÕ = This is mutual for 49-50, 52-57 Prt +Act+PrfPrc Cond

    58 umblõ:umbõlõ **LEXICON V-01_UMBLÕ = This is mutual for 58-61: umblõ, mõtlõ, mǟʼdlõ, naʼgrõ Prt Imprt

    Jus Qvo

    participles

    58 erțļõ:erțõlõ **LEXICON V-01_ERȚĻÕ = This is mutual for 58-61: umblõ, mõtlõ, mǟʼdlõ, naʼgrõ Prt Imprt

    Jus Qvo

    participles

    59 mõtlõ:umbõlõ **LEXICON V-01_MÕTLÕ = This is mutual for 58-61: umblõ, mõtlõ, mǟʼdlõ, naʼgrõ Prt Imprt

    Jus Qvo

    participles

    60 mǟʼdlõ:mǟʼdõlõ **LEXICON V-01_MǞʼDLÕ = This is mutual for 58-61: umblõ, mõtlõ, mǟʼdlõ, naʼgrõ Prt Imprt

    Jus Qvo

    participles

    61 naʼgrõ:naʼgõrõ **LEXICON V-01_NAʼGRÕ = This is mutual for 58-61: umblõ, mõtlõ, mǟʼdlõ, naʼgrõ Prt Imprt

    Jus Qvo

    participles

    Nonfinites

    **LEXICON GER_s =

    **LEXICON GER_sõ =

    **LEXICON INF_ZERO =

    **LEXICON INF_dõ =

    **LEXICON INF_da =

    **LEXICON SUP-STEM_m =

    **LEXICON SUP_m =

    **LEXICON SUP_m =

    **LEXICON SUP_mõ =

    **LEXICON ACTPRSPRC =

    **LEXICON ACTPRSPRC =

    **LEXICON ACTPRFPRC_SG-d/PL-nõd =

    **LEXICON ACTPRFPRC_SG-nd/PL-nõd = Are the singular and plural homonyms?

    **LEXICON ACTPRFPRC_SG-nd/PL-nnõd = Are the singular and plural homonyms?

    **LEXICON PSSPRSPRC =

    **LEXICON PSSPRSPRC_b =

    **LEXICON PSSPRSPRC_tõb =

    **LEXICON PSSPRFPRCSG = 2014-08-21

    **LEXICON PSSPRFPRCSG_d = 2014-08-21

    **LEXICON PSSPRFPRCSG_tõd = 2014-08-21

    Finites

    **LEXICON INDPRS_tõ = Indicative present

    **LEXICON INDPRT_ZERO = Indicative preterite in i

    **LEXICON INDPRT_i = Indicative preterite in i

    **LEXICON INDPRT_ztõ = Indicative preterite in z

    **LEXICON INDPRT_zt/ztõ = Indicative preterite in z

    **LEXICON INDPRT_žtõ = Indicative preterite in ž

    **LEXICON INDPRTSG3-STEM_tõ =

    **LEXICON COND = Conditional present

    Indicative present

    **LEXICON INDPRSSG1-STEM =

    Conditional

    Imperative

    Jussative

    Quotative


    This (part of) documentation was generated from src/fst/morphology/affixes/verbs.lexc


    src-fst-morphology-phonology.twolc.md

    Livonian morphophonology

    This file documents the phonology.twolc file

    We first show alphabet and sets, thereafter rules.

    Alphabet

    Literal quotes and angles

    They must be escaped (cf morpheme boundaries further down):

    »7 «7 %[%>%] - Literal > %[%<%] - Literal <

    Archiphonemes for consonant lengthening

    Triggers

    Vowel raising

    Vowel metathesis

    VOWEL SHORTENING

    Sets

    Rule section

    Vowel rules

    Shortening in first syllable

    Rule: ǟ:ä in first syllable

    Rule: ā:a in first syllable

    Rule: ȱ:ȯ

    Rule: ā:ī in second syllable plural

    Rule: ū:ī in second syllable plural

    Rule: ǭ:a in first syllable

    Rule: ē:e in first syllable rēnaz+N+Sg+Gen:

    Rule: ū:u in first syllable

    Rule: ū:ȯ in first syllable

    Rule: ī:i in first syllable

    Rule: ȭ:õ in first syllable

    Rule: ō:o in first syllable rōda+N+Pl+Par

    lengthen vowels

    Rule: e:ē in first syllable

    Rule: u:ū in first syllable

    Rule: ȯ:ū in first syllable

    Rule: ä:ǟ in first syllable

    VOWEL LENGTHENING

    Rule: a:ǭ in first syllable

    Rule: a:ā in first syllable

    Rule: o:ō in first syllable

    LOWER VOWELS Rule: ī:ē in tīe 15

    Destressing in second syllable **Rule: ā:õ **

    **Rule: a:õ **

    **Rule: ū:õ **

    Rule: õ:i

    VOWEL LOSS

    Rule: ā:0

    Rule: ō:0

    Rule: ū:0

    Rule: ī:0

    Rule: a:0 rēnaz+N+Sg+Gen:

    rōda+N+Pl+Par

    Rule: e:0

    Rule: {õØ}:0

    Rule: õ:0

    Rule: i:0 in first syllable

    Rule: u:0 in second position of first-syllable diphthong

    Rule: o:0 in second position of first-syllable diphthong

    
    * pūol0a%^Stress1to2%^ConsL examples:*
    
    * pȯ0llõ00 examples:*
    

    Zero to vowel

    Rule: 0:õ in vowel metathesis

    Consonant rules

    Consonant loss

    Rule: shorten consonantism between 1st and 2nd vowel center jeʼllõ:jelāb

    Rule: g:0

    Rule: l:0

    Rule: m:0

    Rule: z:0 rēnaz+N+Sg+Gen:

    Consonant lengthening

    Lengthening consonantism between first and second vowel center simultaneous to reducing vowel of second syllable

    Rule: %{XC%}:Cx

    %{XC%}:p 2014-02-27

    %{XC%}:s 2020-10-21 tas+N+Sg+Ill

    %{XC%}:ž 2014-02-27

    %{XC%}:k 2014-02-27

    Rule: Stod removal left

    Rule: z:ž

    Rule: d:ḑ lēʼḑ:līʼed 147 rōda+N+Pl+Par

    Rule: ļ:l

    Rule: l:ļ This rule should not require the %^ConsRM:0 trigger, but for now this makes it work. kēļ:kēl 215

    Rule: n:ņ palatalization

    Rule: r:ŗ jūŗ:jūr 221

    Rule: d:t

    Rule: d:ț

    Rule: d:ž

    Rules for consonant loss

    Rule: d:0 Vow: (ʼ:) (Cns:+) _ (%^Pen: %^VowsRM:|%^VOWRaise:) (%^PreI: %^StodRM:|%^VowsLI1:|%^StodRM:) [%^D2ZERO:0|%^ConsRM:] ;

    P loss before subsequent morpheme with underlying

    T loss before subsequent morpheme with underlying initial d

    Rule: k:0

    Rule: š:0

    Rule: ț:0

    Rule: s:š palatalization

    Rule: ǟ:ē palatalization

    Rule: ǟ:e short and palatalization

    Rule: ä:e short and palatalization

    Rule: ǭ:ä palatalization

    Rule: a:ä palatalization


    This (part of) documentation was generated from src/fst/morphology/phonology.twolc


    src-fst-morphology-root.lexc.md

    Morphology

    INTRODUCTION TO THE MORPHOLOGICAL ANALYSER OF LIVONIAN.

    List of the multichar symbols

    The morphological analyses of wordforms in Livonian are presented in this system in terms of the symbols declared below.

    (It is highly suggested to follow existing GiellaLT standards when adding new tags).

    The parts-of-speech are:

    Parts of speech are further split up into:

    Nouns

    Pronouns

    Nominals are inflected for Number and Case

    Number

    Case

    Possession is marked as such:

    The comparative forms are:

    Numerals are classified under:

    Verb moods are:

    Tenses

    Voice

    Verb personal forms are:

    Other verb forms are

    Verbs are syntactically split according to transitivity:

    Usage extents are marked using following tags:

    Abbreviated words are classified with:

    Special symbols are classified with:

    Special multiword units are analysed with:

    Normative/prescriptive compounding tags

    (to govern compound behaviour for the speller, ie what a compound SHOULD BE):

    The first part of the component may be ..

    This entry / word can …

    Non-dictionary words can be recognised with:

    Question and Focus particles:

    Semantics are classified with

    Homonymy

    Derivations are classified under the morphophonetic form of the suffix, the source and target part-of-speech.

    Symbols that need to be escaped on the lower side (towards twolc):

    Morphophonology

    To represent phonologic variations in word forms we use the following symbols in the lexicon files:

    And following triggers to control variation

    Flag diacritics

    We have manually optimised the structure of our lexicon using following flag diacritics to restrict morhpological combinatorics - only allow compounds with verbs if the verb is further derived into a noun again: | @P.NeedNoun.ON@ | (Dis)allow compounds with verbs unless nominalised | @D.NeedNoun.ON@ | (Dis)allow compounds with verbs unless nominalised | @C.NeedNoun@ | (Dis)allow compounds with verbs unless nominalised

    For languages that allow compounding, the following flag diacritics are needed to control position-based compounding restrictions for nominals. Their use is handled automatically if combined with +CmpN/xxx tags. If not used, they will do no harm. | @P.CmpFrst.FALSE@ | Require that words tagged as such only appear first | @D.CmpPref.TRUE@ | Block such words from entering ENDLEX | @P.CmpPref.FALSE@ | Block these words from making further compounds | @D.CmpLast.TRUE@ | Block such words from entering R | @D.CmpNone.TRUE@ | Combines with the next tag to prohibit compounding | @U.CmpNone.FALSE@ | Combines with the prev tag to prohibit compounding | @P.CmpOnly.TRUE@ | Sets a flag to indicate that the word has passed R | @D.CmpOnly.FALSE@ | Disallow words coming directly from root.

    Use the following flag diacritics to control downcasing of derived proper nouns (e.g. Finnish Pariisi -> pariisilainen). See e.g. North Sámi for how to use these flags. There exists a ready-made regex that will do the actual down-casing given the proper use of these flags. | @U.Cap.Obl@ | Allowing downcasing of derived names: deatnulasj. | @U.Cap.Opt@ | Allowing downcasing of derived names: deatnulasj.

    Root lexicon

    The word forms in Livonian start from the lexeme roots of basic word classes

    Lexica for words that are not inflected

    These are but here for the time being

    adverb lexicon

    Interjections lexicon

    pcle-mod lexicon

    pcle-lexicon

    This is used in compounding, e.g. äʼb-:äʼb


    This (part of) documentation was generated from src/fst/morphology/root.lexc


    src-fst-morphology-stems-acronyms.lexc.md

    Acronyms Livonian acronyms …


    This (part of) documentation was generated from src/fst/morphology/stems/acronyms.lexc


    src-fst-morphology-stems-adjectives_newwords.lexc.md

    This is where new words are added as lexc entries before they are added to the xml source files. A_ “(eng) /(est) /(fin) /(lav)” ;

    LEXICON A_NEWWORDS are commented out in root.lexc, comment in as needed ADD NEW ADJECTIVES BELOW


    This (part of) documentation was generated from src/fst/morphology/stems/adjectives_newwords.lexc


    src-fst-morphology-stems-adverbs_newwords.lexc.md

    This is where new words are added as lexc entries before they are added to the xml source files. ADV_ “(eng) /(est) /(fin) /(lav)” ;

    LEXICON ADV_NEWWORDS are commented out in root.lexc, comment in as needed ADD NEW ADVERBS BELOW


    This (part of) documentation was generated from src/fst/morphology/stems/adverbs_newwords.lexc


    src-fst-morphology-stems-exceptions.lexc.md

    Exceptions are quite strange word-forms. the ones that do not fit anywhere else. This file contains all enumerated word forms that cannot reasonably be created from lexical data by regular inflection. Usually there should be next to none exceptions, it’s always better to have a paradigm that covers only one or few words than an exception since these will not work nicely with e.g. compounding scheme or possibly many end applications.

    the verbs of negation have partial inflection:

    Some verbs only have few word-forms left:

    The verb lǟdõ has irregular forms:

    The verb vȱlda has irregular forms:

    PRONOUNS

    PROPER NOUNS

    NOUNS partitive for morfa demo

    NUMERALS testing

    testing what is this


    This (part of) documentation was generated from src/fst/morphology/stems/exceptions.lexc


    src-fst-morphology-stems-nouns_newwords.lexc.md

    This is where new words are added as lexc entries before they are added to the xml source files. N_ “(eng) ear/(est) /(fin) /(lav)” ;

    LEXICON N_NEWWORDS are commented out in root.lexc, comment in as needed ADD NEW NOUNS BELOW


    This (part of) documentation was generated from src/fst/morphology/stems/nouns_newwords.lexc


    src-fst-morphology-stems-propernouns_newwords.lexc.md

    This is where new words are added as lexc entries before they are added to the xml source files. PROP_ “(eng) ear/(est) /(fin) /(lav)” ;

    LEXICON PROP_NEWWORDS are commented out in root.lexc, comment in as needed


    This (part of) documentation was generated from src/fst/morphology/stems/propernouns_newwords.lexc


    src-fst-morphology-stems-questionablemisc_newwords.lexc.md

    This is where new words are added as lexc entries before they are added to the xml source files. V_ “(eng) ear/(est) /(fin) /(lav)” ;

    LEXICON QUESTIONABLEMISC_NEWWORDS are commented out in root.lexc, comment in as needed


    This (part of) documentation was generated from src/fst/morphology/stems/questionablemisc_newwords.lexc


    src-fst-morphology-stems-verbs_newwords.lexc.md

    This is where new words are added as lexc entries before they are added to the xml source files. V_ “(eng) ear/(est) /(fin) /(lav)” ;

    LEXICON V_NEWWORDS are commented out in root.lexc, comment in as needed Add new verbs below


    This (part of) documentation was generated from src/fst/morphology/stems/verbs_newwords.lexc


    src-fst-phonetics-txt2ipa.xfscript.md

    retroflex plosive, voiceless t ʈ 0288, 648 ( = ASCII 096) retroflex plosive, voiced d ɖ 0256, 598 labiodental nasal F ɱ 0271, 625 retroflex nasal n ɳ 0273, 627 palatal nasal J ɲ 0272, 626 velar nasal N ŋ 014B, 331 uvular nasal N\ ɴ 0274, 628

    bilabial trill B\ ʙ 0299, 665 uvular trill R\ ʀ 0280, 640 alveolar tap 4 ɾ 027E, 638 retroflex flap r ɽ 027D, 637 bilabial fricative, voiceless p\ ɸ 0278, 632 bilabial fricative, voiced B β 03B2, 946 dental fricative, voiceless T θ 03B8, 952 dental fricative, voiced D ð 00F0, 240 postalveolar fricative, voiceless S ʃ 0283, 643 postalveolar fricative, voiced Z ʒ 0292, 658 retroflex fricative, voiceless s ʂ 0282, 642 retroflex fricative, voiced z` ʐ 0290, 656 palatal fricative, voiceless C ç 00E7, 231 palatal fricative, voiced j\ ʝ 029D, 669 velar fricative, voiced G ɣ 0263, 611 uvular fricative, voiceless X χ 03C7, 967 uvular fricative, voiced R ʁ 0281, 641 pharyngeal fricative, voiceless X\ ħ 0127, 295 pharyngeal fricative, voiced ?\ ʕ 0295, 661 glottal fricative, voiced h\ ɦ 0266, 614

    alveolar lateral fricative, vl. K alveolar lateral fricative, vd. K\

    labiodental approximant P (or v) alveolar approximant r\ retroflex approximant r` velar approximant M\

    retroflex lateral approximant l` palatal lateral approximant L velar lateral approximant L
    Clicks

    bilabial O\ (O = capital letter) dental |
    (post)alveolar !\ palatoalveolar =\ alveolar lateral ||
    Ejectives, implosives

    ejective > e.g. ejective p p> implosive < e.g. implosive b b< Vowels

    close back unrounded M close central unrounded 1 close central rounded } lax i I lax y Y lax u U

    close-mid front rounded 2 close-mid central unrounded @\ close-mid central rounded 8 close-mid back unrounded 7

    schwa ə @

    open-mid front unrounded E open-mid front rounded 9 open-mid central unrounded 3 open-mid central rounded 3\ open-mid back unrounded V open-mid back rounded O

    ash (ae digraph) { open schwa (turned a) 6

    open front rounded & open back unrounded A open back rounded Q Other symbols

    voiceless labial-velar fricative W voiced labial-palatal approx. H voiceless epiglottal fricative H\ voiced epiglottal fricative <\ epiglottal plosive >\

    alveolo-palatal fricative, vl. s\ alveolo-palatal fricative, voiced z\ alveolar lateral flap l\ simultaneous S and x x\ tie bar _ Suprasegmentals

    primary stress “ secondary stress % long : half-long :\ extra-short _X linking mark -
    Tones and word accents

    level extra high _T level high _H level mid _M level low _L level extra low _B downstep ! upstep ^ (caret, circumflex)

    contour, rising contour, falling _F contour, high rising _H_T contour, low rising _B_L

    contour, rising-falling _R_F (NB Instead of being written as diacritics with _, all prosodic marks can alternatively be placed in a separate tier, set off by < >, as recommended for the next two symbols.) global rise global fall Diacritics

    voiceless 0 (0 = figure), e.g. n_0 voiced _v aspirated _h more rounded _O (O = letter) less rounded _c advanced _+ retracted _- centralized _” syllabic = (or _=) e.g. n= (or n=) non-syllabic _^ rhoticity `

    breathy voiced _t creaky voiced _k linguolabial _N labialized _w palatalized ‘ (or _j) e.g. t’ (or t_j) velarized _G pharyngealized _?\

    dental d apical _a laminal _m nasalized ~ (or _~) e.g. A~ (or A~) nasal release _n lateral release _l no audible release _}

    velarized or pharyngealized _e velarized l, alternatively 5 raised _r lowered _o advanced tongue root _A retracted tongue root _q


    This (part of) documentation was generated from src/fst/phonetics/txt2ipa.xfscript


    src-fst-transcriptions-transcriptor-abbrevs2text.lexc.md

    We describe here how abbreviations are in Liv are read out, e.g. for text-to-speech systems.

    For example:


    This (part of) documentation was generated from src/fst/transcriptions/transcriptor-abbrevs2text.lexc


    src-fst-transcriptions-transcriptor-numbers-digit2text.lexc.md

    Starting work with ordinals


    This (part of) documentation was generated from src/fst/transcriptions/transcriptor-numbers-digit2text.lexc


    tools-grammarcheckers-grammarchecker.cg3.md

    [ L A N G U A G E ] G R A M M A R C H E C K E R

    DELIMITERS

    TAGS AND SETS

    Tags

    This section lists all the tags inherited from the fst, and used as tags in the syntactic analysis. The next section, Sets, contains sets defined on the basis of the tags listed here, those set names are not visible in the output.

    Beginning and end of sentence

    BOS EOS

    Parts of speech tags

    N A Adv V Pron CS CC CC-CS Po Pr Pcle Num Interj ABBR ACR CLB LEFT RIGHT WEB PPUNCT PUNCT

    COMMA ¶

    Tags for POS sub-categories

    Pers Dem Interr Indef Recipr Refl Rel Coll NomAg Prop Allegro Arab Romertall

    Tags for morphosyntactic properties

    Nom Acc Gen Ill Loc Com Ess Ess Sg Du Pl Cmp/SplitR Cmp/SgNom Cmp/SgGen Cmp/SgGen PxSg1 PxSg2 PxSg3 PxDu1 PxDu2 PxDu3 PxPl1 PxPl2 PxPl3 Px

    Comp Superl Attr Ord Qst IV TV Prt Prs Ind Pot Cond Imprt ImprtII Sg1 Sg2 Sg3 Du1 Du2 Du3 Pl1 Pl2 Pl3 Inf ConNeg Neg PrfPrc VGen PrsPrc Ger Sup Actio VAbess

    Err/Orth

    Semantic tags

    Sem/Act Sem/Ani Sem/Atr Sem/Body Sem/Clth Sem/Domain Sem/Feat-phys Sem/Fem Sem/Group Sem/Lang Sem/Mal Sem/Measr Sem/Money Sem/Obj Sem/Obj-el Sem/Org Sem/Perc-emo Sem/Plc Sem/Sign Sem/State-sick Sem/Sur Sem/Time Sem/Txt

    HUMAN

    PROP-ATTR PROP-SUR

    TIME-N-SET

    Syntactic tags

    @+FAUXV @+FMAINV @-FAUXV @-FMAINV @-FSUBJ> @-F<OBJ @-FOBJ> @-FSPRED<OBJ @-F<ADVL @-FADVL> @-F<SPRED @-F<OPRED @-FSPRED> @-FOPRED> @>ADVL @ADVL< @<ADVL @ADVL> @ADVL @HAB> @<HAB @>N @Interj @N< @>A @P< @>P @HNOUN @INTERJ @>Num @Pron< @>Pron @Num< @OBJ @<OBJ @OBJ> @OPRED @<OPRED @OPRED> @PCLE @COMP-CS< @SPRED @<SPRED @SPRED> @SUBJ @<SUBJ @SUBJ> SUBJ SPRED OPRED @PPRED @APP @APP-N< @APP-Pron< @APP>Pron @APP-Num< @APP-ADVL< @VOC @CVP @CNP OBJ

    -OTHERS SYN-V @X ### Sets containing sets of lists and tags This part of the file lists a large number of sets based partly upon the tags defined above, and partly upon lexemes drawn from the lexicon. See the sourcefile itself to inspect the sets, what follows here is an overview of the set types. #### Sets for Single-word sets INITIAL #### Sets for word or not WORD NOT-COMMA #### Case sets ADLVCASE CASE-AGREEMENT CASE NOT-NOM NOT-GEN NOT-ACC #### Verb sets NOT-V #### Sets for finiteness and mood REAL-NEG MOOD-V NOT-PRFPRC #### Sets for person SG1-V SG2-V SG3-V DU1-V DU2-V DU3-V PL1-V PL2-V PL3-V #### Pronoun sets #### Adjectival sets and their complements #### Adverbial sets and their complements #### Sets of elements with common syntactic behaviour #### NP sets defined according to their morphosyntactic features #### The PRE-NP-HEAD family of sets These sets model noun phrases (NPs). The idea is to first define whatever can occur in front of the head of the NP, and thereafter negate that with the expression **WORD - premodifiers**. #### Border sets and their complements #### Grammarchecker sets * * * This (part of) documentation was generated from [tools/grammarcheckers/grammarchecker.cg3](https://github.com/giellalt/lang-liv/blob/main/tools/grammarcheckers/grammarchecker.cg3) --- ## tools-tokenisers-tokeniser-disamb-gt-desc.pmscript.md ## Tokeniser for liv Usage: ``` $ make $ echo "ja, ja" | hfst-tokenise --giella-cg tokeniser-disamb-gt-desc.pmhfst $ echo "Juos gorreválggain lea (dárbbašlaš) deavdit gáibádusa boasttu olmmoš, man mielde lahtuid." | hfst-tokenise --giella-cg tokeniser-disamb-gt-desc.pmhfst $ echo "(gáfe) 'ja' ja 3. ja? ц jaja ukjend \"ukjend\"" | hfst-tokenise --giella-cg tokeniser-disamb-gt-desc.pmhfst $ echo "márffibiillagáffe" | hfst-tokenise --giella-cg tokeniser-disamb-gt-desc.pmhfst ``` Pmatch documentation: <https://github.com/hfst/hfst/wiki/HfstPmatch> Characters which have analyses in the lexicon, but can appear without spaces before/after, that is, with no context conditions, and adjacent to words: * Punct contains ASCII punctuation marks * The symbol after m-dash is soft-hyphen `U+00AD` * The symbol following {•} is byte-order-mark / zero-width no-break space `U+FEFF`. Whitespace contains ASCII white space and the List contains some unicode white space characters * En Quad U+2000 to Zero-Width Joiner U+200d' * Narrow No-Break Space U+202F * Medium Mathematical Space U+205F * Word joiner U+2060 Apart from what's in our morphology, there are 1. unknown word-like forms, and 2. unmatched strings We want to give 1) a match, but let 2) be treated specially by `hfst-tokenise -a` Unknowns are made of: * lower-case ASCII * upper-case ASCII * select extended latin symbols * liv specific latin extension ASCII digits * select symbols * Combining diacritics as individual symbols, * various symbols from Private area (probably Microsoft), so far: * U+F0B7 for "x in box" ### Unknown handling Unknowns are tagged ?? and treated specially with `hfst-tokenise` hfst-tokenise --giella-cg will treat such empty analyses as unknowns, and remove empty analyses from other readings. Empty readings are also legal in CG, they get a default baseform equal to the wordform, but no tag to check, so it's safer to let hfst-tokenise handle them. Finally we mark as a token any sequence making up a: * known word in context * unknown (OOV) token in context * sequence of word and punctuation * URL in context * * * This (part of) documentation was generated from [tools/tokenisers/tokeniser-disamb-gt-desc.pmscript](https://github.com/giellalt/lang-liv/blob/main/tools/tokenisers/tokeniser-disamb-gt-desc.pmscript) --- ## tools-tokenisers-tokeniser-gramcheck-gt-desc.pmscript.md ## Grammar checker tokenisation for liv Requires a recent version of HFST (3.10.0 / git revision>=3aecdbc) Then just: ``` $ make $ echo "ja, ja" | hfst-tokenise --giella-cg tokeniser-disamb-gt-desc.pmhfst ``` More usage examples: ``` $ echo "Juos gorreválggain lea (dárbbašlaš) deavdit gáibádusa boasttu olmmoš, man mielde lahtuid." | hfst-tokenise --giella-cg tokeniser-disamb-gt-desc.pmhfst $ echo "(gáfe) 'ja' ja 3. ja? ц jaja ukjend \"ukjend\"" | hfst-tokenise --giella-cg tokeniser-disamb-gt-desc.pmhfst $ echo "márffibiillagáffe" | hfst-tokenise --giella-cg tokeniser-disamb-gt-desc.pmhfst ``` Pmatch documentation: <https://github.com/hfst/hfst/wiki/HfstPmatch> Characters which have analyses in the lexicon, but can appear without spaces before/after, that is, with no context conditions, and adjacent to words: * Punct contains ASCII punctuation marks * The symbol after m-dash is soft-hyphen `U+00AD` * The symbol following {•} is byte-order-mark / zero-width no-break space `U+FEFF`. Whitespace contains ASCII white space and the List contains some unicode white space characters * En Quad U+2000 to Zero-Width Joiner U+200d' * Narrow No-Break Space U+202F * Medium Mathematical Space U+205F * Word joiner U+2060 Apart from what's in our morphology, there are 1) unknown word-like forms, and 2) unmatched strings We want to give 1) a match, but let 2) be treated specially by hfst-tokenise -a * select extended latin symbols * select symbols * various symbols from Private area (probably Microsoft), so far: * U+F0B7 for "x in box" TODO: Could use something like this, but built-in's don't include šžđčŋ: Simply give an empty reading when something is unknown: hfst-tokenise --giella-cg will treat such empty analyses as unknowns, and remove empty analyses from other readings. Empty readings are also legal in CG, they get a default baseform equal to the wordform, but no tag to check, so it's safer to let hfst-tokenise handle them. Finally we mark as a token any sequence making up a: * known word in context * unknown (OOV) token in context * sequence of word and punctuation * URL in context * * * This (part of) documentation was generated from [tools/tokenisers/tokeniser-gramcheck-gt-desc.pmscript](https://github.com/giellalt/lang-liv/blob/main/tools/tokenisers/tokeniser-gramcheck-gt-desc.pmscript) --- ## tools-tokenisers-tokeniser-tts-cggt-desc.pmscript.md ## TTS tokenisation for smj Requires a recent version of HFST (3.10.0 / git revision>=3aecdbc) Then just: ```sh make echo "ja, ja" \ | hfst-tokenise --giella-cg tokeniser-disamb-gt-desc.pmhfst ``` More usage examples: ```sh echo "Juos gorreválggain lea (dárbbašlaš) deavdit gáibádusa \ boasttu olmmoš, man mielde lahtuid." \ | hfst-tokenise --giella-cg tokeniser-disamb-gt-desc.pmhfst echo "(gáfe) 'ja' ja 3. ja? ц jaja ukjend \"ukjend\"" \ | hfst-tokenise --giella-cg tokeniser-disamb-gt-desc.pmhfst echo "márffibiillagáffe" \ | hfst-tokenise --giella-cg tokeniser-disamb-gt-desc.pmhfst ``` Pmatch documentation: <https://kitwiki.csc.fi/twiki/bin/view/KitWiki/HfstPmatch> Characters which have analyses in the lexicon, but can appear without spaces before/after, that is, with no context conditions, and adjacent to words: * Punct contains ASCII punctuation marks * The symbol after m-dash is soft-hyphen `U+00AD` * The symbol following {•} is byte-order-mark / zero-width no-break space `U+FEFF`. Whitespace contains ASCII white space and the List contains some unicode white space characters * En Quad U+2000 to Zero-Width Joiner U+200d' * Narrow No-Break Space U+202F * Medium Mathematical Space U+205F * Word joiner U+2060 Apart from what's in our morphology, there are 1) unknown word-like forms, and 2) unmatched strings We want to give 1) a match, but let 2) be treated specially by hfst-tokenise -a * select extended latin symbols * select symbols * various symbols from Private area (probably Microsoft), so far: * U+F0B7 for "x in box" TODO: Could use something like this, but built-in's don't include šžđčŋ: Simply give an empty reading when something is unknown: hfst-tokenise --giella-cg will treat such empty analyses as unknowns, and remove empty analyses from other readings. Empty readings are also legal in CG, they get a default baseform equal to the wordform, but no tag to check, so it's safer to let hfst-tokenise handle them. Needs hfst-tokenise to output things differently depending on the tag they get * * * This (part of) documentation was generated from [tools/tokenisers/tokeniser-tts-cggt-desc.pmscript](https://github.com/giellalt/lang-liv/blob/main/tools/tokenisers/tokeniser-tts-cggt-desc.pmscript)

    Sitemap

    Debugging site.pages:

    URL: /assets/css/style.css - Title:

    URL: /Links.html - Title:

    URL: /index-header.html - Title: Livonian documentation

    URL: / - Title: Livonian documentation

    URL: /liv.html - Title: Liv language model documentation

    URL: /src-cg3-functions.cg3.html - Title:

    URL: /src-fst-morphology-affixes-adjectives.lexc.html - Title: Adjective inflection

    URL: /src-fst-morphology-affixes-adpositions.lexc.html - Title: Adjective inflection

    URL: /src-fst-morphology-affixes-conjunctors.lexc.html - Title: Conjunctions

    URL: /src-fst-morphology-affixes-determiners.lexc.html - Title: Determiner inflection

    URL: /src-fst-morphology-affixes-nouns.lexc.html - Title: Livonian noun inflection

    URL: /src-fst-morphology-affixes-pronouns.lexc.html - Title: Prounoun inflection

    URL: /src-fst-morphology-affixes-propernouns.lexc.html - Title: Proper noun inflection

    URL: /src-fst-morphology-affixes-quantifiers.lexc.html - Title: Quantifier inflection

    URL: /src-fst-morphology-affixes-symbols.lexc.html - Title: Symbol affixes

    URL: /src-fst-morphology-affixes-verbs.lexc.html - Title: Livonian Verb inflection

    URL: /src-fst-morphology-phonology.twolc.html - Title: Livonian morphophonology

    URL: /src-fst-morphology-root.lexc.html - Title:

    URL: /src-fst-morphology-stems-acronyms.lexc.html - Title:

    URL: /src-fst-morphology-stems-adjectives_newwords.lexc.html - Title:

    URL: /src-fst-morphology-stems-adverbs_newwords.lexc.html - Title:

    URL: /src-fst-morphology-stems-exceptions.lexc.html - Title:

    URL: /src-fst-morphology-stems-nouns_newwords.lexc.html - Title:

    URL: /src-fst-morphology-stems-propernouns_newwords.lexc.html - Title:

    URL: /src-fst-morphology-stems-questionablemisc_newwords.lexc.html - Title:

    URL: /src-fst-morphology-stems-verbs_newwords.lexc.html - Title:

    URL: /src-fst-phonetics-txt2ipa.xfscript.html - Title:

    URL: /src-fst-transcriptions-transcriptor-abbrevs2text.lexc.html - Title:

    URL: /src-fst-transcriptions-transcriptor-numbers-digit2text.lexc.html - Title:

    URL: /tools-grammarcheckers-grammarchecker.cg3.html - Title:

    URL: /tools-tokenisers-tokeniser-disamb-gt-desc.pmscript.html - Title: Tokeniser for liv

    URL: /tools-tokenisers-tokeniser-gramcheck-gt-desc.pmscript.html - Title: Grammar checker tokenisation for liv

    URL: /tools-tokenisers-tokeniser-tts-cggt-desc.pmscript.html - Title: TTS tokenisation for smj

    Root items:

    URL: /Links.html - Title: Links

    URL: /index-header.html - Title: Livonian documentation

    URL: / - Title: Livonian documentation

    URL: /liv.html - Title: Liv language model documentation

    URL: /src-cg3-functions.cg3.html - Title: Src-cg3-functions.cg3

    URL: /src-fst-morphology-affixes-adjectives.lexc.html - Title: Adjective inflection

    URL: /src-fst-morphology-affixes-adpositions.lexc.html - Title: Adjective inflection

    URL: /src-fst-morphology-affixes-conjunctors.lexc.html - Title: Conjunctions

    URL: /src-fst-morphology-affixes-determiners.lexc.html - Title: Determiner inflection

    URL: /src-fst-morphology-affixes-nouns.lexc.html - Title: Livonian noun inflection

    URL: /src-fst-morphology-affixes-pronouns.lexc.html - Title: Prounoun inflection

    URL: /src-fst-morphology-affixes-propernouns.lexc.html - Title: Proper noun inflection

    URL: /src-fst-morphology-affixes-quantifiers.lexc.html - Title: Quantifier inflection

    URL: /src-fst-morphology-affixes-symbols.lexc.html - Title: Symbol affixes

    URL: /src-fst-morphology-affixes-verbs.lexc.html - Title: Livonian Verb inflection

    URL: /src-fst-morphology-phonology.twolc.html - Title: Livonian morphophonology

    URL: /src-fst-morphology-root.lexc.html - Title: Src-fst-morphology-root.lexc

    URL: /src-fst-morphology-stems-acronyms.lexc.html - Title: Src-fst-morphology-stems-acronyms.lexc

    URL: /src-fst-morphology-stems-adjectives_newwords.lexc.html - Title: Src-fst-morphology-stems-adjectives_newwords.lexc

    URL: /src-fst-morphology-stems-adverbs_newwords.lexc.html - Title: Src-fst-morphology-stems-adverbs_newwords.lexc

    URL: /src-fst-morphology-stems-exceptions.lexc.html - Title: Src-fst-morphology-stems-exceptions.lexc

    URL: /src-fst-morphology-stems-nouns_newwords.lexc.html - Title: Src-fst-morphology-stems-nouns_newwords.lexc

    URL: /src-fst-morphology-stems-propernouns_newwords.lexc.html - Title: Src-fst-morphology-stems-propernouns_newwords.lexc

    URL: /src-fst-morphology-stems-questionablemisc_newwords.lexc.html - Title: Src-fst-morphology-stems-questionablemisc_newwords.lexc

    URL: /src-fst-morphology-stems-verbs_newwords.lexc.html - Title: Src-fst-morphology-stems-verbs_newwords.lexc

    URL: /src-fst-phonetics-txt2ipa.xfscript.html - Title: Src-fst-phonetics-txt2ipa.xfscript

    URL: /src-fst-transcriptions-transcriptor-abbrevs2text.lexc.html - Title: Src-fst-transcriptions-transcriptor-abbrevs2text.lexc

    URL: /src-fst-transcriptions-transcriptor-numbers-digit2text.lexc.html - Title: Src-fst-transcriptions-transcriptor-numbers-digit2text.lexc

    URL: /tools-grammarcheckers-grammarchecker.cg3.html - Title: Tools-grammarcheckers-grammarchecker.cg3

    URL: /tools-tokenisers-tokeniser-disamb-gt-desc.pmscript.html - Title: Tokeniser for liv

    URL: /tools-tokenisers-tokeniser-gramcheck-gt-desc.pmscript.html - Title: Grammar checker tokenisation for liv

    URL: /tools-tokenisers-tokeniser-tts-cggt-desc.pmscript.html - Title: TTS tokenisation for smj

    Directory items: