Estonian NLP Grammar

Finite state and Constraint Grammar based analysers, proofing tools and other resources

View the project on GitHub giellalt/lang-est-x-utee

Phonology

Placeholders

Sami GT convention

Triggers

Usage tag. It may relate to an individual word in the lexicon, or to a set of inflectional forms of some inflectional type, i.e. its sub-paradigm. It never surfaces. It is used to pair the surface form with the usage tag of the lexical representation.

Special surface side symbols, used in rule contexts

Apostrophe is used for separating inflectional affix from a foreign word lemma

Morphophonemes

If the sound change is unproductive and difficult to relate to its immediate context, we use capital letters with numbers to denote them. In stems, they typically result from diachronic processes. In affixes, they are typically related to the declination or conjugation and the form of the stem they attach to.

Short stops

Orthographic convention: after voiceless (e.g. s or h, or k p t), gbd is written as kpt e.g. õhk-õhu, vask-vase K1 also in: uks, jooksma P1 also in: laps T1 also in: jätma, katma, kütma, matma, võtma, mõtlema, ütlema

Short stops in stem illatives for words that do not have grade alternation. They surface (or appear as extra long) in strong grade, expressed by stem illative only.

Unstressed syllable vowels disappear…

A stem vowel in inflectional forms of ne/s words, to make them formally similar in inflection

Ad hoc stem vowels for ne/s words

Few words…

j surfacing and changing

4 words have h-illative: sohu, suhu, öhe, pähe

only hea and pea

6 words have õ in indicative past

A handful of words…

Verb affix lexicons are simpler if we introduce these:

Stem vowels for verbs of some inflectional types

Verb affixes have k-g and t-d-0 alternations:

Imperative mood affixes gu/ku, ge/kem etc

Infinitive affixes ta/da/a, and gerund affixes tes/des/es

Impersonal voice affixes tud/tud, takse/dakse etc

To form past indicative forms and make them pronouncable

Sometimes the choice of an allomorph or allophone is related to the frequency of the word.

For plural partitive, the form is generated either with sg vowel + sid or plural vowel + 0 So we must allow stem vowels for singular and plural to appear and disappear in certain conditions.

Singular stem vowel tag in lexicon

Plural stem vowel tag in lexicon

Inflectional affixes having the same grammatical meaning: Pl Par endings sid/0, id/sid, Sg Ill endings sse/0. Their choice depends on triggers in the lexicon, have to be defined un-naturally letter by letter,

If the sound change is productive and/or very regularly determined by context (e.g. by morpheme border), we do not use special symbols to denote the changing phonemes

ne, s ending words have similar paradigms; only sg nom is different

-le/-el stem alternations also use e:0, in addition to 0:e (sip0lema-sipel0da)

high vowel lowering in certain contexts

1.1. plural partitive: -sid vs stem vowel change


This (part of) documentation was generated from src/fst/morphology/phonology.twolc