On this page

The parts-of-speech are:
Parts of speech are further split up into:
Pronouns
Quantifiers (numerals)
Nominals are inflected for Number and Case
Number
Case
Possession and other declension types are marked with:
The comparative forms are:
Verb moods are:
- Infinitive moods
Tenses in the indicative and infrequently in the conditional
Verb personal forms are:
Object conjugation
The Usage extents are marked using following tags:
Special symbols
Simplex tags
Verbal arguments
Homonymy
Der begin
Declaring noun derivations
Conjugation of words other than finite verbs
Declaring Indefinite Pronoun derivations
DECLARING NOUN DERIVATIONS
the combinatory –Event– preceding the NP-final noun
DECLARING NUMERAL DERIVATIONS
DECLARING DEVERBAL DERIVATIONS OF VERBS
Special letters in the root that might be useful in dialect research and etymology later
Tags
Different focus particles
Imperative clitics
- Tags distinguishing different versions of the same lemma (before POS)
- Symbols that need to be escaped on the lower side (towards twolc):
Flag diacritics
number
case

Morphology

INTRODUCTION TO MORPHOLOGICAL ANALYSER OF ERZYA.

Analysis symbols

The morphological analyses of wordforms of ERZYA are presented in this system in terms of following symbols. (It is highly suggested to follow existing standards when adding new tags).

+TYÄ WORK HAS TO BE DONE
%

The parts-of-speech are:

+A adjective
+Adp adposition
+Adv adverb
+CS subordinating conjunction
+CC coordinating conjunction
+Det determiner
+Descr descriptive
+Interj interjection
+N noun
+Num numerals
+Pcle particle
+Po postposition
+Pr preposition (in Russian loans)
+Pron pronoun
+Qnt quantifier
+V verb

Parts of speech are further split up into:

Adjectives

+Adn Adnominal (modifier) !! This is not an NP head like +Pron
+Bahuvrihi This is a nominative-case NP used as an adjective
+bahuvrihi get rid of these for upper-case

Adverbs

+Ideoph These are ideophonic descriptors used to modify the verb вырк ливтясь “flit and it flew off” “Ideophone: A vivid representation of an idea in sound. A word, often onomatopoeic, which describes a predicate, qualificative or adverb in respect to manner, colour, sound, smell, action, state or intensity.” (Doke 1935:118)
+Manner with reference to type of adverb
+Parenthetic parenthetic
+Spat spatial
+Iter Iterative form expressing number of times; myv: кавксть, kpv: кыкысь
+Mult Multiplicative, two-ply; myv: кавонькирда
+Deg Ad-adjective This is degree, depricate + AdA
+Epist epistemic modality marker speaker’s evaluation/judgment of, degree of confidence in
+EvidNfh not first-hand келя
+EvidFh first-hand
+PerifMod periferal modifier ськамонзо

Interjections

+Formulaic

Nouns

+Prop proper

Particles

Postpositions + Spat, + Temp

Pronouns

+Dem demonstrative
+Indef indefinite
+Dep dependent word requiring the presence of another, e.g. мень
+Exclusive: ськамонза
+Intensive: intensive pronoun
+Interr interrogative
+PerifMod: periferal modifier ськамонза, кавонест
+Pers personal
+Recipr reciprocal
+Refl reflexive
+Rel relative
+Relat relator noun
+Sel selective, when selecting from a set of definites
+Short тень, теть; эстень
+Long монень, тонеть; монстень
+Sg1 first person singular
+Sg2 second person singular
+Sg3 third person singular
+Pl1 first person plural
+Pl2 second person plural
+Pl3 third person plural

Quantifiers (numerals)

Quantifiers and Numerals are classified under:

+Appr Approximative numeral кавто-колмо, колмошка two or three NB! do not confuse with Komi case +Apr
+AssocColl -ne- ; avide-
+Assoc +мезть
+Card cardinal + NCard
+Coll collective
+Distr Distributive
+Ord ordinal + NOrd
+Exclusive: ськамонзо

Nominals are inflected for Number and Case

Number

+Sg singular
+Pl plural
+SP ambiguous for number, general number

Case

+Abe abessive
+Abl ablative case
+Com Comitative “-нек/-нэк”
+Cmpr Comparative case form -шка
+Dat dative
+Ela elative case
+Gen genitive case
+Ill illative
+Ine inessive
+Lat lative
+Loc Locative “вить ён : вить ёно”
+Nom nominative case
+Prl prolative “га/ка/ва”
+Tra translative: used in similative and depictive constructions to mark what would be a secondary subject: –вармакс оргодсь тосто–
+Temp Temporalis case form “-не/-нэ” previously TempCx
+Voc Vocative

Possession and other declension types are marked with:

+PxSg1 first person singular
+PxSg2 second person singular
+PxSg3 third person singular
+PxSP3 third person singular or plural with dative only
+PxPl1 first person plural
+PxPl2 second person plural
+PxPl3 third person plural
+Def Definite

The comparative forms are:

+Comp comparative as opposed to superlative
+Superl superlative
+Attr Attribute

Verb moods are:

+Cond conditional Ындеря- (Derivational)
+Conj conjunctional “вОль”
+Des desiderative Ыксэль “was about to; wanted to”
+Ind indicative
+Imprt imperative
+Opt optative
+Prec precative
+Proh prohibitive is distinct from the negation of imperative Иля аварде! Don't cry' (Proh); Аволь мелявтт, кецяк!Don’t worry, be happy!’ (Neg + Imprt)

Infinitive moods

+Oblig modality: deontic/directive/obligative андомс: андома , якамс: якама
+Delib +Sugg modality: deontic/directive/deliberative I still need the right word for this андомс: андомсат

Tenses in the indicative and infrequently in the conditional

+Prs In Erzya There is no morphological distinction between present and future
+Prt1 Preterite 1
+Prt2 Preterite 2 (This is also used in predicate forms not involving a finite verb.)

Verb personal forms are:

+ScSg1 * subject conjugation first person singular
+ScSg2 * subject conjugation second person singular
+ScSg3 * subject conjugation third person singular
+ScPl1 * subject conjugation first person plural
+ScPl2 * subject conjugation second person plural
+ScPl3 * subject conjugation third person plural
Object conjugation
+OcSg1 * object conjugation first person singular
+OcSg2 * object conjugation second person singular
+OcSg3 * object conjugation third person singular
+OcPl1 * object conjugation first person plural
+OcPl2 * object conjugation second person plural
+OcPl3 * object conjugation third person plural

Other verb forms are

+Act * active voice (exo-tradition)
+PrsPrc * present participle (only non-contrastive usage)
+DemPrc * present participle (both contrastive and non-contrastive)
+ActPrcLong {иы}й (This is dealt with elsewhere as an active present participle)
+ActPrcShort {иы} (This is dealt with elsewhere as an active present participle)
+ActDemPrc {иы}ця (This is dealt with elsewhere as an active present participle)
+ConNeg * connegative, main verb complement to Neg, vow-stem
+ConNegII * connegative, main verb complement to Neg, cons-stem
+Ger * gerund This is used with Der/Ozj and VAbl
+Inf * infinitive
+Neg * verb of negation эзь, аволь, иля, апак
+ConvPrc * converb OR participle апак
+Prc * participle
+VGen * Verb Genitive, genitive form participle
+VAbl * Verb Ablative “озадо”
+Prc/Telic * Telic participle “саевть”
+Der/Abe * ВтОмО
+Der/Cmpr * шка
+Der/A * adjective derived from N or V
+Der/N2A * adjective derived from N
+Der/V2A * adjective derived from V
+Subst * deverbal nouns retaining verb arguments/gov
+PrfPrc

The Usage extents are marked using following tags:

+Err/Orth * Substandard
+Err/Sub * Substandard
+Err/Orth-no-hyphen * тетятават should be тетят-ават
+Err/Orth-back-should-be-hard-front * back should be hard front
+Err/Orth-cons-stem * пачт емс 2012 пачтямс
+Err/Orth-freq-le * пачтнемс:пачле
+Err/Orth-cons-stem * эзь эряв
+Err/Orth-front-linking-vowel * linking vowel is front уряжень
+Err/Orth-high-linking-vowel * linking vowel is high
+Err/Orth-mid-linking-vowel-should-be-high * linking vowel is mid вечкелизь should be вечкилизь
+Err/Orth-mid-onset-default-missing * should be скаломок, but this is скалмок, мелезэнек: мельзэнек
+Err/Orth-no-linking-vowel * linking vowel is missing
+Err/Orth-shib-hard * Иважнэнь
+Err/Orth-stem-a-should-be-o0 * чачтомс+V:чачта
+Err/Orth-stem-hard-e-should-be-je * Nekshnems
+Err/Orth-stem-ja-should-be-je0 * лемдемс+V:лемдя
+Err/Orth-stem-je-should-be-ja * мелямс:меле
+Err/Orth-stem-je-should-be-je0 * чудемс+V:чуде чуд емс (->)чуде мс
+Err/Orth-je-for-jo * should be ё
+Err/Orth-vowel-stem-je * пачтякшномс:пачтекшне
+Err/Orth-stem-soft-should-be-0 * кирпець:кирпецьтне
+Err/Orth-stem-nodent-hard-should-be-tnje * потоктнэсэ
+Err/Orth-missing-soft-in-stem * видме
+Err/Orth-missing-t-in-def-pl * область: областне
+Err/Orth-s-to-j * кайсь Modern: кассь
+Err/Orth-z-to-j * кардайсэ Modern: кардазсо
+Err/Orth-v-loss-before-lab * ольной
+Err/Orth-split-tween * гемень, кавтово
+Err/Orth-0-not-pal * no soft sign but should take soft sign
+Err/Orth-f * not v but instead f
+Err/Orth-s * not v but instead s
+Err/Orth-d * not t but instead d
+Err/Orth-colloq * colloquial, e.g. Минорыч
+Err/Orth-old1 * old1 like озимь, морковь
+Err/Orth-pre1880 * orthography preceding 1880
+Err/Orth-pre1978 * orthography preceding 1978
+Err/Orth-pre2012 * previous orthography
+Use/Marg * Marginal
+Use/-Spell * Exclude from speller
+Use/SpellNoSugg * recognized but not suggested in speller
+Use/Circ * Circular path
+Use/CircN * Circular number path
+Use/-Ped * Remove from pedagogical speller
+Use/NG * Do not generate, for isme-ped.fst and apertium
+Use/GC – only retained in the HFST Grammar Checker disambiguation analyser
+Use/-GC – never retained in the HFST Grammar Checker disambiguation analyser
+Use/TTS – only retained in the HFST Text-To-Speech disambiguation tokeniser
+Use/-TTS – never retained in the HFST Text-To-Speech disambiguation tokeniser
+Err/Lex * The lemma is not an Erzya word (Depricating –+Src/F–)
+URL * For tagging URLs

Dialect tags

+Dial/SH * Short forms
+Dial/L * Long forms
+Dial * No specification Specific to some dialects Rueter 2010: 8
+Dial/-C * Not central standard
+Dial/C * 1 Central or Kozlovka-Mokshlei
+Dial/W * 2 Western or Insar
+Dial/W-NW * 2 Western or Insar, subgroup NW
+Dial/W-SW * 2 Western or Insar, subgrou0 SW
+Dial/NW * 3 North-Western or Alatyr
+Dial/SE * 4 South-Eastern or Sura
+Dial/M * 5 Mixed or Drakino-Shoksha

Orthography tags

+Orth/PhonDeriv * Derivation is phonetic but declension and conjugation morphologic
+Orth/PhonInfl * Entire inflection is phonetic 1821, 1920-30
+Orth/standard * described in 2008, dictionary 2012
+Orth/thirties * 1939–1955 phonetic, morphological
+Orth/fifties * 1955–1978 phonetic, morphological
+Orth/seventies * 1978–1993 phonetic, morphological
+Orth/nineties * 1993-2008 morphological, but phonetic compounding
+Orth/wiki * Regular-semantic deriving from 1993 and 2008
+Orth/-wiki * e.g. compound words written with white space
+Orth/standard_wiki * e.g. вайгельпе
+Orth/-thirties * e.g. таргсемс, студенттнэ
+Orth/Colloq Colloquial speech reflected in spelling

Abbreviated words are classified with:

+ABBR * Abbreviation
+Symbol = independent symbols in the text stream, like £, €, ©
+ACR * Acronym

Special symbols

Delimiter marks are classified with:

+CLB +PUNCT +LEFT +RIGHT +MIDDLE *
%^excl *

The verbs are syntactically split according to transitivity:

+TV * transitive verb
+IV * intransitive verb
+NomAg Actor Noun From Verb - Nomen Agentis (ready)
+NomAct Action Noun From Verb - Nomen Actio (ready)
+Dimin Diminutive

Auxiliary verbs

+Aux *

Special multiword units are analysed with:

+Multi

Non-dictionary words can be recognised with:

+Guess

Question and Focus particles:

+Qst +Foc
+Acc for Russian
+All for Russian
+AnIn for Russian animate
+Anim for Russian
+Cmpar for Russian
+Count for Russian
+Epenth for Russian
+Imp for Russian imperative
+Impf for Russian
+Inan for Russian inanimate
+Ins for Russian
+Fac for Russian
+Fem for Russian feminine
+MFN for Russian
+Msc for Russian masculine
+Neu for Russian neuter
+Perf for Russian
+PObj for Russian
+Pos for Russian
+Prb for Russian
+Pred for Russian predicate
+PrsAct for Russian
+Pst for Russian

Semantic tags

Semantic tags to help disambiguation & synt. analysis: (before POS) Borrowed from main/langs/sme/src/morphology/root.lexc

Simplex tags

+Sem/Act Activity
+Sem/Amount Amount
+Sem/Ani Animate
+Sem/Aniprod Animal Product
+Sem/Body Bodypart
+Sem/Body-abstr siellu, vuoig?a, jierbmi
+Sem/Build Building
+Sem/Build-part Part of Bulding, like the closet
+Sem/Cat Category
+Sem/Clth Clothes
+Sem/Clth-jewl Jewelery
+Sem/Clth-part part of clothes, boallu, sávdnji…
+Sem/Ctain Container
+Sem/Ctain-abstr Abstract container like bank account
+Sem/Ctain-clth
+Sem/Curr Currency like dollár, Not Money
+Sem/Dance Dance
+Sem/Dir Direction like GPS-kursa
+Sem/Domain Domain like politics, reindeerherding (a system of actions)
+Sem/Drink Drink
+Sem/Dummytag Dummytag
+Sem/Edu Educational event
+Sem/Event Event
+Sem/Feat Feature, like Árvu
+Sem/Feat-phys Physiological feature, ivdni, fárda
+Sem/Feat-psych Psychological feauture
+Sem/Feat-measr Psychological feauture
+Sem/Fem Female name
+Sem/Food Food
+Sem/Food-med Medicine
+Sem/Furn Furniture
+Sem/Game Game
+Sem/Geom Geometrical object
+Sem/Group Animal or Human Group
+Sem/Hum Human
+Sem/Hum-abstr Human abstract
+Sem/Ideol Ideology
+Sem/Kin Kinship term (special PxSg2 forms),
+Sem/Kin_Fem Kinship term (special PxSg2 forms), female
+Sem/Kin_Mal Kinship term (special PxSg2 forms), male
+Sem/Lang Language
+Sem/Mal Male name
+Sem/Mat Material for producing things
+Sem/Measr Measure
+Sem/Money Has to do with money, like wages, not Curr(ency)
+Sem/Obj Object
+Sem/Obj-clo Cloth
+Sem/Obj-cogn Cloth
+Sem/Obj-el (Electrical) machine or apparatus
+Sem/Obj-ling Object with something written on it
+Sem/Obj-rope flexible ropelike object
+Sem/Obj-surfc Surface object
+Sem/Org Organisation
+Sem/Part Feature, oassi, bealli
+Sem/Perc-cogn Cognative perception
+Sem/Perc-emo Emotional perception
+Sem/Perc-phys Physical perception
+Sem/Perc-psych Physical perception
+Sem/Plant Plant
+Sem/Plant-part Plant part
+Sem/Plc Place
+Sem/Plc-abstr Abstract place
+Sem/Plc-elevate Place
+Sem/Plc-line Place
+Sem/Plc-water Place
+Sem/Pos Position (as in social position job)
+Sem/Process Process
+Sem/Prod Product
+Sem/Prod-audio Audio product
+Sem/Prod-cogn Cognition product
+Sem/Prod-ling Linguistic product
+Sem/Prod-vis Visual product
+Sem/Rel Relation
+Sem/Route Name of a Route
+Sem/Rule Rule or convention
+Sem/Semcon Semantic concept
+Sem/Sign Sign (e.g. numbers, punctuation)
+Sem/Sport Sport
+Sem/State
+Sem/State-sick Illness
+Sem/Substnc Substance, like Air and Water
+Sem/Sur Surname
+Sem/Fem-Sur Surname female
+Sem/Mal-Sur Surname male
+Sem/Symbol Symbol
+Sem/Time Time
+Sem/Tool Prototypical tool for repairing things
+Sem/Tool-catch Tool used for catching (e.g. fish)
+Sem/Tool-clean Tool used for cleaning
+Sem/Tool-it Tool used in IT
+Sem/Tool-measr Tool used for measuring
+Sem/Tool-music Music instrument
+Sem/Tool-write Writing tool
+Sem/Txt Text (girji, lávlla…)
+Sem/Veh Vehicle
+Sem/Wpn Weapon
+Sem/Wthr The Weather or the state of ground

Multiple Semantic tags:

+Sem/Act_Group
+Sem/Act_Plc
+Sem/Act_Route
+Sem/Amount_Build
+Sem/Amount_Semcon
+Sem/Ani_Body-abstr_Hum
+Sem/Ani_Build
+Sem/Ani_Build-part
+Sem/Ani_Build_Hum_Txt
+Sem/Ani_Group
+Sem/Ani_Group_Hum
+Sem/Ani_Hum
+Sem/Ani_Hum_Plc
+Sem/Ani_Hum_Time
+Sem/Ani_Plc
+Sem/Ani_Plc_Txt
+Sem/Ani_Time
+Sem/Ani_Veh
+Sem/Aniprod_Hum
+Sem/Aniprod_Obj-clo
+Sem/Aniprod_Perc-phys
+Sem/Aniprod_Plc
+Sem/Body-abstr_Prod-audio_Semcon
+Sem/Body_Body-abstr
+Sem/Body_Clth
+Sem/Body_Food
+Sem/Body_Group_Hum
+Sem/Body_Hum
+Sem/Body_Mat
+Sem/Body_Measr
+Sem/Body_Obj_Tool-catch
+Sem/Body_Plc
+Sem/Body_Time
+Sem/Build-part_Plc
+Sem/Build_Build-part
+Sem/Build_Clth-part
+Sem/Build_Edu_Org
+Sem/Build_Event_Org
+Sem/Build_Org
+Sem/Build_Route
+Sem/Clth-jewl_Curr
+Sem/Clth-jewl_Money
+Sem/Clth-jewl_Plant
+Sem/Clth_Hum
+Sem/Ctain-abstr_Org
+Sem/Ctain-clth_Plant
+Sem/Ctain-clth_Veh
+Sem/Ctain_Feat-phys
+Sem/Ctain_Furn
+Sem/Ctain_Tool
+Sem/Ctain_Tool-measr
+Sem/Curr_Org
+Sem/Dance_Org
+Sem/Dance_Prod-audio
+Sem/Domain_Food-med
+Sem/Domain_Prod-audio
+Sem/Edu_Event
+Sem/Edu_Group_Hum
+Sem/Edu_Mat
+Sem/Edu_Org
+Sem/Event_Food
+Sem/Event_Hum
+Sem/Event_Plc
+Sem/Event_Time
+Sem/Feat-phys_Tool-write
+Sem/Feat-phys_Veh
+Sem/Feat-phys_Wthr
+Sem/Feat-psych_Hum
+Sem/Feat_Plant
+Sem/Food_Perc-phys
+Sem/Food_Plant
+Sem/Game_Obj-play
+Sem/Geom_Obj
+Sem/Group_Hum
+Sem/Group_Hum_Org
+Sem/Group_Hum_Plc
+Sem/Group_Hum_Prod-vis
+Sem/Group_Org
+Sem/Group_Sign
+Sem/Group_Txt
+Sem/Hum_Lang
+Sem/Hum_Lang_Plc
+Sem/Hum_Lang_Time
+Sem/Hum_Obj
+Sem/Hum_Org
+Sem/Hum_Plant
+Sem/Hum_Plc
+Sem/Hum_Tool
+Sem/Hum_Veh
+Sem/Hum_Wthr
+Sem/Lang_Tool
+Sem/Mat_Plant
+Sem/Mat_Txt
+Sem/Measr_Time
+Sem/Money_Obj
+Sem/Money_Txt
+Sem/Obj-play
+Sem/Obj-play_Sport
+Sem/Obj_Semcon
+Sem/Clth-jewl_Org
+Sem/Org_Rule
+Sem/Org_Txt
+Sem/Org_Veh
+Sem/Part_Prod-cogn
+Sem/Perc-emo_Wthr
+Sem/Plant_Plant-part
+Sem/Plant_Tool
+Sem/Plant_Tool-measr
+Sem/Plc-abstr_Rel_State
+Sem/Plc-abstr_Route
+Sem/Plc_Pos
+Sem/Plc_Route
+Sem/Plc_Substnc
+Sem/Plc_Substnc_Wthr
+Sem/Plc_Time
+Sem/Plc_Tool-catch
+Sem/Plc_Wthr
+Sem/Prod-audio_Txt
+Sem/Prod-cogn_Txt
+Sem/Semcon_Txt
+Sem/Obj_State
+Sem/Substnc_Wthr
+Sem/Time_Wthr

Semantics are classified with

+Sem/Divinity Divinity (god personified),
+Sem/Constellation Constellation,
+Sem/Ant Anthroponym
+Sem/Fem Anthroponym female
+Sem/Mal Anthroponym male
+Sem/Patr Patronym
+Sem/Fem-Patr Patronym female
+Sem/Mal-Patr Patronym male
+Sem/Rvr name of river or water way, media of transportation,
+Sem/Mnth name of month
+Sem/Inanim Inanimate,

Semantic Fields

+Field/Agr agriculatural
+Field/Anat anatomical
+Field/Bio biological
+Field/Bot botanical
+Field/Chem chemical
+Field/Geol geological
+Field/Gram grammatical
+Field/Hist historical
+Field/Law law
+Field/Mar maritime
+Field/Math mathematical
+Field/Med medical
+Field/Mus musical
+Field/Relig church
+Field/Tech technical
+Field/Zool zoological

Other tags

Verbal arguments

+Subj/Zero This is used to mark verbs without a semantic subject

Derivations are classified under the morphophonetic form of the suffix, the source and target part-of-speech.

+V→N +V→V +V→A

Homonymy

Der begin

+Der In front of every derivation to make it possible to target derivations as a class e.g. in regular expressions etc
+Der/VtOmO
+Der/AbeAttr
+Der/stO Deriving adverbs from adjectives A2Adv
+Der/ms эрзямс эрзя, истямс истя, вадрямс вадря
+Der/shka
+Der/GenAttr +Der/Onj genitive attribute derivation of non-nouns
+Der/aj vocative
+Der/kaj vocative
+Der/PatrMal Male patronymic
+Der/PatrFem Female patronymic
+Der/Ovt * telic deverbal noun also attr, resultative participle
+Der/Oms * infinitive illative
+Der/OmO * infinitive locative/nominative
+Der/OmstO * infinitive elative
+Der/OmsO * infinitive inessive
+Der/OmdO * infinitive ablative
+Der/Omga * infinitive prolative
+Der/Oma * modality: deontic/directive/obligative андомс: андома , якамс: якама
+Der/Omka * modality: deontic/directive/obligative андомс: андомка , якамс: якамка
+Der/Ycja * active (demonstrative) present participle takes copula person
+Der/Yj * active long present participle takes copula person
+Der/Y * active short present participle
+Der/Yks * active short present participle with ks derivation
+Der/Ozj * Gerund
+Der/Cond * conditional derivation +Der/Ynderja
+Der/NomAg Actor Noun From Verb - Nomen Agentis (derivation) default in Ыця
+Der/NomAct Action Noun From Verb - Nomen Actio (derivation)

Declaring noun derivations

+Der/pelj

Modifier without noun

+Der/MWN Modifier without Noun
+Der/Dem Speaker-Oriented Demonstrative
Conjugation of words other than finite verbs
+Der/Pr derivation to predicate head, e.g. nominal conjugation
+Der/Cop This is not a derivation
+Clt/Cop This will replace the nominal conjugation Der/Pr+V
+Clt/Cond

Declaring Indefinite Pronoun derivations

+Der/koj prefix +Indef in indefinite pronouns
+Der/ta prefix +Indef in indefinite pronouns
+Der/tago prefix +Indef in indefinite pronouns
+Der/Gak suffix +Indef in indefinite pronouns
+Der/buti suffix +Indef in indefinite pronouns
+Der/Yja suffix +Indef in indefinite pronouns ковия, зярыя

DECLARING NOUN DERIVATIONS

+Der/chi adjective-to-noun
the combinatory –Event– preceding the NP-final noun
+Der/OmA verb-to-noun

DECLARING NUMERAL DERIVATIONS

+Der/cje +A+Ord
+Der/tjks +A+Ord (non-contrastive)

DECLARING DEVERBAL DERIVATIONS OF VERBS

+Der/kshnO verb2verb derivation
+Der/OkshnOms verb2verb derivation
+Der/OvOms verb2verb derivation
+Der/OvkshnOms verb2verb derivation
+Der/OvtOms verb2verb derivation
+Der/Ovtnjems verb2verb derivation
+Der/Ozevems verb2verb derivation
+Der/Ozevtems verb2verb derivation
+Der/Ozevtnjems verb2verb derivation
+Der/Ozevkshnems verb2verb derivation
+Der/sje this in verb2verb derivation and also in denominal demonstrative –Der/Dem–
+Der/nje verb2verb derivation
+Der/njems verb2verb derivation
+Der/Oncje old orth кудонцесь
+Der/Dimin
+Der/ka diminutive
+Der/NJE This is used in ошке, калнэ and кудыне
+Der/nJE diminutive
+Der/Ynje diminutive
+Der/Ynjka diminutive
+Der/Ynjkinje diminutive
+Der/ke diminutive in –ке–
+Der/kinje diminutive
+Der/ks Adv›N
+OLang/SME - North Sámi
+OLang/SMJ - Lule Sámi
+OLang/SMA - South Sámi
+OLang/FIN - Finnish
+OLang/SWE - Swedish
+OLang/NOB - Norw. bokmål
+OLang/NNO - Norw. nynorsk
+OLang/ENG - English
+OLang/MYV - Erzya
+OLang/MDF - Moksha
+OLang/RUS - Russian
+OLang/TAT - Tatar
+OLang/UND - Undefined
+F - Foreign

Morphophonology

To represent phonologic variations in word forms we use the following symbols in the lexicon files:

And following triggers to control variation

{frontHard} — front harmony hard
{frontSoft} — front harmony soft
{back} — back harmony
{backHard} — back harmony
{dialM} — for Shoksha and Drakino Dial/M morphology
{ichPat} — for triggering colloquial partonymic forms
%^CnsRM — Remove consonant
Е3 testing тне тнэ
%^H used with stems in ч, ш, ж for hard plurals

Special letters in the root that might be useful in dialect research and etymology later

Ь3 арсемс:арсе arśems vs арсемс:арЬ3се aŕśems
Ӓ3 эрямс:Ӓ3ря
Ӓ4 пелемс:пӒ4ль
%^Ӓ3 ^Ӓ3 :Э
%^Ӓ4 ^Ӓ4 :Е
%^ӓ3 эрямс:^ӓ3ря
%^ӓ4 пелемс:п^ӓ4ль
%^Ь2ZERO removes stem-final soft sign
{дт} in ablative
{ое} inflectional suffix protovowel аволь аволинь
{оеэØ} Suffix-initial archiphoneme
{уиыØ} Suffix-initial archiphoneme in dialect
%^RegrRaise идиса, идима ! raising e:i, o:u before a in NW
%^Break ашоян disallow о:

вт{оеэ}мО1 suffix-internal archivowel

{оэØ} inessive, elative; this is the hard/broad s
{ОØ} Stem-final archiphoneme панго
{ЕØ} Stem-final archiphoneme тинге

%^OldAE — This allows Ӓ4 and Ӓ3 to be realized as я

%^NoLinkVow — No linking vowel is used only after consonants for error
%^SoftRetain — The soft sign is not lost when adding -тне
%^HardNoDent — Hard non-dent followed by -тнэ потоктнэсэ

MISC

+Cmp/Hyph A tag to indicate that a hyphen was used when compounding

Development tag

+WORK
+NoVowX
ZERO
%0
%-
+Dig1
+Dig2
+Dig3
+Dig4
+Rom Roman numerals

Compounding

+Cmp Dynamic compound - this tag should always be part of a dynamic compound. It is important for Apertium, and useful in other cases as well.
+Cmp/Hyph-Coll with nouns
+Cmp/Hyph-Redup with verbs
+Cmp/Hyph-Synonym with verbs
+Cmp/Hyph-Serial with verbs
+Cmp/Hyph-tejems with verbs

Imperative clitics

+Clt/Ga редяка Precative +Prec
+Clt/Gaja редякая
+Clt/Gajatj редякаять
+Clt/Gajatja редякаятя
+Clt/Gatja редякатя
+Clt/Gaka редякака ARE these real?
+Clt/Gakaja редякакая ARE these real?
+Pred2 secondary predicate. Examples: “Joe came in with his hat on.” “Joe came in Joe had his hat on.”

Tags distinguishing different versions of the same lemma (before POS)

+v1
+v2
+v3
+v4
+v5
+v6
+v7
+v8
+v9
+v10
+v11
+v12
+v13
+v14
+v15
+v16
+v17
+v18
+v19
+v20
+v21
+v22
+v23
+v24
+ACC +DAT +COM This marks a function not a morpheme
+NoPoss used with personal pronouns in oblique cases, where a possessor index is expected

Symbols that need to be escaped on the lower side (towards twolc):

»
«
(written with square brackets, see the root.lexc file)
< (written with square brackets, see the root.lexc file)

Flag diacritics

We have manually optimised the structure of our lexicon using following flag diacritics to restrict morhpological combinatorics - only allow compounds with verbs if the verb is further derived into a noun again: | @P.NeedNoun.ON@ | (Dis)allow compounds with verbs unless nominalised | @D.NeedNoun.ON@ | (Dis)allow compounds with verbs unless nominalised | @C.NeedNoun@ | (Dis)allow compounds with verbs unless nominalised

For languages that allow compounding, the following flag diacritics are needed to control position-based compounding restrictions for nominals. Their use is handled automatically if combined with +CmpN/xxx tags. If not used, they will do no harm. | @P.CmpFrst.FALSE@ | Require that words tagged as such only appear first | @D.CmpPref.TRUE@ | Block such words from entering ENDLEX | @P.CmpPref.FALSE@ | Block these words from making further compounds | @D.CmpLast.TRUE@ | Block such words from entering R | @D.CmpNone.TRUE@ | Combines with the next tag to prohibit compounding | @U.CmpNone.FALSE@ | Combines with the prev tag to prohibit compounding | @P.CmpOnly.TRUE@ | Sets a flag to indicate that the word has passed R | @D.CmpOnly.FALSE@ | Disallow words coming directly from root.

Use the following flag diacritics to control downcasing of derived proper nouns (e.g. Finnish Pariisi -> pariisilainen). See e.g. North Sámi for how to use these flags. There exists a ready-made regex that will do the actual down-casing given the proper use of these flags. | @U.Cap.Obl@ | Allowing downcasing of derived names: deatnulasj. | @U.Cap.Opt@ | Allowing downcasing of derived names: deatnulasj.

Flags used to identify parts of speech

@P.POS.PRON@
@U.POS.N@
@U.POS.NUM@
@U.POS.A@
@P.POS.N@
@R.POS.N@
@P.POS.NUM@
@R.POS.NUM@
@P.POS.A@
@R.POS.A@
@P.POS.V@
@R.POS.V@
@C.POS@

Flags used with +Clt/Cop nonverbal predication

@U.PRED.NO@
@U.PRED.YES@
@C.PRED@

Flags used with transitivity

@U.TRANS.TV@
@U.TRANS.IV@
@P.TRANS.TV@
@P.TRANS.IV@ Flags used with serial verbs
@U.CONJ-INF.YES@
@U.CONJ-INF.NO@
@U.CONJ-TX.NONPAST@
@U.CONJ-TX.PRT1@
@U.CONJ-TX.PRT2@
@U.CONJ-MX.IND@
@D.CONJ-MX.IND@ 2012-11-04 should this be –D– or –N–
@U.CONJ-MX.IMP@
@U.CONJ-MX.OPT@
@U.CONJ-MX.PREC@
@U.CONJ-MX.DES@
@U.CONJ-MX.CONJ@
@U.CONJ-MX.COND@
@U.CONJ-CONNEG.YES@
@U.CONJ-CONNEG.NO@
@U.CONJ-NX.PL@
@U.CONJ-NX.SG@
@U.CONJ-POSS.1@
@U.CONJ-POSS.2@
@U.CONJ-POSS.3@
@U.CONJ-POSS.2ACC@
@U.CONJ-POSS.3ACC@
@U.CONJ-PX.10@
@U.CONJ-PX.12@
@U.CONJ-PX.13@
@U.CONJ-PX.15@
@U.CONJ-PX.16@
@U.CONJ-PX.20@
@U.CONJ-PX.21@
@U.CONJ-PX.23@
@U.CONJ-PX.24@
@U.CONJ-PX.26@
@U.CONJ-PX.30@
@U.CONJ-PX.31@
@U.CONJ-PX.32@
@U.CONJ-PX.33@
@U.CONJ-PX.34@
@U.CONJ-PX.35@
@U.CONJ-PX.36@
@U.CONJ-PX.40@
@U.CONJ-PX.42@
@U.CONJ-PX.43@
@U.CONJ-PX.45@
@U.CONJ-PX.46@
@U.CONJ-PX.50@
@U.CONJ-PX.51@
@U.CONJ-PX.53@
@U.CONJ-PX.54@
@U.CONJ-PX.56@
@U.CONJ-PX.60@
@U.CONJ-PX.61@
@U.CONJ-PX.62@
@U.CONJ-PX.63@
@U.CONJ-PX.64@
@U.CONJ-PX.65@
@U.CONJ-PX.66@
@R.CONJ-PX.13@
@R.CONJ-PX.16@
@R.CONJ-PX.23@
@R.CONJ-PX.26@
@R.CONJ-PX.33@
@R.CONJ-PX.36@
@R.CONJ-PX.43@
@R.CONJ-PX.46@
@R.CONJ-PX.53@
@R.CONJ-PX.56@
@R.CONJ-PX.63@
@R.CONJ-PX.66@
@P.CONJ.ObjAll@
@R.CONJ.ObjAll@
@C.CONJ@
@P.TLOSS.ON@
@R.TLOSS.ON@
@P.PossPx.Sg1@
@P.PossPx.Sg2@
@P.PossPx.Sg3@
@P.PossPx.Pl1@
@P.PossPx.Pl2@
@P.PossPx.Pl3@
@U.PossPx.S3@
@U.PossPx.SP3@
@U.PossPx.Sg1@
@U.PossPx.Sg2@
@U.PossPx.Sg3@
@U.PossPx.Pl1@
@U.PossPx.Pl2@
@U.PossPx.Pl3@
@D.PossPx@
@C.PossPx@
@P.TNUM.SG@
@P.TNUM.PL@
@D.TNUM.SG@
@D.TNUM.PL@
@C.TNUM@

problematic

@P.TPERS.1@
@P.TPERS.2@
@P.TPERS.3@
@N.TPERS.1@
@N.TPERS.2@
@N.TPERS.3@
@U.TPERS.1@
@U.TPERS.2@
@U.TPERS.3@
@C.TPERS@
@U.CX.ABE@
@U.CX.ABL@
@U.CX.CMP@
@U.CX.COM@
@U.CX.DAT@
@U.CX.ELA@
@U.CX.GEN@
@R.CX.ILL@
@D.CX.ILL@
@U.CX.ILL@
@U.CX.INE@
@U.CX.LAT@
@U.CX.LOC@
@U.CX.NOM@
@U.CX.PRL@
@U.CX.TRA@
@U.CX.PRL@
@U.CX.TEMP@
@N.CX.ILL@
@N.CX.INE@
@N.CX.LAT@
@N.CX.ELA@
@C.CX@
@P.DNUM.PL@
@P.DNUM.SG@
@U.DNUM.PL@
@U.DNUM.SG@
@C.DNUM@
@P.NUM.SG@
@P.NUM.PL@
@D.NUM.SG@
@D.NUM.PL@
@C.NUM@
@U.INDEF.KOI@
@U.INDEF.TA@
@U.INDEF.TAGO@
@U.INDEF.BUTI@
@U.INDEF.GAK@
@C.INDEF-PRON@
@P.INDEF.PREF@
@D.INDEF.PREF@
@R.INDEF.PREF@
@C.INDEF@

This allows or disallows combining with hyphen through loop especially for acronyms 2012-11-04

@U.HYPH-COMBO.ACRO@
@D.HYPH-COMBO.ACRO@
@C.HYPH-COMBO@

This disallows secondary compounding

@U.COMPOUND.YES@
@D.COMPOUND.YES@
@U.COMPOUND.NO@

Linking vowel for use with Translative

@P.LV.ON@
@P.LV.OFF@
@R.LV.ON@
@U.LV.ON@
@D.LV.ON@
@C.LV@
@C.CONJ-INF@
@C.CONJ-TX@
@C.CONJ-MX@
@C.CONJ-CONNEG@
@C.CONJ-NX@
@C.CONJ-PX@
@C.CONJ-POSS@
@C.KLOSS@
@C.TLOSS@

FLAGS USED WITH COLLECTIVE NOUNS

number

@U.DECL-NX.SG@
@U.DECL-NX.SP@
@U.DECL-NX.PL@
@R.DECL-NX.SG@
@R.DECL-NX.SP@
@R.DECL-NX.PL@
case
@U.DECL-CX.NOM@
@U.DECL-CX.ACC@
@U.DECL-CX.GEN@
@U.DECL-CX.DAT@
@U.DECL-CX.ABL@
@U.DECL-CX.ILL@
@U.DECL-CX.INE@
@U.DECL-CX.ELA@
@U.DECL-CX.LAT@
@U.DECL-CX.LOC@
@U.DECL-CX.TRA@
@U.DECL-CX.PRL@
@U.DECL-CX.COM@
@U.DECL-CX.TEMP@
@U.DECL-CX.ABE@
@U.DECL-CX.CMP@
@U.DECL-DX.DEF@
@U.DECL-DX.INDEF@
@U.DECL-DX.PX@

Removal

@C.DECL-NX@
@C.DECL-DX@
@C.DECL-CX@

Flag diacritic	Explanation
@U.number.one@	Flag used to give arabic numerals in smj different cases ;
@U.number.two@	Flag used to give arabic numerals in smj different cases ;
@U.number.three@	Flag used to give arabic numerals in smj different cases ;
@U.number.four@	Flag used to give arabic numerals in smj different cases ;
@U.number.five@	Flag used to give arabic numerals in smj different cases ;
@U.number.six@	Flag used to give arabic numerals in smj different cases ;
@U.number.seven@	Flag used to give arabic numerals in smj different cases ;
@U.number.eight@	Flag used to give arabic numerals in smj different cases ;
@U.number.nine@	Flag used to give arabic numerals in smj different cases ;
@U.number.zero@	Flag used to give arabic numerals in smj different cases ;

Russian letters from shared-urj_Cyrl а́ и́ о́ е́ у́

The word forms in ERZYA start from the lexeme roots of basic word classes, or optionally from prefixes: Here follow all contlexes, appr 20.

Hyphenated-nouns ; entire serial nouns
Hyphenated-verbs ; entire serial verbs

CyrillicFemaleName ; Emptied 2026-06-08, all moved to urj-Cyrl-propernouns.lexc HUNSPELL Type name derivation RussianMalenamesDerive ; ! RussianSurnamesDerive ;

увол-авол

alo-SPAT-1Arg ; >PO_KAL-LOC

This (part of) documentation was generated from src/fst/morphology/root.lexc