Multichar_Symbols and Root lexicon for Komi
Check these:
Analysis symbols
- +WORK to mark intermediate solutions
The morphological analyses of wordforms for the Komi-Zyrian
language are presented in this system in terms of the following symbols.
(It is highly suggeste d to follow existing standards when adding new tags).
- +A: adjective кывберд прилагательное
- +Adp: adposition (prepositio, postposition)
- +Adv: adverb урчитан наречие
- +CS: subordinating conjunction XX подчинительный союз
- +CC: coordinating conjunction XX сочинительный союз
- +Det: determiner XX XX
- +Interj: interjection междометтьӧ междометие
- +N: noun эмакыв - существительное
- +Num: numeral лыдакыв числительное
- +Pcle: particle кывтор частица
- +Po: postposition кывбӧр послелог
- +Pr: preposition XX предлог
- +Pron: pronoun нимвежтас местоимение
- +Qnt: Quantifier ХХ XX
- +V: verb кадакыв глагол
- +Ideoph These are ideophonic descriptors used to modify the verb – вырк ливтясь “flit and it flew off”
- +Deg Degree depricate AdA
- +Manner with reference to type of adverb
- +Mult multiplicative, i.e. iterations
- +Spat spatial
- +Temp temporal
- +Parenthetic parenthetical phrase
- +Presentational
Interjections
+Formulaic = expressions such as аттьӧ, ало, …
+Conative Used for calling animals, for example брысь, баль-баль, …
Nouns
- +Prop proper
- +CollN used with paired nouns collective nouns
- +Relat relational noun: выв, ув
Pronouns
- +Dem: demonstrative
- +Indef: indefinite
- +Interr: interrogative
- +Pers: personal
- +Recipr: reciprocal
- +Refl: reflexive
- +Rel: relative
- +Poss: possessive
Nominals are inflected for Number and Case
Number
- +Sg singular
- +Pl plural
- +Du for pronoun.
Case
A category of case in Komi can be identified as:
- +Acc accusative ZERO керан
- +Acc1 accusative -ӧс керан
- +Acc3 accusative -сӧ керан
- +Abl ablative case -лысь босьтан
- +Apr approximative -лань матыстчан
- +AprEgr approximative egressive -ланьсянь матысь ылыстчан
- +AprEla approximative elative -ланьысь матысь петан
- +AprIll approximative illative -ланьӧ матӧ матыстчан
- +AprIne approximative inessive -ланьын матыс ина
- +AprPrl approximative prolative -ланьӧд маті вуджан
- +AprTer approximative terminative -ланьӧдз матіӧдз воан
- +AprTra approximative translative -ланьті маті вуджан
- +Car caritive -тӧг торйӧдан
- +Cns consecultative -ла могман
- +Com Comitative -кӧд ӧтвывтан
- +Cmpr Comparative case form -ся ӧткодялан
- +Cmpl Postposition complement
- +Dat dative case -лы сетан
- +Egr egressive -сянь ылыстчан
- +Ela elative -ысь петан
- +Gen genitive case -лӧн асалан
- +Ill illative -ӧ пыран
- +Ine inessive -ын ина
- +Ins instrumental -ӧн керанторъя
- +Nom nominative case нимтан
- +Prl prolative -ӧд вуджан
- +Tra translative -ті вуджан
- +Ter Terminative -ӧдз матыстчан
- +Voc Vocative ??
- +Abs Absolute = +Sg+Nom
Possessive suff
- +PxSg1 +PxSg2 +PxSg3 +PxPl1 +PxPl2 +PxPl3
- +Px1 +Px2 +Px3
- +So/CP segment ordering: case, person
- +So/PC segment ordering: person, case
- +Attr +Card
- +Ord
- +Iter Iterative form expressing number of times
- +Tot
- +Arab +Rom
- +Coll
Quantifiers (numerals)
- +Appr: Approximative numeral кавто-колмо, колмошка two or three NB! do not confuse with Komi case +Apr
- +AssocColl: -ne- ; avide-
- +Assoc: +мезть
- +Card: cardinal + NCard
- +ZeroColl: Zero collective кодныс
- +Distr: Distributive
- +Iter: Iterative form expressing number of consecutive times; kpv:
кыкысь
- +Mult: Multiplicative adverbs number of times; kpv:
кык пӧв
- +Coord: Coordinates, i.e. 65˚36′8,30″ in numerals.lexc
- +Cop: this is for copula complement predicate position with pl in -ӧсь depricated Pred
- +Ind +Prs +Prt1 +Prt2 +Fut +Imprt tense
- +Sg1 +Sg2 +Sg3 +Pl1 +Pl2 +Pl3 +Du1 person тэа-меа
- +1 +2 +3 Final мед ог so that I/we won’t 2019-04-06
- +Inf
- +Ger Gerund This is used with derivations
- +ConNeg +Neg
- +VAbess тӧм Participle verbal adjective, see also Der/Abe
- +VCar тӧг Gerund
- +VTer тӧдз Gerund
- +Final мог, мон, моз ‘so that I won’t’
- +TV
- +IV
- +Aux
- +ABBR +ACR
- +Acron
- +Symbol = independent symbols in the text stream, like £, €, ©
Special symbols are classified with:
- +CLB +PUNCT +LEFT +RIGHT
- +Multi Special multiword units are analysed with:
- +Guess
Question and Focus particles:
- +v1
- +v2
- +v3
- +v4
- +v5
- +v6
- +v7
- +v8
- +v9
- +v10
- +v11
- +v12
- +v13
- +v14
- +v15
- +v16
- +v17
- +v18
- +v19
- +v20
- +v21
- +v22
- +v23
- +v24
Dialect features
Check these Where do these come from source
- +Src/F foreign source apparently 2015-09-08
- +Dim diminutive for verbs -ышт- (there might be a better term)
- +Dimin diminutive for nouns -тор-
- +NonHum look at this and place somewhere
Semantic tags to help disambiguation & synt. analysis: (before POS)
Borrowed from main/langs/sme/src/morphology/root.lexc
- +Sem/Act Activity
- +Sem/Amount Amount
- +Sem/Ani Animate
- +Sem/Aniprod Animal Product
- +Sem/Body Bodypart
- +Sem/Body-abstr siellu, vuoig?a, jierbmi
- +Sem/Build Building
- +Sem/Build-part Part of Bulding, like the closet
- +Sem/Cat Category
- +Sem/Clth Clothes
- +Sem/Clth-jewl Jewelery
- +Sem/Clth-part part of clothes, boallu, sávdnji…
- +Sem/Ctain Container
- +Sem/Ctain-abstr Abstract container like bank account
- +Sem/Ctain-clth
- +Sem/Curr Currency like dollár, Not Money
- +Sem/Dance Dance
- +Sem/Dir Direction like GPS-kursa
- +Sem/Domain Domain like politics, reindeerherding (a system of actions)
- +Sem/Drink Drink
- +Sem/Dummytag Dummytag
- +Sem/Edu Educational event
- +Sem/Event Event
- +Sem/Feat Feature, like Árvu
- +Sem/Feat-phys Physiological feature, ivdni, fárda
- +Sem/Feat-psych Psychological feauture
- +Sem/Feat-measr Psychological feauture
- +Sem/Fem Female name
- +Sem/Food Food
- +Sem/Food-med Medicine
- +Sem/Furn Furniture
- +Sem/Game Game
- +Sem/Geom Geometrical object
- +Sem/Group Animal or Human Group
- +Sem/Hum Human
- +Sem/Hum-abstr Human abstract
- +Sem/Ideol Ideology
- +Sem/Lang Language
- +Sem/Mal Male name
- +Sem/Mat Material for producing things
- +Sem/Measr Measure
- +Sem/Money Has to do with money, like wages, not Curr(ency)
- +Sem/Obj Object
- +Sem/Obj-clo Cloth
- +Sem/Obj-cogn Cloth
- +Sem/Obj-el (Electrical) machine or apparatus
- +Sem/Obj-ling Object with something written on it
- +Sem/Obj-rope flexible ropelike object
- +Sem/Obj-surfc Surface object
- +Sem/Org Organisation
- +Sem/Part Feature, oassi, bealli
- +Sem/Perc-cogn Cognative perception
- +Sem/Perc-emo Emotional perception
- +Sem/Perc-phys Physical perception
- +Sem/Perc-psych Physical perception
- +Sem/Plant Plant
- +Sem/Plant-part Plant part
- +Sem/Plc Place
- +Sem/Plc-abstr Abstract place
- +Sem/Plc-elevate Place
- +Sem/Plc-line Place
- +Sem/Plc-water Place
- +Sem/Pos Position (as in social position job)
- +Sem/Process Process
- +Sem/Prod Product
- +Sem/Prod-audio Audio product
- +Sem/Prod-cogn Cognition product
- +Sem/Prod-ling Linguistic product
- +Sem/Prod-vis Visual product
- +Sem/Rel Relation
- +Sem/Route Name of a Route
- +Sem/Rule Rule or convention
- +Sem/Semcon Semantic concept
- +Sem/Sign Sign (e.g. numbers, punctuation)
- +Sem/Sport Sport
- +Sem/State
- +Sem/State-sick Illness
- +Sem/Substnc Substance, like Air and Water
- +Sem/Sur Surname
- +Sem/Symbol Symbol
- +Sem/Time Time
- +Sem/Tool Prototypical tool for repairing things
- +Sem/Tool-catch Tool used for catching (e.g. fish)
- +Sem/Tool-clean Tool used for cleaning
- +Sem/Tool-it Tool used in IT
- +Sem/Tool-measr Tool used for measuring
- +Sem/Tool-music Music instrument
- +Sem/Tool-write Writing tool
- +Sem/Txt Text (girji, lávlla…)
- +Sem/Veh Vehicle
- +Sem/Wpn Weapon
- +Sem/Wthr The Weather or the state of ground
- +Sem/Year
- +Sem/Sur-Fem Surname female
- +Sem/Sur-Mal Surname male
- +Sem/Ant Anthroponym
- +Sem/Ant-Fem Anthroponym female
- +Sem/Ant-Mal Anthroponym male
- +Sem/Patr Patronym
- +Sem/Patr-Fem Patronym female
- +Sem/Patr-Mal Patronym male
- +Sem/Ant_Fem
- +Sem/Ant_Mal
- +Sem/Patr-Маl
- +Sem/Event_Plc сёянін
- +Sem/Hum_Prof profession, capacity doctor, tractor driver
Derivation
Derivations are classified under the morphophonetic form of the suffix, the
source and target part-of-speech.
- +Der/xxx
- +Der In front of every derivation to make it possible to target derivations as a class e.g. in regular expressions etc
- +Der/La
- +Der/Ан Process Participle +AN
- +Der/Ана Process Participle +ANA, Gerund or participle according to context (with…)
- +Der/Анаа adverb derived from participle (+ANA) +ANAA
- +Der/чӧж +CHOZH
- +Der/тӧг
- +Der/Abe тӧм should take +A, see also +VAbess
- +Der/Patr patronymics in Russian
- +Instr
- +NomAct
- +Der/NomAct +Event
- +Der/NomAg
- +Duration
- +Der/иг
- +Der/Ig
- +Der/IgKezhlo
- +Der/IgKosta
- +Der/IgKosti
- +Der/IgMoz %{иі%}гмоз
- +Der/IgonMoz %{иі%}гӧнмоз
- +Der/IgSor %{иі%}гсор
- +Der/IgTyr %{иі%}гтыр
- +Der/IgTyrji %{иі%}гтырйи
- +Der/IgTyrja %{иі%}гтыръя
- +Der/IgChozh%{иі%}гчӧж
- +Der/ысь
- +ActPrsPtc
- +PrsPrc
- +PrsPtc
- +PastPtc
- +Der/кості +KOSTI
- +Der/коста +KOSTA
- +Der/кежлӧ +KEZHLO
- +Der/мысь +MYS
- +Der/мысьт +MYST
- +Der/сор = +SOR
- +Der/тыр = +TYR
- +Der/тырйи = +TYRJI
- +Der/тыръя = +TYRJA
- +Der/мӧн = +MON
- +Der/мӧнъя = Ӧнія коми кыв. 2000: 399-403
- +Der/ӧмӧн = +OMON !Ӧнія коми кыв. 2000: 425
Declaring adjectival derivations
Noun phrase modifiers are generally considered derivational
- +MAbe abessive modifier -тӧм
- +MLoc locative modifier са -
- +MHab habeo modifier а -
- +MTmp temporal modifier ся -
- +Der/ProprietiveMod = +Der/APrior Denominal prioritive adjective Der/а
- +Der/PrivMod = тӧм
- +Der/а
- +Der/са
- +Der/ся
- +Der/Иник
- +Der/Ин
- +Der/увса
- +Der/сайса
- +Der/пӧвстса
- +Der/костса
- +Der/бердса
- +Der/бӧрса
- +Der/весьтса
- +Der/водзса
- +Der/вывса
- +Der/гӧгӧрса
- +Der/дорса
- +Loc LocMod, IneMod Быд во шедӧдӧны бур успеваемость Воркута да Инта каръясса, Прилузскӧй да Княжпогостскӧй районъясса школаяс.
- +LocMod move to Loc
- +CompMod
- +Der/тӧм used with nouns and followed by +AbeMod
- +Abe PrivMod, AbeMod джуджыд анализъястӧм да обобщениеястӧм статьяяс.
- +PrivMod move to Abe
- +Prp ProprietiveMod, HabObjMod Весиг киясыс тӧдсаӧсь, найӧ мугов рӧмаӧсь, кузь чорыд чуньясаӧсь.
- +ProprietiveMod move to Prp
- +Der/TempMod TempMod Der/ся но и Ф. В. Плесовскийлысь квайтымынӧд вояссяяссӧ * позьӧ аддзыны сӧмын библиотекаясысь. Declaring spatial adverb derivations; see also spatial postpositions
- +Der/ла
- +Der/ладор
- +Der/дор
- +Der/выв
- +Der/тор
- +MWN check! used once, should it be +Der/MWN?, Well, yes.
- +Der/MWN
- +Der/мед Superlative
- +Der/сюрӧ +Der/кӧ !Declaring Indefinite Pronoun derivations
- +Der/моз +MOZ diminishing, kind of, sort of
- +Der/кодь diminishing, kind of, sort of
- +Der/лун adjective-to-noun
- +Der/ӧм verb-to-noun !Declaring Indefinite Pronoun derivations the combinatory +Event preceding the NP-final noun
Declaring Deverbal derivations of verbs
- +Der/л
- +Der/лы
- +Der/ывлы
- +Der/ышт
- +Der/лывлы
- +Der/сь This only occurs following a vowel in an yny-stem 2017-09-19+Der/сь
- +Der/сьы 2017-09-19+Der/ч ! This appears to be a variant of +Der/сьы; it follows plosives
- +Der/чы This appears to be a variant of +Der/сьы; it follows plosives
- +Der/ал
- +Der/овт
- +Der/ась
- +Der/N Noun derived with conversion from noun, conversion but not ZERO
- +Der/A Adjective derivated from Noun or Verb
- +Der/Adv Adverb derivated from Adjective
- +EOLang/BXR
- +EOLang/CHM
- +EOLang/KOI
- +EOLang/KOM
- +EOLang/KPV
- +EOLang/MHR
- +EOLang/MRJ
- +EOLang/MDF
- +EOLang/MYV
- +EOLang/RUS
- +EOLang/YRK
Morphophonology
To represent phonologic variations in word forms we use the following
symbols in the lexicon files:
Archiphonemes
- {aä}: Vowel alternating symbol
- {oö}: Vowel alternating symbol
- {uü}: Vowel alternating symbol
- %^к2 %^л2 %^м2 %^т2 %^ь2 %^К2 %^Л2 %^М2 %^Т2 %^Ь2 %^И2
- **%^V1 ** for reduplicated vowel унаӧн > унаан
- %> suffix border
- %{иі%}: for soft and hard
- %{ая%}: for soft and hard
Triggers to control variation
- {front}: Vowel change triggers
- {back}: Vowel change triggers
- %^Close Close syllable, this triggers final consonant drop, seen in word-final position and before consonant
- **%^C2V ** Consonant v to vowel, Izhva ныы, ооны
- +%<acc%> accusative
- +%<ela%> elative -ысь
- +%<ins%> instrumental -ӧн
- +%<inf_ны%> infinitive in -ны
- +%<po_вылӧ%> postposition вылӧ
- +%<sub_мый%> subordinate clause in мый/that
Symbols that need to be escaped on the lower side (towards twolc):
- »
- «
- > (written with square brackets, see the root.lexc file)
- < (written with square brackets, see the root.lexc file)
Flag diacritics
We have manually optimised the structure of our lexicon using following
flag diacritics to restrict morhpological combinatorics - only allow compounds
with verbs if the verb is further derived into a noun again:
Flags |
Explanation |
@P.NeedNoun.ON@ |
(Dis)allow compounds with verbs unless nominalised |
@D.NeedNoun.ON@ |
(Dis)allow compounds with verbs unless nominalised |
@C.NeedNoun@ |
(Dis)allow compounds with verbs unless nominalised |
Two flags copied from sme
Flags |
Explanation |
@P.Pmatch.Loc@ |
Used on multi-token analyses; tell hfst-tokenise/pmatch where in the form/analysis the token should be split. |
@P.Pmatch.Backtrack@ |
Used on single-token analyses; tell hfst-tokenise/pmatch to backtrack by reanalysing the substrings before and after this point in the form (to find combinations of shorter analyses that would otherwise be missed) |
Compunding
- +Cmp
- +Cmp/Serial used with serial verbs
- +Cmp/SplitR
Flags
For languages that allow compounding, the following flag diacritics are needed
to control position-based compounding restrictions for nominals. Their use is
handled automatically if combined with +CmpN/xxx tags. If not used, they will
do no harm.
Flags |
Explanation |
@P.CmpFrst.FALSE@ |
Require that words tagged as such only appear first |
@D.CmpPref.TRUE@ |
Block such words from entering ENDLEX |
@P.CmpPref.FALSE@ |
Block these words from making further compounds |
@D.CmpLast.TRUE@ |
Block such words from entering R |
@D.CmpNone.TRUE@ |
Combines with the next tag to prohibit compounding |
@U.CmpNone.FALSE@ |
Combines with the prev tag to prohibit compounding |
@P.CmpOnly.TRUE@ |
Sets a flag to indicate that the word has passed R |
@D.CmpOnly.FALSE@ |
Disallow words coming directly from root. |
Use the following flag diacritics to control downcasing of derived proper
nouns (e.g. Finnish Pariisi -> pariisilainen). See e.g. North Sámi for how to use
these flags. There exists a ready-made regex that will do the actual down-casing
given the proper use of these flags.
Flags |
Explanation |
@U.Cap.Obl@ |
Always capital letter for names: Deatnu. |
@U.Cap.Opt@ |
Allowing downcasing of derived names: deatnulasj. |
Flags |
Explanation |
@U.CONJ-VAL.TV@ |
Flags used with serial verbs: VAL = Valence |
@U.CONJ-VAL.IV@ |
Flags used with serial verbs: VAL = Valence |
@U.CONJ-INF.YES@ |
INF = Infinitive |
@U.CONJ-INF.NO@ |
INF = Infinitive |
@U.CONJ-TX.FUT@ |
TX = tense |
@U.CONJ-TX.PRES@ |
TX = tense |
@U.CONJ-TX.PRET1@ |
TX = tense |
@U.CONJ-TX.PRET2@ |
TX = tense |
@U.CONJ-GER.IG@ |
GER = gerund |
@U.CONJ-GER.VCAR@ |
GER = VCar тӧг |
@U.CONJ-GER.VCARMoz@ |
GER = VCar тӧгмоз |
@U.CONJ-GER.VMON@ |
GER = VMon мӧн |
@U.CONJ-GER.VTER@ |
GER = VTer тӧдз |
@U.CONJ-MX.IND@ |
MX = mood |
@U.CONJ-MX.IMP@ |
MX = mood |
@U.CONJ-CONNEG.YES@ |
CONNEG = negation |
@U.CONJ-CONNEG.NO@ |
CONNEG = negation |
@U.CONJ-NX.PL@ |
NX = number |
@U.CONJ-NX.SG@ |
NX = number |
@U.CONJ-POSS.1@ |
POSS = possessive, person 1 |
@U.CONJ-POSS.2@ |
POSS = possessive 2 |
@U.CONJ-POSS.3@ |
POSS = possessive 3 |
@U.CONJ-POSS.2ACC@ |
POSS = possessive etc. |
@U.CONJ-POSS.3ACC@ |
POSS = possessive |
@U.CONJ-PX.1@ |
PX = person |
@U.CONJ-PX.2@ |
PX = person |
@U.CONJ-PX.3@ |
PX = person |
@C.CONJ-VAL@ |
Removal |
@C.CONJ-INF@ |
Removal |
@C.CONJ-TX@ |
Removal |
@C.CONJ-MX@ |
Removal |
@C.CONJ-GER@ |
Removal |
@C.CONJ-CONNEG@ |
Removal |
@C.CONJ-NX@ |
Removal |
@C.CONJ-PX@ |
Removal |
@C.CONJ-POSS@ |
Removal |
@P.PossPx.Sg1@ |
FLAGS USED WITH COLLECTIVE NOUNS |
@P.PossPx.Sg2@ |
FLAGS USED WITH COLLECTIVE NOUNS |
@P.PossPx.Sg3@ |
FLAGS USED WITH COLLECTIVE NOUNS |
@P.PossPx.Pl1@ |
FLAGS USED WITH COLLECTIVE NOUNS |
@P.PossPx.Pl2@ |
FLAGS USED WITH COLLECTIVE NOUNS |
@P.PossPx.Pl3@ |
FLAGS USED WITH COLLECTIVE NOUNS |
@U.PossPx.Sg1@ |
FLAGS USED WITH COLLECTIVE NOUNS |
@U.PossPx.Sg2@ |
FLAGS USED WITH COLLECTIVE NOUNS |
@U.PossPx.Sg3@ |
FLAGS USED WITH COLLECTIVE NOUNS |
@U.PossPx.Pl1@ |
FLAGS USED WITH COLLECTIVE NOUNS |
@U.PossPx.Pl2@ |
FLAGS USED WITH COLLECTIVE NOUNS |
@U.PossPx.Pl3@ |
FLAGS USED WITH COLLECTIVE NOUNS |
@D.PossPx@ |
FLAGS USED WITH COLLECTIVE NOUNS |
@C.PossPx@ |
FLAGS USED WITH COLLECTIVE NOUNS |
@U.DECL-NX.SG@ |
number |
@U.DECL-NX.PL@ |
number |
@R.DECL-NX.PL@ |
number |
@U.DECL-CX.ABE@ |
unify case |
@U.DECL-CX.ABL@ |
unify case |
@U.DECL-CX.ACC@ |
unify case |
@U.DECL-CX.APR@ |
unify case |
@U.DECL-CX.APRINE@ |
unify case |
@U.DECL-CX.APRILL@ |
unify case |
@U.DECL-CX.APRELA@ |
unify case |
@U.DECL-CX.APREGR@ |
unify case |
@U.DECL-CX.APRPRL@ |
unify case |
@U.DECL-CX.APRTRA@ |
unify case |
@U.DECL-CX.APRTER@ |
unify case |
@U.DECL-CX.CAR@ |
unify case |
@U.DECL-CX.CMP@ |
unify case |
@U.DECL-CX.CNS@ |
unify case |
@U.DECL-CX.COM@ |
unify case |
@U.DECL-CX.DAT@ |
unify case |
@U.DECL-CX.EGR@ |
unify case |
@U.DECL-CX.ELA@ |
unify case |
@U.DECL-CX.GEN@ |
unify case |
@U.DECL-CX.ILL@ |
unify case |
@U.DECL-CX.INE@ |
unify case |
@U.DECL-CX.INS@ |
unify case |
@U.DECL-CX.NOM@ |
unify case |
@U.DECL-CX.PRL@ |
unify case |
@U.DECL-CX.TRA@ |
unify case |
@U.DECL-CX.TER@ |
unify case |
@U.DECL-DX.INDEF@ |
declension type |
@U.DECL-DX.PX@ |
declension type |
@C.DECL-NX@ |
Removal |
@C.DECL-DX@ |
Removal |
@C.DECL-CX@ |
Removal |
@U.Cap.Obl@ |
Allowing downcasing of derived names: deatnulasj |
@U.Cap.Opt@ |
Allowing downcasing of derived names: deatnulasj |
Lexicon Root
The word forms in Komi (Zyrian) language start from the lexeme roots of basic
word classes, or optionally from prefixes:
- SUBSTANDARDS ; temporary solution
- adjectives ;
- adjectives-russian-like ;
- kom-adjectives-russian-like ;
- adpositions ;
- adverbs ;
- conjunctors ;
- descriptives ;
- determiners ;
- gerunds ;
- interjections ;
- nouns ;
- numerals ;
- particles ;
- pronouns ;
- propernouns-malenames-cyrillic ;
- propernouns-malesurnames-cyrillic ;
- propernouns-toponyms-Russian ; 2019-10-30 Cyrillic
- @U.Cap.Obl@ propernouns-toponyms-Komi ; toponyms - always uppercase
- @U.Cap.Opt@ propernouns-toponyms-Komi ; toponyms - allow downcasing for adj derivation
- propernouns ;
- quantifiers ;
- subjunctors ;
- verbs-A2M ;
- verbs-N2END ;
- VERBNEGATIVE ; affixes/verbs.lexc
- PRONOUN-TYPES ; in affixes/pronouns.lexc 2019-04-06
- Abbreviation ;
- Acronym ;
- kpv-Acronym ;
- Punctuation ;
- Symbols ;
- EXCEPTIONS ;
- dialect_lexicon ;
- urj-Cyrl-ProperNouns ; ! Testing 2015-09-06
- A_NEWWORDS ;
- A-Russian-like_NEWWORDS ;
- ADV_NEWWORDS ;
- N_NEWWORDS ;
- @U.Cap.Obl@ PROP_NEWWORDS ;
- @U.Cap.Opt@ PROP_NEWWORDS ;
- V_NEWWORDS ;
Lexica without morphology !
Absolute forms
ABS_
пу керка
выль керка
Compounding
R
Serial-Verbs
Lexica called End, whatever they are
ABBR-IS_ADV
ABBR-IS_N
Clitics
K
WordEnd
WordEnd-2
SPAT-COMPARATIVE
COMPARATIVE
SUBSTANDARDS
Endlex
Lexicon ENDLEX
And this is the ENDLEX of everything:
@D.CmpOnly.FALSE@@D.CmpPref.TRUE@@D.NeedNoun.ON@ # ;
The @D.CmpOnly.FALSE@
flag diacritic is ued to disallow words tagged
with +CmpNP/Only to end here.
The @D.NeedNoun.ON@
flag diacritic is used to block illegal compounds.
This (part of) documentation was generated from src/fst/morphology/root.lexc