Wangkajunga language model documentation
All doc-comment documentation in one large file.
src-cg3-dependency.cg3.md
C O M M O N S Á M I D E P E N D E N C Y G R A M M A R
This dep file is for sma, sme, smj, sje.
DELIMITERS
Sentence delimiters are the following: <.> <!> <?> <…> <¶>
TAGS AND SETS
N V A Adv CC CS Inf Sup Neg Num Po Pr
Pcle Prop
Pron IV TV COMMA DASH CITATION to keep colouring we add a “ HYPHEN QMARK PUNCT LEFT RIGHT CLB Ind Pot Impr ImprtII Cond ConNeg Caus causative eus VGen Interj ABBR ACR Prs Prt Cmpnd RCmpnd PrfPrc PrsPrc Actor Actio Ger Indef Nom Acc Ill Com Gen Ess
IM For fao
POS sub-categories
Syntactic tags and sets
Syntactic tags in input to this file
Syntactic tags added in this file
- @FMV : finite main verb
- oaidná: Son oaidná ollislaš gova. - She sees the whole picture
- infinite main verb
- @FAUX : finite auxiliary verb
- ferte: Son ferte oaidnit ollislaš gova. - She must see the whole picture.
- @FMVdic : finite main verb introducing direct speech
- @IMVdic : infinite main verb introducing direct speech
- @FS-IMV : infinite main verb of subclause
- @FS-IAUX : infinite auxiliary verb in subclause
- @FS-N<IAUX : infinite auxiliary verb of a relative subclause
- @FS-N<IMV : infinite main verb of a relative subclause
- @FS-OBJ : finite verb in subclause functioning as object
- @FS-OBJ> : finite verb in subclause functioning as object
- @FS-<OBJ : finite verb in subclause functioning as object
- @FS-SUBJ : finite verb in subclause functioning as subject
- @FS-SUBJ> : finite verb in subclause functioning as subject
- @FS-<SUBJ : finite verb in subclause functioning as subject
- @FS-ADVL> : finite verb in subclause functioning as adverbial to the left of the main clause
- @FS-<ADVL : finite verb in subclause functioning as adverbial to the right of the main clause
- @S< : a clause modifying a sentence to the right of it
- @FS-ADVL : finite verb in subclause …
- @-FS-<ADVL : infinite subclause - eus
- @-FS-ADVL> : infinite subclause - eus
- @FS-N< : relative clause to N
- @FS->N : relative clause to N to the left side of it - eus
- @FS-VFIN< : finite verb in sentence, statement
- eai: Idja ii leat šat, eai ge sii dárbbaš lámppá dahje beaivváža čuovgga, dasgo Hearrá Ipmil lea sin čuovga. - The night is not anymore, they do not need the lamp- or day- light either, because God the Lord is their light.
- @FS-<APP : finite subclause functioning as an apposition
- @ICL-ADVL : non-finite subclause …
- @ICL-AUX< : “right” argument of auxiliary (?)
- @ICL-OBJ : infinitival clause object
- @ICL-SUBJ : infinitival clause subject
- @ICL-P< : infinitival clause complement of preprosition
- @IAUX : non-finite auxiliary
-
: main verb. A temporarily tag omitted in the end of the file. -
: auxilary verb. A temporarily tag omitted in the end of the file.
fao syntags
- @>V
kal syntags
- @INS :
- @<INS :
- @INS> :
eus syntags
- @FS-SPRED : finite verb in subclause functioning as a subject predicate - eus, but not sure if in use
Syntactic set definitions
Dep grammar
Correction rules
-
muitalit
-
XX
-
XX
-
XX
-
faoSumId=Rel
The finite verb
Mapping rules
lgRemove removes the language tags
This (part of) documentation was generated from src/cg3/dependency.cg3
src-cg3-disambiguator.cg3.md
Start making a syntactic disambiguator
Sets
Sentence delimiters are the following: “<.>” “<…>” “<!>” “<?>” “<¶>”
Part-of-Speech
- N = noun
- A = adjective
- Num = numeral
- V = verb
- CC = conjunction
- CS = subjunction
- Adv = adverb
- Pr = preposition
- Po = postposition
- Pron = pronoun
- Interj = interjection
Numerus
- Sg = Singular
- Pl = Plural
- Sg1 = Singular 1.p.
- Sg2 = Singular 2.p.
- Sg3 = Singular 3.p.
- Pl1 = Plural 1.p.
- Pl2 = Plural 2.p.
- Pl3 = Plural 3.p.
Cases
- Nom
- Gen
- Acc
- Par
- Ine
- Ill
- Ela
- Ade
- Abe
- All
- Abl
- Ess
- Tra
- Ins
- Com
- SUBJ-CASE = Nom Par
Types
- Prop = Proper noun
- Interr = Interrogative
- Dem = demonstrative pron
- Rel = Relative pron Relpronpl “mikkä ja “jokka” Relpronsg “mikä” ja “joka” Interrpronpl “kuka” ja “mikä”
- Pers = Personal pron
-
Indef = Indef pron
- Inf = Infinitive
- ConNeg = Conjugated as Negative form
- PrfPrc = Perfectum Particip
- Imprt = Imperative
- Act = Active
-
Neg = Negation verb
- COMMA = comma
- Foc/kaan = focus clitic -kaan
- Sem/Fem = feminin propernoun
Sets with more members
-
WORD = all PoS
- NPMOD = these can modify a noun
-
NPMODADV = NPMOD plus adverb
-
NOT-NPMOD = these cannot modify a noun
-
NOT-NPMODADV = these cannot modify a noun, and is not adverb
- QVANT-ADV = e.g. paljon, vähän
-
KUNKA = e.g. kunka missä (adverbs that start a sentence)
-
S-BOUNDARY = words that start a sentence
-
VFIN = finite verb
-
COPULAS = olla
-
AUX = verbs which can be auxilary
- SV-BOUNDARY = words that start a sentence and finite verb
This (part of) documentation was generated from src/cg3/disambiguator.cg3
src-cg3-functions.cg3.md
S Y N T A C T I C F U N C T I O N S F O R S Á M I
Sámi language technology project 2003-2018, University of Tromsø #
This file adds syntactic functions. It is common for all the Saami
LEFT RIGHT because of apertium
-
Sets for POS sub-categories
-
Sets for Semantic tags
-
Sets for Morphosyntactic properties
Syntactic tags
- @+FAUXV : finite auxiliary verb
- ferte: Son ferte oaidnit ollislaš gova. - She must see the whole picture.
- @+FMAINV : finite main verb
- oaidná: Son oaidná ollislaš gova. - She sees the whole picture
- @-FAUXV : infinite auxiliary verb
- sáhte: In sáhte gáhku borrat. - I cannot eat cake.
- @-FMAINV : infinite main verb
- oaidnit: Son ferte oaidnit ollislaš gova. - She must see the whole picture.
- @-FSUBJ> : Subject of infinite verb outside the verbal.
- mu: Diet dáhpáhuvai mu dieđikeahttá. - It happened without me knowing about it.
- @-F<OBJ : Subject of infinite verb outside the verbal.
- nuppi: Ulbmil lea oažžut nuppi boagustit. - The goal is to get the other one to laugh.
- @-FOBJ> : Object of infinite verb outside the verbal.
- váldovuoittuid: Sii vurde váldovuoittuid fasket. - They waited to grab the main prizes.
- @SPRED<OBJ : Object of an subsject predicative. (some adjectives are transitive)
- guliid: Mánát leat oažžulat guliid.
- @-FADVL : Adverbial complement of infinite verb outside the verbal.
- várrogasat: Dihkkadeaddji rávve skohtervuddjiid várrogasat mátkkoštit. - The roadman warns snowscooter drivers to drive carefully.
- @-F<PRED : Predicative complement of infinite verb outside the verbal.
- ággan: Jáhkken kulturmáhtu leat oktan ággan.
- @>ADVL : Modifier of an adverbial to the right.
- vaikko: doppe leat vaikko man ollu studeanttat.
- @ADVL< : Komplement for adverbial.
- vahkus: Son málesta guktii vahkus.
- @<ADVL : Adverbial after the main verb.
- dás: Eanet dieđuid gávnnat dás.
- @ADVL> : Adverbial to the left of the main verb
- viimmat: Dál de viimmat asttan lohkat reivve.
- @ADVL>CS : Adverbial modifying subjunction.
- ‘beare’ pointing at ‘danin go’: Muhto dus ii leat riekti dearpat su beare danin go sáhtát.
-
: Habitive, specifying an adverbial, e.g. @ADVL> - Máhtes: Máhtes lea beana.
-
: Extencial, specifying an subject, e.g. @<SUBJ - beana: Máhtes lea beana.
-
: logoforic pronouns, e.g. @>N (for MT) -
: - @>N : Modifier of a noun to the right.
- geavatlaš: Ráđđehussii lea geavatlaš politihkka deaŧalaš. - For the government, practical politics is important.
- @N< : Complement of noun to the left.
- vihtta: Mun boađán diibmu vihtta.
- @>A : Modifier of an adjective to the right.
- juohke: Seminára lágiduvvo juohke nuppi jagi.
- @P< : Complement of preposition.
- soađi: Dat dáhpáhuvai maŋŋel soađi.
- @>P : Complement of postposition.
- riegádeami: Seta riegádeami maŋŋel Áttán elii vel 800 jagi.
- @HNOUN : Stray noun in sentence fragment.
- muittut: Fidnokurssa muittut.
- @INTERJ : Interjection.
- Hei: Hei, boađe!
- @>Num : Attribute of numeral to the right.
- dušše: Mun ledjen dušše guokte mánu doppe.
- @Pron< : Complement of pronoun to the left.
- Birehiin: Moai Birehiin leimme doppe.
- @>Pron : Modifyer of pronoun to the right.
- vaikko: Olmmoš sáhttá bargat vaikko maid.
- @Num< : Complement of numeral to the left.
- girjjiin: Dat lea okta min buoremus girjjiin.
- @OBJ : Object, the verb is not in the sentence (ellipse)
- @<OBJ : Object, the verb is to the left.
- gávtti: Son goarru gávtti.
- @OBJ> : Object, the verb is to the right.
- filmma: Dán filmma leat Kárášjoga nuorat oaidnán.
- @OPRED : Object predicative, the verb is not in the sentence (ellipse).
- @<OPRED : Object predicative, the verb is to the left.
- buriid: Son ráhkada gáhkuid hui buriid.
- @OPRED> : Object predicative, the verb is to the right.
- dohkkemeahttumin: Son oinnii dohkkemeahttumin bargat ášši nu.
- @PCLE : Particle.
- Amma: Amma mii eat leat máksán? - We have not paid, have we?
- @COMP-CS< : Complement of subjunction.
- vejolaš: Dat šaddá nu buorre go vejolaš.
- @SPRED : Subject predicative, the verb is not in the sentence (ellipse).
- @<SPRED : Subject predicative, the verb is to the left.
- árgabivttas: Ovdal lei gákti árgabivttas.
- @SPRED> : Subject predicative, the verb is to the left.
- álbmogin: Sápmelaččaid historjá álbmogin lea duháhiid jagiid boaris.
- @SUBJ : Subject, the finite verb is not in the sentence (ellipse).
- @<SUBJ : Subject, the finite verb is to the left.
- gákti: Ovdal lei gákti árgabivttas.
- @SUBJ> : Subject, the finite verb is to the right.
- Son: Son lea mu oabbá. - Sheis my sister.
- @PPRED : Predicative for predicative.
- @APP : Apposition
- @APP-N< : Apposition to noun to the left.
- oahpaheaddji: Oidnen Ánne, min oahpaheaddji.
- @APP-Pron< : Apposition to pronoun to the left.
- boazodoalloáirasat: Ja moai boazodoalloáirasat áigguime vaikko guovttá joatkit barggu.
- @APP>Pron : Apposition to noun to the right.
- @APP-Num< : Apposition to numeral to the left.
- @APP-ADVL< : Apposition to adverbial to the left.
- bearjadaga: Mun vuolggán ihttin, bearjadaga.
- @VOC : Vocative
- Miss Turner : Bures boahtin deike, Miss Turner! - Welcome her, Miss Turner!
- @CVP : Conjunction or subjunction that conjoins finite verb phrases.
- go : Leago guhkes áigi dassá go Máreha oidnet? - Is it a long time since you saw Máret?
- @CNP : Local conjunction or subjunction.
- vai : Leago nieida vai bárdni? - Is it a girl or a boy?
- @CMPND
- @X : The function is unknown, e.g. because of that the word is unknown
Tag sets
-
Sets for verbs
-
V is all readings with a V tag in them, REAL-V should be the ones without an N tag following the V. The REAL-V set thus awaits a fix to the preprocess V … N bug.
-
The set COPULAS is for predicative constructions
-
NP sets defined according to their morphosyntactic features
-
The PRE-NP-HEAD family of sets
These sets model noun phrases (NPs). The idea is to first define whatever can occur in front of the head of the NP, and thereafter negate that with the expression WORD - premodifiers.
The set NOT-NPMOD is used to find barriers between NPs. Typical usage: … (*1 N BARRIER NPT-NPMOD) … meaning: Scan to the first noun, ignoring anything that can be part of the noun phrase of that noun (i.e., “scan to the next NP head”)
-
Miscellaneous sets
-
Border sets and their complements
ADLVCASE
- Syntactic sets
These were the set types.
Numeral outside the sentence
HABITIVE MAPPING
-
hab1 hab aux leat
-
hab_numo1 hab copula comma comma N+Nom
-
hab_numo2 copula nu mo/go hab
-
leahab copula nu mo/go hab
-
hab2 hab auxv adv leat
-
hab3 (
@ADVL>) for asdf hab-actor and hab-case; if leat to the right, and Nom to the right of leat. Lots of restrictions. -
hab3 (
@ADVL>) for asdf hab-actor and hab-case; if leat to the right, and Nom to the right of leat. Lots of restrictions. -
hab3 (
@ADVL>) for asdf hab-actor and hab-case; if leat to the right, and Nom to the right of leat. Lots of restrictions. -
hab3 (
@ADVL>) for hab-actor and hab-case; if leat to the right, and Nom to the right of leat. Lots of restrictions. -
hab_main (
@ADVL>) for hab-actor and hab-case; if leat to the right, and Nom to the right of leat. Lots of restrictions. -
habInf hab lea inf
-
habNomLeft Nom or Num + gen hab lea
-
habAdvl Ii han ovttasge du sogas leat dat namma.
-
hab4 hab cc hab leat
-
hab6 lea go hab – leago hab
-
hab7 lea go hab
- hab8 This is not HAB Ellii šattai hoahppu.
-
hab5 This is not HAB Mánás gollot gieđat.
-
hab9 prop ord-hab leat
-
hab10 prop ord-hab leat
- habDain (
@ADVL>) for (Pron Dem Pl Loc) if leat followed by Nom to the right -
habDain2
-
habRel # before relative clause
-
habEllipse Buot gánddain lea dreassa, nieiddain fas gákti.
-
habGen (
@<ADVL) hab for Gen; if Gen is located in the end of the sentence and Nom is sentence initial -
habGenQst (
@<ADVL) hab for Gen; in a question sentence. Gen is located sentence initially and SUBJ is found to the right. To the right of SUBJ is copulas -
n<titel1 (@N<) for (“jr”) or (“sr”); if first one to the left is Prop
-
n<titel2 (@N<) for INITIAL; if first one to the left is a noun, or if to the left of you is a single letter which is part of a noun conjunction bustávas e ja f gáibiduvvo
-
n<:com (@N<) for (Sg Com); if first one to the left is Coll
-
>nAttr (@>N) for Attr; if there is a noun to your right
-
n>Indef (Pron Indef Attr); if eará is to the right
-
n>Indef (Pron Indef Com); if eará is to the right
-
>nNum (@>N) for numerals if; there is a noun to your right. You are not allowed to be (Sg Nom), (Sg Acc) or (Sem/Date)
-
noun>n (@>N) for Gen; if there is a noun to your right. Restrictions: Not if you are: a time related word. Not if you are OKTA with Pl Loc to your right. Not if CC is to your right followed by another Gen and then Po. Not if you are HUMAN and to your right is Actio Nom folloed by a noun.
-
>nTime (@>N) for Gen TIME-N; if timenoun to your right. Restrictions: Not if you are a OKTA Nom with Pl Loc to your right. Not if CC followed by Gen, followed by Po to your right. Not if COMMA to your right
-
>ntittel (@>N) for (Sg Nom TIME-N) or (Nom Der/NomAg); if to your right is Sem/Mal, Sem/Fem, Sem/Sur
-
>nplc (@>N) for (Sg Nom Prop Sem/Plc), if to your right is Sem/Plc
-
>nALU (@>N) for Sg Acc numerals; when a measure-noun to the right
-
>NTime (@>N) for Gen; if you are TIME-N with BOC to your left, and PREGEN to your right
-
n<:Refl (@N<) for (Refl Nom); if to the left is (N Nom), or if first one to the left is a finite mainverb with a (N Nom) to the left
-
>pron1 (@>Pron) for GRADE-ADV, DUSSE, BUOT if; first one to the right is Pron
-
>pron2 (@>Pron) for (Refl Nom) if; first one to the right is Refl
-
>pron3 (@>Pron) for (Pron Recipr) if; first one to the right is (Pron Recipr)
-
vaikko (@>Pron) for vaikko if; first one to the right is Indef
-
vaikkoman (@>ADVL) for vaikko if; first one to the right is man
-
dasmaŋŋel (@>ADVL) for vaikko if; first one to the right is man
-
adv>advl (@>ADVL)
-
adv>advl (@>ADVL)
-
BOSvoc (@VOC) for HUMAN Nom; if sentence initial. To the right is comma. No nom-cased HUMAN followed by comma or CC is allowed to the right. There should not be a relative clause to the right, because then you are likely to be SUBJ
-
voc (@VOC) for Nom HUMAN; if comma to the left and an second person verb or pronoun to the left. To the right is the end of the sentence
-
Particle<subj (@PCLE)
-
spred<obj (@SPRED<OBJ) for Acc; the object of an SPRPED. Not to be mistaken with OPRED. If SPRED is to the left, and copulas is to the left of it. Nom or Hab are found sentence initially.
-
Hab<subj (
@<SUBJ) for Nom; if copulas, goallut or jápmit is FMAINV and habitive or human Loc is found to the left. OR: if Ill or @Pron< followed by HAB are found to the left. -
Hab<subj (
@<SUBJ) with relative clause in between -
Hab>Advlcase<subj (
@<SUBJ) for Nom; it allows adverbials with Ill/Loc/Com/Ess to be found inbetween HAB and . -
Nom>Advlcase<subj (
@<SUBJ) for Nom; it allows adverbials with Ill/Loc/Com/Ess to be found inbetween Nom and @<SUBJ. -
<extSubj (
@<SUBJ) for Nom; if copulas to the left, and some kind of adverb, N Loc, time related word or Po to the left of it. OR: if Ill or @Pron< to the left, followed by copulas and the before mentioned to the left of copulas. -
<extSubj (
@<SUBJ) for sma Nom; if some kind of adverb to the left, N Loc, time related word or Po to the left of it. -
<extSubjA (
@<SUBJ) for A - TEST WITHOUT THIS ONE -
<extSubj (
@<SUBJ) for Nom; if leat to the left and sentenceboundary -
<extSubj (
@<SUBJ) for Nom, but not for Pers. To the left boahtit or heaŋgát as MAINV, and futher to the left is some kind of place related word, or time related word -
loc<extSubj (
@<SUBJ) for Nom -
<spred (@<SPRED) for Nom; if Nom to the left, copulas to the left of Nom, and a time related word to the left of it.
-
<extQst1 (
@<SUBJ) for Nom; in an existential sentence. To your left is hab, some kind of place or time-word or Po. This is a Qst-sentence so the qst-pcle is attached to leat or following leat -
<extQst2 (
@<SUBJ) for Nom; in an existential sentence. To your left is leat and it is sentence initial. No attributes or other words are allowed inbetween (because then you are SPRED), except the attribute muhtun, muhtin -
extQst3> (
@SUBJ>) for Nom; if habitive first one to the left, followed by copulas. -
extQst3> (
@SUBJ>) for Nom; if habitive first one to the left, followed by copulas. -
<extsubjcoor (
@<SUBJ) for Nom. Coordination -
Sem/Year
-
<spredQst (@<SPRED) for Nom; in a typically question sentence; You are not allowed to be Pers or human. The special part is that Nom is not allowed to your right
-
<spredQst2 (@<SPRED) for (A Nom); in a typically question sentence; You are SPRED if (N Nom) is to your left and leat + qst is to the left
-
<spredQst3 (@<SPRED) for (A Nom); you are SPRED when you are (A Nom) and to your right is (N Nom). This is a Qst-sentence, so copulas is found to your left
-
<spredQst4 (@<SPRED) for Nom; but only in a qst-sentence where there is no chance of you beeing the subj
-
<NomBeforeSpred (@<SPRED) for (A Nom) if; Nom to the left, and copulas is to the left of Nom. There is no Nom allowed to the right of copulas! To avoid messing with coordination: ja, dahje and comma are not allowed to your left. Comma is not allowed to your right; if so then you are likely to be coordinated
-
<spred (@<SPRED) for A Nom or N Nom if; the subject Nom is on the same side of copulas as you: on the right side of copulas
-
<spredVeara (@<SPRED) for veara + Nom; if genitive immediately to the right, and intransitive mainverb to the right of genitive
-
leftCop<spred (@<SPRED) for Nom; if copulas is the main verb to the left, and there is no Ess found to the left of cop (note that Loc is allowed between target and cop). OR: if you are Coll or Sem/Group with copulas to your left.
-
<spredLocEXPERIMENT (@<SPRED) for material Loc; if you are to the right of copulas, and the Nom to the left of copulas is not a hab-actor
-
NumTime (@<SPRED) for A Nom
-
<spredSg (@<SPRED) for Sg Nom
-
<spredPg (@<SPRED) for Pl Nom
-
<spred (@<SPRED) for Nom; if copulas to the left, and Nom or sentence boundary to the left of copulas. First one to the right is EOS.
-
COP<spredEss (@<SPRED) for N Ess
-
spredEss> (@SPRED>) for N Ess; if copulas to the right of you, and if an NP with nom-case first one to your left.
-
GalleSpred> (@SPRED>) for Num Nom; if sentence initial
-
spredSgMII> (@SPRED>)
-
spredšaddat> (@SPRED>)
-
r492> (@SPRED>) for Interr Gen; consisting only of negations. You are not allowed to be MII. You are not allowed to have an adjective or noun to yor right. You are not allowed to have a verb to your right; the exception beeing an aux.
-
AdjSpredSg> (@SPRED>) for A Sg Nom; if copulas to the right, but not if A or @<SPRED are found to the right of copulas
-
Spred>SubjInf (@SPRED>) for Nom; if copulas to the right, and the subject of copulas is an Inf to the right
-
spredCoord (@<SPRED) coordination for Nom; only if there already is a SPRED to the left of CNP. Not if there is some kind of comparison involved.
-
subj>Sgnr1 (@SUBJ>) for Nom Sg, including Indef Nom if; VFIN + Sg3 or Pl3 to the right (VFIN not allowed to the left)
- subj>Du (@SUBJ>) for dual nominatives, including Coll Nom. VFIN + Du3 to the right.
-
subj>Pl (@SUBJ>) for plural nominatives, including Coll and Sem/Group. VFIN + Pl3 to the right.
-
subj>Pl (@SUBJ>) for plural nominatives
-
subj>Sg (@SUBJ>) for Nom Sg; if VFIN + Sg3 to the right.
-
Sg<subj (@<SUBJ) for Nom Sg; if VFIN Sg3 or Du2 to the left (no HAB allowed to the left).
-
Du<subj (@<SUBJ) for Nom Coll if; a dual third person verb is found to the left
-
PlDu<subj (@<SUBJ) for (N Nom Pl), (Sem/Group Nom), (Coll Nom), (Pron Nom Pl) if; a verb is Pl3 or Du3 to your left. The verb is not allowed to be copulas with a place, Loc or time noun to its left
-
copPl3<subj (@<SUBJ) for Nom Pl; you don’t to be a noun, only Nom Pl. To the left is copulas and first one to the right is @<SPRED
-
-fsubj> (@-FSUBJ>) for HUMAN Gen; in a NP-clause. To your right is Actio Nom followed by a noun
-
f<advl (@-F<ADVL) for infinite adverbials
-
f<advl (@-F<ADVL) for infinite adverbials
-
s-boundary=advl> (@ADVL>) for ADVL that resemble s-boundaries. Mainverb to the right.
-
diibmuadvl> (@ADVL>) for (diibmu Nom) if first one to the right is Num
-
-fsubj (@-FSUBJ>) for HUMAN Acc after DADJAT verbs
-
-fobj> (@-FOBJ>) for Acc if front of abessive, gerundium, actio locative, perfectum participle or infinitive. First one to the right not allowed to be Acc though
-
-fobj> (@-FOBJ>) for Acc if human with ADVL-case to the left and transitive infinitive OBJ to the right. First one to the right not allowed to be Acc though
-
advl>mainV (@ADVL>) if; finite mainverb not found to the left, but the finite mainverb is found to the right.
-
V<advl (@<ADVL) if; finite mainverb found to the left. Not if a comma is found immediately to the left and a finite mainverb is located somewhere to the right of this comma.
-
advl>v (@ADVL>) if; you are ADVL, time-noun or Sem/Route and there is a finite verb to the right in the clause, or if to your right is: de followed by a finite verb. OR: if you are a time-nound and to your right is: go or sentenceboundary followed by a finite verb
- <advlPoPr (@<ADVL) for Po or Pr; if mainverb to the left.
-
advlPoPr> (@<ADVL) for Po or Pr; if mainverb to the right.
-
BOSPo> (@ADVL>) for Po; if trapped between BOS to the right and S-BOUNDARY OR COMMA to the left, because the main verb will then automatically be on your right side.
-
<advlComIll (@<ADVL) only if; you are Com OR Ill. To your left is a mainverb, and to your right a sentenceboundary, because we don’t want there to be another mainverb you potentially could belong to
-
<advlEOS (@<ADVL) for Po or Pr or Loc; if you are found at the very end of a sentence. A mainverb is needed to the left though.
-
<advlGen (@<ADVL) for (N Gen) if mainverb to the left and no noun to the right
-
<opredgohcodit (@<OPRED) for Ess
-
advlEss> (@<ADVL) for weather and time Ess, if FMAINV to the left.
-
comma<advlEOS (@<ADVL) for Adv if; mainverb is to the left. Comma to the left and mainverb to the right in the same clause is not allowed
-
advl>inbetween (@ADVL>) for Adv; if inbetween two sentenceboundaries where no mainverb is present.
-
comma<advlEOS (@<ADVL) for Adv if; comma found to the left and the finite mainverb to the left of comma. To the right is the end of the sentence.
-
BOSadvl> (@ADVL>) if; you are N Loc or N Ill and found sentence initially and there is a main verb somewhere to the right. No barrier for the mainverb; based on the thought that first one to your right is probably a sentenceboundary.
-
cleanupILL<advl (@<ADVL) for N Ill if; there are no boundarysymbols to your left, if you arent already @N< OR @APP-N<, and no mainverb is to yor left.
-
cleanupPo (@ADVL) for Po: This rule tags all Po:s as ADVL if they haven’t gotten a tag somewhere along the way.
-
cleanupPr (@ADVL) for Po: This rule tags all Pr:s as ADVL if they haven’t gotten a tag somewhere along the way.
-
-fsubj>asAcc (@-FSUBJ>) for HUMAN Acc; if there is a verb @-F<OBJ to your left
-
-f<obj (@-F<OBJ) for Acc if there is a transitive verb + SYN-V to your left
-
-fsubj>IV (@-FSUBJ>) for Acc; if there is an IV-verb acting as a @-F<OBJ to your right
-
-fsubj>IV (@-FSUBJ>) for Acc; if there is an TV-verb acting as a @-F<OBJ to your right followed by an Acc
-
-fsubj>asGen (@-FSUBJ>) for Gen;
-
f<subj (@-F<SUBJ) for Nom if; (V @-F<OBJ) to the left.
-
<opredAAcc (@<OPRED) for A Acc; if an other accusative to the left, and a transtive verb to the left of it. OR: if a transitive verb to the left, and an accusative to the left of it.
- TV<obj (@<OBJ) for Acc; if there is a transitive mainverb to the left in the clause. Not for Rel. Not if you are a numeral followed by a measure-noun
sma object
-
<advlMeasr (@<ADVL) for (Num Acc); if finite IV-mainverb to the left, measure-noun to the right
-
<objMeasr (@<OBJ) for Num Acc; if finite TV-mainverb to the left, measure-noun to the right
-
<advlMeasr2 (@<ADVL) for MEASR-N + Acc; if (Num Pl) to the left and mainverb to the left of it
-
advlMeasr> (@ADVL>) for Num Acc;
-
Obj> (@OBJ>) for Acc; if there is a finite mainverb to the right in the clause. A really simple rule with no other restrictions..
-
s-boun<obj (@<OBJ) for Acc; if sentenceboundary to your left and a transitive mainverb to the left futher to the left
-
<objIV (@<OBJ) for Acc; if there is an intransitive mainverb in the clause. Not for Rel or Num. Not if you are a numeral followed by a measure-noun
-
<advlEss (@<ADVL) for ESS-ADVL if; FMAINV to the left
-
IV<spredEss (@<SPRED) for N Ess if; FMAINV to the left is intransitive or bargat
-
<opredEss (@<OPRED) for (N Ess), (A Ess) if; transitive mainverb to the left in the clause. If accusative to the left or to the right, or if Inf or ahte to the right, or if there is a noun to the right followed by an Inf
-
Acc<opredEss (@<OPRED) for (N Ess), (A Ess) if; transitive mainverb to the left in the clause, and an accusative cased Rel left to the verb
-
onlyV<opred (@<OPRED) for (N Ess) if; there is a transitive mainverb to the left. Usually there needs to be an Acc to the left, but here it is not needed
-
onlyV<opred2 (@<OPRED) for (N Ess) if;
SUBJ MAPPING - leftovers
-
subj>ifV (@SUBJ>) for NP-HEAD-NOM, DUPRON or (Num Nom) if; a finite mainverb is found to the right. This is a cleanup rule for subjects
-
hnoun>ifV (@SUBJ>) for NP-HEAD-NOM, DUPRON if. The counterpart of subj>ifV. You are HNOUN if there is a finite verb to your right, but NOT if there is a finite verb after a relative clause
OBJ MAPPING - leftovers
MAPPING for MT - experimental
HNOUN MAPPING
- @<ADVLcoor (@<ADVL) for ADVLCASEAdv if @CNP to the left and ADVL to the left of it
missingX adds @X to all missings
therestX adds @X to all what is left, often errouneus disambiguated forms
For Apertium:
The analysis give double analysis because of optional semtags. We go for the one with semtag.
This (part of) documentation was generated from src/cg3/functions.cg3
src-fst-morphology-affixes-adjectives.lexc.md
Adjective inflection The Wangkajunga language adjectives compare.
This (part of) documentation was generated from src/fst/morphology/affixes/adjectives.lexc
src-fst-morphology-affixes-nouns.lexc.md
Noun inflection The Wangkajunga language nouns inflect in cases.
temporal and spatial nouns - have a limited set of specific case endings, and do not have pronoun clitics
demonstrative lexicons
This (part of) documentation was generated from src/fst/morphology/affixes/nouns.lexc
src-fst-morphology-affixes-prefixes.lexc.md
Prefixes Prefixes in the Wangkajunga language are bound to beginning of other words.
This (part of) documentation was generated from src/fst/morphology/affixes/prefixes.lexc
src-fst-morphology-affixes-propernouns.lexc.md
Proper noun inflection The Wangkajunga language proper nouns inflect in the same cases as regular nouns, but with a colon (‘:’) as separator.
This (part of) documentation was generated from src/fst/morphology/affixes/propernouns.lexc
src-fst-morphology-affixes-symbols.lexc.md
Symbol affixes
This (part of) documentation was generated from src/fst/morphology/affixes/symbols.lexc
src-fst-morphology-affixes-verbs.lexc.md
Verb inflection The Wangkajunga language verbs inflect in persons.
lexicon Verb_prefixes (above) -> lexicon Verb_stems (separate file) -> following lexicons, depending on relevant conjugation:
is positioning of +V here okay? or better with separate lexicon / before prefixes?
This (part of) documentation was generated from src/fst/morphology/affixes/verbs.lexc
src-fst-morphology-clitics.lexc.md
Pronoun clitics (quite long)
This (part of) documentation was generated from src/fst/morphology/clitics.lexc
src-fst-morphology-phonology.twolc.md
=================================== ! The Wangkajunga morphophonological/twolc rules file ! =================================== !
- pilyurr%>^P^A
- pilyurr%>pa
This (part of) documentation was generated from src/fst/morphology/phonology.twolc
src-fst-morphology-root.lexc.md
Documenting the Wangkajunga root.lexc file
This files documents the Wangkajunga root.lexc file.
Analysis symbols
The morphological analyses of wordforms for the Wangkajunga language are presented in this system in terms of the following symbols. (It is highly suggested to follow existing standards when adding new tags).
The parts-of-speech are:
- +N
- +A
- +Adv
- +V
- +Pron
- +CS
- +CC
- +Adp
- +Po
- +Pr
- +Interj
- +Pcle
- +Num
Transitivity:
- +IV Intransitive (i.e. with Abs)
- +TV Transitive (i.e. with Erg + Abs)
nominal cases
- +Abs
- +Erg
- +Dat
- +Abl
- +Gen
- +Loc
- +Perl
- +All
- +Avoid
Derivational tags
- +Der/Foc = derivational tags
- +Der/SpatAbl
- +Der/SpatAll
- +Der/TempLoc
Other tags
- +Inch NB from the reference book, inchoative is used as verbalisation
pronoun clitics
- +Pron/Clt
- +1Sg +2Sg +3Sg
- +1Du +2Du +3Du
- +1Pl +2Pl +3Pl
- +Incl +Excl
- +Acs +Refl Acs = Accessory = locative or allative
- +Subj Subj = abs with intransitive verb, erg with transitive verb
other cases are declared elsewhere - Dat, Abs, Abl.
Verb affixes
tense inflections
- +Prs Present Tense
- +Perf +Imprt +Pst +PstNar +Fut imperfect tense inflections for Imperfective: Past, Past Habitual, Future, Imperative
- +Imperf +PstHbt
irrealis tense inflections ! TODO: work on tags. Irrealis/Admon? But two separate morphophonemes
- +Irr = Irrealis
- +Admon = Admonitive
- +Int = Intentive
- +Unr = Unrealised
- +Purp = Purposive
- +Oblig = Obligative
- +Hyp = Hypothetical
- +Char = Characteristic (*payi may behave differently - nominalisation?)
affixes following from irrealis inflections
- +Contr Contradictive
- +Avoid Avoidance
serial and nominalised inflections
- +Ser Serial
- +Nomz Nominaliser
verb derivation affixes
- +Act -ti nominal -> IT verb. changes position/stance meaning to action.
- +Caus/Make -ma nominal -> T verb. (particularly for attributes)
- +Caus/PutTo -ju nominal -> T verb.
temporal relative affix
- +Trel
verb directional affixes
- +Directional/towards ni (suffix / infix)
- Directional/away+ maa (prefix)
- Directional/around+ parra (prefix)
verb post-inflection affixes
- +Grp = group (different to GROUP as derivational nominal suffix)
- +Compl = completed action
- +Warn = warning
verb compounds
- +Compound/put = jurra
- +Compound/hit = puwa
- +Compound/have = kanyila
- +Compound/carry = kati
- +Compound/go = yarra
- +Compound/lie = ngarrin
- +Compound/get_up = pakala
- +Compound/mouth_action = jarra
Reduplication
- +Redpl Redpl+
Clitics
- ^P^A +Pa =
- +Clt/Foc =
- +Clt/Prob = kirli
- +Clt/contrary_to_expectation lka
- +Clt/really =
- +Clt/Cert = ngulyu
- +Clt/Rep = nyu
- +Clt/Dub = pa
- +Clt/Emph = kaja, rtuka, rtu
- +Clt/while = kaji
- +Clt/when = la
- +Clt/then = yila, lta
- +Voc = voc
Demonstrative affixes #TODO: add more meaning to tag names?
- +SentMod
- +Dem/ngula +Dem/pa only with yangka. In book +Rel +Pa
- +Dem/na +Dem/janu in book +Foc +Abl
- +Dem/janulu only with palunya
Flag diacritics for verb conjugations
- @U.CONJ.Ø@
- @U.CONJ.WA@
- @U.CONJ.RRA@
- @U.CONJ.LA@
Flag diacritics for noun cases
DCASE = derivational case
- @U.DCASE.HAV@
- @U.DCASE.THING@
- @U.DCASE.PRIV@
- @U.DCASE.INT@
- @U.DCASE.WANT@
- @U.DCASE.ASST@
- @U.DCASE.TEMP@
- @U.DCASE.DWELL@
- @U.DCASE.SIDE@
- @U.DCASE.TYPE@
- @U.DCASE.SIM@
- @U.DCASE.CONTR@
- @U.DCASE.MOD@
- @U.DCASE.BIG@
- @U.DCASE.ANOTH@
- @U.DCASE.VERY@
- @U.DCASE.NUM@
- @U.DCASE.DUAL@
- @U.DCASE.FEW@
- @U.DCASE.PL@
- @U.DCASE.GRP@
- @U.DCASE.PAIR@
- @U.DCASE.ONLY@
- @U.DCASE.FOC@
corresponding D-flags
- @D.DCASE.HAV@
- @D.DCASE.THING@
- @D.DCASE.PRIV@
- @D.DCASE.INT@
- @D.DCASE.WANT@
- @D.DCASE.ASST@
- @D.DCASE.TEMP@
- @D.DCASE.DWELL@
- @D.DCASE.SIDE@
- @D.DCASE.TYPE@
- @D.DCASE.SIM@
- @D.DCASE.CONTR@
- @D.DCASE.MOD@
- @D.DCASE.BIG@
- @D.DCASE.ANOTH@
- @D.DCASE.VERY@
- @D.DCASE.NUM@
- @D.DCASE.DUAL@
- @D.DCASE.FEW@
- @D.DCASE.PL@
- @D.DCASE.GRP@
- @D.DCASE.PAIR@
- @D.DCASE.ONLY@
- @D.DCASE.FOC@
SCASE = semantic case
- @U.SCASE.ABL@
- @U.SCASE.GEN@
- @U.SCASE.LOC@
- @U.SCASE.PERL@
- @U.SCASE.ALL@
corresponding D-flags
- @D.SCASE.ABL@
- @D.SCASE.GEN@
- @D.SCASE.LOC@
- @D.SCASE.PERL@
- @D.SCASE.ALL@
Flag diacritics for clitics (to ensure the same clitic does not appear twice on a single word)
CLT = clitic
- @U.CLT.FOC@
- @U.CLT.KIRLI@
- @U.CLT.LKA@
- @U.CLT.YILTA@
- @U.CLT.CERT@
- @U.CLT.REP@
- @U.CLT.DUB@
- @U.CLT.EMPH@
- @U.CLT.KAJI@
- @U.CLT.LA@
- @U.CLT.LTA@
- @U.CLT.YILA@
corresponding D-flags
- @D.CLT.FOC@
- @D.CLT.KIRLI@
- @D.CLT.LKA@
- @D.CLT.YILTA@
- @D.CLT.CERT@
- @D.CLT.REP@
- @D.CLT.DUB@
- @D.CLT.EMPH@
- @D.CLT.KAJI@
- @D.CLT.LA@
- @D.CLT.LTA@
- @D.CLT.YILA@
Flag diacritics for pronoun clitics (to ensure the same case is not used twice within a cluster).
CLCASE = pronoun clitic case TODO: consider changing name to PCCASE
- @U.CLCASE.S@
- @U.CLCASE.ABS@
- @U.CLCASE.DAT@
- @U.CLCASE.ACS@
- @U.CLCASE.ABL@
- @U.CLCASE.REFL@
corresponding D-flags
- @D.CLCASE.S@
- @D.CLCASE.ABS@
- @D.CLCASE.DAT@
- @D.CLCASE.ACS@
- @D.CLCASE.ABL@
- @D.CLCASE.REFL@
integrate the things to come:
Here are the tags from the template. These and the ones above should be merged.
The parts of speech are further split up into:
- +Prop +Pers +Dem +Interr +Refl +Recipr +Rel +Indef +Temp +Spat
The Usage extents are marked using following tags:
- +Err/Orth
- +Use/-Spell
The nominals are inflected in the following Case and Number
- +Sg +Du +Pl
- +Ess +Nom +Gen +Acc +Ill +Loc +Com +Com/Sh
The possession is marked as such:
- +PxSg1 +PxSg2 +PxSg3 +PxDu1 +PxDu2 +PxDu3 +PxPl1 +PxPl2 +PxPl3
The comparative forms are:
- +Comp +Superl
Numerals are classified under:
- +Attr +Card
- +Ord
Verb moods are:
- +Ind +Prs +Prt +Cond +Imprt
Other verb forms are
- +Inf +Ger +ConNeg +Neg +PrsPrc +PrfPrc +Sup +VGen +VAbess
Abbreviated words are classified with:
- +ABBR +ACR
- +Symbol = independent symbols in the text stream, like £, €, ©
Special symbols are classified with:
- +CLB +PUNCT +LEFT +RIGHT +MIDDLE
The verbs are syntactically split according to transitivity:
- +TV +IV
Special multiword units are analysed with:
- +Multi
Non-dictionary words can be recognised with:
- +Guess
Question and Focus particles:
- +Qst +Foc
Semantics are classified with
- +Sem/Spat
- +Sem/Temp
- +Sem/Mal
- +Sem/Fem
- +Sem/Sur
- +Sem/Plc
- +Sem/Org
- +Sem/Obj
- +Sem/Ani
- +Sem/Hum
- +Sem/Plant
- +Sem/Group
- +Sem/Time
- +Sem/Txt
- +Sem/Route
- +Sem/Measr
- +Sem/Wthr
- +Sem/Build
- +Sem/Edu
- +Sem/Veh
- +Sem/Clth
Derivations are classified under the morphophonetic form of the suffix, the
source and target part-of-speech.
Morphophonology
Flag diacritics
We have manually optimised the structure of our lexicon using following flag diacritics to restrict morhpological combinatorics - only allow compounds with verbs if the verb is further derived into a noun again:
| Flag | Explanation [ |
| [——[————- | |
| @P.NeedNoun.ON@ | (Dis)allow compounds with verbs unless nominalised |
| @D.NeedNoun.ON@ | (Dis)allow compounds with verbs unless nominalised |
| @C.NeedNoun@ | (Dis)allow compounds with verbs unless nominalised |
For languages that allow compounding, the following flag diacritics are needed to control position-based compounding restrictions for nominals. Their use is handled automatically if combined with +CmpN/xxx tags. If not used, they will do no harm.
| Flag | Explanation [ |
| [——[————- | |
| @P.CmpFrst.FALSE@ | Require that words tagged as such only appear first |
| @D.CmpPref.TRUE@ | Block such words from entering ENDLEX |
| @P.CmpPref.FALSE@ | Block these words from making further compounds |
| @D.CmpLast.TRUE@ | Block such words from entering R |
| @D.CmpNone.TRUE@ | Combines with the next tag to prohibit compounding |
| @U.CmpNone.FALSE@ | Combines with the prev tag to prohibit compounding |
| @P.CmpOnly.TRUE@ | Sets a flag to indicate that the word has passed R |
| @D.CmpOnly.FALSE@ | Disallow words coming directly from root. |
Use the following flag diacritics to control downcasing of derived proper nouns (e.g. Finnish Pariisi -> pariisilainen). See e.g. North Sámi for how to use these flags. There exists a ready-made regex that will do the actual down-casing given the proper use of these flags.
| Flag | Explanation [ |
| [——[————- | |
| @U.Cap.Obl@ | Allowing downcasing of derived names: deatnulasj. |
| @U.Cap.Opt@ | Allowing downcasing of derived names: deatnulasj. |
LEXIXON Root
The word forms in Wangkajunga language start from the lexeme roots of basic word classes, or optionally from prefixes:
This (part of) documentation was generated from src/fst/morphology/root.lexc
src-fst-morphology-stems-adjectives.lexc.md
Adjectives Adjectives in the Wangkajunga language describe things.
This (part of) documentation was generated from src/fst/morphology/stems/adjectives.lexc
src-fst-morphology-stems-closed.lexc.md
Closed parts of speech
This file contains closed parts of speech. It might be split later on. Each POS gets first a lexicon for the tag, then a lexicon for the words pointing to the tag lexicon.
Interjections
Particles
Conjunctions
This (part of) documentation was generated from src/fst/morphology/stems/closed.lexc
src-fst-morphology-stems-nouns.lexc.md
Nouns Nouns in the Wangkajunga language are things.
This (part of) documentation was generated from src/fst/morphology/stems/nouns.lexc
src-fst-morphology-stems-numerals.lexc.md
Numerals Numerals in the Wangkajunga language are numbers.
This (part of) documentation was generated from src/fst/morphology/stems/numerals.lexc
src-fst-morphology-stems-pronouns.lexc.md
Pronouns Pronouns in the Wangkajunga language are references to things.
This (part of) documentation was generated from src/fst/morphology/stems/pronouns.lexc
src-fst-morphology-stems-verbs.lexc.md
Verbs Verbs in the Wangkajunga language are actions.
This (part of) documentation was generated from src/fst/morphology/stems/verbs.lexc
src-fst-phonetics-txt2ipa.xfscript.md
retroflex plosive, voiceless t ʈ 0288, 648 ( = ASCII 096)
retroflex plosive, voiced d ɖ 0256, 598
labiodental nasal F ɱ 0271, 625
retroflex nasal n ɳ 0273, 627
palatal nasal J ɲ 0272, 626
velar nasal N ŋ 014B, 331
uvular nasal N\ ɴ 0274, 628
bilabial trill B\ ʙ 0299, 665
uvular trill R\ ʀ 0280, 640
alveolar tap 4 ɾ 027E, 638
retroflex flap r ɽ 027D, 637
bilabial fricative, voiceless p\ ɸ 0278, 632
bilabial fricative, voiced B β 03B2, 946
dental fricative, voiceless T θ 03B8, 952
dental fricative, voiced D ð 00F0, 240
postalveolar fricative, voiceless S ʃ 0283, 643
postalveolar fricative, voiced Z ʒ 0292, 658
retroflex fricative, voiceless s ʂ 0282, 642
retroflex fricative, voiced z` ʐ 0290, 656
palatal fricative, voiceless C ç 00E7, 231
palatal fricative, voiced j\ ʝ 029D, 669
velar fricative, voiced G ɣ 0263, 611
uvular fricative, voiceless X χ 03C7, 967
uvular fricative, voiced R ʁ 0281, 641
pharyngeal fricative, voiceless X\ ħ 0127, 295
pharyngeal fricative, voiced ?\ ʕ 0295, 661
glottal fricative, voiced h\ ɦ 0266, 614
alveolar lateral fricative, vl. K alveolar lateral fricative, vd. K\
labiodental approximant P (or v) alveolar approximant r\ retroflex approximant r` velar approximant M\
retroflex lateral approximant l`
palatal lateral approximant L
velar lateral approximant L
Clicks
bilabial O\ (O = capital letter)
dental |
(post)alveolar !\
palatoalveolar =\
alveolar lateral ||
Ejectives, implosives
ejective > e.g. ejective p p> implosive < e.g. implosive b b< Vowels
close back unrounded M close central unrounded 1 close central rounded } lax i I lax y Y lax u U
close-mid front rounded 2 close-mid central unrounded @\ close-mid central rounded 8 close-mid back unrounded 7
schwa ə @
open-mid front unrounded E open-mid front rounded 9 open-mid central unrounded 3 open-mid central rounded 3\ open-mid back unrounded V open-mid back rounded O
ash (ae digraph) { open schwa (turned a) 6
open front rounded & open back unrounded A open back rounded Q Other symbols
voiceless labial-velar fricative W voiced labial-palatal approx. H voiceless epiglottal fricative H\ voiced epiglottal fricative <\ epiglottal plosive >\
alveolo-palatal fricative, vl. s\ alveolo-palatal fricative, voiced z\ alveolar lateral flap l\ simultaneous S and x x\ tie bar _ Suprasegmentals
primary stress “
secondary stress %
long :
half-long :\
extra-short _X
linking mark -
Tones and word accents
level extra high _T level high _H level mid _M level low _L level extra low _B downstep ! upstep ^ (caret, circumflex)
contour, rising contour, falling _F contour, high rising _H_T contour, low rising _B_L
contour, rising-falling _R_F
(NB Instead of being written as diacritics with _, all prosodic
marks can alternatively be placed in a separate tier, set off
by < >, as recommended for the next two symbols.)
global rise
voiceless 0 (0 = figure), e.g. n_0 voiced _v aspirated _h more rounded _O (O = letter) less rounded _c advanced _+ retracted _- centralized _” syllabic = (or _=) e.g. n= (or n=) non-syllabic _^ rhoticity `
breathy voiced _t creaky voiced _k linguolabial _N labialized _w palatalized ‘ (or _j) e.g. t’ (or t_j) velarized _G pharyngealized _?\
dental d apical _a laminal _m nasalized ~ (or _~) e.g. A~ (or A~) nasal release _n lateral release _l no audible release _}
velarized or pharyngealized _e velarized l, alternatively 5 raised _r lowered _o advanced tongue root _A retracted tongue root _q
This (part of) documentation was generated from src/fst/phonetics/txt2ipa.xfscript
src-fst-transcriptions-transcriptor-abbrevs2text.lexc.md
We describe here how abbreviations are in Wangkajunga are read out, e.g. for text-to-speech systems.
For example:
- s.:syntynyt # ;
- os.:omaa% sukua # ;
- v.:vuosi # ;
- v.:vuonna # ;
- esim.:esimerkki # ;
- esim.:esimerkiksi # ;
This (part of) documentation was generated from src/fst/transcriptions/transcriptor-abbrevs2text.lexc
src-fst-transcriptions-transcriptor-numbers-digit2text.lexc.md
% komma% :, Root ; % tjuohkkis% :%. Root ; % kolon% :%: Root ; % sárggis% :%- Root ; % násti% :%* Root ;
This (part of) documentation was generated from src/fst/transcriptions/transcriptor-numbers-digit2text.lexc
tools-grammarcheckers-grammarchecker.cg3.md
[ L A N G U A G E ] G R A M M A R C H E C K E R
DELIMITERS
TAGS AND SETS
Tags
This section lists all the tags inherited from the fst, and used as tags in the syntactic analysis. The next section, Sets, contains sets defined on the basis of the tags listed here, those set names are not visible in the output.
Beginning and end of sentence
BOS EOS
Parts of speech tags
N A Adv V Pron CS CC CC-CS Po Pr Pcle Num Interj ABBR ACR CLB LEFT RIGHT WEB PPUNCT PUNCT
COMMA ¶
Tags for POS sub-categories
Pers Dem Interr Indef Recipr Refl Rel Coll NomAg Prop Allegro Arab Romertall
Tags for morphosyntactic properties
Nom Acc Gen Ill Loc Com Ess Ess Sg Du Pl Cmp/SplitR Cmp/SgNom Cmp/SgGen Cmp/SgGen PxSg1 PxSg2 PxSg3 PxDu1 PxDu2 PxDu3 PxPl1 PxPl2 PxPl3 Px
Comp Superl Attr Ord Qst IV TV Prt Prs Ind Pot Cond Imprt ImprtII Sg1 Sg2 Sg3 Du1 Du2 Du3 Pl1 Pl2 Pl3 Inf ConNeg Neg PrfPrc VGen PrsPrc Ger Sup Actio VAbess
Err/Orth
Semantic tags
Sem/Act Sem/Ani Sem/Atr Sem/Body Sem/Clth Sem/Domain Sem/Feat-phys Sem/Fem Sem/Group Sem/Lang Sem/Mal Sem/Measr Sem/Money Sem/Obj Sem/Obj-el Sem/Org Sem/Perc-emo Sem/Plc Sem/Sign Sem/State-sick Sem/Sur Sem/Time Sem/Txt
HUMAN
PROP-ATTR PROP-SUR
TIME-N-SET
Syntactic tags
@+FAUXV @+FMAINV @-FAUXV @-FMAINV @-FSUBJ> @-F<OBJ @-FOBJ> @-FSPRED<OBJ @-F<ADVL @-FADVL> @-F<SPRED @-F<OPRED @-FSPRED> @-FOPRED> @>ADVL @ADVL< @<ADVL @ADVL> @ADVL @HAB> @<HAB @>N @Interj @N< @>A @P< @>P @HNOUN @INTERJ @>Num @Pron< @>Pron @Num< @OBJ @<OBJ @OBJ> @OPRED @<OPRED @OPRED> @PCLE @COMP-CS< @SPRED @<SPRED @SPRED> @SUBJ @<SUBJ @SUBJ> SUBJ SPRED OPRED @PPRED @APP @APP-N< @APP-Pron< @APP>Pron @APP-Num< @APP-ADVL< @VOC @CVP @CNP OBJ