K’iche’ language model documentation
All doc-comment documentation in one large file.
src-cg3-dependency.cg3.md
!!!C O M M O N S Á M I D E P E N D E N C Y G R A M M A R
This dep file is for sma, sme, smj, sje.
!!!DELIMITERS
Sentence delimiters are the following: <.> <!> <?> <…> <¶>
!!!TAGS AND SETS
N V A Adv CC CS Inf Sup Neg Num Po Pr
Pcle Prop
Pron IV TV COMMA DASH CITATION to keep colouring we add a “ HYPHEN QMARK PUNCT LEFT RIGHT CLB Ind Pot Impr ImprtII Cond ConNeg Caus causative eus VGen Interj ABBR ACR Prs Prt Cmpnd RCmpnd PrfPrc PrsPrc Actor Actio Ger Indef Nom Acc Ill Com Gen Ess
IM For fao
!!POS sub-categories
!!Syntactic tags and sets
!Syntactic tags in input to this file
!Syntactic tags added in this file
- @FMV : finite main verb ** oaidná: Son oaidná ollislaš gova. - She sees the whole picture
- infinite main verb
- @FAUX : finite auxiliary verb ** ferte: Son ferte oaidnit ollislaš gova. - She must see the whole picture.
- @FMVdic : finite main verb introducing direct speech
- @IMVdic : infinite main verb introducing direct speech
- @FS-IMV : infinite main verb of subclause
- @FS-IAUX : infinite auxiliary verb in subclause
- @FS-N<IAUX : infinite auxiliary verb of a relative subclause
- @FS-N<IMV : infinite main verb of a relative subclause
- @FS-OBJ : finite verb in subclause functioning as object
- @FS-OBJ> : finite verb in subclause functioning as object
- @FS-<OBJ : finite verb in subclause functioning as object
- @FS-SUBJ : finite verb in subclause functioning as subject
- @FS-SUBJ> : finite verb in subclause functioning as subject
- @FS-<SUBJ : finite verb in subclause functioning as subject
- @FS-ADVL> : finite verb in subclause functioning as adverbial to the left of the main clause
- @FS-<ADVL : finite verb in subclause functioning as adverbial to the right of the main clause
- @S< : a clause modifying a sentence to the right of it
- @FS-ADVL : finite verb in subclause …
- @-FS-<ADVL : infinite subclause - eus
- @-FS-ADVL> : infinite subclause - eus
- @FS-N< : relative clause to N
- @FS->N : relative clause to N to the left side of it - eus
- @FS-VFIN< : finite verb in sentence, statement ** eai: Idja ii leat šat, eai ge sii dárbbaš lámppá dahje beaivváža čuovgga, dasgo Hearrá Ipmil lea sin čuovga. - The night is not anymore, they do not need the lamp- or day- light either, because God the Lord is their light.
- @FS-<APP : finite subclause functioning as an apposition
- @ICL-ADVL : non-finite subclause …
- @ICL-AUX< : “right” argument of auxiliary (?)
- @ICL-OBJ : infinitival clause object
- @ICL-SUBJ : infinitival clause subject
- @ICL-P< : infinitival clause complement of preprosition
- @IAUX : non-finite auxiliary
-
: main verb. A temporarily tag omitted in the end of the file. -
: auxilary verb. A temporarily tag omitted in the end of the file.
!fao syntags
- @>V
!kal syntags
- @INS :
- @<INS :
- @INS> :
!eus syntags
- @FS-SPRED : finite verb in subclause functioning as a subject predicate - eus, but not sure if in use
!Syntactic set definitions
!!!Dep grammar
Correction rules
-
muitalit
-
XX
-
XX
-
XX
-
faoSumId=Rel
!!The finite verb
!!!Mapping rules
!! lgRemove removes the language tags
This (part of) documentation was generated from src/cg3/dependency.cg3
src-cg3-disambiguator.cg3.md
Faroese disambiguator
Usage, in lang-fao:
cat text.txt|hfst-tokenize -cg tools/tokenisers/tokeniser-disamb-gt-desc.pmhfst |vislcg3 -g src/cg3/disambiguator.cg3
This file documents the Faroese disambiguator file .
Delimiters, tags and sets
- LIST NAGD = Nom Acc Gen Dat ;
- LIST AGD = Acc Gen Dat ;
- LIST GENDER = Msc Fem Neu ;
- LIST NUMBER = Sg Pl ;
Test: Go for minimal weight. This rules gives priority to lexicalised forms.
- NumRom in beginning of sentence
MAPPING OF CC AND CS
Mostly we map both @CNP and @CVP, then we select @CNP, after that we remove them so @CVP remains
-
CCasCNPCVP Map (@CNP @CVP) to CC
-
killAllahtenotCS All occurrences of “at” are CSs.
-
Kill Sem/ID
-
killAllCNP removes all remaining @CNP
-
XCC-CS removes CC and CS with no synttag
-
ErrOrth goes for correct forms
-
X removes readings with no syntax
This (part of) documentation was generated from src/cg3/disambiguator.cg3
src-cg3-functions.cg3.md
S Y N T A C T I C F U N C T I O N S F O R S Á M I
Sámi language technology project 2003-2018, University of Tromsø #
This file adds syntactic functions. It is common for all the Saami
LEFT RIGHT because of apertium
-
Sets for POS sub-categories
-
Sets for Semantic tags
-
Sets for Morphosyntactic properties
!!Syntactic tags
- @+FAUXV : finite auxiliary verb ** ferte: Son ferte oaidnit ollislaš gova. - She must see the whole picture.
- @+FMAINV : finite main verb ** oaidná: Son oaidná ollislaš gova. - She sees the whole picture
- @-FAUXV : infinite auxiliary verb ** sáhte: In sáhte gáhku borrat. - I cannot eat cake.
- @-FMAINV : infinite main verb ** oaidnit: Son ferte oaidnit ollislaš gova. - She must see the whole picture.
- @-FSUBJ> : Subject of infinite verb outside the verbal. ** mu: Diet dáhpáhuvai mu dieđikeahttá. - It happened without me knowing about it.
- @-F<OBJ : Subject of infinite verb outside the verbal. ** nuppi: Ulbmil lea oažžut nuppi boagustit. - The goal is to get the other one to laugh.
- @-FOBJ> : Object of infinite verb outside the verbal. ** váldovuoittuid: Sii vurde váldovuoittuid fasket. - They waited to grab the main prizes.
- @SPRED<OBJ : Object of an subsject predicative. (some adjectives are transitive) ** guliid: Mánát leat oažžulat guliid.
- @-FADVL : Adverbial complement of infinite verb outside the verbal. ** várrogasat: Dihkkadeaddji rávve skohtervuddjiid várrogasat mátkkoštit. - The roadman warns snowscooter drivers to drive carefully.
- @-F<PRED : Predicative complement of infinite verb outside the verbal. ** ággan: Jáhkken kulturmáhtu leat oktan ággan.
- @>ADVL : Modifier of an adverbial to the right. ** vaikko: doppe leat vaikko man ollu studeanttat.
- @ADVL< : Komplement for adverbial. ** vahkus: Son málesta guktii vahkus.
- @<ADVL : Adverbial after the main verb. ** dás: Eanet dieđuid gávnnat dás.
- @ADVL> : Adverbial to the left of the main verb ** viimmat: Dál de viimmat asttan lohkat reivve.
- @ADVL>CS : Adverbial modifying subjunction. ** ‘beare’ pointing at ‘danin go’: Muhto dus ii leat riekti dearpat su beare danin go sáhtát.
-
: Habitive, specifying an adverbial, e.g. @ADVL> ** Máhtes: Máhtes lea beana. -
: Extencial, specifying an subject, e.g. @<SUBJ ** beana: Máhtes lea beana. -
: logoforic pronouns, e.g. @>N (for MT) -
: - @>N : Modifier of a noun to the right. ** geavatlaš: Ráđđehussii lea geavatlaš politihkka deaŧalaš. - For the government, practical politics is important.
- @N< : Complement of noun to the left. ** vihtta: Mun boađán diibmu vihtta.
- @>A : Modifier of an adjective to the right. ** juohke: Seminára lágiduvvo juohke nuppi jagi.
- @P< : Complement of preposition. ** soađi: Dat dáhpáhuvai maŋŋel soađi.
- @>P : Complement of postposition. ** riegádeami: Seta riegádeami maŋŋel Áttán elii vel 800 jagi.
- @HNOUN : Stray noun in sentence fragment. ** muittut: Fidnokurssa muittut.
- @INTERJ : Interjection. ** Hei: Hei, boađe!
- @>Num : Attribute of numeral to the right. ** dušše: Mun ledjen dušše guokte mánu doppe.
- @Pron< : Complement of pronoun to the left. ** Birehiin: Moai Birehiin leimme doppe.
- @>Pron : Modifyer of pronoun to the right. ** vaikko: Olmmoš sáhttá bargat vaikko maid.
- @Num< : Complement of numeral to the left. ** girjjiin: Dat lea okta min buoremus girjjiin.
- @OBJ : Object, the verb is not in the sentence (ellipse)
- @<OBJ : Object, the verb is to the left. ** gávtti: Son goarru gávtti.
- @OBJ> : Object, the verb is to the right. ** filmma: Dán filmma leat Kárášjoga nuorat oaidnán.
- @OPRED : Object predicative, the verb is not in the sentence (ellipse).
- @<OPRED : Object predicative, the verb is to the left. ** buriid: Son ráhkada gáhkuid hui buriid.
- @OPRED> : Object predicative, the verb is to the right. ** dohkkemeahttumin: Son oinnii dohkkemeahttumin bargat ášši nu.
- @PCLE : Particle. ** Amma: Amma mii eat leat máksán? - We have not paid, have we?
- @COMP-CS< : Complement of subjunction. ** vejolaš: Dat šaddá nu buorre go vejolaš.
- @SPRED : Subject predicative, the verb is not in the sentence (ellipse).
- @<SPRED : Subject predicative, the verb is to the left. ** árgabivttas: Ovdal lei gákti árgabivttas.
- @SPRED> : Subject predicative, the verb is to the left. ** álbmogin: Sápmelaččaid historjá álbmogin lea duháhiid jagiid boaris.
- @SUBJ : Subject, the finite verb is not in the sentence (ellipse).
- @<SUBJ : Subject, the finite verb is to the left. ** gákti: Ovdal lei gákti árgabivttas.
- @SUBJ> : Subject, the finite verb is to the right. ** Son: Son lea mu oabbá. - Sheis my sister.
- @PPRED : Predicative for predicative.
- @APP : Apposition
- @APP-N< : Apposition to noun to the left. ** oahpaheaddji: Oidnen Ánne, min oahpaheaddji.
- @APP-Pron< : Apposition to pronoun to the left. ** boazodoalloáirasat: Ja moai boazodoalloáirasat áigguime vaikko guovttá joatkit barggu.
- @APP>Pron : Apposition to noun to the right.
- @APP-Num< : Apposition to numeral to the left.
- @APP-ADVL< : Apposition to adverbial to the left. ** bearjadaga: Mun vuolggán ihttin, bearjadaga.
- @VOC : Vocative ** Miss Turner : Bures boahtin deike, Miss Turner! - Welcome her, Miss Turner!
- @CVP : Conjunction or subjunction that conjoins finite verb phrases. ** go : Leago guhkes áigi dassá go Máreha oidnet? - Is it a long time since you saw Máret?
- @CNP : Local conjunction or subjunction. ** vai : Leago nieida vai bárdni? - Is it a girl or a boy?
- @CMPND
- @X : The function is unknown, e.g. because of that the word is unknown
!!Tag sets
- Sets for verbs
** V is all readings with a V tag in them, REAL-V should be the ones without an N tag following the V. The REAL-V set thus awaits a fix to the preprocess V … N bug.
-
The set COPULAS is for predicative constructions
-
NP sets defined according to their morphosyntactic features
-
The PRE-NP-HEAD family of sets
These sets model noun phrases (NPs). The idea is to first define whatever can occur in front of the head of the NP, and thereafter negate that with the expression WORD - premodifiers.
The set NOT-NPMOD is used to find barriers between NPs. Typical usage: … (*1 N BARRIER NPT-NPMOD) … meaning: Scan to the first noun, ignoring anything that can be part of the noun phrase of that noun (i.e., “scan to the next NP head”)
-
Miscellaneous sets
-
Border sets and their complements
ADLVCASE
- Syntactic sets
These were the set types.
!!Numeral outside the sentence
!!HABITIVE MAPPING
-
hab1 hab aux leat
-
hab_numo1 hab copula comma comma N+Nom
-
hab_numo2 copula nu mo/go hab
-
leahab copula nu mo/go hab
-
hab2 hab auxv adv leat
-
hab3 (
@ADVL>) for asdf hab-actor and hab-case; if leat to the right, and Nom to the right of leat. Lots of restrictions. -
hab3 (
@ADVL>) for asdf hab-actor and hab-case; if leat to the right, and Nom to the right of leat. Lots of restrictions. -
hab3 (
@ADVL>) for asdf hab-actor and hab-case; if leat to the right, and Nom to the right of leat. Lots of restrictions. -
hab3 (
@ADVL>) for hab-actor and hab-case; if leat to the right, and Nom to the right of leat. Lots of restrictions. -
hab_main (
@ADVL>) for hab-actor and hab-case; if leat to the right, and Nom to the right of leat. Lots of restrictions. -
habInf hab lea inf
-
habNomLeft Nom or Num + gen hab lea
-
habAdvl Ii han ovttasge du sogas leat dat namma.
-
hab4 hab cc hab leat
-
hab6 lea go hab – leago hab
-
hab7 lea go hab
- hab8 This is not HAB Ellii šattai hoahppu.
-
hab5 This is not HAB Mánás gollot gieđat.
-
hab9 prop ord-hab leat
-
hab10 prop ord-hab leat
- habDain (
@ADVL>) for (Pron Dem Pl Loc) if leat followed by Nom to the right -
habDain2
-
habRel # before relative clause
-
habEllipse Buot gánddain lea dreassa, nieiddain fas gákti.
-
habGen (
@<ADVL) hab for Gen; if Gen is located in the end of the sentence and Nom is sentence initial -
habGenQst (
@<ADVL) hab for Gen; in a question sentence. Gen is located sentence initially and SUBJ is found to the right. To the right of SUBJ is copulas -
n<titel1 (@N<) for (“jr”) or (“sr”); if first one to the left is Prop
-
n<titel2 (@N<) for INITIAL; if first one to the left is a noun, or if to the left of you is a single letter which is part of a noun conjunction ‘‘bustávas e ja f gáibiduvvo’’
-
n<:com (@N<) for (Sg Com); if first one to the left is Coll
-
>nAttr (@>N) for Attr; if there is a noun to your right
-
n>Indef (Pron Indef Attr); if eará is to the right
-
n>Indef (Pron Indef Com); if eará is to the right
-
>nNum (@>N) for numerals if; there is a noun to your right. You are not allowed to be (Sg Nom), (Sg Acc) or (Sem/Date)
-
noun>n (@>N) for Gen; if there is a noun to your right. Restrictions: Not if you are: a time related word. Not if you are OKTA with Pl Loc to your right. Not if CC is to your right followed by another Gen and then Po. Not if you are HUMAN and to your right is Actio Nom folloed by a noun.
-
>nTime (@>N) for Gen TIME-N; if timenoun to your right. Restrictions: Not if you are a OKTA Nom with Pl Loc to your right. Not if CC followed by Gen, followed by Po to your right. Not if COMMA to your right
-
>ntittel (@>N) for (Sg Nom TIME-N) or (Nom Der/NomAg); if to your right is Sem/Mal, Sem/Fem, Sem/Sur
-
>nplc (@>N) for (Sg Nom Prop Sem/Plc), if to your right is Sem/Plc
-
>nALU (@>N) for Sg Acc numerals; when a measure-noun to the right
-
>NTime (@>N) for Gen; if you are TIME-N with BOC to your left, and PREGEN to your right
-
n<:Refl (@N<) for (Refl Nom); if to the left is (N Nom), or if first one to the left is a finite mainverb with a (N Nom) to the left
-
>pron1 (@>Pron) for GRADE-ADV, DUSSE, BUOT if; first one to the right is Pron
-
>pron2 (@>Pron) for (Refl Nom) if; first one to the right is Refl
-
>pron3 (@>Pron) for (Pron Recipr) if; first one to the right is (Pron Recipr)
-
vaikko (@>Pron) for vaikko if; first one to the right is Indef
-
vaikkoman (@>ADVL) for vaikko if; first one to the right is man
-
dasmaŋŋel (@>ADVL) for vaikko if; first one to the right is man
-
adv>advl (@>ADVL)
-
adv>advl (@>ADVL)
-
BOSvoc (@VOC) for HUMAN Nom; if sentence initial. To the right is comma. No nom-cased HUMAN followed by comma or CC is allowed to the right. There should not be a relative clause to the right, because then you are likely to be SUBJ
-
voc (@VOC) for Nom HUMAN; if comma to the left and an second person verb or pronoun to the left. To the right is the end of the sentence
-
__Particle<subj __ (@PCLE)
-
spred<obj (@SPRED<OBJ) for Acc; the object of an SPRPED. Not to be mistaken with OPRED. If SPRED is to the left, and copulas is to the left of it. Nom or Hab are found sentence initially.
-
Hab<subj (
@<SUBJ) for Nom; if copulas, goallut or jápmit is FMAINV and habitive or human Loc is found to the left. OR: if Ill or @Pron< followed by HAB are found to the left. -
Hab<subj (
@<SUBJ) with relative clause in between -
Hab>Advlcase<subj (
@<SUBJ) for Nom; it allows adverbials with Ill/Loc/Com/Ess to be found inbetween HAB and . -
Nom>Advlcase<subj (
@<SUBJ) for Nom; it allows adverbials with Ill/Loc/Com/Ess to be found inbetween Nom and @<SUBJ. -
<extSubj (
@<SUBJ) for Nom; if copulas to the left, and some kind of adverb, N Loc, time related word or Po to the left of it. OR: if Ill or @Pron< to the left, followed by copulas and the before mentioned to the left of copulas. -
<extSubj (
@<SUBJ) for sma Nom; if some kind of adverb to the left, N Loc, time related word or Po to the left of it. -
<extSubjA (
@<SUBJ) for A - TEST WITHOUT THIS ONE -
<extSubj (
@<SUBJ) for Nom; if leat to the left and sentenceboundary -
<extSubj (
@<SUBJ) for Nom, but not for Pers. To the left boahtit or heaŋgát as MAINV, and futher to the left is some kind of place related word, or time related word -
loc<extSubj (
@<SUBJ) for Nom -
<spred (@<SPRED) for Nom; if Nom to the left, copulas to the left of Nom, and a time related word to the left of it.
-
<extQst1 (
@<SUBJ) for Nom; in an existential sentence. To your left is hab, some kind of place or time-word or Po. This is a Qst-sentence so the qst-pcle is attached to leat or following leat -
<extQst2 (
@<SUBJ) for Nom; in an existential sentence. To your left is leat and it is sentence initial. No attributes or other words are allowed inbetween (because then you are SPRED), except the attribute muhtun, muhtin -
extQst3> (
@SUBJ>) for Nom; if habitive first one to the left, followed by copulas. -
extQst3> (
@SUBJ>) for Nom; if habitive first one to the left, followed by copulas. -
<extsubjcoor (
@<SUBJ) for Nom. Coordination -
Sem/Year
-
<spredQst (@<SPRED) for Nom; in a typically question sentence; You are not allowed to be Pers or human. The special part is that Nom is not allowed to your right
-
<spredQst2 (@<SPRED) for (A Nom); in a typically question sentence; You are SPRED if (N Nom) is to your left and leat + qst is to the left
-
<spredQst3 (@<SPRED) for (A Nom); you are SPRED when you are (A Nom) and to your right is (N Nom). This is a Qst-sentence, so copulas is found to your left
-
<spredQst4 (@<SPRED) for Nom; but only in a qst-sentence where there is no chance of you beeing the subj
-
<NomBeforeSpred (@<SPRED) for (A Nom) if; Nom to the left, and copulas is to the left of Nom. There is no Nom allowed to the right of copulas! To avoid messing with coordination: ja, dahje and comma are not allowed to your left. Comma is not allowed to your right; if so then you are likely to be coordinated
-
<spred (@<SPRED) for A Nom or N Nom if; the subject Nom is on the same side of copulas as you: on the right side of copulas
-
<spredVeara (@<SPRED) for veara + Nom; if genitive immediately to the right, and intransitive mainverb to the right of genitive
-
leftCop<spred (@<SPRED) for Nom; if copulas is the main verb to the left, and there is no Ess found to the left of cop (note that Loc is allowed between target and cop). OR: if you are Coll or Sem/Group with copulas to your left.
-
<spredLocEXPERIMENT (@<SPRED) for material Loc; if you are to the right of copulas, and the Nom to the left of copulas is not a hab-actor
-
NumTime (@<SPRED) for A Nom
-
<spredSg (@<SPRED) for Sg Nom
-
<spredPg (@<SPRED) for Pl Nom
-
<spred (@<SPRED) for Nom; if copulas to the left, and Nom or sentence boundary to the left of copulas. First one to the right is EOS.
-
COP<spredEss (@<SPRED) for N Ess
-
spredEss> (@SPRED>) for N Ess; if copulas to the right of you, and if an NP with nom-case first one to your left.
-
GalleSpred> (@SPRED>) for Num Nom; if sentence initial
-
spredSgMII> (@SPRED>)
-
spredšaddat> (@SPRED>)
-
r492> (@SPRED>) for Interr Gen; consisting only of negations. You are not allowed to be MII. You are not allowed to have an adjective or noun to yor right. You are not allowed to have a verb to your right; the exception beeing an aux.
-
AdjSpredSg> (@SPRED>) for A Sg Nom; if copulas to the right, but not if A or @<SPRED are found to the right of copulas
-
Spred>SubjInf (@SPRED>) for Nom; if copulas to the right, and the subject of copulas is an Inf to the right
-
spredCoord (@<SPRED) coordination for Nom; only if there already is a SPRED to the left of CNP. Not if there is some kind of comparison involved.
-
subj>Sgnr1 (@SUBJ>) for Nom Sg, including Indef Nom if; VFIN + Sg3 or Pl3 to the right (VFIN not allowed to the left)
- subj>Du (@SUBJ>) for dual nominatives, including Coll Nom. VFIN + Du3 to the right.
-
subj>Pl (@SUBJ>) for plural nominatives, including Coll and Sem/Group. VFIN + Pl3 to the right.
-
subj>Pl (@SUBJ>) for plural nominatives
-
subj>Sg (@SUBJ>) for Nom Sg; if VFIN + Sg3 to the right.
-
Sg<subj (@<SUBJ) for Nom Sg; if VFIN Sg3 or Du2 to the left (no HAB allowed to the left).
-
Du<subj (@<SUBJ) for Nom Coll if; a dual third person verb is found to the left
-
PlDu<subj (@<SUBJ) for (N Nom Pl), (Sem/Group Nom), (Coll Nom), (Pron Nom Pl) if; a verb is Pl3 or Du3 to your left. The verb is not allowed to be copulas with a place, Loc or time noun to its left
-
copPl3<subj (@<SUBJ) for Nom Pl; you don’t to be a noun, only Nom Pl. To the left is copulas and first one to the right is @<SPRED
-
-fsubj> (@-FSUBJ>) for HUMAN Gen; in a NP-clause. To your right is Actio Nom followed by a noun
-
f<advl (@-F<ADVL) for infinite adverbials
-
f<advl (@-F<ADVL) for infinite adverbials
-
s-boundary=advl> (@ADVL>) for ADVL that resemble s-boundaries. Mainverb to the right.
-
diibmuadvl> (@ADVL>) for (diibmu Nom) if first one to the right is Num
-
-fsubj (@-FSUBJ>) for HUMAN Acc after DADJAT verbs
-
-fobj> (@-FOBJ>) for Acc if front of abessive, gerundium, actio locative, perfectum participle or infinitive. First one to the right not allowed to be Acc though
-
-fobj> (@-FOBJ>) for Acc if human with ADVL-case to the left and transitive infinitive OBJ to the right. First one to the right not allowed to be Acc though
-
advl>mainV (@ADVL>) if; finite mainverb not found to the left, but the finite mainverb is found to the right.
-
V<advl (@<ADVL) if; finite mainverb found to the left. Not if a comma is found immediately to the left and a finite mainverb is located somewhere to the right of this comma.
-
advl>v (@ADVL>) if; you are ADVL, time-noun or Sem/Route and there is a finite verb to the right in the clause, or if to your right is: de followed by a finite verb. OR: if you are a time-nound and to your right is: go or sentenceboundary followed by a finite verb
- <advlPoPr (@<ADVL) for Po or Pr; if mainverb to the left.
-
advlPoPr> (@<ADVL) for Po or Pr; if mainverb to the right.
-
BOSPo> (@ADVL>) for Po; if trapped between BOS to the right and S-BOUNDARY OR COMMA to the left, because the main verb will then automatically be on your right side.
-
<advlComIll (@<ADVL) only if; you are Com OR Ill. To your left is a mainverb, and to your right a sentenceboundary, because we don’t want there to be another mainverb you potentially could belong to
-
<advlEOS (@<ADVL) for Po or Pr or Loc; if you are found at the very end of a sentence. A mainverb is needed to the left though.
-
<advlGen (@<ADVL) for (N Gen) if mainverb to the left and no noun to the right
-
<opredgohcodit (@<OPRED) for Ess
-
advlEss> (@<ADVL) for weather and time Ess, if FMAINV to the left.
-
comma<advlEOS (@<ADVL) for Adv if; mainverb is to the left. Comma to the left and mainverb to the right in the same clause is not allowed
-
advl>inbetween (@ADVL>) for Adv; if inbetween two sentenceboundaries where no mainverb is present.
-
comma<advlEOS (@<ADVL) for Adv if; comma found to the left and the finite mainverb to the left of comma. To the right is the end of the sentence.
-
BOSadvl> (@ADVL>) if; you are N Loc or N Ill and found sentence initially and there is a main verb somewhere to the right. No barrier for the mainverb; based on the thought that first one to your right is probably a sentenceboundary.
-
cleanupILL<advl (@<ADVL) for N Ill if; there are no boundarysymbols to your left, if you arent already @N< OR @APP-N<, and no mainverb is to yor left.
-
cleanupPo (@ADVL) for Po: This rule tags all Po:s as ADVL if they haven’t gotten a tag somewhere along the way.
-
cleanupPr (@ADVL) for Po: This rule tags all Pr:s as ADVL if they haven’t gotten a tag somewhere along the way.
-
-fsubj>asAcc (@-FSUBJ>) for HUMAN Acc; if there is a verb @-F<OBJ to your left
-
-f<obj (@-F<OBJ) for Acc if there is a transitive verb + SYN-V to your left
-
-fsubj>IV (@-FSUBJ>) for Acc; if there is an IV-verb acting as a @-F<OBJ to your right
-
-fsubj>IV (@-FSUBJ>) for Acc; if there is an TV-verb acting as a @-F<OBJ to your right followed by an Acc
-
-fsubj>asGen (@-FSUBJ>) for Gen;
-
f<subj (@-F<SUBJ) for Nom if; (V @-F<OBJ) to the left.
-
<opredAAcc (@<OPRED) for A Acc; if an other accusative to the left, and a transtive verb to the left of it. OR: if a transitive verb to the left, and an accusative to the left of it.
- TV<obj (@<OBJ) for Acc; if there is a transitive mainverb to the left in the clause. Not for Rel. Not if you are a numeral followed by a measure-noun
!sma object
-
<advlMeasr (@<ADVL) for (Num Acc); if finite IV-mainverb to the left, measure-noun to the right
-
<objMeasr (@<OBJ) for Num Acc; if finite TV-mainverb to the left, measure-noun to the right
-
<advlMeasr2 (@<ADVL) for MEASR-N + Acc; if (Num Pl) to the left and mainverb to the left of it
-
advlMeasr> (@ADVL>) for Num Acc;
-
Obj> (@OBJ>) for Acc; if there is a finite mainverb to the right in the clause. A really simple rule with no other restrictions..
-
s-boun<obj (@<OBJ) for Acc; if sentenceboundary to your left and a transitive mainverb to the left futher to the left
-
<objIV (@<OBJ) for Acc; if there is an intransitive mainverb in the clause. Not for Rel or Num. Not if you are a numeral followed by a measure-noun
-
<advlEss (@<ADVL) for ESS-ADVL if; FMAINV to the left
-
IV<spredEss (@<SPRED) for N Ess if; FMAINV to the left is intransitive or bargat
-
<opredEss (@<OPRED) for (N Ess), (A Ess) if; transitive mainverb to the left in the clause. If accusative to the left or to the right, or if Inf or ahte to the right, or if there is a noun to the right followed by an Inf
-
Acc<opredEss (@<OPRED) for (N Ess), (A Ess) if; transitive mainverb to the left in the clause, and an accusative cased Rel left to the verb
-
onlyV<opred (@<OPRED) for (N Ess) if; there is a transitive mainverb to the left. Usually there needs to be an Acc to the left, but here it is not needed
-
onlyV<opred2 (@<OPRED) for (N Ess) if;
!!SUBJ MAPPING - leftovers
-
subj>ifV (@SUBJ>) for NP-HEAD-NOM, DUPRON or (Num Nom) if; a finite mainverb is found to the right. This is a cleanup rule for subjects
-
hnoun>ifV (@SUBJ>) for NP-HEAD-NOM, DUPRON if. The counterpart of subj>ifV. You are HNOUN if there is a finite verb to your right, but NOT if there is a finite verb after a relative clause
!!OBJ MAPPING - leftovers
!!
!!HNOUN MAPPING
- @<ADVLcoor (@<ADVL) for ADVLCASEAdv if @CNP to the left and ADVL to the left of it
! missingX adds @X to all missings
! therestX adds @X to all what is left, often errouneus disambiguated forms
!!For Apertium: The analysis give double analysis because of optional semtags. We go for the one with semtag.
This (part of) documentation was generated from src/cg3/functions.cg3
src-fst-morphology-affixes-symbols.lexc.md
Symbol affixes
This (part of) documentation was generated from src/fst/morphology/affixes/symbols.lexc
src-fst-morphology-ext-Apertium-quc-apertium-quc.quc.lexc.md
This (part of) documentation was generated from src/fst/morphology/ext-Apertium-quc/apertium-quc.quc.lexc
src-fst-phonetics-txt2ipa.xfscript.md
retroflex plosive, voiceless t ʈ 0288, 648 ( = ASCII 096)
retroflex plosive, voiced d ɖ 0256, 598
labiodental nasal F ɱ 0271, 625
retroflex nasal n ɳ 0273, 627
palatal nasal J ɲ 0272, 626
velar nasal N ŋ 014B, 331
uvular nasal N\ ɴ 0274, 628
bilabial trill B\ ʙ 0299, 665
uvular trill R\ ʀ 0280, 640
alveolar tap 4 ɾ 027E, 638
retroflex flap r ɽ 027D, 637
bilabial fricative, voiceless p\ ɸ 0278, 632
bilabial fricative, voiced B β 03B2, 946
dental fricative, voiceless T θ 03B8, 952
dental fricative, voiced D ð 00F0, 240
postalveolar fricative, voiceless S ʃ 0283, 643
postalveolar fricative, voiced Z ʒ 0292, 658
retroflex fricative, voiceless s ʂ 0282, 642
retroflex fricative, voiced z` ʐ 0290, 656
palatal fricative, voiceless C ç 00E7, 231
palatal fricative, voiced j\ ʝ 029D, 669
velar fricative, voiced G ɣ 0263, 611
uvular fricative, voiceless X χ 03C7, 967
uvular fricative, voiced R ʁ 0281, 641
pharyngeal fricative, voiceless X\ ħ 0127, 295
pharyngeal fricative, voiced ?\ ʕ 0295, 661
glottal fricative, voiced h\ ɦ 0266, 614
alveolar lateral fricative, vl. K alveolar lateral fricative, vd. K\
labiodental approximant P (or v) alveolar approximant r\ retroflex approximant r` velar approximant M\
retroflex lateral approximant l`
palatal lateral approximant L
velar lateral approximant L
Clicks
bilabial O\ (O = capital letter)
dental |
(post)alveolar !\
palatoalveolar =\
alveolar lateral ||
Ejectives, implosives
ejective > e.g. ejective p p> implosive < e.g. implosive b b< Vowels
close back unrounded M close central unrounded 1 close central rounded } lax i I lax y Y lax u U
close-mid front rounded 2 close-mid central unrounded @\ close-mid central rounded 8 close-mid back unrounded 7
schwa ə @
open-mid front unrounded E open-mid front rounded 9 open-mid central unrounded 3 open-mid central rounded 3\ open-mid back unrounded V open-mid back rounded O
ash (ae digraph) { open schwa (turned a) 6
open front rounded & open back unrounded A open back rounded Q Other symbols
voiceless labial-velar fricative W voiced labial-palatal approx. H voiceless epiglottal fricative H\ voiced epiglottal fricative <\ epiglottal plosive >\
alveolo-palatal fricative, vl. s\ alveolo-palatal fricative, voiced z\ alveolar lateral flap l\ simultaneous S and x x\ tie bar _ Suprasegmentals
primary stress “
secondary stress %
long :
half-long :\
extra-short _X
linking mark -
Tones and word accents
level extra high _T level high _H level mid _M level low _L level extra low _B downstep ! upstep ^ (caret, circumflex)
contour, rising contour, falling _F contour, high rising _H_T contour, low rising _B_L
contour, rising-falling _R_F
(NB Instead of being written as diacritics with _, all prosodic
marks can alternatively be placed in a separate tier, set off
by < >, as recommended for the next two symbols.)
global rise
voiceless 0 (0 = figure), e.g. n_0 voiced _v aspirated _h more rounded _O (O = letter) less rounded _c advanced _+ retracted _- centralized _” syllabic = (or _=) e.g. n= (or n=) non-syllabic _^ rhoticity `
breathy voiced _t creaky voiced _k linguolabial _N labialized _w palatalized ‘ (or _j) e.g. t’ (or t_j) velarized _G pharyngealized _?\
dental d apical _a laminal _m nasalized ~ (or _~) e.g. A~ (or A~) nasal release _n lateral release _l no audible release _}
velarized or pharyngealized _e velarized l, alternatively 5 raised _r lowered _o advanced tongue root _A retracted tongue root _q
This (part of) documentation was generated from src/fst/phonetics/txt2ipa.xfscript
src-fst-transcriptions-transcriptor-abbrevs2text.lexc.md
We describe here how abbreviations are in K’iche’ are read out, e.g. for text-to-speech systems.
For example:
- s.:syntynyt # ;
- os.:omaa% sukua # ;
- v.:vuosi # ;
- v.:vuonna # ;
- esim.:esimerkki # ;
- esim.:esimerkiksi # ;
This (part of) documentation was generated from src/fst/transcriptions/transcriptor-abbrevs2text.lexc
src-fst-transcriptions-transcriptor-numbers-digit2text.lexc.md
% komma% :, Root ; % tjuohkkis% :%. Root ; % kolon% :%: Root ; % sárggis% :%- Root ; % násti% :%* Root ;
This (part of) documentation was generated from src/fst/transcriptions/transcriptor-numbers-digit2text.lexc
src-fst-transcriptions-transcriptor-symbols2text.lexc.md
This file contains mappings from abbreviations and some acronyms to full forms for text-to-speech purposes. This is a supplement to the analyser; the analyser must tag the strings as +ABBR or similar for the transcriptions to work. The resulting full form must be lemmas known to the analyser, for further processing.
We describe here how abbreviations in K’iche’ are read out, for text-to-speech systems.
The file contains:
-
miscellaneous symbols
-
smileys
-
Clause boundary symbols
-
Single punctuation marks
-
Paired punctuation marks
This (part of) documentation was generated from src/fst/transcriptions/transcriptor-symbols2text.lexc
tools-grammarcheckers-grammarchecker.cg3.md
[ L A N G U A G E ] G R A M M A R C H E C K E R
DELIMITERS
TAGS AND SETS
Tags
This section lists all the tags inherited from the fst, and used as tags in the syntactic analysis. The next section, Sets, contains sets defined on the basis of the tags listed here, those set names are not visible in the output.
Beginning and end of sentence
BOS EOS
Parts of speech tags
N A Adv V Pron CS CC CC-CS Po Pr Pcle Num Interj ABBR ACR CLB LEFT RIGHT WEB PPUNCT PUNCT
COMMA ¶
Tags for POS sub-categories
Pers Dem Interr Indef Recipr Refl Rel Coll NomAg Prop Allegro Arab Romertall
Tags for morphosyntactic properties
Nom Acc Gen Ill Loc Com Ess Ess Sg Du Pl Cmp/SplitR Cmp/SgNom Cmp/SgGen Cmp/SgGen PxSg1 PxSg2 PxSg3 PxDu1 PxDu2 PxDu3 PxPl1 PxPl2 PxPl3 Px
Comp Superl Attr Ord Qst IV TV Prt Prs Ind Pot Cond Imprt ImprtII Sg1 Sg2 Sg3 Du1 Du2 Du3 Pl1 Pl2 Pl3 Inf ConNeg Neg PrfPrc VGen PrsPrc Ger Sup Actio VAbess
Err/Orth
Semantic tags
Sem/Act Sem/Ani Sem/Atr Sem/Body Sem/Clth Sem/Domain Sem/Feat-phys Sem/Fem Sem/Group Sem/Lang Sem/Mal Sem/Measr Sem/Money Sem/Obj Sem/Obj-el Sem/Org Sem/Perc-emo Sem/Plc Sem/Sign Sem/State-sick Sem/Sur Sem/Time Sem/Txt
HUMAN
PROP-ATTR PROP-SUR
TIME-N-SET
Syntactic tags
@+FAUXV @+FMAINV @-FAUXV @-FMAINV @-FSUBJ> @-F<OBJ @-FOBJ> @-FSPRED<OBJ @-F<ADVL @-FADVL> @-F<SPRED @-F<OPRED @-FSPRED> @-FOPRED> @>ADVL @ADVL< @<ADVL @ADVL> @ADVL @HAB> @<HAB @>N @Interj @N< @>A @P< @>P @HNOUN @INTERJ @>Num @Pron< @>Pron @Num< @OBJ @<OBJ @OBJ> @OPRED @<OPRED @OPRED> @PCLE @COMP-CS< @SPRED @<SPRED @SPRED> @SUBJ @<SUBJ @SUBJ> SUBJ SPRED OPRED @PPRED @APP @APP-N< @APP-Pron< @APP>Pron @APP-Num< @APP-ADVL< @VOC @CVP @CNP OBJ