Inari Sámi NLP Grammar

Finite state and Constraint Grammar based analysers, proofing tools and other resources

View the project on GitHub giellalt/lang-smn

Page Content

Guessing: Rule for adding Sem/Date as a tag to readings which looks like dates

Guessing: Rule for adding Adv Sem/Adr as a tag to readings which looks addresses

*Select PlcSur (Sem/Plc) (Sem/Sur)

Some propernouns have two parts and the first is not a genitive. We still have problems with abbr when these propernouns are inflected or are a part of a cmp. The copy rule adds Attr reading to names which not get it in the fst (Soria). The select rule selects Attr when the next word is e.g. Moria.

Rules for giving Attr to names, e.g. Ole Attr Kåven.

Remove unwanted analyses

Numerals

Lexicalised derivations

Particular verbs

Propernouns

Removing or selecting proper nouns that are lookalikes

*Removes PropPl, but problems with names as Davviriikkaid Ráđi, there we want Prop Pl

Some adjectives are never derived as Adv

Rules for Prop Attr, Sem/Sur and Plc

*Removes PropAttr if no Prop on the right side

MISC

ONE-COHORT DISAMBIGUATION - CYCLE 0

The idea behind “cycle 0” is to have safe rules without context first. These rules typically chose lexicalisations over derivations, Saami words instead of marginal names, etc.

Lexicalised derivations

*Removes derN if lexicalised.

*Removes derNEss if lexicalised, and both nouns are essive.

*Removes derA or PrsPrc or VGen if lexicalised. VGen is a chance.

*Removes derAdv when Adv is lexicalised.

Fragments

Adjectives or nouns, not adverbs

Adverbs

Lexicalised adverbs

It is useful to select early the adverbial reading for potensial nouns or verbs.

*aloGen removes állu Gen, álo Adv vs. N Gen

*bealisAdv

Pronouns

Nouns, not verbs

Lexical selection - nouns

mánnu vs mánus

Not noun

Adposition or not

Not Qst

Interjections

Southern Locative vs. Essive

Px-rules for special nouns

Some verb rules

Particular CS

Adpositions

Adpositions, not verbs

Section 2: LOCAL DISAMBIGUATION - CYCLE 1

FAMILY pronouns

Pron Pers 1. p.

Pron Pers 2. p.

Pron Pers 3. p.

An early rule for “eanaš”/”eanas”

Px constraints

First select Px, then remove all remaining Px

We end section 2 by removing all remaining Px

Section 3: Certain verb readings

verb or adv

All imperatives

For imperative disambiguation we need the following: Pick imperative contexts, and thereafter remove imperative. Such contexts are: Imperative verb sentence-initially with exclamation mark

Sg1 - early cycle, safe rules

Sg2 - early cycle, safe rules

Sg3 - early cycle, safe rules

Negative verb, not abbreviation or roman numeral Ii.

Du1 - early cycle, safe rules

These Du1, Du2 rules are (almost) not in use in our corpus, but we keep them for completeness.

Du2 - early cycle, safe rules

The next two rules are not found in the corpus, but logically they belong, to cover the whole paradigm. There is no verb-internal homonymy here, but there is homonymy with e.g. Illative for certain verbs.

Du3 - early cycle, safe rules

The competitor to Du3 is -ba Foc.

Pl1 - early cycle, safe rules

The competitor here is obviously Inf, but also Pl3 and Prt Sg2.

Pl2 - early cycle, safe rules

These rules are not used when disambiguating the corpus

Pl3 - early cycle, safe rules

Select…

The following two may be joined:

Remove…

The following two may be joined:

PrsPrc

OBS: denne er ikke helt bra

Section 4: CYCLE 1B: REMOVING THE READINGS THAT WERE LEFT FROM THE 1A RULES

We don’t need more Px sections, it’s done alrady

Noun, adjectiv, PrsPrc or not?

Adjectives and adverbs

Adv or not?

maid has many readings and as Rel it is a member of S-BOUNDARY. Therefore we need to disambiguate is early in this file. Most important is to select Adv. Because of that A ang N still can have Vfin readings, it is difficult to make very general rules.

matPcle

The following two rules are omitted. They only inflect on the disambiguation of mat pcle, a wackernagel, which is done in the rule over here, I think.

Disambiguating abbreviations

Disambiguating particles

Disambiguating clitics

Disambiguating numerals

Disambiguating adpositions

čađa

Commented out som adp-rules we don’t need anymore:

geahčai

guovddaš

mađe

miehta

LIST LG-MATERIAL = Inf Adv Nom ;

Diambiguation Noun vs. Po or Pr:

Some particular subjunctions and Neg Sup

go as CS and Qst Pcle

First select all “go” Qst Pcle, then remove them so the rest will be “go” CS

Section 9 WORD-SPECIFIC RULES

Some particular subjunctions

Adverb rules

MAPPING OF COMP-CS< , COMPLEMENTS OF PARTICLES IN COMPARISON

First map all COMP-CS<, then remove the other readings

MAPPING OF CC AND CS

Mostly we map both @CNP and @CVP, then we select @CNP, after that we remove them so @CVP remains

*CVPoppramsing Lásse, Iŋgá ja mun

*CVPCmp/SplitR Cmp/SplitR @CNP

PRONOUNS

Plural?

Interrogative and relative pronouns

Emphatic ieš

Numerals

Indefinite pronouns

The rules are not documented yet

Demonstrative pronouns - should have a look at these

Disambiguating adjectives

Attribute disambiguation

Rules for Attr between Dem and N

Other attribute rules

Special rules for ‘buorre’ (the only adjective showing case agreement)

This block of rules is there to ensure case agreement for comparatives.

alit vs. allat Comp Attr

And now some rules for adverbs that modify adjectives

Proper nouns

VERBS

Disambiguating verbs - part 1

First ConNeg forms, they are dependent upon Neg verbs. Then Imperative (with their special syntax), infinitive, and other infinite forms. Person comes later (in part 2)

ConNeg forms

Number following the rule headers below refer to numbers of hit in a 13 053 859 word corpus.

Imperative

See also Imprt or Ind some sections down.

Infinitive

Rules that prevent later selection of Inf for a finite verb in the frame

INF-V…CC…

Verbgenitive

Supinum vs. potential – no example found in large corpus

Perfect Participle

Topicalized version

the following chapter should be possible to unify.

Actio

Present participle

*orrut vs. orrot)

Rules for “addit” (which is an adjective, but more often a verb)

Actio Loc = N Loc

Actio Nom = Ess

Imprt or Ind

Nouns or verbs

The rules are no documented yet

Demonstrative pronouns, agreement in DP - should it be moved to after verbmappings?

The rules are no documented yet

VERB MAPPINGS

Verbs as predicatives (@SPRED>) and (@<OPRED)

The tags (@SPRED>) and (@<OPRED) target PrfPrc

The rules are no documented yet

Passive verbs often have

Verbs as prenominal participles (@>N):

(@+FAUXV) and (@+FMAINV) target Neg, orrut

(@<SUBJ) target Inf

(@<SPRED) target Inf

(@<ADVL) target Inf, Actio Ess

@-F<OBJ target Inf

(@A<) target Inf

(@N<) target Inf, Actio Ess

(@<ADVL) target Inf, Actio Ess

(@<OBJ) target Inf, Actio Ess, PrfPrc

(@+FMAINV) and (@+FAUXV) and (@-FAUXV)

(@-FMAINV) and (@-FAUXV)

And then we remove the verbs which didn’t get any syntactic tag, in favour of verbs with syntactic tags.

killifVinCohort This rule removes all other readings, if there is a mapped V reading in the same cohort. Every case which this goes wrong, should be fixed in mapping rules or previous disrules.

NOUNS

CASE DISAMBIGUATION

Num as subject, tricky cases - the rule should be here because of the verbdisambiguation

ACCUSATIVE-GENITIVE DISAMBIGUATION

Secure rules for choosing Acc

Semantihkka: Choosing accusative or genitive semantically

Other genitive rules

Genlassin Selects Gen if first one to the right is lassin *bargostipeanddaid lassin

lassinIll Selects Ill if first one to the left is lassin *lassin Sarai

Gen and preposition/postposition

Genitive in place adverbials ROUTE

Adjectives take object

Temporal adverbials: Choosing accusative or genitive TIME

Reflexive pronouns: acc or gen

Accusative object

*topOBJPers Removes Gen if you are Acc, and to you right is a Pron followed by a transitive verb. You have to be sentence initial

*AccVAbess Selects Gen if to the right is abessive

Gen modifiers inside NP

Accusative in coordination

Intransitive verbs can sometimes be transitive

Accusative or genitive in front of ALU and in front of adjectives

Exceptional accusative attributes in front of ALU nouns.

Numerals

Leftover accusatives

*COMPInfAcc Selects Acc if you are Gen and to the left is an Inf TV @COMP-CS<

Accusative before @COMP-CS<

Accusative before some A

Accusative sentence-finally

Genitive

Nominative and accusative

*NomIFInitialThenSg3 Selects Nom if -1 BOS and 1 oblique / Sg3 lookalike. Works in fragments.

Nominative

Miscellaneous rules

Vocatives, subjects of sentence fragments

Nominative in titles and sentence fragments

Nominative after “go”, “dego”, “dugo” and “nugo”

Preverbal subjects

Postverbal subjects

Nominative predicatives

Nominative as objects in existential clauses

Nominative in coordination and apposition

Nominative in parallell constructions

Not nominative

Comitative rules

NP internal disambiguation of Com

Disambiguation based upon verb valency

Disambiguation of Com depending on Adv or certain verb or N

Animate nouns

HAB-ACTOR in habitive-constructions

váldit vára + Loc

dahkat earrodearvvuođat geainna nu

eallit mainna nu

Disambiguation based upon verb valency

COM-V

tools (concrete and abstract)

BODY as an instrument

Dynamic-verbs

Event-tool-actio

Most actio can be both tool and event.

PLACE-V

STATE-V (eallit)

Movement-verbs

The super-set Dynamic-verb according to choose (Pl Loc) or (Sg Com)

The idea is that the superset DYNAMIC-V are not connected to TOOL, ABSTR-TOOL or CONCEPT in (Pl Loc). This is the “minste felles multiplum”. The sub-sets are different, f.i. many of them (but not all) are not connected to HUMAN in (Pl Loc), one is not connected to ABSTR-ENTITY and ACTOR in (Pl Loc). We work with negation so the rules don´t destroy analysis because of insufficent sets.

First the general-rules for selecting (Sg Com), then the more special rules for selecting (Sg Com), and then we selct (Pl Loc) for the rest of them under # Another round of locative rules.

HUMAN-LOC-V

Locative and comitative - Disambiguation based upon coordination

And then we remove the remaining Sg Com analysis

Essive

Late case rules (after other case rules have worked).

VERBS PART 2, Section #22

Finite or not

Finite

Not Finite

Indicative Negative

Infinitive

Indicative or imperative

Verbs according to person and number

Sg1 - First person singular

Sg2 - Second person singular

Sg3 - Third person singular

Infinitive and clausal subject

Rules that look backwards for a subject across a relative clause:

Rules that look backwards for a subject across a subordinate clause (CP boundary):

Extension possibilities: Coordination

Son oaidná du ja mu ovdal go boahtit…

Coordinated Sg3 verbs

Not (V Sg3)

Du1 - First person dual

The previous two rules look marginal.

Du2 - Second person dual

Rules for leahppi = (“leahppi” N Sg Nom)

Du3 - Third person dual

Pl1 - First person plural

Pl2 - Second person plural

Pl3 - Third person plural

Rules for a special infinitive construction

More finite verbs

Passive

Infinitive

Present Participle

Actio/Perfect Participle

Actio

Selecting some more finite verbs

Lexical disambiguation of verbs

NOMEN

Case rules

Other rules for nouns and pronouns

Determiners

Adverbs and adjectives

NOUNS

Variant lemmas

VERBS

Removing Err/Orth


This (part of) documentation was generated from src/cg3/coredisambiguation.cg3