Finnish NLP Grammar

Finite state and Constraint Grammar based analysers, proofing tools and other resources

View the project on GitHub giellalt/lang-fin

Page Content

Noun inflection and derivation

Noun stem variation and allomorph selection

The nominal stems were classified according to what is the stem variation and what allomorphs they select. This section lists all the possible variations and all their allomorph selections. Each class description gives the key forms that can be used when classifying new words, the examples of inflection and negative examples for the most obvious missed allomorphs and differentiating factors. The example list should be at least: singular nominative, singular essive, singular inessive, plural essive, plural elative, singular partitives, singular illatives, plural partitives plural GENITIVEs, plural illatives and the compound forms.

Noun stems without stem variation

The most basic noun stem does not have any stem internal variation and uses few commonest allomorphs. The words in this class are either bisyllabic or have one of common derivational suffixes: sto, …. The nouns in this class end in o, u, y or ö, which determines their illative suffix and therefore exact classification:

The stems ending in u are also without variationm and the bisyllabic ones have the same simple allomorph pattern:

Basic gradation cases

The basic stems without variations other than consonant gradation.

Between two vowels, the weak grade of k is optionally an apostrophe instead. For k that is not optionally ’, for example when it is after a consonant other than s, the variation is k ~ 0 instead (e.g. NOUN_UKKO).

Between three vowels where k is surrounded by same vowels the k becomes obligatorily ’. When the vowels are different, it becomes optionally ’ (as in NOUN_TEKO), and after consonant other than s it is k ~ 0 (as in NOUN_UKKO).

There is a gap in paradigms in y and ö finals of k:’

Other gradations can be more easily caught from the preceding context.

There is a gap in paradigms in y finals with p:m variation

There is a gap that misses all other stem vowels of r:t variation except o

The k:v variation is unique to handful of words of form CUkU, such as

The trisyllabic and longer words with stem vowels o, u, y, ö have no stem variation either, but selection of suffix allomorphs for plural GENITIVEs and partitives is different:

The words with stem vowel o, u, y, ö preceded by vowels still have no more stem variation than other cases, but give yet another pattern of allomorphs for plural partitives and GENITIVEs:

Similar inflection exists in limited amounts in new loan words that are written as pronounced (thus taking no stem variation but still ending with long vowels with a syllable boundary):

This class includes a set of new proper nouns that get nativised a bit:

Optional gradation with illatives

In some trisyllabic words ending with quantitative consonant gradation, the illative form can attach to either strong or weak stem, even in standard written Finnish. Otherwise words in this class behave like other trisyllabic stems.

I final stems

The basic variation of stem final i in nominative is that it becomes e in plural stems. In plural GENITIVE form, the stem vowel disappears making suffix allomorph -ien, instead of -jen.

There is a gap in i final words for p:v variation and front harmony

There is a gap in i final paradigm with t:l variation and back vowels.

New loan words ending in consonant may be inflected as i stem words

The trisyllabic -i final stems work like their o, u, y. ö counterparts ; they combine the e:i and e:0 variation to the additional allomorphs for plural partitive and GENITIVE:

I-final nominatives with e stems

Some of the words with i-final nominative forms have i:e variation for singular stem, and i:0 for whole plural stems:

New e-final stems

The new words with e stem work exactly like the bisyllabic o, u, y, and ö stems ; no stem variation and same allomorph set. This variant can be recognised from singular illative then:

A-final stems

The basic variation of a-final stems is a:o in plural forms:

Notably, the basic a:o paradigm does not support many ä:ö cases.

Some -A stems do not have the a:o variation, exhibiting a:0 variation for plural stems instead. This class notably contains all the -jA suffixed deverbal nouns.

There is a gap for ä-final word with p:m variation

There is a gap for k:j gradation in a-final stems.

Certain trisyllabic stems allow both variations of a:o and a:0 for plural forms.

Some a stems with a:o variation have slightly different set of allomorphs

There is a possible class for further variation of a:o in the old dictionary dictionary that is worth re-evaluating.

There’s one more allomorph pattern.

The trisyllabic a-final words with quantitative consonant gradation allow same illative variation as o, u, y, and ö finals described earlier.

The a-final words ending in long vowels with syllable boundary have a:0 variation and more allomorphs for plyral GENITIVE or illative.

Lexicalised comparatives

Lexicalised comparatives have the same special inflection pattern as comparatives have: stem final i varies with a, and mp gradates into mm. There are not many comparatives that lexicalise into nouns.

Long vowel stems

The long vowel stems have shortening variation in plural inflection, and special singular illatives, partitives.

Bisyllabic and longer stems with long vowels have -seen illative suffix. This class has some of the old -UU derivations.

Monosyllabic long vowel stems have illative suffixes of form -hVn.

Opening diphthong stems

In old opening diphthong final words, the dipthong simplifies for plural forms by dropping the initial vowel. For new words of this class, no stem mutations happen and they are in above mentioned classes, (e.g. NOUN_ZOMBIE).

THIS WAS MISSING 2015-08-23, REDIRECTING Jaska

Newer loan words

The loan words that end in long vowel, and have been modified to Finnish orthography, have combination of long vowel stem’s allomorphs for e.g illatives. Sometimes rules for these classes of words are vague.

There’s a gap in -ii final loan stems.

Some loan words don’t end in long vowel but work like they would. The official dictionary says these words should avoid plural GENITIVE itten but not iden, in reality they may be in absolutely free variation. In general the rules for loan words are vague and do not always seem to work.

The rules are even more wonky when the vowel harmony does not follow the orthography, or the orthography leaves things open to interpretation. Only way to even begin to understand the norm is to look up examples on RILF site (to me, some of the forms on that normative guide are just bizarre).

The loan words that end in consonant when written but vowel when pronounced are inflected with an apostrophe ’. With half-vowels the rule is a bit shaky, but officially apostrophe is the only way.

I-final stems (old e stems)

Some i-final stems have i:e variation in singular forms, as they are originated from -e forms, only nominative has -i. They also have some consonant stem forms that are archaic for other classes of words. The difference between these classes are in the selection of singular partitives and plural GENITIVEs (but the boundaries of norm are not clear-cut, and most variants are found in the wild):

It is noteworthy of the official dictionary classification, that classes with numbers 24 and 26 are identical. The distinction should probably not be retained in future versions.

The -mi stems will rarely undego m:n variation for consonant stem forms.

The -si words that originate from old -te stems have the consonant gradation patterns left in their stems. The si is only in nominative stem and this class mainly concerns stems that are old enough to have undergone ti>si transformation.

A few -psi, -ksi, -tsi stems have consonant simplification for consonant stems. Other variation with these stems is the selection of plural GENITIVE allomorphs.

The -ksi stem in haaksi includes k:h variation.

Consonant final nouns

The consonant stems use inverted gradation if applicable, that is, the nominatives have end in consonants and their gradating consonants are in weak form. The singular forms of consonant final words have intervening e before suffixes. The basic consonant final words have no stem modifications.

Some of the n-final stems have n:m variation.

-tOn suffixes

The caritive suffix -ton inflects with A before the singular suffixes.

Lexicalised superlatives

The lexicalised superlatives have special inflection pattern.

Vasen inflects almost like superlative:

-nen suffixed forms

Number of derivations end in -nen, that has special alternation pattern.

-s final words

The s final words have some variation patterns that are determined lexically.

The basic variation is s:ks, with e before the singular suffixes.

Some of the s final stems have additional s:t:d variation in singular stems. Most notably, the UUs derivations are in this class.

Some s final words have special lengthening inflection.

The word mies has special s:h variation pattern.

t-final words

The t-final words have t:0 variation in the stem, and the singular suffixes are as usual joined with e. It is common to see non-standard forms of these words.

Few t-final words have lengthening in singular stems

Nominalised nut participles have special inflection just as well.

Old e^ stems

The e final words that have lost final consonant inflect like consonant final words, including the inverse consonant gradation. This class includes all deverbal -e suffixed nouns.

Dual nominative paradigms

A handful of words can use two completely distinct inflection patterns where a bit of overlapping inflection has been cut out. These words have two nominatives, and thus often two dictionary entries: one which is regular entry from the e^ class of words (like NOUN_ASTE), and one which is consonant final, and may have inverse gradation.

Exceptions to dictionary inflection

There are few cases where dictionaries traditionally have never indicated correct inflection by classification. In computational implementation we need to assign some classes or exceptional paths to them, and they are described here.

Two words have exceptions in their vowel harmony patterns:

It is not noted anywhere that the common inflection pattern for veli is exceptional:

A few of ika-final nouns, but not all, have the shifting half vowel written as j in normative orthography.

Noun forms of numerals with special inflection

The numerals are not really nouns in this morphology, for details see [numeral-stems], but some of their compounds are nouns, and the following classes are for those that have special stem or suffix patterns not available with other nouns. The numerals 1 and 2 are in paradigm that is currently left with one other noun, haaksi, so nominals with 2 go to that class but 1 gets a new class for being front vowelled. 3 Has its own paradigm, 4 is like koira, 5 like hiisi and 6 like kausi. The numerals 7, 8, 9 have their own paradigm, which, other than the nominative having extra n at the end, is same as the tri-syllabic a, ä stems, similarly 10 is quite like sisar with the extra en in nominative.

Plurales tantum

For some words, the singular forms are rare, odd, or even deemed ungrammatical, these words have separate classes for them. In Finnish words to commonly be in this class are events like häät (wedding), juhlat (party), etc. then all things that are semantically coupled, like clothes with two somethings (as with English): farkut (jeans), housut (pants). It is noteworthy, that sometimes dictionaries classify common words as plurale tantum for semantic reasons: joukot (troops) versus joukko (group). We don’t need that at this point. The compounds of plurale tantum words are made from singular forms: farkkukangas (jean fabric), hääjuhla (wedding party).

Adjective-initial Compounds with Agreeing Inflection

The words in dictionary paradigm class ⁵¹ refer to old closed class of adjective initial compounds, which follow agreeing compound pattern same as numbers. The amount of these words is relatively small, so they have been spelled out here in full form rather than using more complex methods of agreeing compounding, Further reading: VISK § 420 FIXME: some are still missing

Nouns and other nominals inflect in number, cases, possessives and with clitics, in that order. Combinations that this regular inflection can form is approximately 2×15×5×26=4900, so we do not show all variants in test cases and examples, but just central ones that are interesting and potential to break.

Nominatives

Singular nominative is the dictionary reference form for most of the words.

The plural nominative attaches to singular stem. For plural words it is also the form that is used for dictionary lookups:

Singular case inflection

The nouns can inflect in 15 regular cases. Most of the cases have one or two case endings, with only varying part being the harmony vowel. For nouns with direct consonant gradation, the most of the singular case suffixes attach to the weak singular stem:

For words with direct gradation the only forms that attach to strong singular stem may be essive and possessive’s of nominative or genitive:

Plural inflection

The strong plural stem of words with direct gradation contains only essive and comitative:

Cases with allomorphic variation

The nominal cases which can take several different suffixes are singular and plural partitives, singular and plural illatives and plural genitives. After stem variation the selection of these allomorphs is the main factor of morphological classification of nouns.

The reconstructed historical suffix for partitives of Finnish is ða, ðä, the current variants according to that theory would be the different realisations of extinct ð.

For basic vowel stems the partitive suffix is a, ä.

In stems other than -a, -ä stems, the 3rd possessive suffix may appear in -an, -än form after the partitive suffix.

The consonant stems and long vowel stems regularly take -ta, -tä suffix for singular partitives. It is up to interpretation whether the partitive suffix of the -e^ stem is considered to be -tta, -ttä, or just -ta, -tä. In principle the consonant that disappeared from -e^ stems could be from that.

The singular illatives have more variants.

The variant with intervening h attaches to long vowel stems:

The stems with short vowel do not have the intervening h.

Bisyllabic long vowel stems have illative suffix of -seen.

Plural genitives have the most variants. Especially, many words have more or less free variation between handful of choices.

The -in suffix for plural genitive that goes with singular stem is always markedly archaic. Most commonly it appears in compound words.

Plural partitive has a few variants:

And plural illative has few variants:

Possessive suffixes

Possessives come optionally after the case suffixes. For consonant final cases the possessives assimilate or eat the final part of the case ending or stem.

The possessive suffix of form -an, -än, attaches to some long vowel stems:

Noun clitics

Clitics can attach to any word-form, including one that already has a clitic. Clitics do not modify the form they attach to and are simply concatenated to the end.

Noun compounding

Nouns form compounds productively. The non-final parts of regular compounds are singular nominatives, or singular or plural genitives of nouns. The final parts are nominals and inflect regularly.


This (part of) documentation was generated from src/fst/morphology/affixes/nouns.lexc