North Sami Text-to-Speech

Finite state and Constraint Grammar based Text-to-Speech processing

View the project on GitHub giellalt/speech-sme

Page Content

This document looks into the (case) system of numerals and potential problems for TTS

Attributive (short forms)

Acceptable compound parts:

Digits Text
1100-1900 -nuppelotčuođi/nuppelohkáičuođi
11 000 - 19 000 -nuppelotduhát/nuppelohkáiduhát
20 000-90 000 -lotduhát/logiduhát

Now we have -lohkái/-logi. We should probably have -lot in compounds. See for example 11-jahkásaš: oktanuppelotjahkásaš, not﹡oktanuppelohkáijahkásaš.

Years

There is a difference between Norwegian/Swedish and Finnish type between the period 1100-1999:

This does not extend to 2000:

We already have the Finnish type. The question is whether we need the Norwegian type. We probably do.

Case marking

Some number types will have case marking. Usually the colon will indicate when the number should be in loc., ill or com. singular. However, there is no marking to separate nominative from accusative and genitive.

For most numerals, accusative and nominative singular are the same anyway, the exceptions being 1, 0, and mill, for which accusative and genitive are similar. For the other numerals, genitive is used mostly with postpositions, or within an NP which is in gen., ill or loc.sg.

Numerals for which nom/acc/gen are the same anyway: čieža (7), gávcci (8), ovcci (9), logi (10), čuođi (100).

Compound numerals

Whole tens and hundreds have a difference between nom/acc on the one hand and genitive on the other. While the final part logi or čuođi remains the same, the first or middle number will change:

Expansions of numerals in sentences in texts


Examples of numerals in different cases

Nominative

Accusative

Genitive

Locative

Ordinals

From-to-expressions (need to be differentiated)

Chapters

Date