GiellaLT

GiellaLT provides an infrastructure for rule-based language technology aimed at minority and indigenous languages, and streamlines building anything from keyboards to speech technology. Read more about Why. See also How to get started and our Privacy document.

View GiellaLT on GitHub

Preparing text for TTS

Things to consider

Collect enough text to be read in as training material. The model may be built based on 3-12 hours, a good target is 10 hours speech. This should exual appr. 45000-50000 words. Collecting and especially prepararing the text may take several months.

Keep in mind:

  1. The text should cover digraph sequences, consonant gradation strings, etc.
  2. The text should be balanced topic-wise
  3. It should contain numbers of different types
  4. It should also contain loan words

forthcoming.