Lule Sami Text-to-Speech

Finite state and Constraint Grammar based Text-to-Speech processing

View the project on GitHub giellalt/speech-smj

Page Content

Speech data recording, storing, pre-processing

Speech data recording, storing, pre-processing

Recording:

5-10 hrs of speech
Texts that are not copyrighted/we are allowed to use/share with open licence
Record preferably studio quality data (44.1 kHz, 16-bit minimum) in .wav
Speaker metadata, informed consent forms

While recording, keep track on mistakes, hesitations or any changes the speaker does to the text while reading!!!

Storing:

Where to store a) speech data and b) metadata

Pre-processing:

Aligning speech data and text on phoneme level (use WebMAUS)
Filename system - consistency!
A big data table showing the content and information of EACH FILENAME
Allows mass renaming and other mass python operations to be done to the files