Finite state and Constraint Grammar based Text-to-Speech processing
View the project on GitHub giellalt/speech-sme
The pipeline is not yet fully functional. This document is both a guide to help us get where we want, and documentation for the present status and planned functionality.
Here is a test command illustrating the whole processing pipeline from plain text in until IPA out (not all components are in place yet, and those components are substituted with alternatives to get something running):
$ echo "Iđđes dii. 9 mun doapmalan čoaggit alitnásttiid álbmotmeahcis." | \
apertium-destxt | \
hfst-proc -C -w -e -q -r sme/bin/sme.hfstol | \
vislcg3 -g sme/src/sme-dis.rle | \
grep -v '^"' | cut -d '"' -f3 | cut -d ' ' -f2 | \
hfst-optimized-lookup -q sme/bin/isme.hfstol | \
cut -f2 | grep -v '^$'
The output produced with the above pipeline is:
Iđđes+Adv+?
dii.
9
+?
doapmalan
čoaggit
alitnásttiid
álbmotmeahcin
álbmotmeahcis
..
+?
The target is to produce IPA, one output token for each input token.
The text output option illustrated above can be used to ensure 1:1 roundtrip correctnes for the disambiguation and generation - we should be able to produce the same output as we put into the pipeline.
Below is each command commented:
echo "Iđđes dii. 9 mun doapmalan čoaggit alitnásttiid álbmotmeahcis."
apertium-destxt
hfst-proc
requires that certain characters are escaped, and
this tool does the jobhfst-proc -C -w -e -q -r sme/bin/sme.hfstol
-e
) and producing VISLCG3-formatted output (-C
) adding the raw
analysis string as a subreading (-r
); the lemma is returned in dictionary
case (-w
) which is needed if generation is going to workvislcg3 -g sme/src/sme-dis.rle
grep + cut + cut
hfst-optimized-lookup -q sme/bin/isme.hfstol
cut + grep
hfst-lookup
)