GiellaLT provides an infrastructure for rule-based language technology aimed at minority and indigenous languages, and streamlines building anything from keyboards to speech technology. Read more about Why. See also How to get started and our Privacy document.
There are a lot of FST descriptions of languages out there, one major such source is Apertium. But most of these projects do not make spelling checkers or many other tools based on their morphological description. Since we have the infrastructure and the tools in place to make all languages work, it might be useful to just take those repos, and compile their fst within our infra, and from there make spellers, tokenisers, and a lot of other stuff.
We use git subtree
for adding external repos. To do that, add a new language as follows:
git subtree
as follows:git subtree add --prefix src/fst/morphology/ext-Apertium-nno \
https://github.com/apertium/apertium-nno.git master --squash
src/fst/morphology/Makefile.am
as needed to make everything buildupdate
target to Makefile.am
(ie in the root dir of the project); see other languages with an external data source for examples:update:
git subtree pull --prefix src/fst/morphology/ext-Apertium-nno https://github.com/apertium/apertium-nno.git master --squash
When you later want to update the code from the external repository, you can just run the command make update
in the root directory of the project.
NB! Replace ext-Apertium-nno
and https://github.com/apertium/apertium-nno.git
in the commands above with what is correct for your language.
NB2! The name of the directoy within src/fst/morphology/
must start with ext-
, to make it easy to see that the source code is from an external repo.