GiellaLT provides an infrastructure for rule-based language technology aimed at minority and indigenous languages, and streamlines building anything from keyboards to speech technology. Read more about Why. See also How to get started and our Privacy document.
An OpenOffice.org hyphenator was not a fixed part of the original plan, rather it was indicated as an optional addition. Part of the reason for this was that at the time I could not find information on how to implement it. Now the information is there (or: we have found it), and we’ll try to add Sámi hyphenation after we have the OOo speller in place. Relevant links are:
Basically, the task of creating TeX hyphenation pattern files consis of the following steps:
OPatGen
(see above) and associated scripts -
see the user documentation aboveWe need to install OPatGen
first, which in turn needs cweave
.
Steps to install cweave
:
sudo port install texlive-bin-extra
Steps to install opatgen
(we have a local copy of the code):
cd $GTHOME/tools/patlib
make
NB! At the moment the C compilation is broken, because the code
rely on a number of old, non-conformant coding practices. gcc 2.95
is required, which is not available on any modern systems anymore.
We either need to upgrade the code, or find an alternative solution.
We have successfully modernised the code of tools/dic2traskelet.C
(a helper tool), but the C code of the main application is a bit
more complex to cover.
The OPatGen
route is too time-consuming, so we need an alternative.
Probably the simplest alternative is to use the original PatGen
tool,
which differs from OPatGen
in two important respects:
The steps needed to create the hyphenation patterns are almost the same, with the addition of the need to escape the non-ASCII characters. In more detail, it looks like the following: