GiellaLT provides rule-based language technology aimed at minority and indigenous languages
Below there are a couple of example tasks, and steps to take to realise them.
If you want to change the build procedure (e.g. to add or remove a new feature from a specific fst for all languages), work through this task.
Here is the procedure, with
am-shared as an example.
The local directory
am-shared is an exact copy of:
dictionary-include.amlocally (for your test language) and make sure everything works.
./update-all-from-core.sh -t und
When a new fst type is called for, the procedure is roughly the same as above, with a few additions. As an example, we will create a new fst for dictionary analysers (ie to be used for analysing input to dictionary lookup).
But first we need to answer the following question: where do we add the code for
building the new fst? The idea so far has been to add default targets to the
am-shared/topdir-include.am, and optional targets (turned on via
configure options) to separate include files such as
dictionary-incluce.am. Generally this line will continue, but if the list of
default targets grow too long, even those might be split out in separate include
For our example, we will edit
dictionary-incluce.am. As always, edit in a
local language dir first, to test that the new target works. When all is done
and works fine, copy the modifications to the
und template. Here are the
steps to go through:
am-shared/dictionary-incluce.am- the following steps will tell the system how to build the fst:
*.tmp.*to allow local overrides.
*.tmp.hfst -> *.hfsttarget in the local Makefile.am (if no such changes are needed,
*.tmp.hfstwill just be copied to
filters/dir, and add dependencies to them all (such that the build will break properly if a filter is not available, and all required filters are rebuilt if needed).
src/Makefile.am- the following steps will tell the system when to build the fst target:
GT_ANALYSERS_HFST(for hfst transducers).
if WANT_DICTIONARIESand within that
ifblock, write the following:
+=part will add the new fst to the list of fst’s already assigned to the variable)
M4work and will be covered in a separate tutorial
./configurewith the proper option
undtemplate, add a note in
und.timestamp, and commit
$GTHOME/langs/update-all-from-core.sh -t und
Task: add plx and Hunspell conversion to the new infra, but only for a limited set of languages (sma, smj, later sme).
We want a new template named
Then, we need to fill that template with the following content:
plxtools.timestamp am-shared/plx-include.am # this is the real build file tools/spellcheckers/plx/ Makefile.am # includes plx-include.am src/ # shared src files - rsrc, rev, version tmp/ # large plx files, make clean safe
The Hunspell conversion is common to all languages, and is thus part of the
und/ template. These parts need to be added to that template:
am-shared/regex-include.am # this is the real build file tools/spellcheckers/filters/ # move common filters in here Makefile.am # includes the usual regex-include.am
This is pretty simple:
cd $GTLANG touch plxtools.timestamp svn add plxtools.timestamp
As soon as the file is created, the merge script will pick take notice, and start merging files from that template to the language with the timestamp file for that template.
Both Hunspell and PLX spellers should by default not be built. To turn them on, one should use something like
--enable-hunspell. See the Oahpa ditto for a way of doing this.
Merge the template for a specific language as follows:
cd $GTLANG ../../giella-core/scripts/merge-templates.sh -t plxtools
That is, specify the template you want to merge using the -t option, both to avoid timeconsuming operations, and to avoid merging several unrelated things at the same time.
make make check
and looking at the output.
After the known bugs have been fixed, re-merge, test and evaluate. Iterate till everything works as planned.
There are definitely a number of other issues. The goal is to have a portable build system with as few dependencies as possible, and with all dependencies checked for and reported properly to the user if missing.
These goals require that we follow the Autotools conventions, and use supported variables and macros where we earlier often used more homegrown solutions.
See the following sites for useful documentation and help:
make -Bnd sma-mobile.zhfst | make2graph | dot -Tpng -o sma-mobile-dag.png
Result: a visual representation of the dependency graph, making it easy to spot wrong dependency chains, and where the problem most likely is.