GiellaLT

GiellaLT provides an infrastructure for rule-based language technology aimed at minority and indigenous languages, and streamlines building anything from keyboards to speech technology.

View GiellaLT on GitHub

Page Content

Adding a new language to the Github infrastructure

Languages reside within the GiellaLT organisation, and new languages should be added there.

:warning: Prerequisites

You need gut to be able to add a new language the way it is intended.

You also need to be at least admin to set up a new repository properly.

How to add a new language

gut template generate -t template-lang-und -d lang-XXX

Replace XXX with the code of the language you want. lang-XXX is really only the name of the new directory/repo, but the name of the repo should follow this pattern. The command is similar for keyboards, just with a different template.

The command will prompt you for the essential data, as follows:

__UND__: 3-letter ISO code, e.g. pma.
__UND2C__: 2-letter ISO code if it exists, 3-letter otherwise
__UNDEFINED__: Language name in English
__LICENSE__: License type, e.g. `LGPLv3`
__REPO__: language repository name, e.g. lang-pma 

This command can also be used to superimpose the GiellaLT dir and file structure on an existing repo, e.g. when importing an LT project into the GiellaLT infrastructure. Presently the command will fail, although the new structure has been added, so one can ignore the error, and proceed to verify and add&commit the changes.

Then do a few preparatory steps:

cd lang-XXX/
chmod a+x autogen.sh # make autogen.sh executable
git branch -m main #  gut uses branch name 'master', we use 'main'
cd ..

When the dir is created, and the content is checked, add it to the GiellaLT GitHub organisation as follows:

gut create repo -d . -o giellalt -r lang-XXX -p

Notes:

The -d option should point to the parent dir of the target — it makes it possible to add multiple language repos at a time, assuming they are all located within the same parent directory. The --clone option makes sure that the new repo/s is/are directly cloned and made part of the local GiellaLT repos. The regex is presently required, but will probably be made optional.

Aftermath

After moving/pushing the new repo, remember to:

Result

The above steps will create a new directory for the specified language, and populate it with the required makefiles, autoconf files and template source files.

To start doing real work, you must do one set of preparations still:

cd lang-LANGCODE
./autogen.sh
./configure

Now you can start editing the source files, and whenever you want to make sure everything compiles, run make. Run make check to ensure that all defined tests are passed. Remember to update the test suits as you enhance the linguistic model!