GiellaLT provides rule-based language technology aimed at minority and indigenous languages
Ever since Windows 10, Anniversary Update 2018, it has been possible to install a Linux system on Windows. Follow the following instructions to install Linux/bash on Windows 10.
Note that If you only want to use the ready-made grammatical analysers (as explained on the Linguistic analysis page.
this documentation is relevant when you want to participate in building and developing the grammatical tools yourself.
Then return here.
To access Windows files from the linux window, do ls /mnt/
and navigate from there. A good idea would be to make an alias in the .profile file of your linux home folder, e.g. something along the lines of:
alias lgtech = "pushd /mnt/c/Users/YourUserName/Documents/lgtech"
… where YourUserName should be replaced with just that. The path starts with /mnt/
, you should check that the rest of the path is what you want.
Then writing lgtech
will bring you directly to the relevant folder. You then may want to install all language technology files here.
The good thing with installing them here and not under the home directory is that you can access the files with Windows programs as well (but remember to use UTF-8 encoding!)
Then follow the instructions for Linux to get the things you need for participating in the development of language technology tools. Rembember that if you only want to use the tools, you may stop here and instead just download the analysers, see the page on linguistic analysis
You need a number of tools for the build chain. We assume you have installed Ubuntu on your Windows machine. If you installed some other Linux version, look at its documentation for how to install programs like the ones below):
sudo apt-get install autoconf automake libtool libsaxonb-java python3-pip \
python3-lxml python3-bs4 python3-html5lib libxml-twig-perl antiword xsltproc \
poppler-utils wget python3-svn wv python3-feedparser subversion openjdk-11-jdk cmake \
python3-tidylib python3-yaml libxml-libxml-perl libtext-brew-perl
You need tools to convert your linguistic source code (lexicons, morphology, phonology, syntax, etc.) into usefull tools like analysers, generators, hyphenators and spellers.
To get that, run these two commands in the terminal (e.g. after having written cd ENTER
):
wget https://apertium.projectjj.com/apt/install-nightly.sh -O - | sudo bash
sudo apt-get -f install apertium-all-dev
This downloads a shell script (1), makes it executable (2), and runs it (3). The shell script in turn will download and install prebuilt binaries for programs for morphology, syntax and machine translation:
Rerun with regular intervals, e.g. once a year, to get the latest updates.
hfst is our default compiler, and it builds all our tools. It is open source, and it is needed for turning your morphology and lexicon into spellcheckers and other useful programs.
The following two programs are not needed, we just refer to them since the source code is compatible with them. If you don’t know whether you need them, just skip them.
/usr/local/bin/
.
In order to participate in the development work, you need an editor, a program for editing text files. Here are some candidates:
Any other editor handling UTF-8 should be fine as well.