GiellaLT

GiellaLT provides an infrastructure for rule-based language technology aimed at minority and indigenous languages, and streamlines building anything from keyboards to speech technology.

View GiellaLT on GitHub

Page Content

Translation memories

This page contains resources for use in Computer-assisted translation (CAT) software (cf. the English Wikipedia article for an explanation.

We have integrated machine translation support for the CAT program OmegaT. OmegaT is only one of many CAT programs, though. There is a comparison of CAT software available.

Translation memories

We have made translation memories for several language pairs. The collections are in the so-called tmx format, and can be used by all CAT programs listed in the software comparison article quoted above. For easy access we have listed them below (control-click (right-click) on the tmx file you want and download it). The sentence pairs are mainly taken from official documents.

Language pair Updated (dd.mm.yyyy)
Finnish - Inari Saami 08.03.2017
Finnish - North Saami 08.03.2017
Finnish - Norwegian 01.03.2019
Finnish - Skolt Saami 21.03.2017
Finnish - South Saami 23.06.2020
North Saami - Inari Saami 09.03.2017
North Saami - Lule Saami 28.02.2019
North Saami - Norwegian 08.03.2017
North Saami - South Saami 28.08.2019
Norwegian - Finnish 01.03.2019
Norwegian - Lule Saami 09.03.2017
Norwegian - North Saami 10.12.2019
Norwegian - South Saami 07.03.2017
Russian - Komi (missing)
South Saami - Norwegian 28.02.2019

If you use OmegaT, add the .tmx file to the tm folder.

Glossary files

Correspondingly, we have made some glossary files. They are tab separated lists.

For OmegaT, add the file to the glossary folder.

Segmentation file

If you translate from North Saami and use OmegaT, you may download a segmentation file, and put it in the omegat folder.