GiellaLT Documentation

GiellaLT provides rule-based language technology aimed at minority and indigenous languages

View GiellaLT on GitHub

Page Content

Corpus maintenance

This document keeps track of measures to improve the corpus collection and conversion process. Note also the sentence alignment page, which looks into that specific sub-part of the corpus maintenance.

Corpus improvement work

Mappestruktur osv

Tasks

Where do we find texts

Parallel texts

Meetings in the corpus improvement project

OCR and conversion errors leftover from spring 2011