On this page
Plan for setup of machine translation
** NOTE: This documentation is old. It is kept since it may contain methodological points still valid.**
We plan to look at at least Apertium (a rule-based system, cf. its wiki) and Moses (a statistically-based system). This document discusses the setup of Moses.
Overview
The programs should be installed on the Xserve machine, in order to facilitate long runs (may last for days).
Files
We need 5 different programs, cf. the download information on each page:
- alignment.jar, our Bergen-Tromsø sentence aligner
- Mosesdecoder (the mt program itself)
- giza++ (word alignment)
- srilm (language model)
- mkcls (word class/POS? training)
They shall be installed on the Xserve, and installed to standard paths.
The process
- Input is a set of parallel sentences
Setup
- Files where they belong
- Paths and access
- Modify makefiles
Make catalogues in gtsvn/mt
Today we have the catalogues:
- courses
- dev
- doc
- giza
- grantapplications
- script
Needed:
- change giza to wordalign, make one for sentencealign.
- have catalogues for the language pairs, and for the machine runs
MT systems, usage
smenob
- A gist system, i.e. in order to get an idea of what is written
nobsme
engsme
- Only KDE input